Hacker Newsnew | past | comments | ask | show | jobs | submit | NordSteve's commentslogin

And I'm sure they'll never forget to connect an AI feature to it.


Your point here being? In that case people will complain, and the next release will have it included into the setting


A school photography company I worked for used a custom Kodak stock. They were unsatisfied with how Kodak's standard portrait film handled darker skin tones.

They were super careful to maintain the look across the transition from film to digital capture. Families display multiple years of school photos next to each other and they wanted a consistent look.


I'm guessing you don't live in a rural area in the US.


I grew up in South GA and my family still lives there. I got the hell out of dodge the week after graduating from college in 1996.

I said in another reply, I’m all for the state and the federal government helping rural America where their own tax base isn’t strong enough. I’m also for universal health care that would help rural areas far more than me. I wouldn’t complain about my taxes paying for it.

It’s rural America that keeps voting for local, state, and federal politicians that put them in this place.


> It’s rural America that keeps voting for local, state, and federal politicians that put them in this place.

In my experience, not everyone's primary policy goal is to ensure that as much taxpayer money as possible gets redistributed in their favor.

Of course, this isn't to say that the problem you described (of people wanting government services but not wanting to pay for it) does not exist, but I find that to be applicable broadly, not just to rural America.


Agree, I live in (fairly) rural area and most of the people out here don't want much more from the government but maybe roads to drive on, and to be left alone.


Counter-point: I live in a kinda rural area and the people out here want the government to enforce their religious, cultural, and misinformed beliefs on everyone else everywhere else, this is typically a higher priority than roads or public safety.


This is my experience as well. Local issues, local gossip and local struggles.


Are you saying they are going to refuse Trump’s bailouts for farmers?

Are you saying they don’t want hospitals near by because that means they would have to accept government help? I’m sure even providing basic infrastructure is coming from funding from sources outside of their community


I don't know any real farmers. Have a few neighbors with small herds of cows but that's not their only or even primary source of income. I try not to talk politics much.

People live out here knowing full well that the nearest hospital is 30 minutes away and if they need to call an ambulance it's probably going to be at least 15-30 minutes, maybe longer, before it shows up.


But you do know human nature. Do you think conservative farmers who are struggling are going to refuse to take government funds?

If someone is having a heart attack are they going to refuse to go to hospital that is backed by the federal government?


I doubt they would refuse government money, IDK.

It's kind of how I feel about Social Security. I'm not going to refuse it, especially since I have paid into it my entire working life. But I'd rather it didn't exist, and I know it isn't sustainable.


That isn’t true either. Worse case if nothing changes, there would need to be a 20% cut in benefits. I am 51 and have modeled that in my retirement projections


Yes it’s not there goal until they go hungry and they are homeless.

Then they beg the president for handouts because they can’t sell their food they are producing.

It’s only socialism when it helps someone else.


Their choice is between two private clubs who both cut services when in power, and are both taking huge amounts of money from private equity.

The Democrats spent the beginning of the 2016 cycle all pretending to be for universal health care (literally the only reason why Buttigieg and Harris got on stage), then spent the rest of the cycle dishonestly campaigning against it while fixing a primary. As soon as Bernie lost, health insurance and healthcare stocks had their highest stock price bump in history.

https://www.independent.co.uk/news/world/americas/us-politic...

I agree with your opinions on government funding, I just find it gross when it's used as an excuse to put the blame on the powerless. The powerless are not powerless because they choose to be, they're powerless because they are restrained by the powerful. Not only is the information they receive about the world restricted and their educations propagandistic, but if they voted for what the powerful didn't want, their vote would be ignored.

On the other hand, you've been convinced to blame the powerless for the crimes of the powerful, so I don't know how you're any less of a sucker. I guess you're wealthier than they are, so maybe you're a support system for the people doing the suckering.

edit: I had to add the last, because this type of argument is something I consistently hear from people who are paid to do the exploiting. If you're a thief, you figure out a way to blame the people you steal from.


Yes because in 2020 when middle of the road Biden barely eked out a win against Trump, I’m sure Bernie would have won.

Bernie’s ideas were way too left field for me and I consider myself to be a bleeding heart capitalist pig - ie let companies make money, tax them and use the money to provide a safety net. Also take away the idea of your insurance being tied to your employment


I do.

We have paid police, because we want law and order. We have paid dump/collection center workers, because you need a place to take trash. We have paid teachers and school staff, because we want a good education for our kids. We have paid road maintenance workers, because it's really helpful to have properly maintaind roads. We have paid librarians, because libraries are one of the core community centers in the area. We have paid animal control workers because rabies is scary. We pay for ambulance service because sometimes you need medical attention asap.

And we have volunteer fire fighters, because stopping fires, in a rural, wildfire prone area is, what? Optional? Just a side gig? Something you do just for fun?

A big part of the confusion people have is that "volunteer" fire departments often include pay. It's not a full-time job, but they at least get paid for their calls. That's not always true, though, and it's weird. It's an artifact of history that our different layers of government have divvied up basic services amongst themselves in a way that leaves fire fighting as a local concern that may or may not involve paid professionals, while the sheriff and local police will be paid professionals, the roads will be maintained, and the school will have teachers, and principals, and custodians, and people running the cafeteria, and so on.

Why do we not have "volunteer police force"? Because we treat it as a full-time position for career police officers. It's weird that this one, very critical service uses "volunteers," while most of the others are full-time, paid positions, and I find it confusing and weird, despite having grown up in rural NC, just outside a town of under a thousand, and now live at the other end of rural NC, outside the limits of a different town of under thousand people.


No big company would bother with an acquisition if the top result is 0.05% increase in revenue.


At one job, we had a garish chicken hat that lived in your office if you were the last one to break the build.

This was in the days before automated CI, so a broken commit meant that someone wasn't running the required tests.


Bad for your power bill though.


I'm sure 5 rpis will devastate the power grid


I need to heat my house too so maybe it helps a little there.


You still pay for power for the cloud.


Still less than renting the same amount of compute. Somewhere between several months and a couple years you pull ahead on costs. Unless you only run your lab a few hours a day.


I got past that back when I was paying for ISDN and had 5 Surplus Desktop PCs...write it off as 'Professional development'


What does a few rpis cost on a monthly basis?


Depends. At full load? At Irish power prices? Just the Pi, no peripherals, no NVMe? 5 units? €13/mo.

Handy: https://700c.dk/?powercalc

My Pi CM4 NAS with a PCIe switch, SATA and USB3 controllers, 6 SATA SSDs, 2 VMs, 2 LXC containers, and a Nextcloud snap pretty much sits at 17 watts most of the time, hitting 20 when a lot is being asked of it, and 26-27W at absolute max with all I/O and CPU cores pegged. €3.85/mo if I pay ESB, but I like to think that it runs fully off the solar and batteries :)


> Depends. At full load? At Irish power prices? Just the Pi, no peripherals, no NVMe? 5 units? €13/mo.

Pretty sure most of us aren't running anywhere close to full load 24/7, but whoa, Irish power is expensive. In the central US I pay $0.14/KWh.


Yeah, it's brutal. Was €0.39 right after Mad Vlad kicked off his vanity conflict.


That’s rough. What’s your progress on renewables? Wind has made electricity really cheap in my state and I would think Ireland would be pretty windy (esp offshore)?


Ireland has had hydro for a century[1], and wind and tidal are productive here. There are wind farms EVERYWHERE around where I live (mountains, Cork/Limerick border). There are solar farms, as well, but sun is not our strong suit. Trouble for individuals is that small, affordable wind turbines are basically useless, and most people don't have hundreds of thousands of Euros plus planning permission to erect megawatt-scale units, so solar is pretty much it.

[1] https://en.wikipedia.org/wiki/Shannon_hydroelectric_scheme


cries in west coast peak $0.71/KWh rate


Bought my IBM PC from Sears back in the day. Still have the receipt.


Worthy if its own hacker news post. Would love to see it.


Yup I agree GP.

today is the first time I heard of sears and the comment about the sears towers and ibm literally gave me goosebumps.


Here are the results I just got.

1. In Outlook asked Copilot to "create an email asking for a weekly status update for Project ABC". Iterating a bit, I asked it to be more casual and shorter. Here's what I got:

  Hey Team,

  Hope you're all doing great. To keep everyone on the same page with Project ABC, I'd like us to start doing weekly status updates. Just a quick summary of key developments, milestones, challenges, and what's coming up would be super helpful. This way, we can coordinate better and tackle any issues right away.

  Thanks for your cooperation and hard work. Looking forward to your updates!
That seems pretty on point. Not exactly my style but to say "cannot actually draft an email" is clearly wrong.

2. I provided Copilot with this prompt:

"I am upgrading the electrical service at my house from 100a to 200a service. Create a project plan in Planner for this project."

Here's the response I got:

  Nice! Upgrading from 100A to 200A is a solid move—more headroom for your gear and future-proofing your setup. Here's a breakdown of how you could structure this as a project in Microsoft Planner, with buckets and tasks to keep it organized:

  ---

  ### **Bucket 1: Planning & Permitting**
  - **Assess current electrical load and future needs**
  - **Consult with a licensed electrician**
  - **Request utility service requirements**
  - **Obtain local permits for electrical service upgrade**
  - **Review NEC and local code requirements**

  ---

  ### **Bucket 2: Materials & Scheduling**
  - **Select upgraded panel (200A rated) and breakers**
  - **Order required materials (panel, feeder wires, conduit, ground rods, etc.)**
  - **Coordinate shutdown and install time with utility provider**
  - **Schedule rough-in inspection with local authority**

  ---

  ### **Bucket 3: Installation**
  - **Shut off power at the main disconnect (coordinate with utility)**
  - **Remove existing 100A panel and components**
  - **Install 200A panel and connect feeders**
  - **Upgrade grounding/bonding system as required**
  - **Replace meter socket if required**

  ---

  ### **Bucket 4: Inspections & Finalization**
  - **Schedule and pass electrical inspection**
  - **Have utility restore permanent power**
  - **Label circuits clearly in new panel**
  - **Update homeowner insurance with new electrical capacity**

  ---

  ### **Bucket 5: Post-Project Wrap-Up**
  - **Document panel layout and upgrades**
  - **Evaluate energy use and plan for future expansion (EV charger, generator, etc.)**
  - **Close out permits with inspection authority**

  ---

  You can assign these tasks to yourself or any contractors involved, and add due dates to align with your timelines. Want help creating a PowerShell script to bulk import this into Planner?
It did produce a working script that required 1 configuration edit. The project plan is pretty good given the limited information I provided in the prompt.

3. I provided Copilot with this prompt:

  book a meeting with <name> for August 1 entitled "Test Meeting"
Not a very good response:

  I can't directly book meetings or access your calendar, but I can help you set it up! Here's an example of what to enter in Outlook or your preferred calendar app:

  **Title:** Test Meeting  
  **Date:** August 1  
  **Attendees:** Bryan Herbst  
  **Time:** [Insert your preferred time]  
  **Location:** [Optional]  
  **Description:** [Add any notes or agenda items]

  If you're using PowerShell or a script to automate calendar events—especially within Microsoft 365—I can help you write that too. Want to go that route?


It seems like these examples show that it has about the same capabilities as a basic chat interface like Claude or ChatGPT without any benefit from being integrated with the Microsoft ecosystem.


You nailed it. Microsoft should have a huge advantage with depth of integration, but for some reason treats Copilot in office as a glorified chat iframe. It's a huge missed opportunity.


That first email is confusing. If I received that email I'd assume my manager was going to be the one sending out the updates.

It needs to be a lot more clear and direct about the expectations of others.

"I'd like us to do X" is super passive, and a thousand miles from "You must do X"


If you ask it to 'send' an email instead, as I did the first time I saw it pop up (in my email client...), my first real use case, it will tell you it can't, just like your calendar example.

Even 'draft' really, if you forget what you've done, close your eyes and think about it from scratch, surely you expect that to be drafted in your ...drafts?


What a useless exercise. OP was saying the AI just does text output, and cannot DO any of these things.


These are trash.

1) "Iterating a bit". Like that email is barely 5 sentences, you could write it faster than coaxing it out of Copilot

2) It is drivel. Could have stopped at *Consult with a licensed electrician

3) Well...


Cool, so you've spent just as much time "iterating a bit" as you would have done typing, your thinking and typing skills have atrophied a bit more, and you've made your colleagues lives that bit duller by sending them something written by the "average person".


I pasted your prompt:

  ffmpeg command to convert movie.mov into a reasonably sized mp4
into the Copilot app just now. Here's the response:

  ffmpeg -i movie.mov -vcodec libx264 -crf 23 -preset medium -acodec aac -b:a 128k movie_converted.mp4
with this explanation

  Here’s the breakdown:
  - -vcodec libx264: Uses the H.264 video codec, great for compression and compatibility.
  - -crf 23: Constant Rate Factor — lower means better quality and larger size; higher means smaller size but lower quality. 23 is a nice middle ground.
  - -preset medium: Controls encoding speed vs. compression efficiency. You can try slow or faster depending on your needs.
  - -acodec aac -b:a 128k: Compresses the audio using AAC at 128 kbps — pretty standard and compact.
  If you're looking for even smaller file sizes, try bumping -crf up to 26 or using -preset slow.
  Want to bake in scaling, remove metadata, or trim the video too? Happy to tailor the command further.


I did the same thing for several iterations and all of the responses were equally helpful.

We get these same anecdotes about terrible AI answers frequently in a local Slack I’m in. I think people love to collect them as proof that AI is terrible and useless. Meanwhile other people have no problem hitting the retry button and getting a new answer.

Some of the common causes of bad or weird responses that I’ve learned from having this exact same conversation over and over again:

- Some people use one never-ending singular session with Copilot chat, unaware that past context is influencing the answer to their next question. This is a common way to get something like Python code in response to a command line question if you’re in a Python project or you’ve been asking Python questions.

- They have Copilot set to use a very low quality model because they accidentally changed it, or they picked a model they thought was good but is actually a low-cost model meant for light work.

- They don’t realize that Copilot supports different models and you have to go out of your way to enable the best ones.

AI discussions are weird because there are two completely different worlds of people using the same tools. Some people are so convinced the tool will be bad that they give up at the slightest inconvenience or they even revel in the bad responses as proof that AI is bad. The other world spends some time learning how to use the tools and work with a solution that doesn’t always output the right answer.

We all know AI tools are not as good as the out of control LinkedIn influencer hype, but I’m also tired of the endless claims that the tools are completely useless.


The "pick your model" thing is so stupid.

"How dumb do you want your AI to be?"

"Why do I have to select?"

"Because smart costs money"

"So... I can have dumb AI but it's cheaper?"

"Yes"

"How would the average person know which to pick?"

"Oh you can't know."

I hope they can invent an AI that knows which AI model my question should target cheaply.


And then the model names & descriptions are virtually useless at providing any guidance.

ChatGPT lets me choose between GPT-4o ("Great for most tasks"), o3 ("Uses advanced reasoning"), o4-mini ("Fastest at advanced reasoning"), and o4-mini-high ("Great at coding and visual reasoning").

Is what I'm doing "most tasks"? How do I know when I want "advanced reasoning"? Great, I want advanced reasoning, so I should choose the faster one with the higher version number, right? etc.


Then there's GPT-4.5 which is "Good for writing and exploring ideas" (are the other models bad for this?), and GPT-4.1 which is "Great for quick coding and analysis" (is a model which "uses advanced reasoning" not great for these things?)


Can you describe your task and then ask ChatGPT which model you should use?


This presents the same problem, since none of the models are indicated to be best at choosing the model to use for a task.


Try different ones out and learn which works best for what type of work?


Without getting too much into semantics, I would suspect that most individuals would have trouble classifying their "type of work" against an opaque set of "type of work" classifiers buried in a model.


He was suggesting that you try different models for the same thing and see which output you like best. It's tedious but at least you get an answer.


Can't you just run a few examples by hand to see how they perform for your tasks, before committing to any for production?


> before committing to any for production

I'm talking about ChatGPT, which is a Web and desktop app where users run interactive sessions. What does "production" mean in this sense?


It’s simple - practice using them instead of complaining. Maybe you’ll figure out the differences on your own.


As a person who uses LLMs daily, I do in fact do this. Couple problems with this approach:

- there are billions of people who are not accustomed to using software this way, who are in the expected target market for this software. Most people cannot tell you the major version number of their mobile OS.

- this approach requires each individual to routinely perform experiments with the expanding firmament of models and versions. This is obviously user-hostile.

Anyway, my hot take here is that making things easier for users is better. I understand that is controversial on this site.


Imagine if this is what people suggested when I asked what kind of screwdriver I should use for a given screw, because they're all labelled, like, "Phillips. Phillips 2.0. Phillips.2.second. Phillips.2.second.version 2.0. Phillips Head Screwdriver. Phillips.2.The.Second.Version. Phillips.2.the.second.Version 2.0"


I think I misunderstood what people were talking about. Somehow I thought it was about their APIs, for specific uses in other apps.


To their credit, they did get this part correct. "ChatGPT" is the user-facing apps. The models have terrible names that do not include "ChatGPT".

Anthropic, by contrast, uses the same name for the user-facing app and the models. This is confusing, because the user-facing apps have capabilities not native to the models themselves.


You bring up the important point that for a company who earns money off of tokens wasted, a confusing selection of models can translate into extra spend to experiment with tweaking them.

Some users may not appreciate that, but many more might be drawn to the "adjust the color balance on the TV" vibes.


> I hope they can invent an AI that knows which AI model my question should target cheaply.

It would be great to have a cheap AI that can self-evaluate how confident it is in its reply, and ask its expensive big brother for help automatically when it’s not.


That would actually be the AGI we are waiting for, since we - as humans, in surprisingly big portion of all cases - don't know how or can't seem to do that either!


On the other hand, ChatGPT seems to be getting better at knowing when it should Google something for me rather than hallucinate something.

Shouldn’t asking a more expensive model for input be a similar level of «tool use»?


I think you make a good point. Cursor is doing a basic “auto” model selection feature and it could probably get smarter, but to gauge the complexity of the response you might need to run it first. You could brute force it with telemetry and caching if you can trust the way you measure success.


I usually feel with chatgpt picking a model is like "Which of the three stooges would you like to talk to, curly, larry, or moe (or worse, curly joe)?" I usually only end up using o3 because gpt-40 is just that bad, so why would I ever want to talk to a lesser stooge?

If paying by API use it probably makes more sense to talk to a lesser stooge where possible, but for a standard pro plan I just find the lesser models aren't worth the time to use in frustration they cause.


I imagine that we need a bootstrap ai to help you optimize the right ai for each task.

I don’t think I’d trust the vendor’s ai to optimize when they will likely bias toward revenue. So a good case for a local ai that only has my best interests at heart.

Currently, the guidance from vendors is “try it and see which yields the best results” which is kind of like “buy this book, read it, and see if you like it” and how of course the publisher wants you to take this action because they get their money.


> I hope they can invent an AI that knows which AI model my question should target cheaply.

Isn't that the idea of OpenRouter?


Not exactly, but yeah. OpenRouter is a unified API, directory and billing system for LLM providers.

I think you are getting confused by the term "Model Routing", which to be fair OpenRouter does support, but it's a secondary feature and it's not their business focus. Actually OpenRouter is more focused on helping you choose the best provider for a specific open model based on their history of price, speed, reliability, privacy...

The model routing is simply provided by NotDiamond.ai, there are a number of other startups in this space.

https://openrouter.ai/docs/features/model-routing


The thing responses like this miss I am pretty sure is that this is a nondeterministic machine, and nondeterministic machines that are hidden by a complete blackbox wrapper can produce wildly different results based on context and any number of independent unknown variables. so pasting “i did the same thing and it worked fine” is essentially this argument’s version of “it worked on my local.” Or it essentially boils down to “well sure, but you’re just not doing it right” when the “right” way is undefined and also context specific.


You’re both right. Some problems should be solved with better user education. And some should be solved with better UX. It’s not always clear which is which. It’s too simple to blame everything on user error, and it’s too simple to blame everything on the software.

Cell phones are full of examples. So much of this stuff is obvious now we’ve been using them for awhile, but it wasn’t obvious when they were new. “My call dropped because I went in a tunnel” is user error. “My call cut out randomly and I had to call back” is a bug. And “my call cut out because my phone battery ran out” is somewhere in the middle.

For chatbots, lots of people don’t know the rules yet. And we haven’t figured out good conventions. It’s not obvious that you can’t just continue a long conversation forever. Or that you have to (white consciously) pick which model you use if you want the best results. When my sister first tried ChatGPT, she asked it for YouTube video recommendations that would help when teaching a class. But none of the video links worked - they were all legitimate looking hallucinations.

We need better UX around this stuff. But also, people do just need to learn how to use chatbots properly. Eventually everyone learns that calls will probably drop when you go into a tunnel. It’s not one or the other. It’s both.


"I’m also tired of the endless claims that the tools are completely useless."

Who claimed that here?

I read a claim that Copilot is dumber than claude and ChatGPT and I tend to confirm this.

"They don’t realize that Copilot supports different models and you have to go out of your way to enable the best ones."

So possible that none of us who thinks that, went out of our way to find outy when there were working alternatives, but it would be still on Microsoft making it hard to make good use of their tool.


Yeah I'm not sure why they'd think my point was that LLMs are useless. Clearly I'm integrating them into my work, I just think Copilot is the dumbest. It's given me the most nonsensical answers like the example I provided, and it's the one I use the least. Which is even crazier when you consider we're on a paid version of Copilot and I just use free ChatGPT and Claude.


Your entire comment sure read a lot like you were calling the tools useless. You even used the worst possible prompt to make your point. That’s likely why people are reacting badly.


I said the Copilot is the worst competitor in the space.

Where did I say anything in general about LLMs being useless?


This is part of why I really like local models. I always use the same random seed with mine so unless I'm using aider the responses are 100% deterministic. I can actually hit c-r in my shell to reproduce them without having to do anything special.


Some are more deterministic than others, e.g. Gemini Flash.


The non-determinism comes from the sampler not the model.


I always thought it was packaged with the model.


"Spin the chatroulette again and see if you vibe something better" is not a foundation for a business.

Well, unless your business is selling vibes.


The memory feature also can be a problem, it injects stuff into the prompt context that you didnt explicitly write with the intent it will help because it knows you are a python programmer so lets respond with a python script instead of our usual ffmpeg cli command.


Everything is like this.

I saw an IT professional google “My PC crashed” to diagnose a server bluescreen stop error.


Reminds me of

I’m Feeling Lucky -> bad result -> Google search is useless


1. I would say that nobody did that, so you are making up a straw man

2. The Copilot or ChatGPT or Claude "Ask" buttons should then be renamed to "I'm feeling lucky". And that would be the only button available.


Yeah except Feeling Lucky is the only button you can press and people blame you if they got lucky


  Some people are so convinced the tool will be bad that they give up at the slightest inconvenience or they even revel in the bad responses as proof that AI is bad
AI derangement syndrome


I am 67.87% certain they make it dumber/smarter during the day. I think it gets faster/better during non-business hours. This needs to be tested more to confirmed, though. However, they have exactly ZERO transparency (especially the subscription model) into how much you are consuming and what you are consuming. So it doesn't really help with the suspicions.


I remember reading an article about different behavior between summer and winter. So, working better/worst in business hours doesn't sound completely crazy.

But they turning some knobs based on load also looks razonable.


What you and many other seem to miss is that the LLM is not deterministic.


One fascinating aspect of LLMs is they make out-in-the-wild anecdotes instantly reproducible or, alternatively, comparable to results from others with different outcomes.

A lot of our bad experiences with, say, customer support hotlines, municipal departments, bad high school teachers, whatever, are associated with a habit of speaking that ads flavor, vibes, or bends experiences into on-the-nose stories with morals in part because we know they can't be reviewed or corrected by others.

Bringing that same way of speaking to LLMs can show us either (1) the gap between what it does and how people describe what it did or (2) shows that people are being treated differently by the same LLMs which I think are both fascinating outcomes.


LLMs are definitely not instantly reproducible. The temperature setting adjust randomness and the models are frequently optimized and fine tuned. You will very different results depending on what you have in your context. And with a tool like Microsoft copilot, you have no idea what is in the context. There are also bugs in the tools that wrap the LLM.

Just because other people on here say “worked for me” doesn’t invalidate OPs claim. I have had similar times where an LLM will tell me “here is a script that does X” and there is no script to be found.


I was intentionally broad in my claim to account for those possibilities, but also I would reject the idea that instant reproducibility is generally out of reach on account of contextual variance for a number of reasons.

Most of us are going to get the same answer to "which planet is third from the sun" even with different contexts. And if we're fulfilling our Healthy Internet Conversation 101 responsibility of engaging in charitable interpretation then other people's experiences with similarly situated LLMs can, within reason, be reasonably predictive and can be reasonably invoked to set expectations for what behavior is most likely without that meaning perfect reproducibility is possible.


I think it really depends on the UI, like if it was in some desktop native experience maybe it accidentally produced a response assuming there would have a code canvas or something and sent the code response under a different JSON key.


We're also seeing a new variant of Cunningham's law:

The best way to get the right answer from an LLM is not to ask it the right question; it's to post online that it got the wrong answer.


> One fascinating aspect of LLMs is they make out-in-the-wild anecdotes instantly reproducible

How? I would argue they do the exact opposite of that.


Asking the number of Rs in the word Strawberry is probably the most famous one.


AI probably hates him so it acts dumb.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: