Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Karpathy spends his days rewriting the same code for educational purposes. I'm not surprised he finds such marcos useful.

As a professional though they're pretty useless. Sad to see Karpathy fueling vaporware



"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

https://news.ycombinator.com/newsguidelines.html


I pity comments like this. Instead of trying to up skill to use the newest programming tools, you are setting yourself for failure.

Sonnet 3.5 digests close to 400k bytes of text and produces coherent code that works on the first try. If someone says its not working and they are a professional programmer, get ready to feel like you are hit by ton of bricks next year. The productivity boost is only going to accelerate and those who can't adopt will be left behind.


a) There is no up-skilling needed to use LLMs. They are very basic to use.

b) Many of us have used them for a while now and can speak from experience that they aren't providing a meaningful productivity boost. Simply because they don't work well enough to provide a positive ROI. And no amount of prompting expertise can change that.

c) For me it is junior developers who love these tools because they think it's a shortcut to becoming experienced. But it's akin to cheating. You're not actually learning why and how things are supposed to work. And that will hurt you in professional environments where you often need to explain why you wrote that code and introduced that bug.


Your (1) is not matching with (2) because there are anecdotes contrary to yours (the tweet in question and my personal one). I have close to 2 decades of experience in a variety of languages and frameworks and never felt this powerful and liberated with any of the previous tools.In the past year I have developed 2 complex products nearing market launch with just me on a part time basis.

My professional colleagues continue to feel the exact same way you feel and despite my best efforts refuse to even bother using them for anything. Using LLMs might appear to be simple and the prompt length might be similar between an experienced user vs naive one but the way intent is conveyed varies with skill level.

My only complaints about LLMs are: 1) Context is still a limiting factor (so only medium sized projects) 2) I have to still copy paste the code (no IDE truly helps here)

What has improved in the past 6 months: Sonnet happened and I no longer have to worry about the code being wrong or that it contains obvious mistakes. In many cases where I thought it got it wrong turned out to be a clever way to minimize the number of changes needed/clever ways to do more with less. We are approaching the point where humans no longer are intelligent enough to appreciate the LLMs.


I look forward to the day that I can be "intelligent enough" to truly appreciate LLMs. Maybe I need to buy a course from someone on X.

And not from months of experience using Claude where it over and over again will give me algorithms that are wrong, assure me every time it is right and do so using versions of libraries that are typically a year or more old.


"There is no up-skilling needed to use LLMs. They are very basic to use."

Hard disagree on that. Using LLMs effectively is deceptively deep. Sure, anyone can throw a prompt at a chatbot - but I've been using them on an almost daily basis for over two years at this point and I still feel like I'm finding out new ways to improve my prompting several times a week.

I talked more about how hard they are to use here: https://simonwillison.net/2024/Jun/27/ai-worlds-fair/#slide....

"Many of us have used them for a while now and can speak from experience that they aren't providing a meaningful productivity boost."

I'm getting a meaningful productivity boost, which gets more meaningful the more time I spend learning how best to apply them.


We are many professionals that share Karpathy's opinion on this and know for a fact that it provides a very meaningful productivity boost. It may not be for everyone but I can absolutely not imagine going back, and can confidently say it's not just junior developers that love these tools.


Why isn't there a single screencast (un-edited, un-cherry-picked) of anyone showing off their 10x productivity boost in a full "typical" coding session?


Someone recently asked this on Twitter; Simon Willison responded with https://simonwillison.net/2024/Jun/21/search-based-rag/ which I have not yet watched but which he claimed was a good example of this genre.


Having rewatched that myself the other day it's not actually as good an example as I thought - I use Claude 3.5 Sonnet a bit in it (which was released the morning we recorded that video) and then get a bit of benefit out of Val Town's integration with Codeium, which is similar to VS Code Copilot - but not as much of the code in it was LLM-generated as I remembered.

A better (written) description of how I use these tools is this one: https://simonwillison.net/2024/Mar/30/ocr-pdfs-images/ - and this whole series of posts: https://simonwillison.net/tags/ai-assisted-programming/


I would point out that the OCR example (and from what I see the series of posts you linked to) aren't "live" coding screen shares and don't convey the nitty gritty of how these things are used and how well they work.


Right, but they’re the best I have - I don’t do much live coding aside from that Val Town one which doesn’t use LLMs very much.


I'd love to see this operationalised as concrete predictions, as one might find on a prediction market! Do you have any specific predictions about programming next year?

I ask (for example) because I suspect shitting out CRUD apps is cheaper via LLM than via human now, and I guess probably most programming work is of that nature, but there are programmers out there whose job is not shitting out CRUD apps, and it's not clear from your statement whether you intend the sentiment to cover those programmers too.


The answer lies in your question. I foresee consolidation in programming languages and frameworks with compact and well known ones edging out esoteric and niche ones. In a couple of years of time, I predict that there will be new languages specifically targeting LLMs that aren't as human readable but extremely compact similar to byte code (compactness is preferred due to context size limitation not fully going away).

So in a nutshell I feel like most things will be LLM generated with human focus mostly around systems boundary stitching with focus on extreme cases like quant and medical domains where human oversight might be needed.


Let's cross that bridge when we come to it, shall we. Meanwhile, you should be glad we are refusing to use it. If it works as well as you claim, this situation is to your advantage.


I've been waiting since 2021 when I saw demos of GitHub copilot


I'd like to see that. Link(s)?

I've been on the sidelines, waiting for the dust to settle. Kind of like waiting a few months before applying the latest major OS updates.


Ah yes, just like stocks can only go up. No one will feel like hit by a ton of bricks.


Curious to know why you think it's vaporware. Are the latest LLMs like 3.5 Sonnet bad at original programming based on your experience? It hasn't been the case for me when using it for real world projects lately.


I wrote a cycle detecting reference counting system and asked Sonnet 3.5 to fix it and it failed.

I hand held it, gave it tests, called out flaws in reasoning.. etc., and it still didn't fix.

The longer the chat the more likely it was Sonnet forgot a clarification I provided.

Overall huge waste of time, lost my train of thought and I was helping an LLM rubber duck and fail repeatedly


Don't judge it for one single experience. It'll take several for you to understand well how impactful these tools are for SWE.


I had an XML file format from one app I needed converted to a json file format for another app.

I threw both schemas at Claude and asked for it to write converter code.

Writing mocks, Claude saves an hour+ when mocking out complex classes.

I've never written graphics code before, I had a png animation film strip, Claude wrote code to load, parse, and animate it.


"vaporware"?

You may not like cursor, but they have a product that I -- as a professional -- use every day.

Vaporware isn't the word you're looking for


This whole "I became 10x better" is vaporware.

It can kinda sorta maybe help sometimes a little bit is not vaporware


I would encourage you to learn how to use these systems rather than discounting their value.


Please record a screencast and educate everyone.


Vaporware means something is a concept that hasn't actually shipped yet. You seem to be using "vaporware" to mean over-hyped.


I agree with other commenter. You don't know how to use them. LLM's are not programmers.


This day it's usually not a good professional at all if he thinks he's writing something novel.


Yeah if you're getting paid $50K outside the bay maybe.

If you want big bucks you are writing original code, no two ways about it


Most of the code my friends and I write, isn't original. And it's not just people who make $50K/year. Obviously LLM-assisted code writing is still in its infancy, but it has made a lot of mundane things a breeze already. It sucks that one has to know its shortcomings to make it actually useful for yourself (e.g. I won't ask it to write a context-aware function right away, but I know it's great at generating stubs). But we'll get there, I think.


"Andrej Karpathy (born 23 October 1986[2]) is a Slovak-Canadian computer scientist who served as the director of artificial intelligence and Autopilot Vision at Tesla. He co-founded and formerly worked at OpenAI"

I think he should get some cred with such track record.


You didn't learn how to use these tools properly. If you did you wouldn't have that opinion. Karpathy doesn't just write code snippets for educational purposes. Most of the code he writes is for real world systems and it's not publicly available.


What do you mean by professional that doesn't include Karpathy?


I'm trying to understand how you can so easily dismiss all the professionals thinking it's useful. A more charitable explanation could be that it's useless for you but not others.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: