Hacker Newsnew | past | comments | ask | show | jobs | submit | ej88's commentslogin

I would add on that the most of the premium of a modern SWE has always been on understanding problems and systems thinking. LLMs raise the floor and the ceiling, to where the vast majority of it will now be on systems and relationships

the specific tasks (i.e. writing code) might disappear

but the actual work of constructing reliable systems from vague user requirements with an essentially unbounded resource (software) will exist


Of course this is true. Just like the need to travel long distances over land will never disappear.

The skills needed to be a useful horseman though have almost nothing to do with the skills needed to be a useful train conductor. Most the horseman skills don't really transfer other than being in the same domain of land travel. The horseman also has the problem that they have invested their life and identity into their skill with horses. It massively biases perspective. The person with no experience with horses actually has some huge advantages of the beginner mind in terms of travel by land at the advent of travel by rail.

The ad nauseam software engineer "horsemen" arguments on this board that there will always be the need to travel long distance by land completely misses the point IMO.


I'm quite convinced that software (and, more broadly, implementing the systems and abstractions) seems to have virtually unlimited demand. AI raises the ceiling and broadens software's reach even further as problems that previously required some level of ingenuity or intelligence can be automated now.

Why unlimited? Populations are shrinking and there is only so much debt these economies can handle.

I would rather learn how to use these tools effectively now and ship more and of higher quality now.

If these tools improve to the point where anyone can pick it up - that's great! I enjoyed my head start while it lasted.

If these tools continue to require experience and a skillset to use, that's great too - I'll continue to learn and pull ahead.


Most of the gains come from post-training RL, not pre-training (OpenAI's GPT 5.2 is using the same base model as 4o).

Also the article seems to be somewhat outdated. 'Model collapse' is not a real issue faced by frontier labs.


("The article" referred to https://www.theregister.com/2026/01/11/industry_insiders_see... - we've since changed the URL above.)

> OpenAI's GPT 5.2 is using the same base model as 4o

where’s that info from?


Not the parent, but the only other source of that claim I found was Dylan Patel's recent post from semianalysis.

Was that for 5.1 or 5.2? I recall that info spreading after 5.1’s release, I guess I naively assumed 5.2 was a delayed base model update.

You can just ask ChatGPT what its training cut-off is, and it'll say June 2024.

Ask! 5.2 says August 2025.

Oh! I stand corrected.

A lot of the recent gains are from RL but also better inference during the prefill phase, and none of that will be impacted by data poisoning.

But if you want to keep the "base model" on the edge, you need to frequently retrain it on more recent data. Which is where data poisoning becomes interesting.

Model collapse is still a very real issue, but we know how to avoid it. People (non-professionals) who train their own LoRA for image generation (in a TTRPG context at least) still have the issue regularly.

In any case, it will make the data curation more expensive.


knowledge cutoff date is different for 4o and 5.2

I'm sorry your teammates have skill issues when it comes to using these tools.

I primarily find them useful in augmenting my thinking. Grokking new parts of a codebase, discussing tradeoffs back and forth, self-critiques, catching issues with my plan, etc.

I implemented some of his setup and have been loving it so far.

My current workflow is typically 3-5 Claude Codes in parallel

- Shallow clone, plan mode back and forth until I get the spec down, hand off to subagent to write a plan.md

- Ralph Wiggum Claude using plan.md and skills until PR passes tests, CI/CD, auto-responds to greptile reviews, prepares the PR for me to review

- Back and forth with Claude for any incremental changes or fixes

- Playwright MCP for Claude to view the browser for frontend

I still always comb through the PRs and double check everything including local testing, which is definitely the bottleneck in my dev cycles, but I'll typically have 2-4 PRs lined up ready for me at any moment.


Do you prefer Playwright or the Chrome MCP?

3-5 parallel claude code, do they work at same repo?

do they work on same features/goals?


We have a giant monorepo, hence the shallow clones. Each Claude works on its own feature / bug / ticket though, sometimes in the same part of the codebase but usually in different parts (my ralph loop has them resolve any merge conflicts automatically). I also have one Claude running just for spelunking through K8s, doing research, or asking questions about the codebase I'm unfamiliar with.

why do we have guides and lessons on how to use a chainsaw when we can hack the tree with an axe?


The chainsaw doesn't sometimes chop off your arm when you are using it correctly.


If you swing an axe with a lack of hand eye coordination you don't think it's possible to seriously injure yourself?


Was the axe or the chainsaw designed in such a way that guarantees that it will definitely miss the log and hit your hand fair amount of the times you use it? If it were, would you still use it? Yes, these hand tools are dangerous, but they were not designed so that it would probably cut off your hand even 1% of the time. "Accidents happen" and "AI slop" are not even remotely the same.

So then with "AI" we're taking a tool that is known to "hallucinate", and not infrequently. So let's put this thing in charge of whatever-the-fuck we can?

I have no doubt "AI" will someday be embedded inside a "smart chainsaw", because we as humans are far more stupid than we think we are.


https://scale.com/leaderboard/swe_bench_pro_commercial

I definitely trust the totally private dataset more.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: