Hacker Newsnew | past | comments | ask | show | jobs | submit | jes5199's commentslogin

I'm running Claude Code in a tmux on a VPS, and I'm working on setting up a meta-agent who can talk to me over text messages

Hey - this sounds like really interesting set-up!

Would you be open to providing more details. Would love to hear more, your workflows, etc.


Cursor makes it easier to watch what the model is doing and to also make edits at the same time. I find it useful at work where I need to be able to justify every change in a code review. It’s also great for getting a feel for what the models are capable of - like, using Cursor for a few months make it easier to use Claude Code effectively


ride the BART


I think this might be the way forward, Claude is great at project managing.

I’m already telling Claude to ask Codex for a code review on PRs. or another fun pattern I found is you can use give the web version of Codex an open ended task like “make this method faster”, hit the “4x” button and end and up with four different pull requests attacking the problem in different ways. Then ask Claude to read the open PRs and make a 5th one that combines the approaches. This way Codex does the hard thinking but Claude does the glue


can it read code review comments? I've been finding that having claude write code but letting codex review PRs is a productive workflow, claude code is capable of reading the feedback left in comments and is pretty good at following the advice.


I’m letting Claude Code review the code as part of a gitlab CI job. It adds inline comments (using curl and the http API, nightmare to get right as glab does not support this)

CC can also read the inline comments and creates fixes. Now thinking of adding an extra CI job that will address the review comments in a separate MR.


Have you tried GitHub Copilot? I've been trying it out directly in my PRs like you suggest. Works pretty well sometimes.


I find that ChatGPT’s Codex reviews - which can also be set up to happen automatically on all PRs - seem smarter than Copilot’s, and make fewer mistakes. But these things change fast, maybe Copilot caught up and I didn’t notice


No codex catches genuine bugs here that multiple reviewers would have overlooked, whilst copilot only comes with nitpicks. And codex does none of those, which is also great.


yesss, and OpenAI tried this first when they were going to do a “GPT store”. But REST APIs tend to be complicated because they’re supporting apps. MCP, when it works, is very simple functions

in practice it seems like command line tools work better than either of those approaches


Command line tools are my preference just because they're also very useful to humans. I think providing agents function libraries and letting them compose in a repl works about as well but is higher friction due env management.


I could imagine that in ten years git will feel strangely slow and ceremonial. Why not just continuously work and continuously deploy live-edited software


I feel the opposite way, that git branching and merging will become a bigger part of the job as more code is written by agents in parallel and then accepted by other agents or humans.


for now yes absolutely. but I’m already hearing rumblings that some people are having luck letting multiple agents edit the same directory simultaneously instead of putting changes through PR merge hell. It just needs coordinations tools, see https://github.com/Dicklesworthstone/mcp_agent_mail as one (possibly insane) prototype

for example it’s not out of the question that we could end up with tooling that does truly continuous testing and integration, automatically finding known-good deployments among a continuously edited multiplayer codebase

we’d have to spend a lot more energy on specifications and acceptance testing, rather than review, but I think that’s inevitable - code review can’t keep up with how fast code gets written now


Having tried a janky version of this myself with a NOTES directory, I am very bearish on this being a better workflow than just improving the ui wrapper around git worktrees and the isolation that provides.

Codex already has a fantastic review mode, and gemini / claude are building tools around pr review that work no matter how that pr was produced, so I think this interface is going to get baked in to how agents work in the near term.


That’s a very optimistic outlook for the future.


Often projects need a history of stable checkpoints, and source control is one way to provide that.


Yes, but does it need all the ceremony surrounding it? If, every time I saved the file, the changes were analyzed and committed to git, and a useful commit message included, and commits squashed automatically and pushed and tested and tagged (using magic, let's say); if the system existed in the background, seamlessly, how would our interactions with source control and with other developers look?


> if the system existed in the background, seamlessly, how would our interactions with source control and with other developers look?

They would look like noise.

You would be the source of that noise.

One commit per edit? Nonsense.

Me and any other developer would hate to share a repository with you.


ye gods, have you never heard the word squash?


Why should I do your work instead of mine?

Even your comment is also noise =)


Who's asking you to do my work instead of yours in this hypothetical magic future system that I've invented in my head?


automated commit message will tell you the "what" not the "why".

In any circle of "what makes a good commit message and why even do it" discussions, invariably the recommendation is to explain the "why" and leave out the self-evident "what".

If your stance is that commit and commit messages can be automated away then we might as well not even have them.

I don't share this view, but yeah in this world we don't need AI to do things that shouldn't be done in the first place.


> we might as well not even have them.

You can't see any value in being able to see the "what" in a short bit of English at a glance vs having to analyze a 300+ line diff to figure out what it's doing?


increasingly, the automated systems have access to the original ticket or bug report, and maybe even the conversation while implementation is happening. They can record the “why”


Use jujutsu


Counter argument to living software is that it treats "never done" products as a virtue instead of a failure of design.

Here's a thread where the person replying to me makes this case: https://news.ycombinator.com/item?id=45455963


I love it when I have a tool that’s “done” but the software I work on in my career is never, ever done. It’s almost like there’s two different things we call “software”. there are tools like, idk, “curl” where you can use and old version and be happy. and there are interactive organizations in the world, like, eg, Hacker News, which mutates as the community’s needs change


Software for evolving business-needs is the same for me. What's insightful is that we (I) take continuously evolving software as just that: evolving. It's a defacto virtue to continuously tinker.

Doing away with check-ins entirely is the extreme end-game of that pov. I'm in product and every day and every week yes we very much continually change the product!

But I'm growing less convinced that the natural end-state of this methodology produces obviously better results.


Ohh no, you should be able to decide which changes to commit, line by line, before committing them.

What you describe sounds like a security nightmare to me.

Maybe you are using a remote dev server, and every change you do needs to be committed before you see the result?

Please setup a local environment instead. Not even F5 should be required, you save a file, you see the result in the browser.

When your work is finished, and only then, you should commit your changes.


it doesn't work quite well for complex projects that require integration with other teams/software.

You would need to either have separate versions running at the same time or never do breaking changes or devise some other approach that makes it possible.

It's not always feasible to do it this way


I think that’s a tooling problem. Maybe we do end up running a lot more versions of things in the future. If we believe that code has gotten cheaper, it should be easier to do so.


I wonder how many nines of uptime your team is required to have..


Imagine if someone clicks the deploy button when you're in the middle of typing something and then the service goes down due to a syntax error. To prevent this, we will need some sort of way to set a global lock to indicate that "I'm not done typing yet" and you can only deploy once everyone has released this lock.


Or you don't deploy unless it makes it through at least testing, and a build started while someone was editing the code would probably fail fast unless you coincidentally hit the button right when it's valid, but wrong, code.


I just remembered, in those days, there was an alias called `ubygems` so you could pass `-rubygems` (ie, `-r` with `ubygems` as the argument) on the command line as if it was a first-class feature

it's so typical of ruby culture "haha, what if I do this silly thing" and then that gets shipped to production



it’s an america thing


Other countries don't have prefects? Wikipedia seems to indicate the phenomenon is worldwide.

https://en.wikipedia.org/wiki/Class_president


We wouldn't call such a thing a president, but yes.


huh okay, so, prediction: similar to how interpreted code eventually was given JIT so that it could be as fast as compiled code, eventually the LLMs will build libs of disposable helper functions as they work, which will look a lot like “writing code”. but we’ll stop thinking about it that way


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: