More

supermdguy · 2026-04-17T18:38:27 1776451107

Overall, I'm really impressed by what you accomplished! I'm not a researcher, so not sure if this is that helpful, but here are some thoughts:

- I wonder if the "move" action is difficult for the model to learn to use well. The model sees token location as positional encodings in the embedding, not sparse character offsets. Would be interesting to see something more like "jump to next/previous [token or set of tokens]". Or maybe a find/replace like most coding harness edit tools use?

- I'd move the exact training data generation details to an appendix. Could be summarized to improve the flow of the paper.

param-updater · 2026-04-18T05:09:51 1776488991

Hi, thank you for your advice, I really appreciate it!

My model has been able to move pretty naturally throughout the canvas when editing, the model is able to remember the actual canvas including order of the tokens well, but I understand where you're coming from.

Jump to next/previous token is a good idea, and in the future I can definitely look into implementing it, especially for scaling the model up. Same thing with find/replace. Thanks again.

supermdguy · 2026-04-16T23:35:32 1776382532

Cool concept! I think the hardest part will be getting people in the target audience to use it. A lot of indie hackers make software for other indie hackers, but that isn't true of most other verticals. And honestly building software for indie hackers feels like a losing battle. Any ideas of how to incentivize none-builders to rank projects?

supermdguy · 2026-04-15T23:14:51 1776294891

From the most recent comment, looks like this is a bug, triggered by the system inadvertently activating an internal release tool [0]. Still a pretty wild bug, but not as dramatic as the title suggests. Which is kind of unfortunate honestly, the chaos of every gas town instance automatically contributing to itself would be beautiful to see.

- https://github.com/gastownhall/gastown/blob/main/internal/fo...

supermdguy · 2026-04-13T03:17:25 1776050245

Next step is to build an analog scientific calculator with only EML gates

supermdguy · 2026-04-12T17:15:14 1776014114

Bizarre reading the thread, it feels like their Claude responding to the other posters’ Claudes

phreack · 2026-04-12T18:29:05 1776018545

That was my immediate impression too! It feels like it's all AI maximalists who seem to have a need to filter their every interaction through an LLM. And the result looks and reads just like Moltbook.

tkel · 2026-04-13T00:29:12 1776040152

Yeah and the employee who generated an AI response to the AI-generated bug report, is Jared Sumner who is the founder of Bun which was acquired by Anthropic. Pretty sad state of affairs all around.

pllbnk · 2026-04-13T06:41:48 1776062508

It feels (nobody can prove it) that all user-facing applications are fully vibe-coded and no internal developers have any idea how they work, so they just keep redirecting user questions to Claude to answer on behalf of them. That's why they are dealing with regressions and downtimes every few releases as it's the usual pattern with vibe coding that bug keep resurfacing.

supermdguy · 2026-04-10T00:16:20 1775780180

If all LLM advancements stopped today, but compute + energy got to the price where the $30 million zettaflop was possible, I wonder what outcomes would be possible? Would 1000 claudes be able to coordinate in meaningful ways? How much human intervention would be needed?

supermdguy · 2026-04-09T00:25:56 1775694356

And also OpenAI’s codex spark?

supermdguy · 2026-04-08T23:03:08 1775689388

Headline/article is extremely misleading. They still have subscription plans with included usage, but those usage limits are now based on tokens instead of messages.

https://help.openai.com/en/articles/20001106-codex-rate-card

supermdguy · 2026-04-06T18:00:18 1775498418

I like this, and think it's true for how humans learn. What's interesting to me is that it seems LLMs are significantly smarter than they were two years ago, but it doesn't feel like they have better "taste". Their failure modes are still bizarre and inhuman. I wonder what it is about their architecture/training that scales their experience without corresponding improvements in taste.

In theory, RLVR should encourage less error-prone code, similar to a human getting burned by production outages like the article mentioned. Maybe the scale in training just isn't big enough for that to matter? Perhaps we need better benchmarks that capture long-term issues that arise from bad models and unnecessary complexity.

supermdguy · 2026-04-01T23:18:48 1775085528

Correct URL: https://www.gingerbill.org/article/2026/02/21/does-syntax-ma...