Hacker Newsnew | past | comments | ask | show | jobs | submit | fb03's commentslogin

Can you run the same tests on Qwen3.5:9b? that's also a model that runs very well locally, and I believe it's even stronger than Gemma2B

yes, with one line change. grab the second code block in the article, that's the test harness rigged up to send all 80 questions and both turns through whatever model you want. find MODEL_ID = "google/gemma-4-E2B-it" and swap it to your huggingface id. run it. we'd love for people to keep testing different models on this. if you run qwen through it let us know what you find, post the results here.

We may beat you to it and we will share if we do lol


It's almost like Qwen 3.5 9B is 4 times larger.

and that 4x difference allows you to use CPUs and much cheaper hardware to achieve the same level of outcome... for free

I'm working on `tu` (terminal use), which is a way to give agents access to a full blown virtual terminal to operate TUI apps

https://github.com/flipbit03/terminal-use

I'm super proud, because it came to my knowledge that someone at Codex used my tool to debug codex+zellij issues, by running zellij within `tu`, and then codex inside zellij


Outages are already happening, besides that, we need vibe warrooming


Quick Q: OP told he used Llama 3.2:3b which is a pretty old model. What would be a good modern model to substitute it? Qwen3.5:4b or something?


  Show HN: forestui – TUI for git worktrees with Claude Code + GitHub integration

  I built forestui to manage the chaos of working on multiple features/branches simultaneously. It's a tmux-powered TUI that orchestrates your worktrees, editors, terminals, and Claude Code sessions across organized tmux windows (edit:repo:branch, claude:repo:branch, term:repo:branch). Switch contexts without losing state.

  Why? Git worktrees let you have multiple branches checked out at once without stashing or juggling, but I find managing them with raw git commands painful.

  What it does:

  - Worktree management: Create, rename, archive, delete worktrees with single keypresses. Each worktree gets its own directory, so you can have feat/auth, fix/bug-123, and main all open in different editor windows simultaneously.
  - GitHub integration: Lists your open issues directly in the TUI (using the gh cli). See an issue you want to work on? One keypress creates a worktree with an auto-generated branch name (42-fix-login-bug) and optionally pulls latest first.
  - Deep Claude Code integration: Tracks sessions per-worktree, shows recent sessions with message counts and last activity, resume any session with one keypress. Also trigger Claude's YOLO mode for fast iteration in a wt using a single button press.

  Tech: Python 3.14, Textual for the TUI (app is fully reactive and works with mouse well), Pydantic for models, libtmux for tmux integration. Strict mypy, async everywhere so the UI never blocks.

  Install:

  # One-liner (installs uv if needed)
  curl -fsSL https://raw.githubusercontent.com/flipbit03/forestui/main/install.sh | bash

  # If you already use uv
  uv tool install forestui

  Auto-updates on startup via uv. Requires tmux and, optionally, the gh cli.
Hope you enjoy!


What a coincidence! I actually implemented a harness to test this, about a week ago

https://github.com/flipbit03/caducode


I've created a toy coding agent called "caducode". More of a thought experiment that materialized into a little something.

The philosophy behind it is: instead of providing a bunch of tools to the LLM, you simply provide a single tool: run_python(). The Agent just generates code to do whatever it needs, to inspect local files, to carry edits, to run commands.

https://github.com/flipbit03/caducode

It worked surprisingly well, even with a very small 30b local model.


"I'm using OpenAI here"

proceeds to show a piece of code importing anthropic

was pretty confusing to me


The only thing that still makes me maintain a Windows installation is playing League of Legends. Everything else (I mean, real work) is done on Linux


Yes, it was configured in his mRNA


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: