I have a great local assistant that works end-to-end with voice. It's built on local, web-first technologies, it fits small LLMs in memory and manages inference and TTS/STT without stuttering. I've been shaping it up over a couple years and constantly switching out new models.
If you want something simple that runs in browser, look at vosk-browser[0] and vits-web[1].
I'd also recommend checking out KittenTTS[2], I use it and it's great for the size/performance. However, you'd need to implement a custom JavaScript harness for the model since it's a python project. If you need help with that, shoot me an email and I can share some code.
There are other great approaches too if you don't mind python, personally I chose the web as a platform in order to make my agent fully portable and remote once I release it.
And of course, NVIDIA's new model just came out last week[3] but I haven't gotten to test it out just yet, and also there was the recent Sparrow-1[4] announcement which shows people are finally putting money into the problems plaguing voice agents that are rigged up from several models and glue infrastructure, vs a single end-to-end model or at least a conversational turn-taking model to keep things on rails.
I moved in with one of my closest friends a few years ago, someone I considered a brother. In less than a year, I got someone to sublet and have not spoken to him since. I had no idea someone could be such a tool.
The prototype phase meant data centers are now measured in MW instead of TFLOPS.
At a time where we were desperate to reduce emissions, data centers now consume around 20% of the energy consumed by the entire aviation sector, with consumption is rising at 15% YoY.
Never mind the water required to cool them, or the energy and resources required to build them, the capital allocation, and the opportunity cost of not allocating all of that to something else.
The computing power in a crappy cheap modern phone used to fill up a warehouse and cost a ton of energy, relatively. Moore's law might not remain steadfast, but if history is any indication, we'll find a way to make the technology more efficient.
So, yes, prototypes often use more energy than the final product. That doesn't mean we shouldn't sustainable build datacenters, but that's conflating issues.
Flight changed everything when it comes to warfare. But as far as individuals are concerned, the average human on the planet will take a handful of flights in their lifetime, at best, and nearly all flights that are taken are for recreation which is ultimately fungible with other forms of recreation that don't involve taking flights, and of the flights that aren't for recreation most could be replaced by things like video calls, and the vast and overwhelming majority of the goods that make up the lifeblood of the global economy are still shipped by ship, not shipped by air.
Which is to say, the commercial aviation industry could permanently collapse tomorrow and it would have only a marginal impact on most people's lives, who would just replace planes with train, car, or boat travel. The lesson here is that even if normal people experience some tangential beneficial effects from LLMs, their most enduring legacy will likely be to entrench authority and cement the existing power structures.
It's silly to say that the ability to fly has not changed society. Or that it won't continue to change society, if we manage to become space-faring before ruining our home planet.
The phrase, "The average human on the planet will take a handful of flights in their lifetime" is doing a lot of work. What are those flights to? How meaningful/important were the experiences? What cultural knowledge was exchanged? What about crucial components that enable industries we depend on? For example, a nuclear plant might constantly be ordering parts that are flown in overnight.
In general you're really minimizing the importance of aviation without really providing anything to back up your claims.
Your problem is thinking that hype artists, professionals and skeptics are all the same voice with the same opinion. Because of that, you can't recognize when sentiment is changing among the more skeptical.
Functional illiteracy and lack of any capacity to hold any context longer than two sentences has long been a plague on HN. Now that we've outsourced our entire thinking process to "@grok is this true", it has now claimed almost the entirety of human race.
soulofmischief: complains that AI-skeptics would say the Wright brothers were idiots because they didn't imediately implement a supersonic jet
ares623: we were promised supersonic jets today or very soon (translation: AI hype and scam artists have already promised a lot now)
eru: The passive voice is doing a lot of work in your sentence. (Translation: he questions the validity of ares623's statement)
me: Here are just three examples of hype and scam promising the equivalent of super jet today, with some companies already being burned by these promises.
Apply your own "functional literacy". I made a clarification that those outside of an industry have to separate the opinions of professionals and hype artists.
The irony of your comment would be salient, if it didn't feel like I was speaking with a child. This conversation is over, there's no reason to continue speaking with you as long you maintain this obnoxious attitude coupled with bad reading comprehension.
Here's Ryan Dahl, cofounder of Deno, creator of Node.js tweeting today:
--- start quote ---
This has been said a thousand times before, but allow me to add my own voice: the era of humans writing code is over. Disturbing for those of us who identify as SWEs, but no less true. That's not to say SWEs don't have work to do, but writing syntax directly is not it.
They have everything to gain by saying those things. It doesn’t even need to be true. All the benefits arrive at the point of tweeting.
If it turns out to be not true then they don’t lose anything.
So we are in a state where people can just say things all the time. Worse, they _have_ to say. To them, Not saying anything is just as bad as being directly against the hype. Zero accountability.
Yes, my point is that industry professionals are re-calibrating based on the last year of agentic coding advancements, and that this is different from hype men on YouTube from 1-2 years ago claiming that they don't have to write code anymore.
Congratulations, now you're starting to understand! :)
Last one is irrelevant. Of course some companies are miscalculating.
OpenAI never claimed they had achieved AGI internally. Sam was very obviously joking, and despite the joke being so obvious he even clarified hours later.
>In a post to the Reddit forum r/singularity, Mr Altman wrote “AGI has been achieved internally”, referring to artificial general intelligence – AI systems that match or exceed human intelligence.
>Mr Altman then edited his original post to add: “Obviously this is just memeing, y’all have no chill, when AGI is achieved it will not be announced with a Reddit comment.”
Dario has not said "we are months away from software jobs being obsolete". He said:
>"I think we will be there in three to six months, where AI is writing 90% of the code. And then, in 12 months, we may be in a world where AI is writing essentially all of the code"
He's maybe off by some months, but not at all a bad prediction.
Arguing with AI skeptics reminds me of debating other very zealous ideologues. It's such a strange thing to me.
Like, just use the stuff. It's right there. It's mostly the people using the stuff vs. the people who refuse to use it because they feel it'll make them ideologically impure, or they used it once two years ago when it was way worse and haven't touched it since.
The insecurity is mind-boggling. So many engineers afraid to touch this stuff for one reason or another.
I pride myself in being an extremely capable engineer who can solve any problem when given the right time and resources.
But now, random unskilled people can do in an afternoon what it might have taken me a week or more to do before. Of course, I know their work might be filled with major security issues, or terrible architectural decisions and hidden tech debt that will eventually grind development to a complete halt.
I can be negative and point out these issues, or I can adopt these tools myself, and have the skilled hand required to keep things on rails. Now what I can do in a week cannot be matched by an unskilled engineer in an afternoon, because we have the same velocity multipliers.
I remember being such a purist in my youth that I didn't even want autocomplete or intellisense, because I feared it would affect my recall or stunt my growth. How far we have come. How I code has changed completely in the last year.
I code 8-20 hours a day, all day. I actively work on several projects at once, flipping between contexts to check results, review code, iterate on design/implementation, hand off new units of work to various agents. It is not a perfect process, I am constantly screaming and pulling my hair out over how stupid and forgetful and stubborn these tools can be sometimes. My output has still dramatically increased, and I have plenty extra time to ensure the quality of the code is secure and good enough.
I've given up on expecting perfection from code I didn't write myself; but what else is new? Any skilled individual who has managed engineers before knows you have to get over this quickly and accept that code from other engineers will not match your standards 100%.
Your role is to develop and enforce guidelines and processes which ensure that any code which hits production has been thoroughly reviewed, made secure and performant. There might be some stupid inline metacomments from the LLM that slip through, but if your processes are tight enough, you can produce much more code with correct interfaces, even if the insides aren't perfect. Even then, targeted refactors are more painless than ever.
Engineers who only know how to code, and at a relatively mediocre level, which I imagine is the majority of engineers now in the field who got into it because of the money, are probably feeling the heat and worried that they won't be employable. I do not share that fear, provided that anyone at all is employable.
When running a business, you'll still need to split the workload, especially as keeping pace with competition becomes an increasingly brutal exercise. The money is still in the industry, and people with money will still find ways to use it to develop an edge.
People can say what they want about LLMs reducing intelligence/ability; The trend has clearly been that people are beginning to get more organized, document things better, enforce constraints, and think in higher-level patterns. And there's renewed interest in formal verification.
LLMs will force the skilled, employable engineer to chase both maintainability and productivity from the start, in order to maintain a competitive edge with these tools. At least until robots replace us completely.
The thing is that currently most of these projects are just done by engineers, Its easy to stay organized when the project lasts couple of weeks and stays within <5 engineers. The issues starts when the software starts living longer and you add in the modern agile practices, it comes a complete mess which each PM trying to add random features on top of the existing code. As you increase more and more code, the maintainability will just become impossible.
> The issues starts when the software starts living longer
There's going to be a bifurcation; caricaturing it, "operating system kernels" and "disposable code". In the latter case, you don't maintain it; you dispose of it and vibe-code up a new one.
I am aware that software complexity scales. That is literally why I suggested that having good standards from the start is becoming increasingly important.
The answer to whatever perceived (unfounded) overlap between gun owners to potential ICE agents is not to encourage or condone more prejudice and ostracization of the people who do not fall in both categories via speech that lumps them in with the others anyway.
That's a very disrespectful way to recognize and appreciate them.
Many of us own guns precisely to defend ourselves and our countrymen in the event of civil chaos. That's what the second amendment is for.
Most true leftists I know are armed. Don't forget what Karl Marx said about an armed populace. We are in some serious shit and this kind of divisive attitude is not productive.
Let us know when you take your guns to defend Minnesota. Or are the actions being committed there by the government not a sufficient amount of "civil chaos" for you to take action? I just wish gun owners were honest. You're not here to defend anyone or anything. You just like to make small holes in paper targets.
Individual vigilante action is not the answer. Collectivization, political organization is. Guns are for self-defense and armed conflict. I can't solve this problem on my own, until there is mass collectivization then I'm only an individual and cannot just go around taking on a rogue authoritarian government. That's absurd. Besides, I personally have physical handicaps, I'm not Batman. And I spoke for leftists. I cannot control what other people do. But divisive crap like this is not helping to unify anything.
> I just wish gun owners were honest. You're not here to defend anyone or anything. You just like to make small holes in paper targets.
You literally know nothing about me and are making assumptions about how I think and operate based on a single fact you think you know about me. That is textbook prejudice. All you're doing is showing that you don't understand the point of the Bill of Rights or what checks and balances it takes to uphold a fair government.
I'll pull the Karl Marx quote for you.
“Under no pretext should arms and ammunition be surrendered; any attempt to disarm the workers must be frustrated, by force if necessary”
You quote Marx about keeping arms while sitting on your ass doing fuck all about the corruption and tyranny surrounding us. Sounds like convenient excuses to do fuck all and pat yourself on the back as a patriot defender of America. People are being killed on the streets RIGHT NOW. But keep biding your time to act. I'm sure you will be decisive in unraveling the fascism taking over our country you brave keyboard warrior!
The thing about archives is you either parse them now or parse them later. With how much JS and other crap is served in modern social media frontends, I'm not sure WARC is the best format for archiving from them.
But that is the point of WARC: otherwise, your archival method need some sort of general inteligence (ai or human behind the scenes) to store exacly what you need.
With WARC (and good WARC tooling like Browsetrix-crawler) you store everything HTTP the site sent.
I'm confident that they can. This isn't a new idea. Something like this would be a walk in the park for Opus 4.5 in the right harness.
Of course it likely still needs a skilled pair of eyes and a steady hand to keep it on track or keep things performant, but it's an iterative process. I've already built my own ASCII rendering engines in the past, and have recently built one with a coding model, and there was no friction.
Ok, but if you have a wooden hammer and chisel, and a steel hammer and chisel, choosing the wooden one is an artisanal choice, not a practical one. These tools enable an amount of velocity I've never had before, both in research and development.
If you want something simple that runs in browser, look at vosk-browser[0] and vits-web[1].
I'd also recommend checking out KittenTTS[2], I use it and it's great for the size/performance. However, you'd need to implement a custom JavaScript harness for the model since it's a python project. If you need help with that, shoot me an email and I can share some code.
There are other great approaches too if you don't mind python, personally I chose the web as a platform in order to make my agent fully portable and remote once I release it.
And of course, NVIDIA's new model just came out last week[3] but I haven't gotten to test it out just yet, and also there was the recent Sparrow-1[4] announcement which shows people are finally putting money into the problems plaguing voice agents that are rigged up from several models and glue infrastructure, vs a single end-to-end model or at least a conversational turn-taking model to keep things on rails.
[0] https://www.npmjs.com/package/vosk-browser
[1] https://github.com/diffusionstudio/vits-web
[2] https://github.com/KittenML/KittenTTS
[3] https://research.nvidia.com/labs/adlr/personaplex/
[4] https://www.tavus.io/post/sparrow-1-human-level-conversation...
reply