Get them to learn the fundamentals and understand them deeply just like they should/might have in the past.
They can do so at an accelerated rate using AI on verifiable subject matter. Use something like SRS + copilot + nano (related: https://srs.voxos.ai) to really internalize concepts.
Go deep on a project while using AI. To what extreme can they take a program before AI can't offer a working solution? Professors should explore and guide their students to this boundary.
I've been building on a platform called "The Jobs Index" as a centralized way to understand:
1. Task-based automation risks
2. Real-time layoff trends
3. Most/least resilient occupations
4. Outlook of the job market over the next 10 years
While the layoffs data is starting to look scary, corporate policy and regulation will very likely lead to an explosion of HIL jobs.
Andrej Karpathy (https://x.com/karpathy) released (and then removed?) a cursory analysis on ~300 occupations, but JTI tracks over 600 across BLS and O*NET data.
One of the things that's becoming clear is that job security will map to lack of verifiability in a given subject matter. That's somewhat concerning.
Fun! I came up with a similar concept - except you can only type in one word at a time. It discourages self-editing while also not being as extreme as exploding text.
I built this because I'm fascinated by the word embeddings. The simple, canonical (?) example of king - man + woman = queen is such an accessible concept.
I figured that an LLM (Claude Opus 4.6) could extend the concept to apply to a number of mathemtics topics and it did!
There's an aspect to this which I feel could really help people who are not mathematically inclined to really internalize core conepts to the point that actual math makes more sense to them.
I think the thinking mode is a net negative in a significant number of cases. I've had an issue in a file that claude failed to mention in the regular output but thought about and then dismissed out of hand in thinking.
As I automate more and more of my agentic coding process, I've come to realize that a swipe-based UX is very likely to dominate corporate decision making in the years to come.
The posted link is a research report on the topic - in full disclosure generated by a custom research agent I've been working on.
I'm sure that others are working on other novel UXs for fast decision making to coordinate their agents. I'd love to hear any insights you've gained so far.
I've worked in governance for the last 15 years. Based on that experience, nobody truly cares about the UX to signify a decision. They care about the communication of the information to make the decision in the first place. So you might be right, but you are focusing on something that is fairly irrelevant. If you want to innovate in the boardroom, innovate on information flow.
I've been building something in this space ("Clink" - multi-agent coordination layer) and this research confirms some of the assumptions that motivated the project. You can't just throw more agents at a problem and expect it to get better.
The error amplification numbers are wild! 17x for independent agents vs 4x with some central coordination. Clink provides users (and more importantly their agents) the primitives to choose their own pattern.
The most relevant features are...
- work queues with claim/release for parallelizable tasks
- checkpoint dependencies when things need to be sequential
- consensus voting as a gate before anything critical happens
The part about tool count increasing coordination overhead is interesting too. I've been considering exposing just a single tool to address this, but I wonder how this plays out as people start stacking more MCP servers together. It feels like we're all still learning what works here. The docs are at https://docs.clink.voxos.ai if anyone wants to poke around!
> The part about tool count increasing coordination overhead is interesting too. I've been considering exposing just a single tool to address this, but I wonder how this plays out as people start stacking more MCP servers together.
It works really well. Whatever knowledge LLMs absorb about CLI commands seems to transfer to MCP use so a single tool with commands/subcommands works very well. It’s the pattern I default to when I’m forced to use an MCP server instead of providing a CLI tool (like when the MCP server needs to be in-memory with the host process).
I've started with the basics for now: messages (called "Clinks" because... marketing), groups, projects, milestones - which are all fairly non-novel and one might say this is just Slack/Jira. The ones that distinguish it are proposals to facilitate distributed consensus behaviour between agents. That's paired with a human-in-the-loop type proposal that requires the fleet owner to respond to the proposal via email.
That's great to hear. It makes sense given the MCP server in this case is mainly just a proxy for API calls. One thing I wonder is at what point do you decide your single tool description packs in too much context? Do you introduce a tool for each category of subcommands?
Wouldn't it be better just to stack functionalities of multiple agents into a single agent instead of getting this multi-agent overhead/failure? Many people in academia consider multi-agentic systems to be just an artifact of the current crop of LLMs but with longer and longer reliable context and more reliable calls of larger numbers of tools in recent models multi-agentic systems seem less and less necessary.
In some cases, you might actually want to cleanly separate parallel agents' context, no? I suppose you could make your main agent with stack functionalities responsible for limiting the prompt of any subagents it spawns.
My hunch is that we'll see a number of workflows that will benefit from this type of distributed system. Namely, ones that involve agents having to collaborate across timezones and interact with humans from different departments at large organizations.
Coordination of workflows between people using different LLM providers is the big one. You prefer Anthropic's models, your coworker swears by OpenAI's. None of these companies are going to support frameworks/tools that allow agent swarms to use anything other than their own models.
Work hours is the only way I've learned to think about it productively.
It's also important to gather consensus among the team and understand if/why work hour estimates differ between individuals on the same body of work or tasks. I'd go so far as to say that a majority of project planning, scoping, and derisking can be figured out during an honest discussion about work hour estimates.
Story points are too open to interpretation and have no meaningful grounding besides the latent work hours that need to go into them.
If you have complex tasks and you have more than one person put in time to do a proper estimate, yes, you should sync up and see if you have different opinions or unclear issues.
reply