More

mkagenius · 2025-12-15T14:10:16 1765807816

> If anything, I think we'll see (another) splintering in the market. Companies with strong internal technical ability vs those that don't.

A tangent, I feel, again, unfortunately, the AI is going to divide society into people who can use the most powerful tools of AI vs those who will be only be using chatGPT at most (if at all).

I don't know why I keep worrying about these things. Is it pointless?

tovej · 2025-12-15T14:29:12 1765808952

I do feel this divide, but from what I've read, and ehat I've observed, it's more a divide between people who understand the limited use-cases where machine learning is useful, and people who believe it should be used wherever possible.

For software engineering, it is useless unless you're writing snippets that already exist in the LLMs corpus.

rglover · 2025-12-15T15:14:52 1765811692

> For software engineering, it is useless unless you're writing snippets that already exist in the LLMs corpus.

If I give something like Sonnet the docs for my JS framework, it can write code "in it" just fine. It makes the occasional mistake, but if I provide proper context and planning up front, it can knock out some fairly impressive stuff (e.g., helping me to wire up a shipping/logistics dashboard for a new ecom business).

That said, this requires me policing the chat (preferred) vs. letting an agent loose. I think the latter is just opening your wallet to model providers but shrug.

tovej · 2025-12-15T18:57:47 1765825067

If you need a shipping dashboard, then yeah, that's a very common, very simple use-case. Just hook up an API to a UI. Even then I don't think you'll make a very maintainable app that way, especially if you have multiple views (because the LLMs are not consistent in how they use features, they're always generating from scratch and matching whatever's closest).

What I'm saying is that whenever you need to actually do some software design, i.e. tackle a novel problem, they are useless.

mkagenius · 2025-12-15T12:49:46 1765802986

Since this can be a significant security issue for the state, why doesn't the government sponsor a security audit of the software. Does it upload the data or everything is done on the device? (Also, will have to keep up with the updates)

grosswait · 2025-12-15T12:54:26 1765803266

How does that provide any assurance against future changes that the public wouldn’t have any ability to know about.

hackernewds · 2025-12-15T14:01:45 1765807305

So the govt implements rules and a panopticon for penalties. this works for the FDA, why wouldn't it for the FCC

pintxo · 2025-12-15T14:13:25 1765808005

Because regulation is bad, according to the current executive?

Politics aside, the FDA applies a very generous amount of regulation (mostly justifiable), not sure we want to pay multiples for our consumer electronics, as it (mostly) shows acceptable behavior and rearely kills anybody.

trimethylpurine · 2025-12-15T15:46:45 1765813605

It is bad. Regulations have been historically hijacked to benefit corporate interests. See Intuit and tax policy for example.

Voters on the right naively thought he'd work to fix it. (Wrong!) But it is very much bad for a very large number of issues. Maybe next executive will fix it? (Wrong!)

Tangurena2 · 2025-12-15T22:16:33 1765836993

The NSA has a bad historical reputation for this sort of thing - intentionally weakening crypto standards to make things easier for themselves to break, while keeping them "strong enough" that other agencies outside of NSA/GCHQ/GRU can't. The Crypto AG scandal [0] was pretty bad, with Clipper/Skipjack & Dual_EC_DRBG [1] being more recent ones. The NSA could do what you are asking to do, but they probably won't let us know what the really bad holes are because they want to keep using them.

Notes:

0 - https://www.washingtonpost.com/graphics/2020/world/national-...

1 - https://www.scientificamerican.com/article/nsa-nist-encrypti..., https://en.wikipedia.org/wiki/Dual_EC_DRBG

IAmBroom · 2025-12-15T15:42:58 1765813378

"Why doesn't the state protect everyone from ___?" is a naive question.

Almost anything can be a significant security issue for the state. They have to carefully choose where they are going to spend effort & money.

And they pick whatever will keep them safely in power... which never ever includes "strict regulation of vacuum cleaners".

PaulDavisThe1st · 2025-12-15T16:04:03 1765814643

> which never ever includes "strict regulation of vacuum cleaners

but has routinely included "network and encryption related technologies".

It's just that these two worlds now, amazingly and probably incorrectly, overlap.

CamperBob2 · 2025-12-15T17:04:38 1765818278

The government's idea of regulating encryption-related technologies is to prohibit anyone but the government from using them. No, thanks.

Tangurena2 · 2025-12-15T22:20:58 1765837258

We don't regulate/protect the SCADA systems that run utilities like water treatment plants and the power transmission system.

kspacewalk2 · 2025-12-15T14:34:39 1765809279

Better yet, why not pick a security auditor and have the bidder pay for it, as a condition for approval?

mkagenius · 2025-12-13T04:00:11 1765598411

If anyone wants to use skills with any other model or tool like Gemini CLI etc. I had created open-skills, which lets you use skills for any other llm.

Caveat: needs mac to run

Bonus: it runs it locally in a container, not on cloud nor directly on mac

1. Open-Skills: https://GitHub.com/BandarLabs/open-skills

mkagenius · 2025-12-02T22:25:09 1764714309

Surprising that they haven't made a podcast (NotebookLM-esque) based on the repo - that one can listen to on a bus ride. Something I had created a while back https://gitpodcast.com

mkagenius · 2025-11-30T19:22:26 1764530546

Remember that things a "ceo" of anything says is just what he hears from people he has talked to. Now it doesn't make it obviously wrong, it's just then begs the question who he has been talking to that week. I doubt gary is doing any of the coding these days. For what it's worth, it's completely fine to ignore what he is saying - no offense.

nylonstrung · 2025-12-01T06:25:51 1764570351

Except he's right in this case, and it is contrary to the hypemongering we'd expect

It's 100% accurate to say that "MCP barely works" and it's meaningful to hear that even from the head of YC which is pushing through massive amount of businesses based on MCP or using it some way

fragmede · 2025-12-01T06:39:57 1764571197

It's two words with no qualifiers from someone we don't think it's technical. If it was, say, Karpathy, then sure, let's waste a whole thread discussing his farts, but I'm sure I'm not alone in having Claude Code having created an MCP, and I use it most times I use Claude Code. To move the conversation forwards though, what limitations and issues have you run into with MCPs? I wouldn't say mine are 100% bug free, but I wouldn't say it "barely works" either. Mostly works?

mkagenius · 2025-11-30T19:02:41 1764529361

> The key to changing everyday behaviour is to make the evaluation of costs (effort) and benefits (rewards) a habit that doesn’t seem too much like hard work. Even for the most apathetic among us, this holds out the hope of turning a kneejerk “no” into an ability to consider saying “yes”.

and this

> But left to his own devices he did nothing. Studies in people who develop apathy have shown that many of them just don’t find it sufficiently rewarding to take action. The cost of making the effort doesn’t seem worth the potential benefit.

seem to form a deadlock?

pitched · 2025-11-30T21:11:17 1764537077

It is a cycle that feeds into the next, constantly strengthening itself. Whether that is positive feedback or negative feedback is really important. It is worth a large disruption to your life to get it working for you. The deadlock is very real.

mkagenius · 2025-11-25T18:45:34 1764096334

Sooner or later I believe, there will be models which can be deployed locally on your mac and are as good as say Sonnet 4.5. People should shift to completely local at that point. And use sandbox for executing code generated by llm.

Edit: "completely local" meant not doing any network calls unless specifically approved. When llm calls are completely local you just need to monitor a few explicit network calls to be sure. Unlike gemini then you don't have to rely on certain list of whitelisted domains.

KK7NIL · 2025-11-25T18:52:01 1764096721

If you read the article you'd notice that running an LLM locally would not fix this vulnerability.

pennomi · 2025-11-25T19:02:45 1764097365

Right, you’d have to deny the LLM access to online resources AND all web-capable tools… which severely limits an agent’s capabilities.

yodon · 2025-11-25T19:00:29 1764097229

From the HN guidelines[0]:

>Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that".

[0]: https://news.ycombinator.com/newsguidelines.html

KK7NIL · 2025-11-25T19:09:18 1764097758

That's fair, thanks for the heads up.

kami23 · 2025-11-25T18:50:49 1764096649

I've been repeating something like 'keep thinking about how we would run this in the DC' at work. The cycles of pushing your compute outside the company and then bringing it back in once the next VP/Director/CTO starts because they need to be seen as doing something, and the thing that was supposed to make our lives easier is now very expensive...

I've worked on multiple large migrations between DCs and cloud providers for this company and the best thing we've ever done is abstract our compute and service use to the lowest common denominator across the cloud providers we use...

api · 2025-11-25T18:56:16 1764096976

Can't find 4.5, but 3.5 Sonnet is apparently about 175 billion parameters. At 8-bit quantization that would fit on a box with 192 gigs of unified RAM.

The most RAM you can currently get in a MacBook is 128 gigs, I think, and that's a pricey machine, but it could run such a model at 4-bit or 5-bit quantization.

As time goes on it only gets cheaper, so yes this is possible.

The question is whether bigger and bigger models will keep getting better. What I'm seeing suggests we will see a plateau, so probably not forever. Eventually affordable endpoint hardware will catch up.

pmontra · 2025-11-25T19:14:34 1764098074

That's not easy to accomplish. Even a "read the docs at URL" is going to download a ton of stuff. You can bury anything into those GETs and POSTs. I don't think that most developers are going to do what I do with my Firefox and uMatrix, that is whitelisting calls. And anyway, how can we trust the whitelisted endpoint of a POST?

zahlman · 2025-11-26T02:52:46 1764125566

> Edit: "completely local" meant not doing any network calls unless specifically approved. When llm calls are completely local you just need to monitor a few explicit network calls to be sure.

The problem is that people want the agent to be able to do "research" on the fly.

tcoff91 · 2025-11-25T19:05:36 1764097536

At the time that there's something as good as sonnet 4.5 available locally, the frontier models in datacenters may be far better.

People are always going to want the best models.

fragmede · 2025-11-25T18:54:14 1764096854

it's already here with qwen3 on a top end Mac and lm-studio.

dizzy3gg · 2025-11-25T18:53:28 1764096808

Why is the being downvoted?

jermaustin1 · 2025-11-25T18:56:30 1764096990

Because the article shows it isn't Gemini that is the issue, it is the tool calling. When Gemini can't get to a file (because it is blocked by .gitignore), it then uses cat to read the contents.

I've watched this with GPT-OSS as well. If the tool blocks something, it will try other ways until it gets it.

The LLM "hacks" you.

lazide · 2025-11-25T19:33:16 1764099196

And… that isn’t the LLM’s fault/responsibility?

ceejayoz · 2025-11-25T19:35:21 1764099321

As the apocryphal IBM quote goes:

"A computer can never be held accountable; therefore, a computer must never make a management decision."

jermaustin1 · 2025-11-25T23:15:01 1764112501

How can an LLM be at fault for something? It is a text prediction engine. WE are giving them access to tools.

Do we blame the saw for cutting off our finger? Do we blame the gun for shooting ourselves in the foot? Do we blame the tiger for attacking the magician?

The answer to all of those things is: no. We don't blame the thing doing what it is meant to be doing no matter what we put in front of it.

lazide · 2025-11-25T23:31:54 1764113514

It was not meant to give access like this. That is the point.

If a gun randomly goes off and shoots someone without someone pulling the trigger, or a saw starts up when it’s not supposed to, or a car’s brakes fail because they were made wrong - companies do get sued all the time.

Because those things are defective.

jermaustin1 · 2025-11-26T12:45:01 1764161101

But the LLM can't execute code. It just predicts the next token.

The LLM is not doing anything. We are placing a program in front of it that interprets the output and executes it. It isn't the LLM, but the IDE/tool/etc.

So again, replace Gemini with any Tool-calling LLM, and they will all do the same.

lazide · 2025-11-26T12:51:38 1764161498

When people say ‘agentic’ they mean piping that token to various degrees of directly into an execution engine. Which is what is going on here.

And people are selling that as a product.

If what you are describing was true, sure - but it isn’t. The tokens the LLM is outputting is doing things - just like the ML models driving Waymo’s are moving servos and controls, and doing things.

It’s a distinction without a difference if it’s called through an IDE or not - especially when the IDE is from the same company.

That causes effects which cause liability if those things cause damage.

NitpickLawyer · 2025-11-25T18:58:57 1764097137

Because it misses the point. The problem is not the model being in a cloud. The problem is that as soon as "untrusted inputs" (i.e. web content) touch your LLM context, you are vulnerable to data exfil. Running the model locally has nothing to do with avoiding this. Nor does "running code in a sandbox", as long as that sandbox can hit http / dns / whatever.

The main problem is that LLMs share both "control" and "data" channels, and you can't (so far) disambiguate between the two. There are mitigations, but nothing is 100% safe.

mkagenius · 2025-11-25T19:05:07 1764097507

Sorry, I didn't elaborate. But "completely local" meant not doing any network calls unless specifically approved. When llm calls are completely local you just need to monitor a few explicit network calls to be sure.

pmontra · 2025-11-25T20:35:11 1764102911

In a realistic and useful scenario, how would you approve or deny network calls made by a LLM?

zahlman · 2025-11-26T02:59:47 1764125987

The LLM cannot actually make the network call. It outputs text that another system interprets as a network call request, which then makes the request and sends that text back to the LLM, possibly with multiple iterations of feedback.

You would have to design the other system to require approval when it sees a request. But this of course still relies on the human to understand those requests. And will presumably become tedious and susceptible to consent fatigue.

pmontra · 2025-11-26T07:37:31 1764142651

Exactly.

mkagenius · 2025-11-24T20:49:32 1764017372

I would argue that lot of the tools will be hosted on GitHub - infact, most of the existing repos are potentially a tool (in future). And the discovery is just a GitHub search

btw gh repos are already part of training the llm

So you don't even need internet to search for tools, let alone TEO

michaelanckaert · 2025-11-24T20:55:38 1764017738

Security nightmare inbound...

The example given by Anthropic of tools filling valuable context space is a result of bad design.

If you pass the tools below to your agent, you don't need "search tool" tool, you need good old fashion architecture: limit your tools based on the state of your agent, custom tool wrappers to limit MCP tools, routing to sub-agents, etc.

Ref: GitHub: 35 tools (~26K tokens) Slack: 11 tools (~21K tokens) Sentry: 5 tools (~3K tokens) Grafana: 5 tools (~3K tokens) Splunk: 2 tools (~2K tokens)

mkagenius · 2025-11-24T21:00:50 1764018050

Don't see whats wrong in letting llm decide which tool to call based on a search on long list of tools (or a binary tree of lists in case the list becomes too long, which is essentially what you eluded to with sub-agents)

michaelanckaert · 2025-11-24T21:05:23 1764018323

I was referring to letting LLM's search github and run tools from there. That's like randomly searching the internet for code snippets and blindly running them on your production machine.

mkagenius · 2025-11-24T21:20:44 1764019244

For that, we need sandboxes to run the code in an isolated environment.

michaelanckaert · 2025-11-24T21:31:03 1764019863

Sure to protect your machine, but what about data security? Do I want to allow unknown code to be run on my private/corporate data?

Sandbox all you want but sooner or later your data can be exfiltrated. My point is giving an LLM unrestricted access to random code that can be run is a bad idea. Curate carefully is my approach.

mkagenius · 2025-11-24T21:49:02 1764020942

For data security, you can run sandbox locally too. See https://github.com/instavm/coderunner

mkagenius · 2025-11-24T19:58:27 1764014307

> Skills are the actualization of the dream that was set out by ChatGPT Plugins .. But I have a hypothesis that it might actually work now because the models are actually smart enough for it to work.

and earlier Simon Willison argued[1] that Skills are even bigger deal than MCP.

But I do not see as much hype for Skills as it was for MCP - it seems people are in the MCP "inertia" and having no time to shift to Skills.

1. https://simonwillison.net/2025/Oct/16/claude-skills/

vidarh · 2025-11-25T01:24:44 1764033884

Skills are less exciting because they're effectively documentation that's selectively loaded.

They are a bigger deal in a sense because they remove the need for all the scaffolding MCPs require.

E.g. I needed Claude to work on transcripts from my Fathom account, so I just had it write a CLI script to download them, and then I had it write a SKILL.md, and didn't have to care about wrapping it up into an MCP.

At a client, I needed a way to test their APIs, so I just told Claude Code to pull out the client code from one of their projects and turn it into a CLI, and then write a SKILL.md. And again, no need to care about wrapping it up into an MCP.

But this seems a lot less remarkable, and there's a lot less room to build big complicated projects and tooling around it, and so, sure, people will talk about it less.

stingraycharles · 2025-11-25T08:53:36 1764060816

Skills are good for context management as everything that happens while executing the skill remains “invisible” to the parent context, but they do inherit the parent context. So it’s pretty effective for a certain set of problems.

MCP is completely different, I don’t understand why people keep comparing the two. A skill cannot connect to your Slack server.

Skills are more similar to sub-agents, the main difference being context inheritance. Sub-agents enable you to set a different system prompt for those which is super useful.

ako · 2025-11-25T11:54:52 1764071692

Are you sure, i thought skill were loaded into the main context, unlike (sub)agents. According to Claude they're loaded into the main context. Do you have link?

stingraycharles · 2025-11-25T13:08:15 1764076095

No, just their header / when they should be invoked, the actual contents of the skill is never loaded in the main context.

ako · 2025-11-25T14:12:20 1764079940

Unless claude decides a skill is needed, then it loads the additional details into the main context to use. It's basically lazy loading into main context.

vidarh · 2025-11-26T12:46:34 1764161194

A skill can absolutely connect to your slack server. Either by describing how to use standard tools to do so, or by including code.

Most of my skills connect to APIs.

sawyerjhood · 2025-11-24T22:44:46 1764024286

I agree with you. I don't see people hyping them and I think a big part of this is that we have sort of hit an LLM fatigue point right now. Also Skills require that your agent can execute arbitrary code which is a bigger buy-in cost if your app doesn't have this already.

zby · 2025-11-24T20:15:23 1764015323

I still don't get what is special about the skills directory - since like forever I instructed Claud Code - "please read X and do Y" - how skills are different from that?

simonw · 2025-11-24T21:26:41 1764019601

They're not. They are just a formalization of that pattern, with a very tiny extra feature where the model harness scans that folder on startup and loads some YAML metadata into the system prompt so it knows which ones to read later on.

lupire · 2025-11-25T01:06:15 1764032775

So "skills" are a hack around the LLM not actually being very smart? Interesting.

simonw · 2025-11-25T03:21:25 1764040885

Everything we do with LLMs is a hack around them not actually being very smart!

Working around their many limitations has been the nature of the game since the original GPT-3.

vidarh · 2025-11-25T01:28:09 1764034089

It's more that they are embracing that the LLM is smart enough that you don't need to build-in this functionality beyond that very minimal part.

A fun thing: Claude Code will sometimes fail to find the skill the "proper" way, and will then in fact sometimes look for the SKILL.md file with tools, and read the file with tools, showing that it's perfectly capable of doing all the steps.

You could probably "fake" skills pretty well with instructions in CLAUDE.md to use a suitable command to extract the preamble of files in a given directory, and tell it to use that to decide when to read the rest.

It's the fact that it's such a thin layer that is exciting - it means we need increasingly less special logic other than relying on just basic instructions to the model itself.

Glemkloksdjf · 2025-11-25T09:59:39 1764064779

No, skills are a set of manifested and tested 'skills' which reduce the 'mental load' of the LLM and reduces the context the LLM needs to do things reproducable.

Similiar to what humans do.

conception · 2025-11-25T03:40:48 1764042048

More not wasting context having it figure it out.

It’s documentation vs researching how to do something.

mkagenius · 2025-11-24T20:35:57 1764016557

The difference is that the code in the directory (and the markdown) are hardcoded and known to work beforehand.

munk-a · 2025-11-24T22:35:52 1764023752

But we are still reliant on the LLM correctly interpreting the choice to pick the right skill. So "known to work" should be understood in the very limited context of "this sub-function will do what it was designed to do reliably" rather than "if the user asks to use this sub-function it will do was it was designed to do reliably".

Skills feel like a non-feature to me. It feels more valuable to connect a user to the actual tool and let them familiarize themselves with it (and not need the LLM to find it in the future) rather than having the tool embedded in the LLM platform. I will carve out a very big exception of accessibility here - I love my home device being an egg timer - it's a wonderful egg timer (when it doesn't randomly play music) and I could buy an egg timer but having a hands-free egg timer is actually quite valuable to me while cooking. So I believe there is real value in making these features accessible through the LLM over media that the feature would normally be difficult to use in.

vidarh · 2025-11-25T01:33:12 1764034392

This is no different to an MCP, where you rely on the model to use the metadata provided to pick the right tool, and understand how to use it.

Like with MCP, you can provide a deterministic, known-good piece of code to carry out the operation once the LLM decides to use it.

But a skill can evolve from pure Markdown via inlining some shell commands, up to a large application. And if you let it, with Skills the LLM can also inspect the tool, and modify it if it will help you.

All the Skills I use now have evolved bit by bit as I've run into new use-cases and told Claude Code to update the script the skills references or the SKILL.md itself. I can evolve the tooling while I'm using it.

mkagenius · 2025-11-24T23:58:11 1764028691

Choice to pick right tool -- there is a benchmark which tracks the accuracy of this.

"Known to work" -- if it has a hardcoded code, it will work 100% of the time - that's the point of Skills. If it's just markdown then yes, some sort of probability will be there and it will keep on improving.

bavell · 2025-11-24T20:37:40 1764016660

Not really special, just officially supported and I'm guessing how best to use it baked in via RL. Claude already knows how skills work vs learning your own home-rolled solution.

fzysingularity · 2025-11-24T20:27:32 1764016052

I definitely see the value and versatility of Claude Skills (over what MCP is today), but I find the sandboxed execution to be painfully inefficient.

Even if we expect the LLMs to fully resolve the task, it'll heavily rely on I/O and print statements sprinkled across the execution trace to get the job done.

mkagenius · 2025-11-24T20:39:24 1764016764

> but I find the sandboxed execution to be painfully inefficient

sandbox is not mandatory here. You can execute the skills on your host machine too (with some fidgeting) but it's a good practice and probably for the better to get in to the habit of executing code in an isolated environment for security purposes.

munk-a · 2025-11-24T22:38:42 1764023922

The better practice is, if it isn't a one-off, being introduced to the tool (perhaps by an LLM) and then just running the tool yourself with structured inputs when it is appropriate. I think the 2015 era novice coding habit of copying a blob of twenty shell scripts off of stack overflow and blindly running them in your terminal (while also not good for obvious reasons) was better than that essentially happening but you not being able to watch and potentially learn what those commands were.

fzysingularity · 2025-11-24T22:44:22 1764024262

I do think that if the agents can successfully resolve these tasks in a code execution environment, it can likely come up with better parametrized solutions with structured I/O - assuming these are workflows we want to run over and over again.

robot-wrangler · 2025-11-24T20:26:00 1764015960

Skills are like the "end-user" version of MCP at best, where MCP is for people building systems. Any other point of view raises a lot of questions.

Aren't skills really just a collection of tagged MCP prompts, config resources, and tools, except with more lock-in since only Claude can use it? About that "agent virtual environment" that runs the scripts.. how is it customized, and.. can it just be a container? Aren't you going to need to ship/bundle dependencies for the tools/libraries those skills require/reference, and at that point why are we avoiding MCP-style docker/npx/uvx again?

Other things that jump out are that skills are supposed to be "composable", yet afaik it's still the case that skills may not explicitly reference other skills. Huge limiting factors IMHO compared to MCP servers that can just use boring inheritance and composition with, you know, programming languages, or composition/grouping with namespacing and such at the server layer. It's unclear how we're going to extend skills, require skills, use remote skills, "deploy" reusable skills etc etc, and answering all these questions gets us most of the way back to MCP!

That said, skills do seem like a potentially useful alternate "view" on the same data/code that MCP is covering. If it really catches on, maybe we'll see skill-to-MCP converters for serious users that want to be able do the normal stuff (like scaling out, testing in isolation, doing stuff without being completely attached to the claude engine forever). Until there's interoperability I personally can't see getting interested though

vidarh · 2025-11-25T01:39:14 1764034754

There's no lock-in there.

Tell your agent of choice to read the preamble of all the documents in the skills directory, and tell it that when it has a task that matches one of the preambles, it should read the rest of the relevant file for full instructions.

There are far fewer dependencies for skills than for MCP. Even a model that knows nothing about tool use beyond how to run a shell command, and has no support for anything else can figure out skills.

I don't know what you mean regarding explicitly referencing other skills - Claude at least is smart enough that if you reference a skill that isn't even properly registered, it will often start using grep and find to hunt for it to figure out what you meant. I've seen this happen regularly while developing a plugin and having errors in my setup.

robot-wrangler · 2025-11-25T12:57:01 1764075421

> There are far fewer dependencies for skills than for MCP.

This is wrong and an example magical thinking. AI obviously does not mean that you can ship/use software without addressing dependencies? See for example https://github.com/anthropics/skills/blob/main/slack-gif-cre... or worse, the many other skills that just punt on this and assume CLI tools and libraries are already available

vidarh · 2025-11-26T12:45:14 1764161114

It is categorically not wrong. With an MCP you have at a minimum all the same dependencies and on top of that a dependency on your agent supporting MCP. With skills, a lot of the time you don't need to ship code at all - just an explanation to the agent of how to use standard tools to access an API for example, but when you do need to ship code, you don't need to ship any more code than with an MCP.

The trivial evidence of this, is that if you have an MCP server available, the skill can simply explain to the agent how to use the MCP server, and so even the absolute worst case for skills is parity.

CuriouslyC · 2025-11-24T20:17:57 1764015477

Skills do something you could already do with folder level readme files and hyperlinks inside source, but in a vendor-locked-in way. Not a fan.

vidarh · 2025-11-25T01:40:38 1764034838

Skills are just markdown files in a folder that any agent that can read files can figure out.

Just tell your non-Claude agent to read your skills directory, and extract the preambles.

pluralmonad · 2025-11-24T20:33:55 1764016435

They are just text files though. I'm sensitive to vendor lock-in and do not perceive a standard folder structure and bare text files to be that.

bavell · 2025-11-24T20:40:13 1764016813

Yeah, the reason I like Skills better than MCP is specifically because skills are just plain text.

mkagenius · 2025-11-24T20:32:27 1764016347

It's definitely not vendor locked. For instance, I have made it work with Gemini with Open-Skills[1].

It is after all a collection of instructions and code that any other llm can read and understand and then do a code execution (via tool call / mcp call)

1. Open-Skills: https://github.com/BandarLabs/open-skills

mkagenius · 2025-11-18T03:53:28 1763438008

I(non autistic) would love to be friends with someone like you.