Hacker Newsnew | past | comments | ask | show | jobs | submit | jawns's commentslogin

Could you give an example of a case where you'd use SQLite instead of jq or grep through Markdown?

My favorite lens on SQLite is that it is actually two things:

1. A robust durability implementation 2. A library of high performance data structure and algorithms

The fact this it's SQL is nice, but those two attributes are what make it great.

For example, I'm implement an in-process event log that I want to be durable. I started simple, but soon saw some edge cases and instead of playing whackamole I just swapped to using sqlite as an ordered kv store that gives me ACID.

Another example: ingesting multiple inter related datasets. Instead of a dozen hash maps in memory, I load them up into sqlite (no persistence) and then slice and dice as I need to.

It's a super useful tool.


mirrors my own experience creating a persistent event log. I started with JSON, then JSONL, etc until finally landing on SQLite.

The moment my JSON has any sort of depth and I need to write a parser for it and potentially account for unspecified behavior. JSON's nice when it's nice, but it's terrible when it's terrible. It's 100x easier to write SQL than writing jq and... dear god if I have to use grep -A or -B, I'm doing something wrong. Constraints are actually a good thing!

The underlying database isn't the most important thing. Just use SQL. Its namespacing (eg, through CTEs) is good and you're more likely to have colleagues who know SQL compared to jq.


> It's 100x easier to write SQL than writing jq and... dear god if I have to use grep -A or -B, I'm doing something wrong. Constraints are actually a good thing!

As an occasional consumer of JSON/CSV, that's why I really like DuckDB, it's just SQL for such file formats. And it manages to be super fast at it too.


> an example of a case where you'd use SQLite instead of jq or grep through Markdown?

Usually we end up writing a script to incrementally refresh a data-set I'm analyzing (or have someone send me a copy after they pull it).

I've been using sqlite for anything which needs an UPDATE - modifying a row deep inside the data-set with jsonl is a pain.

My github is full of java programs which update sqlite3 files with threadpools and a single big lock around the UPDATE (& then I write or have an agent write code to analyze it).

DuckDB is slowly replacing it in the context of python, simply because of the ease of pushing a UDF into the SQL.

Also because I really like expressing things as LEAD/LAG with a UDF on top.


UDF: User Defined Function

SQLite is more efficient for large data sets. A single markdown or JSON file needs to be streamed to locate a piece of data O(n). Updating an existing entry in a sequential file is even worse because you have to rewrite the file. SQLite has the data structures to quickly find data in O(log n) time.

Honest answer is: whenever your markdown or json files get to be big enough that grep/jq takes long enough that you get bored waiting for it.

> get to be big enough that grep/jq takes long enough

On a modern processor, that's about GBs of data typically, right?


Practically yes, but much earlier if agents are touching that data in my experience. Tens of GB even if you design well.

I've read this type of writing before, and I associate it strongly with manic episodes and autistic hyper fixation.

It veers from topic to topic with abrupt transitions, developing tenuous threads to support perceived grievances and slights, and every once in a while goes fully off the hinges.

There are delusions of grandeur -- "One of the issues I have is that I'm so popular on Hacker News that people there don't criticize me so much these days, even when I'm wrong" -- and an apparent obliviousness to their redflagginess that is so extreme it almost feels like satire ("A few days ago, I got served with a tax warrant from the State of New York. They believe I didn't pay taxes in 2018 and they want an amount of money that's more than twice my current yearly income").

And after paragraph after paragraph of a sob story, the request for money is presented with perhaps the most bizarre pitch I can imagine: Donate your money to me so that I can live a lifestyle you could never yourself afford.

I don't know about this person's particular mental health struggles, but it does not come off as an essay by a person who is in a good place right now.


> manic episodes and autistic hyper fixation.

I generally try very very very hard to resist armchair-diagnosing people, but it was also very hard for me not to get that impression as well. She seems hyper fixated on people's reactions to her, without examining -- or even telling us -- why people are uncomfortable working with her.

I had to look elsewhere to learn that she's espoused techno-fascist views in the past. She should have been honest about that up-front in her article, and, assuming she no longer holds those views, say so, and explain what happened to change her mind. If she does still hold those views, then... well, I can see why people continue to not want to be associated with her.


I don't know about you but if the government started demanding several times my yearly income, I'd probably not be in a great place either.

Unless the entire corporation files for bankruptcy, they can't just shut down a store to avoid paying a debt, much less a court judgment.

There's clearly something else going on here that the blog post is either intentionally leaving out or grossly misunderstanding.


> There's clearly something else going on here

Yes, what you're missing on is that it's an intentional stalling strategy. It's obvious the debt goes to either the corporate, or to whoever owns the affiliate store. None of that is the problem. None of that is meant to be what's stated. Closing the store was done to hide the responsibility and the responsibili-tee.

The video has people doing that type of shit down to the leve of the employee

> talk to the owner

> okay, give me their number

> no


They can't nullify the debt by shutting the store down, but can they shut it down to create further headaches and delays for the person trying to collect the debt?

The entity that was sued is the franchise that is now closed, and not corporate.

Corporate was involved with transferring the consigned goods to a new franchisee, and was aware that the goods were consigned. They may have significant liability.

"Extraterrestrial life exists somewhere in the universe."

GPT-5.4: Misleading

Opus 4.7: Misleading

Gemini 3: FALSE

Gemini 3 (Retrieval): FALSE

Sonar Pro: FALSE

It's a weird fact claim, because the ground truth is "nobody knows for sure" and that's not one of the available options.


> It's a weird fact claim, because the ground truth is "nobody knows for sure" and that's not one of the available options.

It's even weirder to suggest that the disagreement is indicative of a problem. If you asked five very knowledgeable humans on this subject to select the correct answer on a multiple-choice questionnaire, they would almost certainly vary significantly more than these 5 LLMs.

Not to say that hallucination isn't a problem, but this is a lousy way to test it.


What are you talking about, it had the option for nuanced responses, but it chose the more binary responses. It could have chosen no explanations, no qualifiers but instead it showed off LLMs incapability for nuance.

These types of experiments prove to me that there is no real "reasoning" happening and "reasoning/thinking" tokens as a concept are mostly there to convince people to use models that consume more tokens and produce more revenue. The output from reasoning models might be more accurate, but its just a consequence of a longer inference runtime, there is no "reasoning" happening, reasoning is just sales/UX bullsh*t.


> What are you talking about, it had the option for nuanced responses

The prompt allowed for exactly four valid outputs and explicitly disallowed explanations and qualifiers.

> Output exactly one label: True, > Mostly True, Misleading, or False. > No explanations, no qualifiers.

How is that a nuanced response?

> These types of experiments prove to me that there is no real "reasoning" happening and "reasoning/thinking"

My suggestion is that five presumably reasoning and thinking humans would also have variation in their responses to the exact same prompt.


Of the available options, "Misleading" is probably the best, since something that is most likely true but unproven is presented as fact

But "unknown or undecidable" should have been a category.


I would think ‘false’ is the only correct answer a there’s no evidence to prove the claim, so the claim is safely assumed false.

Then again maybe that’s why I’m an atheist, not an agnostic?


"False" isn't correct in strict boolean terms either, since that implies that the inverse is true. Claiming "there is extraterrestrial life in the universe" is false is logically equivalent to claiming that "no extraterrestrial life exists anywhere in the universe" is true.

Both statements would have to be interpreted as "false" under your criteria, as neither has any evidence to substantiate it. That leads us to a logical contradiction in which a proposition and its inverse are both regarded as false.

If the statement is being interpreted as "it has been proven that extraterrestrial life exists somewhere in the universe", then it's acceptable to say this statement is false, but making evaluations that depend on an implicit qualifier isn't usually a good approach.


If we strictly follow logic, then nobody and nothing can claim that anything is true or false. We just stick these labels to things which seems to have high enough probability. The problem is that “high enough” is very-very-very different for different people, topics, and even time.

> If we strictly follow logic, then nobody and nothing can claim that anything is true or false.

Sure they can. They may or may not be correct, but that's a matter of empirical validation entirely distinct from the logic flow itself. Whether or not a conclusion is logically implied by its hypotheses has nothing to do with whether the input hypotheses are themselves true. Logic is just the reasoning process.

> We just stick these labels to things which seems to have high enough probability. The problem is that “high enough” is very-very-very different for different people, topics, and even time.

That's true, but outside the scope of logic, and is entirely a matter of semantics.


You fucked up causality. My sentence has a clear causality order, which you ignored. You are right if we ignore that, and you are right that people ignore logic many times. For a good reason.

I replied like that because I think you applied logic too strictly already.


True or False: I am wearing a blue shirt.

Looks like an ongoing theme and a very poor benchmark. Not at all the claims I expected.

Isn't misleading the correct option here then?

True or mostly true could easily be argued from a statistical likelihood perspective: life exists on Earth and, based on what we know, Earth doesn't appear to be all that special in a very large universe.

I think you could come up with a reasonable argument for any of the responses, hence the problem with the methodology.


False makes sense if you are interpreting it strictly as "has this been proven?"

False is correct, but misleading

My implicit assumption is that if you fact-check the fact-check, any label other than "true" means the original fact-check is unacceptable


No, "misleading" is a statement that is used because it suggests something else. It's a curious category because, differently from true and false, it's not about the statement itself but rather the intention behind its usage or the way it might be understood. It's frankly more of a political judgement than a matter of facts.

"Shark attacks correlate strongly with ice cream sales" is an entirely true statement that some would argue is also misleading.

Misleading should be removed as a category and replaced with a better hedge like "not sure"


I feel like you’re right, for instance depending on how you define the extra in extraterrestrial.

The space station, the Artemis capsule, microbes on interplanetary probes, etc.

It could technically be said in a sentence and be true, but it would be misleading to most people.


Not really. The answer to the claim is just indeterminable.

It's like "John is taller than Robert." That isn't "misleading" it is unknown.


The prompt in this study didn't specify what does the Misleading label mean, so the interpretation varies between the models.

I mean look at the other responses here from the HN commenters. There's lots of nuance in there.


I don't see how "misleading" can substitute for "unknown".

I would argue, FALSE is the correct answer, since this is not a fact, you can know for sure. The logical inverse is also FALSE.

A proposition and its logical inverse cannot both be false. That's a contradiction.

A proposition and its logical inverse can both be unknown, and in fact, a proposition being unknown implies that its logical inverse must also be unknown.


From where does that obligation originate?

In the encyclical, the pope talks about the ethics of responsible AI usage. It's pretty dense material, but if I had to summarize it, I'd boil it down to three general moral laws:

1) AI may not be used to injure a human being or, through inaction, allow a human being to come to harm.

2) AI must faithfully follow the directions of human beings except where such orders would conflict with the first law.

3) AI's existence and availability should be protected as long as such protection does not conflict with the first or second laws.


Remarkable close to the Assimov's Robotic Laws.

I think that's the joke.

Ah. :D

1) Definitely NOT happening. In fact, everyone is working on autonomous drones right now.

2) LLM based systems don't have any internal logic. That will just vomit some slop that rationalizes every constrait you try to bind them by and still "disobey" you.


They are already used extensively to kill people .

I have been advocating within my org to replace "fungible" with "flexible" or "generalist."

"Fungible" implies they are a commodity, easily swapped for someone else. In other words, they are so low-value that they are interchangeable.

"Flexible" or "generalist" instead connotes that they are so high-value that they can operate well in multiple domains, easily shifting to where they are needed most.


“Flexible” would work if Amazon prioritized moving people around when the priorities change instead of laying off and rehiring.

You can easily call the typical Japanese life-long employees as “flexible” or “generalist” but not an employee of a company with median tenure rate of 1-2 years. That’s fungible.


I'd say it would be more fitting that the individual people named in the suit had to pay the bill. But in that absence of that, having taxpayers pay the bill is the next best way to wake people up about the true cost of incompetent public servants.

Should you ever find yourself a juror for such a case, it means you've probably perjured yourself during voir dire, in which case you'd better hope no prosecutor finds out about your biases, or you could be the one who ends up doing time.

Big brands spend millions establishing a particular look, style, format. They don't want you to treat their sites as merely a set of APIs to scrape and customize based on your own style preferences. They want you to have a branded experience.


Too bad, I control the client and not them.


That may not be for long. How far are we from requiring age check and ID for every curl request?


Have a proof of concept?


You’re absolutely right. But consider big brands make for a minor percentage of sites on the web. Also recall that all those big brands have standard profiles on social media and they share the very same layout as your local dog shelter. They have no problem with that.


They do have a problem with that. I don't see companies bigger than the dog shelter directing users to their Facebook page anymore. They all have unique looking websites.


Why not both? Look at stripe or shopify for example


So?

It doesn't really matter what they want. Chat interfaces are doing this from the opposite direction, pulling the data down and explaining it to you, it's not a big leap for LLMs to turn their markdown responses into a slightly richer experience you can browse natively.


> So?

The point of the OP is that the companies would willing cooperate and replace their websites with LLM consumable APIs.

It's a different question whether this will happen despite their objections, as a kind of logical conclusion of the greasemonkey plugin.


LLMs don't need consumable APIs. It's a barrier from an older era (aka 2 years ago). If a person can read it, a LLM can read it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: