More

devit · on March 19, 2025

It's cool, but how do you know it won't kill you, esp. given that it's human nature to stop paying attention since it seems to do all the work itself (even if eyes are kept on the road), and that it seems developed by a team with limited resources, using cameras only and apparently with pervasive "end-to-end" use of neural networks? (which, of course, offer no guarantee whatsoever regarding their behavior beyond what statistical testing can suggest)

hasperdi · on March 19, 2025

The driver monitoring camera works well. It beeps loudly when it detects inattention ie eyes off the road.

And bear in mind, this is an ADAS level 2. It's not a totally hands off system. I have to be prepared to take control at all times due to various reasons eg. turn beyond its steering limit, roundabouts, traffic lights, constructions. Think of it as the better version of the car's line keeping assist.

I found that it lightens the cognitive load of driving, especially useful when driving longer distances.

aitchnyu · on March 19, 2025

And does it check its "mirrors"?

devit · on March 18, 2025

You can just zero out the pointer variable with __attribute__((cleanup)) before returning its value (from a temporary copy of course).

devit · on March 18, 2025

That's completely asinine since it can't be made to work properly with inlining (including LTO), architectures that use a shadow stack or don't use a frame pointer, and also ridiculously inefficient and requiring assembly code for all architectures.

Use C++ or __attribute__((cleanup)) instead.

epolanski · on March 18, 2025

Maybe it is.

Or maybe the asinine "thing", is some folks lack of text comprehension skills that can't distinguish an experiment from a best practice recommendation despite a title and content that clearly does not invite that.

devit · on March 15, 2025

The use for donations could be for a single person whose job is to check the upstream code for any antifeatures (telemetry, ads, product placements, online service defaults, Google as paid default search engine, etc.) not in the user's interest and revert them, as well as bundling any useful extension like uBlock Origin and verifying them.

That needs minimal effort compared to building a browser, because it doesn't involve doing any of the hard work, but just removing code that serves to line the pockets of those doing most of the work at the expense of the user.

glenstein · on March 15, 2025

Do I understand correctly that you believe Mozilla doesn't currently have the resources necessary to do that from their $500MM in annual revenue? It sounds like you are talking about an ombudsman or something, which highlights my point here, which is that these are philosophical criticisms disguised as commentary on raising revenue.

Also the mission you are describing sounds like something that you might expect from a Chromium browser that has to regularly revert Google-driven changes. At Mozilla, they already own the browser and they could account for this in their ground-level philosophy.

devit · on March 17, 2025

They could, but they don't want to do that because they get paid by Google to not do it or because those actions get them money in some other way (from advertisers or whatever), or because they think only power users like some features.

glenstein · on March 17, 2025

Firefox publishes their 990 form which discloses all their sources of revenue and Google does not pay Firefox for any of the things you described. Also, it feels kind of nonsensical to suggest that it would have a development strategy of building out their ad tech and simultaneously reverting it, and I don't see how explanations about them wanting or not wanting to do it make that proposed approach for any more sense.

devit · on March 17, 2025

They pay them for making Google the default search engine, and it is hypothesized that the payment may also influence them to not provide ad-blocking by default and possibly other things that are not beneficial for Google's business.

glenstein · on March 18, 2025

>and it is hypothesized

By whom and on what basis? Those are non-optional questions that should have strong answers as preconditions to you posting about it, if the objective is to offer something more than simple bullshitting (in the Harvey Frankfurt sense of indifference to truth).

This also doesn't answer like 90% of my concerns from my previous comments. Who has ever intentionally had a software development approach of having one team develop features and another person revert those features, working in tandem? And why would they need donations that are 0.20% of what they already get in revenue to finance it? I feel like you're just improv riffing here.

devit · on March 19, 2025

The donations would go directly to the individual doing this with no relationship with Mozilla (e.g. with Patreon or similar), not to Mozilla itself.

devit · on March 13, 2025

The pioneer of AI alignment.

devit · on March 12, 2025

"styling on your end" means the _user_ (not the website designer/developer) deciding how controls should be styled and that's much easier if everything uses the same controls because you only need to specify it once and don't need a library to hook all possible UI frameworks that exist.

rerdavies · on March 13, 2025

Users deciding how input controls are styled? Who exactly asked for THAT?

ErikBjare · on March 13, 2025

Heard of userstyles?

devit · on March 11, 2025

Seems like it might be more effective to use the LLMs to write a program that plays Factorio rather than having them pick the next action given a game state.

Also in general I think the issue with Factorio is that you can just find an "optimal" factory design and build order and just follow it every time; perhaps starting with a suboptimal building layout already present and restrictions like being unable to change them or build others of the same type could help.

noddybear · on March 11, 2025

This is exactly how FLE works, the agent writes a program that executes its policy.

I think you bring up a good point, we could create tasks where the goal is to optimise a static factory, starting from a kernel of functionality like 'steam engine power supply' etc.

devit · on March 11, 2025

But it seems like it's being used to generate short snippets that in the examples seem to be equivalent to command lists as opposed to generating a full program that actually plays the whole game by itself.

The model could also then be fed back the results of running the program and iteratively change it as needed.

I.e. prompt first with "Write a program that can play Factorio automatically given an interface <INTERFACE SPECIFICATION> and a set of goals in <GOAL FORMAT>, and produces text output that can help determine whether the program is working correctly and whether tasks are performed efficiently and goals are reached as fast as possible"

And then with "the program was run and produced this text output: <TEXT OUTPUT> Determine any possible bugs, avenues of improvements or missing output information and modify the program accordingly, printing the new version".

And iterate until there doesn't seem to be an improvement anymore.

noddybear · on March 12, 2025

If I understand you correctly, this approach is sort of supported in FLE - the agents can create functions that encapsulate more complex logic. However, interaction is still synchronous/turn-based. I think to do what you propose, you will need to create event listeners that can trigger the agents program whenever appropriate.

devit · on March 10, 2025

The "person one" vs "person two" bias seems trivially solvable by running each pair evaluation twice with each possible labelling and the averaging the scores.

Although of course that behavior may be a signal that the model is sort of guessing randomly rather than actually producing a signal.

harrisonjackson · on March 10, 2025

Agreed on the second part. Correcting for bias this way might average out the scores but not in a way that correctly evaluates the HN comments.

The LLM isn't performing the desired task.

It sounds possible to cancel out the comments where reversing the labels swaps the outcome because of bias. That will leave the more "extreme" HN comments that it consistently scored regardless of the label. But that may not solve for the intended task still.

rahimnathwani · on March 10, 2025

  The LLM isn't performing the desired task.

It's 'not performing the task', in the same way that the humans ranking voice attractiveness are 'not performing the task'.

I wouldn't treat the output as complete garbage, just because it's somewhat biased by an irrelevant signal.

devit · on March 8, 2025

It's bullshit according to previous discussion on HN (https://news.ycombinator.com/item?id=43301369).

Just a bunch of undocumented hardware registers/commands, no remotely accessible backdoor.

devit · on March 8, 2025

The article is wrong on the claim that Rust panic are harder to catch than C++ exceptions: as long as you don't configure panic=abort you can catch them easily with catch_panic and they are generally implemented using the same runtime mechanism (i.e. Rust panics usually effectively are C++ exceptions for most purposes).

01HNNWZ0MV43FF · on March 8, 2025

Hm, it just feels wrong though. Panics feel bigger than exceptions

int_19h · on March 8, 2025

That's a cultural thing (and a good one, too).

The one big difference tho is that in Rust, the end user of the library - i.e. the person compiling the binary of which this library is a part - can decide at that point whether panics unwind like C++ exceptions, or just abort immediately. Conversely, this means that the library should never assume that it can catch panics, even its own internal ones, because it may be compiled with panic=abort.

So it's kinda like C++ exceptions, but libraries can only throw, never catch.