Hacker Newsnew | past | comments | ask | show | jobs | submit | supermatt's commentslogin

Well, I guess one way these AI companies can stop us meeting our needs with local models - make it so we just cant get the hardware to run them...

> It's a separate league from closed models entirely.

To be fair, the SOTA models aren't even a single LLM these days. They are doing all manner of tool use and specialised submodel calls behind the scenes - a far cry from in-model MoE.


Probably naive questions:

Does that also mean that Gemini-3 (the top ranked model) loses to mistral 3 40% of the time?

Does that make Gemini 1.5x better, or mistral 2/3rd as good as Gemini, or can we not quantify the difference like that?


Yes, of course.

Wow. If all the trillions only produces that small of a diff... that's shocking. That's the sort of knowledge that could pop the bubble.

I wouldn't trust LMArena results much. They measure user preference and users are highly skewed by style, tone etc.

You can litteraly "improve" your model on LMArena by just adding a bunch of emojis.


You must be mistaken…

The maximum personal contribution to public health insurance (GKV) is capped at around 400/m for healthcare (and an additional 200 towards long-term/elderly care). Spouse and children are free if they are unemployed.

https://www.tk.de/resource/blob/2189790/9321e565c304a9cc33bb...

If you are paying more than that then you are already paying for private health insurance (PKV) or private supplementation on top of GKV for some premium coverage.


I am not mistaken. I know how to read my own payslip.

Both me and my wife are employed. We have GKV both and we’re basically paying the maximum rate. That’s around €1100/month each, pre-tax. Half of it comes from my official bruto salary and the other half comes from my unofficial bruto salary. Which is how governments hide the costs of public healthcare. Ultimately is part of my salary deductions for the finances of my employer.

Kids are not free: kids doctors don’t work for free. They need to be payed, and they’re paid from the contributions me and other employed fellow citizens pay every month.


“Traditional western reporting” is traditionally a western thing. That’s only 15% of the global population - so if anything it seems bias towards that.

The point is that you can direct it at any of the 1bn+ websites without having to write any scripts.

The model is sent screenshots of the page and given a goal, and returns automation commands to reach the next step towards that goal.


Hmm.. Sounds like a solution looking for a problem to me.

If I could fine tune it to fill my work time sheets, I would count it as a big win!

if you think about it for more than 5 seconds you'll see a lot of applications, it's not that hard cmon.

I really wanted to like penpot, but when I tried a few months ago, simply navigating between pages (even on the example documents) was causing parts of the document to change in bizarre ways. I didn't want that level of risk with documents I actually cared about, so continued to use figma. I guess it's time to give it another shot.

EDIT: still broken 8 months later :(


I think you should post a issue at this point D:

I raised the issue in the forums at the time, with video captures demonstrating the issue(s).

I think it would help to open an issue on github making explicit the following three points explicit in the report:

- steps to reproduce from scratch;

- what you expected to happen;

- what you actually observed (include the screenshot or video capture in addition to a textual description).

Otherwise, you might risk your report being ignored due to a silent misunderstanding about the mismatch between your expectations and the actual results.


At the time i wasn't sure if it was PEBCAK, which is why i started a discussion in the forums. As there were no replies, i received no notifications, and so I forgot all about it.

If anyone is interested in opening a bug report you can see the issue here: https://imgur.com/a/hZ1ja9o


Personally, I do not understand why you think there is a bug from this screen capture alone. Maybe because I am that familiar with penpot and figma, but still, I do not find it obvious.

This is why it's important to describe explicitly the three points in text:

- steps to reproduce;

- what you expected to happen;

- what actual result you observe instead.

Something that might be obvious to you but isn't for others will just be silently ignored most of the time.

EDIT: I now see the problem after reading your other reply above:

https://news.ycombinator.com/item?id=46064757#46069546

This is why it's important to describe explicitly the difference between what you expected and what you observed. I swear I did not see the change in button width before reading the linked comment.


> This is why it's important to describe explicitly

That is a fair point. I will take it on board when giving people screenshots and videos of bugs in future.

> I did not see the change in button width

There's actually a lot more visual changes than that just the button, but I will leave that to the reader as an exercise in spot-the-difference ;)


> There's actually a lot more visual changes than that just the button, but I will leave that to the reader as an exercise in spot-the-difference ;)

This is fair. But issues like this will never get my attention in general because I don’t have time to do this exercise - I would much rather have it all spelled out. Even if there are a bunch of related issues they won’t get fixed in a single PR, it likely will be multiple.

I guess my point is that if you really want OSS projects to improve, the issue submitter can’t just ask the maintainer “figure it out”. It totally works this way in the corporate world though (IME).

Edit: I’m sorry to have jumped to conclusions. Leaving my comment up for accountability.


I didn’t ask the maintainer to “figure it out”. I posted a thread in the forum with multiple videos to start a discussion.

People here have stated I should have filed on GitHub, and because I don’t want to link my GitHub to this account I suggested someone else do it.

That was 6 hours ago, and people are still commenting about my lack of a suitable report rather than actually reporting it correctly themselves - as is evident by the lack of a new issue on the github.


I’m sorry for jumping at you like that.

No problem :)

> I swear I did not see the change in button width before reading the linked comment.

I didn’t either! I stared at that gif for a few minutes and I couldn’t tell what the problem is (or what to look for). It wasn’t until you said “changing button width” I knew where to focus my attention.


"Content not available in your region"

So, given that Penpot appears to mostly be developed in the EU, you'd need to fix that part first.


I’m not sure what you are referring to. If you mean the video link, I am in the EU and can see it.

Doesn't work in Austria, doesn't work from the UK, doesn't work from Finland.

I am physically in the EU and its working for me.

I just checked using nordVPN for Austria and Finland and it's working there too, so maybe you have some other issue going on?

I am assuming you are in the UK, as Imgur are specifically blocking the UK: https://help.imgur.com/hc/en-us/articles/41592665292443-Imgu...

Imgur isn't my site, and I don't vote in the UK, so Im not sure how you expect me to resolve that.



Not being able view imgur in the UK is a pain.

I hate how every time someone even talks about an issue with an open source project, some smart alec replies "well did you raise an issue?" - or worse - "did you send a PR to fix it?".

We are all very aware how bug reporting works. And user criticism of bugs isn't somehow invalidated just because the users didn't go to the sometimes very large effort to report bugs.

I wouldn't have reported this bug either. If the example documents are getting corrupted just by navigating them that indicates that it's just a really buggy project (corroborated by other comments here) that I'm not even going to use, so why would I spend my time working on it?


I opened an issue based on the discussion here and it didn't take much time or effort.

(It was one of those form-based issue templates that requires you to explicitly list out Steps to Reproduce, Expected behavior, Actual behavior, OS version, etc. which IMO causes slightly more friction for anyone who knows how to put together a good bug report, but I've also seen enough poorly-specified issues to know that it's necessary sometimes)


"And user criticism of bugs isn't somehow invalidated just because the users didn't go to the sometimes very large effort to report bugs."

Yes, it is.


No it isn't.

Yeah, just let everyone else do the work while you sit back and gripe.

I can see both sides of the dilemma and I don't necessarily like when a maintainer defaults to "open a PR" but asking for a reproducible issue wherever requested is not too much to ask.

With a PR I understand not wanting to put the effort in as it may not be merged. But offering up a reproducible example on the correct forum is the least you could do. If you want the problem fixed that's the best way forward.


> offering up a reproducible example on the correct forum is the least you could do

I suggested someone do that 8 hrs ago:

https://news.ycombinator.com/item?id=46069471

So far no takers. Just people saying what they would do instead of actually doing it :)


I've done it:

https://github.com/penpot/penpot/issues/7850

Thanks for sharing all the details about the issue, and shame on all the armchair critics :D


Thank you :)

Same experience here. I tried it a few months ago and even on simple use I quickly ran into so many bugs & issues I quickly gave up. I'm willing to learn a new UI, but the tool must be reliable, and it simple was not.

Hopefully they've improved a lot recently?


Same, but the lack of Sketch import was a deal-breaker. It shouldn't have launched without that. Has that been fixed?

I've loaded an example document and do not see what you mean when navigating between pages. A problem like that should be extremely jarring and it is very hard to believe it would be ignored.

> A problem like that should be extremely jarring

Agreed - I don't see how its not glaringly obvious to anyone who uses the app:

https://imgur.com/a/hZ1ja9o


Came with receipts lol - hopefully they can repro and fix this but the fact it as omitted for 8 months kind of hints at how little people are using it.

Yeah you can really see the resize comparing the before and after. https://jpst.it/4KgSB


Wow you're right. I've tried again with the same wireframing kit and it happened to me too! That's unbelievable.

Unrelated but imgur is basically malware at this point. I had to click through so many layers of nagging popups (including a “don’t support us” button, then a severely low-contrast “view in safari” button on a dialog explicitly designed to get me to accidentally click the app link), then when I finally got to your picture, any sort of interaction with the page whatsoever, including pinch-zooming to see the image, just took me away to a different page altogether.

I sincerely hate imgur and hope the whole site goes bankrupt, and I can’t stand it when anyone links to them.


Yeah, imgur had very simple & humble origins and fostered a surprisingly active, reddit-like community (though I'm sure imgurians would resent that particular comparison), and then holy shit it just turned into a bizarrely bloated overstuffed hodgepodge of fire-garbage. I just looked at the homepage for the first time in forever and—wait, what? "Arcade"?

Bleh.


Can you suggest a low-effort alternative I can use in future?


Yeah can't see that lasting. I wish someone would make one with limited adverts that just pays for the hosting and moderation costs. How hard can it be?

I mean nothing is guaranteed to last in the future, but catbox.moe has been around for like a decade already.

https://alternativeto.net/software/imgur/

Here's a comparison of alternatives


I didn’t experience any of that, using an iPad. There were ads above and to the right of the OP video, but it displayed the video with no popups.

I'm glad that at times like these, I switched to Brave browser on all of my platforms (desktop and mobile). I can't recommend it enough.

Also it's not accessible in the UK without workarounds that I'm not going to bother with for that dumpster fire of a site.

strange, I got no pop ups at all

Genuinely asking - what's the issue? I don't see it.

I click to navigate to the "Examples" page (I am gesturing with my mouse to circle around a bit I want you to look at). Then i navigate to "Main components", and back to "Examples" and the content in that area has changed. For example, the button has changed to half the original width.

It's that very bottom button you're referring to, right?

Thats the one I specifically mention, sure, but there are many more changes to the page overall if you compare the before and after.

There was a kickstarter for a $3000 SLS printer a while ago. Formlabs (who have over 50% of the SLS market) promptly bought the company and shut down the kickstarter - and gave backers a $1000 coupon towards their $30000 SLS printers...

That pissed me off so much.

I was one of the backers and I was sooooo looking forward to an affordable home SLS printer. They'd done some incredible engineering, too, in service of getting the price point down to where it was.

Scaling up was going to be a massive challenge for them, but damn, I wish they'd tried instead of phoning it in early.

(Mind you, I'm sure Formlabs paid them handsomely. Would I make the same decision under the same circumstances? I honestly don't know. So far be it from me to judge them, but man do I wish someone would do something about Formlabs' ridiculous prices and monopoly over that space.)


> I wish they'd tried

I'm sure they've tried. From what I recall they've had serious reliability issues on the preview units. So I'd be skeptical if it would have even turned into a successfully delivered Kickstarter. They would have to deliver on that first before even concerning themselves with how to scale up.

So maybe they didn't even get handsomely paid in the acquisition, but were given an option to save face.


ZFS is about end-to-end integrity, not just redundancy. It stores checksums of data when writing, checks them when reading, and can perform automatic restores from mirror members if mismatches occur. During writes, ZFS generates checksums from blocks in RAM. If a bit flips in memory before the block is written, ZFS will store a checksum matching the corrupted data, breaking the integrity guarantee. That’s why ECC RAM is particularly important for ZFS - without it you risk undermining the filesystem’s end-to-end integrity. Other filesystems usually lack such guarantees.

It appears that you are an American who has conveniently forgotten about FISA, EARN IT, CLOUD act, PATRIOT act, LAED, etc, etc.

You realise this hasn’t passed, right? It’s a proposal.

Seriously you should look to yourself and what you guys have actually passed into law before you start throwing stones at others.


I’m not in the US, and glad to no longer be in the EU.

My point is that there is zero chance of this unpopular legislation being repealed once the EU have forced it through.

I would rather take my chances in a sovereign parliamentary democracy. I know the UK has draconian anti privacy laws on the statute books and have retained a lot of EU rules by default. But we still cling to the belief that parliaments cannot bind the hands of future parliaments, and we expect manifestos to be published and debated prior to elections. A lot of this has been pushed to the background while the UK has been governed by incompetent untrustworthy technocrats cut from the same cloth as the Eurocrats they yearn to be, but a political tsunami is on the way. You can feel it. The globalist establishment will rage against it as ghastly Populism, but I see it as a renaissance of Democracy. It gives me hope that unpopular laws can be amended or repealed.


> I’m not in the US...

You seem to be a different person than I was replying to

> My point is...

No, that is your opinion. There is no evidence that this will ever be "forced" through in any form that would erode current rights.

> I would rather take my chances...

By all means do, although you may want to brush up on how legislation in your own country works...


I do know how it works, I also know about U-turns, backbench revolts, opinion poll panic, doorstep canvassing hostility, and voter backlashes. All ways in which the electorate can have an impact.

Good luck getting your EU Commission to change their mind about something they really want. The "right" voting buttons will be pushed eventually. But I suspect you already know that.


> something they really want.

And what exactly is that?

The widely discussed "Chat control" proposal has already been withdrawn. What remains is much narrower in scope. For instance, it no longer includes mandatory client-side scanning.

Over time, these kinds of proposals usually get watered down further until they either respect individual rights fully or fade away entirely, as has consistently happened with privacy-invasive initiatives in the past.

The only exception I can think of was the "Data Retention Directive" that was (eventually!) rejected by the ECJ - only for the UK to reimplement it after Brexit...


> For instance, it no longer includes mandatory client-side scanning.

It's still unclear whether it really is removed. They turned scanning into something voluntary, and then said big chat providers must do _something_ to monitor abuse. It seems _very_ likely that the regulatory bodies/courts will decide that the bar they must clear to meet this "something" is client-side scanning.

And I agree that the regulation still has a lot of hoops to jump through to be implemented, and will likely be further tweaked. But it's _very_ important to keep raising our concerns, otherwise there will be no pressure to change the currently problematic legislation.


> It's still unclear

EVERYTHING is unclear, because they have literally only just received a negotiating mandate to discuss the idea. This isn't even a proposal that will reach parliament in its current form - its undergone near zero scrutiny because it simply hasn't progressed far enough to be scrutinised.

> It seems _very_ likely... client-side scanning

Actually, it seems almost completely UNLIKELY due to the protections afforded EU citizens. The last time legislation was passed that eroded privacy it was repealed by the ECJ (albeit many years later).

> it's _very_ important to keep raising our concerns

I totally agree


In the USA they have 1st amendment, in the EU we don't have it so these things are not just about aiding law enforcement in the traditional sense - this is for Chinese-style censorship.

The EU has near equivalent rights to the 1st amendment under the charter of fundamental rights of the EU and the ECHR, with very specific exclusions and reasons to permit suspension - whereas in the USA those exclusions aren’t codified but decided by a court on an ad-hoc basis (defamation, incitement, true threats, obscenity, fraud, etc).

Basically, the EU gives you the rules up front and the USA decides after the fact.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: