Hacker Newsnew | past | comments | ask | show | jobs | submit | more samwillis's commentslogin

This is a great comparison, but it depends so much on what sort of website or web app you are building. If you are building a content site, with the majority of visitors arriving without a hot cache bundle size is obviously massively important. But for a web app, with users regularly visiting, it's somewhat less important.

As ever on mobile it's latency, not bandwidth, that's the issue. You can very happily transfer a lot of data, but if that network is in your interactive hot path then you will always have a significant delay.

You should optimise to use the available bandwidth to solve the latency issues, after FCP. Preload as much data as possible such that navigations are instant.


Software is very easy to bloat, expand scope, and grow to do more than really needed, or just to release apps that are then forgotten about.

Hardware is naturally limited in scope due to manufacturing costs, and doesn't "grow" in the same way. You replace features and components rather than constantly add to them.

Apple needs someone to come in and aggressively cut scope in the software, removing features and products that are not needed. Pair it down to something manageable and sustainable.


> pare down products and features

macOS has way too many products but far too few features. In terms of feature-completeness, it's already crippled. What OS features can macOS afford to lose?


I would say it's less about losing and more about focus. Identify the lines of business you don't want to be in and sell those features to a third party who can then bundle them for $1/$10/$20. A $2T company just doesn't care, but I would bet that those excised features would be good enough for a smaller software house.

(I have the same complaint about AWS, where a bunch of services are in KTLO and would be better served by not being inside AWS)


macOS has like no features already, and they keep removing more.


If you think hardware can't bloat, I suggest you look into the history of Intels attempt to replace x86. Or the VAX. Not to mention tons of minicomputer companies who built ever more complex minis. And not to mention the supercomputer startup bubble.


This record and replay trick is very similar to what I recently used to implement the query DSL for Tanstack DB (https://tanstack.com/db/latest/docs/guides/live-queries). We pass a RefProxy object into the where/select/join callbacks and use it to trace all the props and expressions that are performed. As others have noted you can't use js operators to perform actions, so we built a set of small functions that we could trace (eq, gt, not etc.). These callbacks are run once to trace the calls and build a IR of the query.

One thing we were surprisingly able to do is trace the js spread operation as that is a rare case of something you can intercept in JS.

Kenton, if you are reading this, could you add a series of fake operators (eq, gt, in etc) to provide the capability to trace and perform them remotely?


Yes, in principle, any sort of remote compute we want to support, we could accomplish by having a custom function you have to call for it. Then the calls can be captured into the record.

But also, apps can already do this themselves. Since the record/replay mechanism already intercepts any RPC calls, the server can simply provide a library of operations as part of its RPC API. And now the mapper callback can take advantage of those.

I think this is the approach I prefer: leave it up to servers to provide these ops if they want to. Don't extend the protocol with a built-in library of ops.


Ah, yes, obviously. This is all very cool!


Just a side note - reading https://tanstack.com/db/latest/docs/guides/live-queries#reus... I see:

    const isHighValueCustomer = (row: { user: User; order: Order }) => 
     row.user.active && row.order.amount > 1000
But if I'm understanding the docs correctly on this point, doesn't this have to be:

    const isHighValueCustomer = (row: { user: User; order: Order }) => 
      and(row.user.active, gt(row.order.amount, 1000))


Yep, that's an error in the docs..


As I understand it it's basically how Pytorch works. A clever trick but also super confusing because while it seems like normal code, as soon as you try and do something that you could totally do in normal code it doesn't work:

  let friendsWithPhotos = friendsPromise.map(friend => {
    return {friend, photo: friend.has_photo ? api.getUserPhoto(friend.id) : default_photo};
  }
Looks totally reasonable, but it's not going to work properly. You might not even realise until it's deployed.


> Given an image or a 3D mesh, SGS-1 can generate CAD B-Rep parts in STEP format. Unlike all other existing generative models, SGS-1 outputs are accurate and can be edited easily in traditional CAD software

This is a game changer, all the models before that output meshes were a toy at best. Super excited to see where they can take this.

I wander if the next step is for a step -> proprietary format (SolidWork, NX etc) model that can infer constraints.


I agree, even if it just does a decent job of creating sane STEP geometry out of 3D scan meshes it would be a huge win.


I’d pay a subscription for that.

There are so many hobbyist 3D printing things I’d like to do around my abode by taking some existing piece and tweaking it. Creating a model for a one off part is pretty tedious though.


I think the bar is even lower than that: generating sane STEP geometry from STL files generated by other CAD software is already a huge win. Autodesk Fusion pretends to be able to do that, but it only works for easy demos.


Thanks for this! We are actively considering this in our next model. What did you have in mind specifically?


Really interesting post, and well done on setting up the Apples to Oranges nature of the benchmark, you're very clear. It's really interesting to see the deference the district architectures make.

Did you run any tests with the new transaction system in ClickHouse? It would be super interesting to see how it effected the batch updates.


If you're refering to the experimental ACID transactions, we didn't test them; I'll check what state they're in and see if its worth including yet, if not, we'll for sure come back to them later and do a similar test!


This is a really great write up!

I work at Electric and started the PGlite and now Tanstack DB projects. The issues mentioned with PGlite are one of the major motivating factors behind Tanstack DB. We are taking those learnings and building, what we believe, is the missing client side datastore that is "sync native" and completely backend agnostic. Also being JS, rather than WASM, solves many of the slower than ideal query semantics, and has enabled us to build an incremental query engine for it.

It's also important to note that Electric doesn't require PGlite on the client, far from it - it's essentially a "protocol first" sync engine, you can use it to write into and maintain any client side store.

This solution by the OP, diffing based of modified data is ideal for a huge number of apps, and something that we intend to built into Tanstack DB so you can easily sync with no additional infrastructure.

SQLite (or PGlite) in the browser is awesome, and has the advantage over Tanstack DB at the moment of having persistence (it's on our roadmap), but they are also somewhat chunky downloads. For many local-first apps that's not a problem though.


I built my own offline capable, multiplayer capable sync engine with pglite and electric https://github.com/evelant/synchrotron

It is opinionated and not for every use case, also very experimental, but you might find some of the ideas interesting.


Oh cool! I'll absolutely take a look.


I'd love to hear what you think of the idea. I really like what electric is building. I'm actually hacking on getting pglite to run on react-native right now =)


Do drop into the discord and let me know who you are there. Would love to hear your take on react-native support.


Oliver is doing awesome work here. A few interesting points:

- Porffor can use typescript types to significantly improve the compilation. It's in many ways more exciting as a TS compiler.

- There's no GC yet, and likely will be a while before it gets any. But you can get very far with no GC, particularly if you are doing something like serving web requests. You can fork a process per request and throw it away each time reclaiming all memory, or have a very simple arena allocator that works at the request level. It would be incredibly performant and not have the overhead of a full GC implementation.

- many of the restrictions that people associate with JS are due to VMs being designed to run untrusted code. If you compile your trusted TS/JS to native you can do many new things, such as use traditional threads, fork, and have proper low level memory access. Separating the concept of TS/JS from the runtime is long overdue.

- using WASM as the IR (intermediate representation) is inspired. It is unlikely that many people would run something compiled with Porffor in a WASM runtime, but the portability it brings is very compelling.

This experiment from Oliver doesn't show that Porffor is ready for production, but it does validate that he is on the right track, and that the ideas he is exploring are correct. That's the imports take away. Give it 12 months and exciting things will be happing.


I'm very excited by Porffor too, but a lot of what you've said here isn't correct.

> - Porffor can use typescript types to significantly improve the compilation. It's in many ways more exciting as a TS compiler.

Proffor could use types, but TypeScript's type system is very unsound and doing so could lead to serious bugs and security vulnerabilities. I haven't kept track of what Oliver's doing here lately, but I think the best and still safe thing you could do is compile an optimistic, optimized version of functions (and maybe basic blocks) based on the declared argument types, but you'd still need a type guard to fall back to the general version when the types aren't as expected.

This isn't far from what a multi-tier JIT does, and the JIT has a lot more flexibility to generate functions for the actual observed types, not just the declared types. This can be a big help when the declared types are interfaces, but in an execution you only see specific concrete types.

> or have a very simple arena allocator that works at the request level.

This isn't viable. JS semantics mean that the request handling path can generate objects that are held from outside the request's arena. You can't free them or you'd get use-after-free problems.

> - many of the restrictions that people associate with JS are due to VMs being designed to run untrusted code

This is true to some extent, but most of the restrictions are baked into the language design. JS is a single-threaded non-shared memory language by design. The lack of threads has nothing to do with security. Other sandboxed languages, famously Java, have threads. Apple experimented with multithreaded JS and it hasn't moved forward not because of security but because it breaks JS semantics. Fork is possible in JS already, because it's a VM concept, not a language concept. Low-level memory access would completely break the memory model of JS and open up even trusted code to serious bugs and security vulnerabilities.

> It is unlikely that many people would run something compiled with Porffor in a WASM runtime

Running JS in WASM is actually the thing I'm most excited about from Porffor. There are a more and more WASM runtimes, and JS is handicapped there compared to Rust. Being able to intermix JS, Rust, and Go in a single portable, secure runtime is a killer feature.


> I haven't kept track of what Oliver's doing here lately

Please do go and check up what the state of using types to inform the compiler is (I'm not incorrect)

On the area allocator, I wasn't clear enough, as stated elsewhere this was in relation to having something similar to isolates - each having a memory space that's cleaned up on exit.

Python has almost identical semantics to JS, and has threads - there is nothing in the EMCAScript standard that would prevent them.


It is absolutely true that it is unsafe to trust TypeScript types. I've chatted briefly with Oliver on socials before and he knows this. So I am a bit confused by this issue: https://github.com/CanadaHonk/porffor/issues/234 which says "presume the types are good and have been validated by the user before compiling". This is just not a thing that's possible. Types are often wrong in subtle ways. Casts throw everything out the window.

Dart had very similar issues and constraints and they couldn't do a proper AOT compiler that considered types until they made the type system sound. TypeScript can never do that and maintain compatibility with JS.

Isolates are already available as workers. The key thing is that you can't have shared memory, other wise you can get cross-Isolate references and have all the synchronization problems of threads.

And ECMAScript is simply just specified as a single-threaded language. You break it with shared-memory threads.

In JS, this always logs '4'. With threads that's not always the case.

    let x = 4;
    console.log(x);


> It is absolutely true that it is unsafe to trust TypeScript types... This is just not a thing that's possible.

Well... unsafe and impossible aren't quite the same thing. I guess this is possible if you throw out "safe" as a requirement?


Based on how much imported libraries are relied upon, it makes sense to treat everything as untrusted. Unless you write every line yourself/in-house, code should be considered untrusted.

I would be curious which attack vectors change or become safe after compiling though.


The point of the js engine sandbox is to protect the user in the browser - it's completely redundant on the server. Supply chain attacks are real, but only Deno has tried to fix that through permissions/rules.

I don't think anything changes with compile to native on the server.


Totally disagree. A spec-compliant JS engine has to support the features that allow vulnerabilities like prototype pollution, which can be exploited through user input alone.


Also none of the third party code will be thread safe. Hell, some of it isn’t even reentrant.


> many of the restrictions that people associate with JS are due to VMs being designed to run untrusted code. If you compile your trusted TS/JS to native you can do many new things, such as use traditional threads, fork, and have proper low level memory access. Separating the concept of TS/JS from the runtime is long overdue.

This is just outright wrong. JS limitations come from lots of things:

1. The language has almost zero undefined behavior by design. Code will essentially never behave differently on different platforms.

2. JS has traditional threads in the form of web workers. This interface exists not for untrusted code but because of thread safety. That's a language design, like channels in Go, rather than a sandboxing consideration.

3. Pretty much every non-browser JS runtime has the ability to fork.

4. JS is fully garbage collected, of course you don't get your own memory management. You can use buffers to manage your own memory if you really want to. WASM lets you manage your own memory and it can run "untrusted" code in the browser with the WASM runtime; your example just doesn't hold water. There's no way you could fiddle with the stack or heap in JS without making it not JS.

5. The language comes with thirty years of baggage, and the language spec almost never breaks backwards compatibility.

Ironically Porffor has no IO at the moment, which is present in literally every JS runtime. It really has nothing to do with untrusted code like you're suggesting.

> You can fork a process per request and throw it away each time reclaiming all memory, or have a very simple arena allocator that works at the request level. It would be incredibly performant and not have the overhead of a full GC implementation.

You also must admit that this would make Porffor incompatible with existing runtimes. Code today can modify the global state, and that state can and does persist across requests. It's a common pattern to keep in-memory caches or to lazily initialize libraries. If every request is fully isolated in the future but not now, you can end up with performance cliffs or a system where a series of requests on Node return different results than a series of requests on Porffor.

As for arena allocation, this makes it even less compatible with Node (if not intractable). If means you can't write (in JS) any code that mutates memory that was initialized during startup. If you store a reference to an object in an arena in an object initialized during startup, at the end of the request when the arena is freed you now have a pointer into uninitialized memory.

How do you tell the developer what they can and cannot mutate? You can't, because any existing variable might be a reference to memory initialized during startup. Your function might receive an object as an argument that was initialized during startup or one that's wasn't, and there's no way to know whether it's safe to mutate it.

Long story short, JS must have a garage collector to free memory, or it's not JS.

> It is unlikely that many people would run something compiled with Porffor in a WASM runtime, but the portability it brings is very compelling.

Node (via SEA in v20), bun, and deno all have built in tooling for generating a self-contained binary. Granted, the runtime needs to work for your OS and CPU, but the exact same thing could be said about a WASM runtime.

And of course there are hundreds of mature bundlers that can compile JS into a single file that runs in various runtimes without ever thinking about platform. It's weird to even consider portability of JS as a benefit because JS is already almost maximally portable.

> This experiment from Oliver doesn't show that Porffor is ready for production, but it does validate that he is on the right track, and that the ideas he is exploring are correct.

It validates that the approach to building a compiler is correct, but it says little about whether the project will eventually be usable and good. It's unlikely it'll get faster, because robust JS compatibility will require more edge cases to be handled than it currently does, and as Porffor's own README says, it's still slower than most JITted runtimes. A stable release might not yield much.


What a strange (and strangely adversarial) comment.

Almost none of your criticisms connects with anything that the other person wrote.


> JS has traditional threads in the form of web workers.

There is no language I’m aware of where workers behave like “traditional threads”. They’re isolates. Not threads.


Web workers don't share memory (other than SAB) with the main thread, they are far from traditional threads. These APIs are designed the way they are to protect end users, stop sites from consuming resources or bad code blocking the main thread. None of that is needed to be that way on the server. There is zero reason that a JS implementation cannot implement proper threads within the same memory space. The issue is that all js engines are derived from the browser where that isn't wanted, they simply don't have support for it. Traditional threads need careful use,

Nowhere did I say that full, or even any, compatibility with Node is needed - it isn't.

We need to stop conflating JS the language with the runtimes.

A JS runtime absolutely can get by without a GC, you just never dealloc and consume indefinitely. That doesn't change any semantics of the language, if a value/object is inaccessible, it's inaccessible...

An arena allocator provides a route to say embedding a js-to-native app in a single threaded web server like Nginx, you don't need to share memory between what in effect become "isolates".


NodeJS has worker threads[0] already

[0]: https://nodejs.org/docs/latest/api/worker_threads.html


These are very similar to web workers, they don't share memory other than via SharedArrayBuffer instances. For anything else you use message passing.


> Web workers don't share memory (other than SAB) with the main thread, they are far from traditional threads. These APIs are designed the way they are to protect end users, stop sites from consuming resources or bad code blocking the main thread. None of that is needed to be that way on the server.

It doesn't protect end users any more than it protects servers. Node could easily expose raw threading, but they don't because nearly the whole language isn't thread safe and everything would break. It has almost nothing to do with protecting users, it's a language design decision that enforces other design constraints.

> We need to stop conflating JS the language with the runtimes

If you're just sharing syntax but the standard library is different and essentially none of the code is compatible, it's not the same language. ECMAScript specifies all of the things you're talking about, and that is JavaScript, irrespective of the runtime.

> A JS runtime absolutely can get by without a GC, you just never dealloc and consume indefinitely. That doesn't change any semantics of the language, if a value/object is inaccessible, it's inaccessible...

If you throw away the whole heap on every request, now every request it's definitionally a "cold start". Which negates the singular benefit that this post is calling out. Porffor is still not faster than JITed engines at runtime, and initializing the code still has to happen.

> Nowhere did I say that full, or even any, compatibility with Node is needed - it isn't.

You have to square what you're saying with this statement. What you're describing is JavaScript in syntax only. You're talking about major departures from the formal language spec. Existing JavaScript code is likely to break. Why not just make a new language and call it something else, like Crystal is to Ruby? It works different, you're saying it doesn't care about compatibility... Why even call it JS then?


> ECMAScript specifies all of the things you're talking about, and that is JavaScript, irrespective of the runtime.

I suggest you go and read the EMCAScript standard: https://ecma-international.org/publications-and-standards/st...

There is nothing in there about browser APIs, and in fact it explicitly states that the browser runtime, or any other runtime + api are not EMCAScript.


Porffor is doing that, JS -> WASM (an an IR) -> C -> Native

For TypeScript it uses the types as hints to the compiler, for example it has int types that alias number.

Very early still, but very cool.

https://porffor.dev/


There is a lot valid concern on accessibility and abuse this could result in, but it think it's important to see the other side of the argument.

There was a really good thread on Twitter a couple of days ago:

> In light of recent Figma news, lemme reiterate that of all the goods that can happen to the web, 90% of them can't happen due to not having access to font rendering & metrics in JS

https://x.com/_chenglou/status/1951481453046538493

And a few choice replies:

> t’s kind of crazy that a platform specifically designed for presenting text doesn’t provide functionality to manipulate text at a detail level

> Brute forcing text measurement in tldraw breaks my heart

Love it or hate it, the web is a platform for application development, making this easer is only good for everyone.

My argument on web APIs is what we should continue to go lower level, and so font and text metrics APIs for canvas would be awesome and an alternative to this. But I'm also a proponent of "using the platform" and for text layout, web engines are incredible, and very performant. Extending that capability to layout inside a canvas enables many awesome features.

One that I've repeatedly gone back to over the years is paginated rich text editing. It's simply impossible to do with contenteditable in a product level way - one of the reasons Google docs has a custom layout engine. This proposal would enable full use of contenteditable for rich text, but with full page/print layout control.

I hope it lands in the browsers.


> of all the goods that can happen to the web, 90% of them can't happen due to not having access to font rendering & metrics in JS

I’d be interested to see a representative excerpt of this person’s “goods that can happen to the web”, because it sounds pretty ridiculous to me. Not much needs that stuff, and a lot of that stuff is exposed in JS these days, and a lot of the rest you can work around it without it being ruinous to performance.

It’s also pretty irrelevant here (that is, about HTML-in-Canvas): allowing drawing HTML to canvas doesn’t shift the needle in these areas at all.


> excerpt of this person’s “goods that can happen to the web"

100% of my concern about the Web is about privacy and security... and why they don't happen.


> One that I've repeatedly gone back to over the years is paginated rich text editing. It's simply impossible to do with contenteditable in a product level way - one of the reasons Google docs has a custom layout engine.

As do we at Nutrient, we use Harfbuzz in WASM plus our own layouting - see the demo here: https://document-authoring-demo.nutrient.io/

Getting APIs for that into the Platform would make life significantly easier, but thanks to WASM it’s not a total showstopper.

Btw, I saw you’re working on sync at ElectricSQL - say hi to Oleksii :)


I just tested this demo and it noticed it doesn't support Arabic text rendering (the letters should be connected) which is a main feature of Harfbuzz


Woa that's some heavy SVG lifting going on there!

If I get it right, every glyph used from the given font is rendered once as a SVG path (upside down! huh!), and then the whole page is a single huge SVG element in which every typed character is a <use> with a reference to that rendered glyph, translated with a CSS transform to the right place (i assume these coordinates come out of HarfBuzz?). Kinda mad that you had to redo 90% of the browser that way but the result is pretty impressive!

I'm curious why you render the glyphs to paths and not have the browser render those directly using eg svg <text> elements?

Was it hard to get this to work cross browser?

ps. srsly I love this about the web. You're doing this amazing engineering feat and I can just pop the trunk and learn all about it. Obviously feel free to not answer anything that's deemed a trade secret, I'm just geeking out hard on this thing :-) :-)


You can't size other SVG elements around text ones since you don't know how much space the text element will occupy.


I don't mean HTML text nodes, I mean still the single big SVG like they do now, but with SVG <text> elements instead of <path> elements. They do know (I suppose) how much space that element would take since they're asking HarfBuzz to tell them.


You can't know the size of an SVG <text> element.


why not? if you control the font and the font size then can't you have harfbuzz tell you how wide each glyph is going to be, and it'll fit precisely? (assuming the browser also uses harfbuzz for this, which it does)


Because SVG isn't defined to call into harfbuzz.

You don't control the font if the SVG fragment is embedded in HTML.


And, I think we've come full circle. I'm pretty sure that's how I was rendering text for the online office suite[] I wrote in ~1998 -- a Java Applet embedded in the browser.

[] VCs: "We're not investing in this crap! No company in their right mind would store their precious, confidential documents on the Internet!"


> I hope it lands in the browsers.

Why would you want world's least performant layout/UI engine infect canvas? This literally just cements the situation you quote about having no access to good APIs.

A reminder that Figma had to "create a browser inside a browser" to work around DOM limitations: https://www.figma.com/blog/building-a-professional-design-to...

> It's simply impossible to do with contenteditable in a product level way - one of the reasons Google docs has a custom layout engine. This proposal would enable full use of contenteditable for rich text, but with full page/print layout control.

Why would it enable contenteditable for rich text if you yourself are saying that it doesn't work, and Google had to implement its own engine?


Although we haven't built offline support yet, it is very much planned.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: