Hacker Newsnew | past | comments | ask | show | jobs | submit | fleabitdev's commentslogin

Good catch - looks like it's a PNG image, with an alpha channel for the rounded corners, and a subtle gradient in the background. The gradient is rendered with dithering, to prevent colour banding. The dither pattern is random, which introduces lots of noise. Since noise can't be losslessly compressed, the PNG is an enormous 6.2 bits per pixel.

While working on a web-based graphics editor, I've noticed that users upload a lot of PNG assets with this problem. I've never tracked down the cause... is there a popular raster image editor which recently switched to dithered rendering of gradients?


My reasoning is because once upon a time, I was using Macromedia Fireworks, and PNGs gave far far better results than JPGs did at the time, at least in terms of output quality. Nearly certainly because I didn't understand JPG compression, but for web work in the mid 2000s PNGs became my favourite. Not to mention proper alpha channels!

...and so it's stuck, two decades on haha


I only recently learned that JPEG and MPEG-1 were designed for near-lossless compression, so the massive bitrate reductions which came further down the road had nothing to do with the original design.

"Inelegant" is the right word; it's hard to shake off the feeling that we might have missed something important. I suspect the next big breakthrough might be waiting for researchers to focus on lower-quality compression specifically, rather than requiring every new codec to improve the state of the art in near-lossless compression.


> for researchers to focus on lower-quality compression specifically

JPEG-XL already does this because it uses VarDCT (Variable-size Discrete Cosine Transform) aka adaptive block sizes (2×2 up to 256×256). Large smooth areas use huge blocks and fine detail uses small blocks to preserve detail. JXL spends bits where your eyes care most instead of evenly across the image. It also has many techniques it uses to really focus on keeping edges sharp.


JPEG XL achieves about half the bitrate of an equal-quality JPEG, even at lower quality levels. That's a real achievement, but the complexity cost is high; I'd estimate that JPEG XL decoders are at least ten times more complex than JPEG decoders. Modern lossy image codecs are "JPEG, with three decades of patch notes" :-)

I think we're badly in need of an entirely new image compression technique; the block-based DCT has serious flaws, such as its high coding cost for edges and its tendency to create block artefacts. The modern hardware landscape is quite different from 1992, so it's plausible that the original researchers might have missed something important, all those years ago.


The discrete wavelet transform (DWT) compresses an image by repeatedly downscaling it, and storing the information which was lost during downscaling. Here's an image which has been downscaled twice, with its difference images (residuals): https://commons.wikimedia.org/wiki/File:Jpeg2000_2-level_wav.... To decompress that image, you essentially just 2x-upscale it, and then use the residuals to restore its fine details.

Wavelet compression is better than the block-based DCT for preserving sharp edges and gradients, but worse for preserving fine texture (noise). The DCT can emulate noise by storing just a couple of high-frequency coefficients for a 64-pixel block, but the DWT would need to store dozens of coefficients to achieve noise synthesis of similar quality.

The end result is that JPEG and JPEG 2000 achieve roughly the same lossy compression ratio before image artefacts show up. JPEG blurs edges, JPEG 2000 blurs texture. At very low bitrates, JPEG becomes blocky, and JPEG 2000 looks like a low-resolution image which has been upscaled (because it's hardly storing any residuals at all!)

FFmpeg has a `jpeg2000` codec; if you're interested in image compression, running a manual comparison between JPEG and JPEG 2000 is a worthwhile way to spend an hour or two.


I found a jpeg2000 reference PDF somewhere. It may as well have been written in Mandarin. I got as far as extracting the width and height. Its much more advanced than jpeg. Forget about writing a decoder.



What about JPEG XL or AVIF? Do they use DCT or DWT, or perhaps something else?


Both formats are DCT-based (except for lossless JPEG XL). JPEG 2000's use of the DWT was unusual; in general, still-image lossy compression research has spent the last 35 years iteratively improving on JPEG's design. This is partly for compatibility reasons, but it's also because the original design was very good.

Since JPEG, improvements have included better lossless compression (entropy coding) of the DCT coefficients; deblocking filters, which blur the image across block boundaries; predicting the contents of DCT blocks from their neighbours, especially prediction of sharp edges; variable DCT block sizes, rather than a fixed 8x8 grid; the ability to compress some DCT blocks more aggressively than others within the same image; encoding colour channels together, rather than splitting them into three completely separate images; and the option to synthesise fake noise in the decoder, since real noise can't be compressed.

You might be interested in this paper: https://arxiv.org/pdf/2506.05987. It's a very approachable summary of JPEG XL, which is roughly the state of the art in still-image compression.


Thanks. The paper is fascinating. I only skimmed around so far and it is full of interesting details. Even beyond compression. They really tried hard to make the USB of image formats, by supporting as many features and use cases as possible. Even things like multiple layers and non-destructive cropping. I like the section where they talk about previous image formats, why many of them failed and how they tried to learn from past mistakes.

Regarding algorithms: Searching for "learned image compression", there are a lot of research papers which use neural networks rather than analytic algorithms like DCT. The compression rates seem to already outperform conventional compression. I guess the bottleneck is more slow decoding speed than compression rate. At least that's the issue with neural video compression.


As I understand it, very small neural networks have already been incorporated into both VVC and AV2 for intra prediction. You're correct that this strategy is limited by decoding performance, especially when predicting large blocks.

In general, I'm pessimistic about prediction-and-residuals strategies for lossy compression. They tend to amplify noise; they create data dependencies, which interfere with parallel decoding; they require non-local optimisation in the encoder; really good prediction involves expensive analysis of a large number of decoded pixels; and it all feels theoretically unsound (because predictors usually produce just one value, rather than a probability distribution).

I'm more optimistic about lossy image codecs based on explicitly-coded summary statistics, with very little prediction. That approach worked well for lossy JPEG XL.


Everything after JPEG is still fundamentally the same, but individual parts of the algorithm are supercharged.

JPEG has 8x8 blocks, modern codecs have variable-sized blocks from 4x4 to 128x128.

JPEG has RLE+Huffman, modern codecs have context-adaptive variations of arithmetic coding.

JPEG has a single quality scale for the whole image, modern codecs allow quality to be tweaked in different areas of the image.

JPEG applies block coefficients on top of a single flat color per block (DC coefficient), modern codecs use a "prediction" made by smearing previous couple of block for the starting point.

They're JPEGs with more of everything.


I'd describe that as a trend, rather than a consensus.

It wasn't an entirely bad idea, because comments carry a high maintenance cost. They usually need to be rewritten when nearby code is edited, and they sometimes need to be rewritten when remote code is edited - a form of coupling which can't be checked by the compiler. It's easy to squander this high cost by writing comments which are more noise than signal.

However, there's plenty of useful information which can only be communicated using prose. "Avoid unnecessary comments" is a very good suggestion, but I think a lot of people over-corrected, distorting the message into "never write comments" or "comments are a code smell".


You've rediscovered a state-of-the-art technique, currently used by JPEG XL, AV1, and the HEVC range extensions. It's called "chroma from luma" or "cross-component prediction".

This technique has a weakness: the most interesting and high-entropy data shared between the luma and chroma planes is their edge geometry. To suppress block artefacts near edges, you need to code an approximation of the edge contours. This is the purpose of your quadtree structure.

In a codec which compresses both luma and chroma, you can re-use the luma quadtree as a chroma quadtree, but the quadtree itself is not the main cost here. For each block touched by a particular edge, you're redundantly coding that edge's chroma slope value, `(chroma_inside - chroma_outside) / (luma_inside - luma_outside)`. Small blocks can tolerate a lower-precision slope, but it's a general rule that coding many imprecise values is more expensive than coding a few precise values, so this strategy costs a lot of bits.

JPEG XL compensates for this problem by representing the local chroma-from-luma slope as a low-resolution 2D image, which is then recursively compressed as a lossless JPEG XL image. This is similar to your idea of using PNG-like compression (delta prediction, followed by DEFLATE).

Of course, since you're capable of rediscovering the state of the art, you're also capable of improving on it :-)

One idea would be to write a function which, given a block of luma pixels, can detect when the block contains two discrete luma shades (e.g. "30% of these pixels have a luminance value close to 0.8, 65% have a luminance value close to 0.5, and the remaining 5% seem to be anti-aliased edge pixels"). If you run an identical shade-detection algorithm in both the encoder and decoder, you can then code chroma information separately for each side of the edge. Because this would reduce edge artefacts, it might enable you to make your quadtree leaf nodes much larger, reducing your overall data rate.


Thanks for the feedback, and the interesting ideas. It's good to know that I was on to something and not completely off :-)

I'm mostly doing this for learning purposes, but a hidden agenda is to create a low-latency codec that can be used in conjunction with other codecs that deal primarily with luma information. AV1 and friends are usually too heavy in those settings, so I try to keep things simple.


There was a constraint - since 2009, the Joint Photographic Experts Group had published JPEG XR, JPEG XT and JPEG XS, and they were probably reluctant to break that naming scheme.

They're running out of good options, but I hope they stick with it long enough to release "JPEG XP" :-)


JPEG XP would have been a nice name for a successor of JPEG 2000, I suppose :)

There's also a JPEG XE now (https://jpeg.org/jpegxe/index.html), by the way.


Incidentally, JPEG Vista would be thematically appropriate.


They can tack on more letters, or increment the X, as required.


Good one - made me and a coworker both LOL (in the literal sense) :D


JPEG ME


The rods are only active in low-light conditions; they're fully active under the moon and stars, or partially active under a dim street light. Under normal lighting conditions, every rod is fully saturated, so they make no contribution to vision. (Some recent papers have pushed back against this orthodox model of rods and cones, but it's good enough for practical use.)

This assumption that rods are "the luminance cells" is an easy mistake to make. It's particularly annoying that the rods have a sensitivity peak between the blue and green cones [1], so it feels like they should contribute to colour perception, but they just don't.

[1]: https://en.wikipedia.org/wiki/Rod_cell#/media/File:Cone-abso...


Consider myself educated, thanks!


Protanopia and protanomaly shift luminance perception away from the longest wavelengths of visible light, which causes highly-saturated red colours to appear dark or black. Deuteranopia and deuteranomaly don't have this effect. [1]

Blue cones make little or no contribution to luminance. Red cones are sensitive across the full spectrum of visual light, but green cones have no sensitivity to the longest wavelengths [2]. Since protans don't have the "hardware" to sense long wavelengths, it's inevitable that they'd have unusual luminance perception.

I'm not sure why deutans have such a normal luminous efficiency curve (and I can't find anything in a quick literature search), but it must involve the blue cones, because there's no way to produce that curve from the red-cone response alone.

[1]: https://en.wikipedia.org/wiki/Luminous_efficiency_function#C...

[2]: https://commons.wikimedia.org/wiki/File:Cone-fundamentals-wi...


Back of the envelope, in the two years since the game was released, this single bug has wasted at least US$10,000,000 of hardware resources. That's a conservative estimate (20% of people who own the game keep it installed, the marginal cost of wasted SSD storage in a gaming PC is US$2.50 per TB per month, the install base grew linearly over time), so the true number is probably several times higher.

In other words, the game studio externalised an eight-figure hardware cost onto their users, to avoid a five-to-six-figure engineering cost on their side.

Data duplication can't just be banned by Steam, because it's a legitimate optimisation in some cases. The only safeguard against this sort of waste is a company culture which values software quality. I'm glad the developers fixed this bug, but it should never have been released to users in the first place.


>Data duplication can't just be banned by Steam

Steam compresses games as much as possible, so in the case of Helldivers 2, you had to download between ~30 and ~40 GB, which was then unpacked to 150 GB (according to SteamDB[0])

[0] https://steamdb.info/app/553850/depots/


You are missing that each update takes AGES while it tortures your disk for patching the files (on my machine it takes 15min or so, and that's on an SSD). So I agree that this is careless and reminds me of the GTA5 startup time that was fixed by a dedicated player who finally had enough and reverse engineered the problem (see https://nee.lv/2021/02/28/How-I-cut-GTA-Online-loading-times...). I still find these things hard to accept.


Here’s a funny one from Halo Infinite network downloading the same picture thousands of times: https://www.reddit.com/r/halo/comments/w5af08/infinite_downl...


Steam update durations depend on compression + CPU performance + SSD I/O. Things will be harder when the disk is almost full and live defragmentation kicks in to get free space for contiguous files. Some SSDs are fast enough to keep up with such a load, but a lot of them will quickly hit their DRAM limits and suddenly that advertised gigabyte per second write speed isn't all that fast. Bonus points for when your SSD doesn't have a heatsink and moving air over it, making the controller throttle hard.

Patching 150GiB with a compressed 15GiB download just takes a lot of I/O. The alternative is downloading a fresh copy of the 150GiB install file, but those playing on DSL will probably let their SSD whizz a few minutes longer than spend another day downloading updates.

If your SSD is slower than your internet capacity, deleting install files and re-downloading the entire game will probably save you some time.


Their update where they got $10k reward from R* brought a smile to my face


In this case, the bug was 131 GB of wasted disk space after installation. Because the waste came from duplicate files, it should have had little impact on download size (unless there's a separate bug in the installer...)

This is why the cost of the bug was so easy for the studio to ignore. An extra 131 GB of bandwidth per download would have cost Steam several million dollars over the last two years, so they might have asked the game studio to look into it.


This article presents it as a big success, but it could be read the opposite way: "Developers of Helldivers 2 wasted 130 GB for years and didn't care because it was others people computers"


> An extra 131 GB of bandwidth per download would have cost Steam several million dollars over the last two years

Nah, not even close. Let's guess and say there were about 15 million copies sold. 15M * 131GB is about 2M TB (2000 PB / 2 EB). At 30% mean utilisation, a 100Gb/s port will do 10 PB in a month, and at most IXPs that costs $2000-$3000/month. That makes it about $400k in bandwidth charges (I imagine 90%+ is peered or hosted inside ISPs, not via transit), and you could quite easily build a server that would push 100Gb/s of static objects for under $10k a pop.

It would surprise me if the total additional costs were over $1M, considering they already have their own CDN setup. One of the big cloud vendors would charge $100M just for the bandwidth, let alone the infrastructure to serve it, based on some quick calculation I've done (probably incorrectly) -- though interestingly, HN's fave non-cloud vendor Hetzner would only charge $2M :P


Isn't it a little reductive to look at basic infrastructure costs? I used Hetzner as a surrogate for the raw cost of bandwidth, plus overheads. If you need to serve data outside Europe, the budget tier of BunnyCDN is four times more expensive than Hetzner.

But you might be right - in a market where the price of the same good varies by two orders of magnitude, I could believe that even the nice vendors are charging a 400% markup.


Yea, I always laugh when folks talk about how expensive they claim bandwidth is for companies. Large “internet” companies are just paying a small monthly cost for transit at an IX. They arent paying $xx/gig ($1/gig) like the average consumer is. If you buy a 100gig port for $2k, it costs the same if you’re using 5 GB a day or 8 PB per day.


Off topic question.

> I imagine 90%+ is peered or hosted inside ISPs, not via transit

How hosting inside ISPs function? Does ISP have to MITM? I heard similar claims for Netflix and other streaming media, like ISPs host/cache the data themselves. Do they have to have some agreement with Steam/Netflix?


Yea netflix will ship a server to an ISP (Cox, comcast, starlink, rogers, telus etc) so the customers of that ISP can access that server directly. It improves performance for those users and reduces the load on the ISP’s backbone/transit. Im guessing other large companies will do this as well.

A lot of people are using large distributed DNS servers like 8.8.8.8 or 1.1.1.1 and these cansometimes direct users to incorrect CDN servers, so EDNS was created to help with it. I always use 9.9.9.11 instead of 9.9.9.9 to hopefully help improve performance.


The CDN/content provider ships servers to the ISP which puts them into their network. The provider is just providing connectivity and not involved on a content-level, so no MITM etc needed.



Makes sense, initial claim was that HD2 size was mainly because of duplicated assets, and any compression worth it's salt would de-duplicate things effectively.


From the story:

> Originally, the game’s large install size was attributed to optimization for mechanical hard drives since duplicating data is used to reduce loading times on older storage media. However, it turns out that Arrowhead’s estimates for load times on HDDs, based on industry data, were incorrect.

It wasn't a bug. They made a decision on what to optimise which was based on incomplete / incorrect data and performed the wrong optimisation as a result.

As a player of the game, I didn't really care that it took up so much space on my PC. I have 2TB dedicated for gaming.


Why not offer 2 versions for download and let the user choose, whether they want to block their whole disk with a single game, or accept a bit longer loading times? Or let the user at installation time make an informed decision by explaining the supposed optimization? Or let the user decide before downloading, what resolution (ergo textures) they want as the highest resolution they will play the game at and only download the textures they need up to that resolution?

Questions, questions, questions.


Because all of these suggestions require developer resources. Doing a quick web search it is estimated they have ~150 employees. A lot of triple-A studios have thousands or ten of thousands of employees. So they are relatively small game studio.

Also note that they are adding more game modes, mode warbonds, and the game is multi-platform and multiplayer. The dev team is relatively small compared to other game studios.

The game engine the game is built in is discontinued and I believe is unsupported. IIRC they are rewriting the game in UE5 because of the issues with the unsupported engine.

A lot of people have problems with Arrowhead (there been some drama between Arrowhead and the community). The install size of the game while a problem wasn't like the top problem. Bigger issues in my mind as someone that plays the game regular is:

e.g.

- The newest updates to the game with some of new enemy types which are quite unfair to fight against IMO (Dragon Roach and the War Strider).

- The other complaint was performance/stability of the game was causing issues with streamers PCs. Some people claimed the game was breaking their PCs (I think this was BS and their PCs were just broken anyway). However there was a problem with performance in the game, which was resolved with a patch a few weeks ago. That greatly improved the game IMO.


I can't answer all of this questions, but "why not offer 2 versions and allow the user to choose" was mentioned here [0].

Helldivers 2 is a multiplayer game, for the game 2 start, everyone in the lobby needs the game to be fully loaded. If one person would choose to have a slower version, it would make everyone wait longer. Which is definitely not a trade off you are willing to make as a game developer because makes the experience for other players worse.

There could be other options and better optimizations, such as lower textures that you mentioned, but I agree with the developers on having only a "fast version".

[0] https://www.pcgamer.com/games/action/helldivers-2-dev-explai...


SSD or HDD?


SSD. Prices got reasonable sometime last year for 2TB NVME/SSD


> the marginal cost of wasted SSD storage in a gaming PC is US$2.50 per TB per month

Out of curiousity, how do you come up with a number for this? I would have zero idea of how to even start estimating such a thing, or even being able to tell you whether "marginal cost of wasted hard drive storage" is even a thing for consumers.


I'd be very interested in hearing alternative estimates, but here's my working:

The lowest cost I could find to rent a server SSD was US$5 per TB-month, and it's often much higher. If we assume that markets are efficient (or inefficient in a way that disadvantages gaming PCs), we could stop thinking there and just use US$2.50 as a conservative lower bound.

I checked the cost of buying a machine with a 2 TB rather than 1 TB SSD; it varied a lot by manufacturer, but it seemed to line up with $2.50 to $5 per TB-month on a two-to-five-year upgrade cycle.

One reason I halved the number is because some users (say, a teenager who only plays one game) might have lots of unused space in their SSD, so wasting that space doesn't directly cost them anything. However, unused storage costs money, and the "default" or "safe" size of the SSD in a gaming PC is mostly determined by the size of games - so install size bloat may explain why that "free" space was purchased in the first place.

> whether "marginal cost of wasted hard drive storage" is even a thing for consumers

As long as storage has a price, use of storage will have a price :-)


Maybe average cost of next-size-up SSD price divided by a SWAG of a gaming PC lifetime? So if I had to buy a 2 TB NVMe stick instead of a 1 TB stick it's an extra $70 and I upgrade after 5 years that's only about $1 per TB-Month. I don't game I have no idea if those are good numbers.

The cheapest storage tier on s3 with instant retrieval is $.004 per GB-Month which implies AWS can still make money at $4 per TB-Month so $2.50 for consumer hardware sounds reasonable to me.


"having one less game installed on your SSD" isn't exactly same as cost per TB, it's just slight wasted convenience at worst


> the marginal cost of wasted SSD storage in a gaming PC is US$2.50 per TB per month

Where are you getting this number from? Not necessarily arguing with it, just curious.


I should probably look up the company that made the game or the publisher and avoid games they make in the future.


That would be a shame because the game is honestly very good despite its flaws, is a lot of fun and has a decent community.


Good luck convincing the MBA's with spreadsheets that 'software quality' is of any value whatsoever.


Experienced software developer, currently available for freelance work. I'd be happy to sign either a conventional months-long software development contract, or a short consulting contract.

    Location: UK
    Remote: Yes
    Willing to relocate: No
    Resume/CV: By request
    Email: (my username) at protonmail dot com
My preferred coding style is high-rigour and highly-documented, which tends to be a great fit for contract work. I have a track record of delivering quality results under minimal supervision.

My specialist skills:

- The Rust language, which has been my daily driver for more than a decade.

- Multimedia (2D rendering, vector graphics, video, audio, libav, Web Codecs...)

- Performance optimisation (parallel programming, SIMD, GPU acceleration, soft-realtime programming...)

Fields in which I'm highly experienced, but below specialist level:

- Web development, with a frontend bias (TypeScript, React, JS build systems, the Web platform, WebGL, WebGPU, Node, WebAssembly...)

- Native development (native build systems, FFIs, low-level Win32, basic fluency in C/C++...)

- Leadership, communication and technical writing, all learned in a previous career.

I should also mention a modest level of experience in computer vision, greenfield R&D, game engine development, programming language development, and data compression. I'm comfortable with ubiquitous tools like Bash, Make, Git, GitHub, Docker and Figma. (Sorry for the keyword spam; you know how it is.)

I'm currently offering a 50% discount for any contract which seems highly educational. My areas of interest include acoustics, DSP, embedded programming, Svelte, Solid, functional languages, Swift, and backend development in general.

I can be flexible on time zones. No interest in relocating long-term, but I'd be happy to visit your team in person for a few days to break the ice.

Thanks for reading, and I look forward to hearing from you :-)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: