Anyone who has worked on a large migration eventually lands on a pattern that goes something like this:
1. Double-write to the old system and the new system. Nothing uses the new system;
2. Verify the output in the new system vs the old system with appropriate scripts. If there are issues, which there will be for awhile, go back to (1);
3. Start reading from the new system with a small group of users and then an increasingly large group. Still use the old system as the source of truth. Log whenever the output differs. Keep making changes until it always matches;
4. Once you're at 100% rollout you can start decomissioning the old system.
This approach is incremental, verifiable and reversible. You need all of these things. If you engage in a massive rewrite in a silo for a year or two you're going to have a bad time. If you have no way of verifying your new system's output, you're going to have a bad time. In fact, people are going to die, as is the case here.
If you're going to accuse someone of a criminal act, a system just saying it happened should NEVER be sufficient. It should be able to show its work. The person or people who are ultimately responsible for turning a fraud detection into a criminal complaint should themselves be criminally liable if they make a false complaint.
We had a famous example of this with Hertz mistakenly reporting cars stolen, something they ultimately had to pay for in a lawsuit [1] but that's woefully insufficient. It is expensive, stressful and time-consuming to have to criminally defend yourself against a felony charge. People will often be forced to take a plea because absolutely everything is stacked in the prosecution's favor despite the theoretical presumption of innocence.
As such, an erroneous or false criminal complaint by a company should itself be a criminal charge.
In Hertz's case, a human should eyeball the alleged theft and look for records like "do we have the car?", "do we know where it is?" and "is there a record of them checking it in?"
In the UK post office scandal, a detection of fraud from accounting records should be verified by comparison to the existing system in a transition period AND, moreso in the beginning, double checking results with forensic accountants (actual humans) before any criminal complaint is filed.
I realize scale makes everything more difficult but at the end of the day, Netflix is encoding and serving several thousand videos via a CDN. It can't be this hard. There are a few statements in this that gave me pause.
The core problem seems to be development in isolation. Put another way: microservices. This post hints at microservices having complete autonomy over their data storage and developing their own GraphQL models. The first is normal for microservices (but an indictment at the same time). The second is... weird.
The whole point of GraphQL is to create a unified view of something, not to have 23 different versions of "Movie". Attributes are optional. Pull what you need. Common subsets of data can be organized in fragments. If you're not doing that, why are you using GraphQL?
So I worked at Facebook and may be a bit biased here because I encountered a couple of ex-Netflix engineers in my time who basically wanted to throw away FB's internal infrastructure and reinvent Netflix microservices.
Anyway, at FB there a Video GraphQL object. There aren't 23 or 7 or even 2.
Data storage for most things was via write-through in-memory graph database called TAO that persisted things to sharded MySQL servers. On top of this, you'd use EntQL to add a bunch of behavior to TAO like permissions, privacy policies, observers and such. And again, there was one Video entity. There were offline data pipelines that would generally process logging data (ie outside TAO).
Maybe someone more experienced with microservices can speak to this: does UDA make sense? Is it solving an actual problem? Or just a self-created problem?
I think they are just trying to put in place the common data model that, as you point out, they need.
(So their micro services can work together usefully and efficiently -- I would guess that currently the communication burden between microservice teams is high and still is not that effective.)
> The whole point of GraphQL is to create a unified view of something
It can do that, but that's not really the point of GraphQL.. I suppose you're saying that's how it was used as FB. That's fine, IMO, but it sounds like this NF team decided to use something more abstract for the same purpose.
I can't comment on their choices without doing a bunch more analysis, but in my own experience I've found off-the-shelf data modeling formats have too much flexibility in some places (forcing you to add additional custom controls or require certain usage patterns) and not enough in others (forcing you to add custom extensions). The nice thing about your own format is you can make it able to express everything you want and nothing you don't. And have a well-defined projection to Graphql (and sqlite and oracle and protobufs and xml and/or whatever other thing you're using).
I totally agree. Especially with Fusion it’s very easy to establish core types in self-contained subgraphs and then extend those types in domain-specific subgraphs. IMO the hardest part about this approach is just namespacing all the things, because GraphQL doesn’t have any real conventions for organizing service- (or product-) specific types.
> The whole point of GraphQL is to create a unified view of something, not to have 23 different versions of "Movie".
GraphQL is great at federating APIs, and is a standardized API protocol. It is not a data modeling language. We actually tried really hard with GraphQL first.
>at the end of the day, Netflix is encoding and serving several thousand videos via a CDN. It can't be this hard
Yeah maybe 10 years ago, but today Netflix is one of the top production companies on the planet. In the article, they even point to how this addresses their issues in content engineering
So I've worked for Google (and Facebook) and it really drives the point home of just how cheap hardware is and how not worth it optimizing code is most of the time.
More than a decade ago Google had to start managing their resource usage in data centers. Every project has a budget. CPU cores, hard disk space, flash storage, hard disk spindles, memory, etc. And these are generally convertible to each other so you can see the relative cost.
Fun fact: even though at the time flash storage was ~20x the cost of hard disk storage, it was often cheaper net because of the spindle bottleneck.
Anyway, all of these things can be turned into software engineer hours, often called "mili-SWEs" meaning a thousandth of the effort of 1 SWE for 1 year. So projects could save on hardware and hire more people or hire fewer people but get more hardware within their current budgets.
I don't remember the exact number of CPU cores amounted to a single SWE but IIRC it was in the thousands. So if you spend 1 SWE year working on optimization acrosss your project and you're not saving 5000 CPU cores, it's a net loss.
Some projects were incredibly large and used much more than that so optimization made sense. But so often it didn't, particularly when whatever code you wrote would probably get replaced at some point anyway.
The other side of this is that there is (IMHO) a general usability problem with the Web in that it simply shouldn't take the resources it does. If you know people who had to or still do data entry for their jobs, you'll know that the mouse is pretty inefficient. The old terminals from 30-40+ years ago that were text-based had some incredibly efficent interfaces at a tiny fraction of the resource usage.
I had expected that at some point the Web would be "solved" in the sense that there'd be a generally expected technology stack and we'd move on to other problems but it simply hasn't happened. There's still a "framework of the week" and we're still doing dumb things like reimplementing scroll bars in user code that don't work right with the mouse wheel.
I don't know how to solve that problem or even if it will ever be "solved".
I worked there too and you're talking about performance in terms of optimal usage of CPU on a per-project basis.
Google DID put a ton of effort into two other aspects of performance: latency, and overall machine utilization. Both of these were top-down directives that absorbed a lot of time and attention from thousands of engineers. The salary costs were huge. But, if you're machine constrained you really don't want a lot of cores idling for no reason even if they're individually cheap (because the opportunity cost of waiting on new DC builds is high). And if your usage is very sensitive to latency then it makes sense to shave milliseconds off because of business metrics, not hardware $ savings.
The key part here is "machine utilization" and absolutely there was a ton of effort put into this. I think before my time servers were allocated to projects but even early on in my time at Google Borg had already adopted shared machine usage and therew was a whole system of resource quota implemented via cgroups.
Likewise there have been many optimization projects and they used to call these out at TGIF. No idea if they still do. One I remember was reducing the health checks via UDP for Stubby and given that every single Google product extensively uses Stubby then even a small (5%? I forget) reduction in UDP traffic amounted to 50,000+ cores, which is (and was) absolutely worth doing.
I wouldn't even put latency in the same category as "performance optimization" because often you decrease latency by increasing resource usage. For example, you may send duplicate RPCs and wait for the fastest to reply. That could be double or tripling effort.
Except you’re self selecting for a company that has high engineering costs, big fat margins to accommodate expenses like additional hardware, and lots of projects for engineers to work on.
The evaluation needs to happen in the margins, even if it saves pennies/year on the dollar, it’s best to have those engineers doing that than have them idling.
The problem is that almost no one is doing it, because the way we make these decisions has nothing to do with the economical calculus behind, most people just do “what Google does”, which explains a lot of the disfunction.
I think the parent's point is that if Google with millions of servers can't make performance optimization worthwhile, then it is very unlikely that a smaller company can. If salaries dominate over compute costs, then minimizing the latter at the expense of the former is counterproductive.
> The evaluation needs to happen in the margins, even if it saves pennies/year on the dollar, it’s best to have those engineers doing that than have them idling.
That's debatable. Performance optimization almost always lead to complexity increase. Doubled performance can easily cause quadrupled complexity. Then one has to consider whether the maintenance burden is worth the extra performance.
I think it's the reverse: a small company doesn't have the liquidity, buying power or ability to convert more resource into more money like Google.
And of course a lot of small companies will be paying Google with a fat margin to use their cloud.
Getting by with less resources, or even on-premise reduced hardware will be a way bigger win. That's why they'll pay a DBA full time to optimize their database needs to reduce costs 2 to 3x the salary. Or have full team of infra guys mostly dealing with SRE and performance.
> I don't remember the exact number of CPU cores amounted to a single SWE but IIRC it was in the thousands.
I think this probably holds true for outfits like Google because 1) on their scale "a core" is much cheaper than average, and 2) their salaries are much higher than average. But for your average business, even large businesses? A lot less so.
I think this is a classic "Facebook/Google/Netflix/etc. are in a class of their own and almost none of their practices will work for you"-type thing.
Maybe not to the same extent, but an AWS EC2 m5.large VM with 2 cores and 8 GB RAM costs ~$500/year (1 year reserved). Even if your engineers are being paid $50k/year, that's the same as 100 VMs or 200 cores + 800 GB RAM.
I don't know how to solve that problem or even if it will ever be "solved".
It will not be “solved” because it’s a non-problem.
You can run a thought experiment imagining an alternative universe where human resource were directed towards optimization, and that alternative universe would look nothing like ours. One extra engineer working on optimization means one less engineer working on features. For what exactly? To save some CPU cycles? Don’t make me laugh.
Google has over the years tried to get several new languages off the ground. Go is by far the most successful.
What I find fascinating is that all of them that come to mind were conceived by people who didn't really understand the space they were operating in and/or had no clear idea of what problem the language solved.
There was Dart, which was originally intended to be shipped as a VM in Chrome until the Chrome team said no.
But Go was originally designed as a systems programming language. There's a lot of historical revisionism around this now but I guarantee you it was. And what's surprising about that is that having GC makes that an immediate non-starter. Yet it happened anyway.
The other big surprise for me was that Go launched without external dependencies as a first-class citizen of the Go ecosystem. For the longest time there were two methods of declaring them: either with URLs (usually Github) in the import statements or with badly supported manifests. Like just copy what Maven did for Java. Not the bloated XML of course.
But Go has done many things right like having a fairly simple (and thus fast to compile) syntax, shipping with gofmt from the start and favoring error return types over exceptions, even though it's kind of verbose (and Rust's matching is IMHO superior).
Channels were a nice idea but I've become convinced that cooperative async-await is a superior programming model.
Anyway, Go never became the C replacement the team set out to make. If anything, it's a better Python in many ways.
Good luck to Ian in whatever comes next. I certainly understand the issues he faced, which is essentially managing political infighting and fiefdoms.
Some of us believe GC[0] isn't an impediment for systems programming languages.
They haven't taken off as Xerox PARC, ETHZ, Dec Olivetti, Compaq, Microsoft desired more due to politics, external or internal (in MS's case), than technical impediments.
Hence why I like the way Swift and Java/Kotlin[1] are pushed on mobile OSes, to the point "my way or get out".
I might discuss about many of Go's decisions regarding minimalism language design, however I will gladly advocate for its suitability as systems language.
The kind of systems we used to program for a few decades ago, compilers, linkers, runtimes, drivers, OS services, bare metal deployments (see TamaGo),...
[0] - Any form of GC, as per computer science definition, not street knowledge.
[1] - The NDK is relatively constrained, and nowadays there is Kotlin Native as well.
Sure. First you need to separate buffered and unbuffered channels.
Unbuffered channels basically operate like cooperate async/await but without the explictness. In cooperative multitasking, putting something on an unbuffered channel is essentially a yield().
An awful lot of day-to-day programming is servicing requests. That could be HTTP, an RPC (eg gRPC, Thrift) or otherwise. For this kind of model IMHO you almost never want to be dealing with thread primitives in application code. It's a recipe for disaster. It's so easy to make mistakes. Plus, you often need to make expensive calls of your own (eg reading from or writing to a data store of some kind) so there's no really a performance benefit.
That's what makes cooperative async/await so good for application code. The system should provide compatible APIs for doing network requests (etc). You never have to worry about out-of-order processing, mutexes, thread pool starvation or a million other issues.
Which brings me to the more complicated case of buffered channels. IME buffered channels are almost always a premature optimization that is often hiding concurrency issues. As in if that buffered channels fills up you may deadlock where you otherwise wouldn't if the buffer wasn't full. That can be hard to test for or find until it happens in production.
But let's revisit why you're optimizing this with a buffered channel. It's rare that you're CPU-bound. If the channel consumer talks to the network any perceived benefit of concurrency is automatically gone.
So async/await doesn't allow you to buffer and create bugs for little benefit and otherwise acts like unbuffered channels. That's why I think it's a superior programming model for most applications.
Buffers are there to deal with flow variances. What you are describing as the "ideal system" is a clockwork. Your async-awaits are meshed gears. For this approach to be "ideal" it needs to be able to uniformly handle the dynamic range of the load on the system. This means every part of the clockwork requires the same performance envelope. (a little wheel is spinning so fast that it causes metal fatigue; a flow hits the performance ceiling of an intermediary component). So it either fails or limits the system's cyclical rate. These 'speed bumps' are (because of the clockwork approach) felt throughout the flow. That is why we put buffers in between two active components. Now we have a greater dynamic range window of operation without speed bumps.
It shouldn't be too difficult to address testing of buffered systems at implementation time. Possibly pragma/compile-time capabilities allowing for injecting 'delay' in the sink side to trivially create "full buffer" conditions and test for it.
There are no golden hammers because the problem domain is not as simple as a nail. Tradeoffs and considerations. I don't think I will ever ditch either (shallow, preferred) buffers or channels. They have their use.
I agree with many of your points, including coroutines being a good abstraction.
The reality is though that you are directly fighting or reimplementing the OS scheduler.
I haven’t found an abstraction that does exactly what I want but unfortunately any sort of structured concurrency tends to end up with coloured functions.
Something like C++ stdexec seems interesting but there are still elements of function colouring in there if you need to deal with async. The advantage is that you can compose coroutines and synchronous code.
For me I want a solution where I don’t need to care whether a function is running on the async event loop, a separate thread, a coprocessor or even a different computer and the actor/CSP model tends to model that the best way. Coroutines are an implementation detail and shouldn’t be exposed in an API but that is a strong opinion.
As you probably know, Rust ended up with async/await. This video goes deep into that and the alternatives, and it changed my opinions a bit: https://www.youtube.com/watch?v=lJ3NC-R3gSI
Golang differs from Rust by having a runtime underneath. If you're already paying for that, it's probably better to do greenthreading than async/await, which is what Go did. I still find the Go syntax for this more bothersome and error-prone, as you said, but there are other solutions to that.
I can see the appeal for simplicity of concept and not requiring any runtime, but it has some hard tradeoffs. In particular the ones around colored functions and how that makes it feel like concurrency was sort of tacked onto the languages that use it. Being cooperative adds a performance cost as well which I'm not sure I'd be on board with.
“Systems programming language” is an ambiguous term and for some definitions (like, a server process that handles lots of network requests) garbage collection can be ok, if latency is acceptable.
Google has lots of processes handling protobuf requests written in both Java and C++. (Or at least, it did at the time I was there. I don’t think Go ever got out of third place?)
My working definition of "systems programming" is "programming software that controls the workings of other software". So kernels, hypervisors, emulators, interpreters, and compilers. "Meta" stuff. Any other software that "lives inside" a systems program will take on the performance characteristics of its host, so you need to provide predictable and low overhead.
GC[0] works for servers because network latency will dominate allocation latency; so you might as well use a heap scanner. But I wouldn't ever want to use GC in, say, audio workloads; where allocation latency is such a threat that even malloc/free has to be isolated into a separate thread so that it can't block sample generation. And that also means anything that audio code lives in has to not use GC. So your audio code needs to be written in a systems language, too; and nobody is going to want an OS kernel that locks up during near-OOM to go scrub many GBs of RAM.
[0] Specifically, heap-scanning deallocators, automatic refcount is a different animal.
I wouldn’t include compilers in that list. A traditional compiler is a batch process that needs to be fast enough, but isn’t particularly latency sensitive; garbage collection is fine. Compilers can and are written in high-level languages like Haskell.
Interpreters are a whole different thing. Go is pretty terrible for writing a fast interpreter since you can’t do low-level unsafe stuff like NaN boxing. It’s okay if performance isn’t critical.
You don't (usually) inherit the performance characteristics of your compiler, but you do inherit the performance characteristics of the language your compiler implements.
From what I remember, Go started out because a C++ application took 30 minutes compiling even though they were using google infrastructure, you could say that they set out to create a systems programming language (they certainly thought so), but mostly I think the real goal was recreating C++ features without the compile time, and in that, they were successful.
A lot of the time, a lack of bugfixes comes from the incentive structure management has created. Specifically, you rarely get rewarded for fixing things. You get rewarded for shipping new things. In effect, you're punished for fixing things because that's time you're not shipping new things.
Ownership is another one. For example, product teams who are responsible for shipping new things but support for existing things get increasingly pushed onto support teams. This is really a consequence of the same incentive structure.
This is partially why I don't think that all subscription software is bad. The Adobe end of the spectrum is bad. The Jetbrains end is good. There is value in creating good, reliable software. If your only source of revenue is new sales then bugs are even less of a priority until it's so bad it makes your software virtually unusuable. And usually it took a long while to get there with many ignored warnings.
The whole New UI debacle really set the tone and expectations and I don't see them changing. They seem like a different company these days? Maybe I didn't really notice in the past.
JetBrains is dead within 5 years unless they can get their AI game figured out (which they’re not).
Don’t get me wrong, I love JetBrains products. However, there value has been almost exclusively in QoL for dev. AI is drastically cutting the need for that.
The jetbrains model is every new release fixes that one critical bug that's killing you, and adds 2 new critical bugs that will drive you mad. I eventually got fed up and jumped off that train.
Hmm, I’ve pretty much never experienced a bug in JetBrains products.
They’re one of the few products that just amazes me with how robust it is. Often, it will tell me I have issues before I even know about them (e.g my runtime is incorrect) and offer 1-click fixes.
Not really sure what you guys are talking about. I've been using Rider for years and it's been great. I'm using the new UI and I have no problems with commits or anything else.
Recently joined a new team where I have to use VS because we have to work through a remote desktop where I can't install new stuff without a lengthy process, and having used VS for a while now it's so much worse. I miss Rider practically every second I'm writing code. There is nothing that I need that VS does better, it's either the same or usually worse for everything I do.
I hope I'll get a bit more used to it over time but so far I hate it. Feels like it's significantly reducing my velocity compared to Rider.
Where to? There's nothing even remotely comparable for many tech stacks. I've been looking for alternatives for many years (also being fed up with their disregard for bugs and performance), but there are none (expect for proper VS for Windows-first C++/C#).
Sadly, I just accepted having worse productivity. I didn't really have a choice, their bugs were actively breaking my workflow, like causing builds to fail. It definitely made me more frustrated and less productive on a day-to-day basis.
Netbeans is not for real development. Sorry, I love Netbeans. I grew up using it. It just doesn't have good support for real world Java development. As for Eclipse, I'll use notepad over that any day. I've been programming in Java since highschool, 20+ years ago.
IntelliJ is the best there is for Java, warts and all.
I just accepted I wasn’t going to find anything comparable, and just have to bite the bullet and accept software that has way less features, but at least consistently works, and doesn’t randomly decide to run at 800% CPU when a single file changes.
Now on team Zed. We’ll see how long that is good before it enshittifies too. I’m not sure if I should be happy they’re still not charging me for it.
In the next release currently in beta, but they relented to move it to an unsupported plugin. Not sure if the idea.properties setting which still works will be removed.
You're removing autonomy from the support team, this will demoralize them.
The issue becomes, you have two teams, one moving fast, adding new features, often nonsensical to the support team, and the second one cleaning up afterward. Being in clean-up crew ain't fun at all.
This builds up resentment, i.e. "Why are they doing this?".
EDIT: If you make it so support team approval is necessary for feature team, you'll remove autonomy from feature team, causing resentment in their ranks (i.e. "Why are they slowing us down? We need this to hit our KPIs!").
Some 20+ years ago we solved this by leapfrogging.
Team A does majority of new features in major release N.
Team B for N+1.
Team A for N+2.
Team A maintains N until N+1 ships.
Team B maintains N+1 until N+2 ships.
Xoogler here. I never worked on Fuchsia (or Android) but I knew a bunch of people who did and in other ways I was kinda adjacent to them and platforms in general.
Some have suggested Fuchsia was never intended to replace Android. That's either a much later pivot (after I left Google) or it's historical revisionism. It absolutely was intended to replace Android and a bunch of ex-Android people were involved with it from the start. The basic premise was:
1. Linux's driver situation for Android is fundamentally broken and (in the opinion of the Fuchsia team) cannot be fixed. Windows, for example, spent a lot of time on this issue to isolate issues within drivers to avoid kernel panics. Also, Microsoft created a relatively stable ABI for drivers. Linux doesn't do that. The process of upstreaming drivers is tedious and (IIRC) it often doesn't happen; and
2. (Again, in the opinion of the Fuchsia team) Android needed an ecosystem reset. I think this was a little more vague and, from what I could gather, meant different things to different people. But Android has a strange architecture. Certain parts are in the AOSP but an increasing amount was in what was then called Google Play Services. IIRC, an example was an SSL library. AOSP had one. Play had one.
Fuchsia, at least at the time, pretty much moved everything (including drivers) from kernel space into user space. More broadly. Fuchsia can be viewed in a similar way to, say, Plan9 and micro-kernel architectures as a whole. Some think this can work. Some people who are way more knowledgeable and experienced on OS design seem to be pretty vocal saying it can't because of the context-switching. You can find such treatises online.
In my opinion, Fuchsia always struck me as one of those greenfield vanity projects meant to keep very senior engineers. Put another way: it was a solution in search of a problem. You can argue the flaws in Android architecture are real but remember, Google doesn't control the hardware. At that time at least, it was Samsung. It probably still is. Samsung doesn't like being beholden to Google. They've tried (and failed) to create their own OS. Why would they abandon one ecosystem they don't control for another they don't control? If you can't answer that, then you shouldn't be investing billions (quite literally) into the project.
Stepping back a bit, Eric Schmidt when he was CEO seemed to hold the view that ChromeOS and Android could coexist. They could compete with one another. There was no need to "unify" them. So often, such efforts to unify different projects just lead to billions of dollars spent, years of stagnation and a product that is the lowest common denominator of the things it "unified". I personally thought it was smart not to bother but I also suspect at some point someone would because that's always what happens. Microsoft completely missed the mobile revolution by trying to unify everything under Windows OS. Apple were smart to leave iOS and MacOS separate.
The only fruit of this investment and a decade of effort by now is Nest devices. I believe they tried (and failed) to embed themselves with Chromecast
But I imagine a whole bunch of people got promoted and isn't that the real point?
This is probably the most complete story told publicly, but there was a lot of timeline with a lot of people in it, so as with any such complicated history "it depends who you ask and how you frame the question": https://9to5google.com/2022/08/30/fuchsia-director-interview...
I remember reading the fuchsia slide deck and being absolutely flabbergasted at the levels of architecture astronautics going on in it. It kept flipping back and forth between some generic PM desire ("users should be able to see notifications on both their phone and their tablet!") to some ridiculous overcomplication ("all disk access should happen via a content-addressable filesystem that's transparently synchronized across every device the user owns").
The slide with all of the "1.0s" shipped by the Fuchsia team did not inspire confidence, as someone who was still regularly cleaning up the messes left by a few select members, a decade later.
I worked on the Nest HomeHub devices and the push to completely rewrite an already shipped product from web/HTML/Chromecast to Flutter/Fuchsia was one of the most insane pointless wastes of money and goodwill I've seen in my career. The fuchsia teams were allowed to grow to seemingly infinite headcount and make delivery promises they could not possibly satisfy -- miss them and then continue with new promises to miss --while the existing software stack was left to basically rot, and disrespected. Eventually they just killed the whole product line so what was the point?
It was exactly the model of how not to do large scale software development.
Fuchsia the actual software looks very cool. Too bad it was Google doing it.
Linux's ever evolving ABI is a feature, not a bug. It's how Linux maintains technical excellence. I'll take that over a crusty backwards compatibility layer written 30 years ago that is full of warts.
1. Over time, profits will tend to decrease. The only way to sustain or increase profits is to cut costs or increase prices;
2. Executive compensation is tied to short-term profit making and/or (worse) the share price;
3. The above leads to eery aspect of a company becoming financialized. We put the accountants in charge of everything. If you look at pretty much any company (eg Boeing) you can trace back their downfall to an era of making short-term profit decisions;
4. Intel has spent ~$152 billion in share buybacks over the last 35 years [1]. Why they need any subsidies is beyond me;
5. We keep giving money to these companies without getting anything in return. We fund research. Pharma companies get to profit off that without giving anything back. We bail out banks after 2008. Why didn't we just nationalize failing those failing banks, restructure then sell (like any other bankruptcy)? We hand out subsidies with no strings attached. A lot of political hay is made out of "welfare" abuse. Well, the biggest form of welfare abuse is corporate welfare;
6. It is important to maintain a high corporate tax rate. Why? Because a low corporate tax rate means there is little to no cost to returning money to shareholders instead of investing in the business. You make $1 billion in profits. What do you do? If you invest it in the business, you get to spend $1 billion. If you pay a dividend or do a buyback, you get to give back $790 million (@ 21% corporate tax rate). Now imagine that corporate tax rate was 40% instead. It completely changes the decision-making process.
7. The Intel of 20 years ago was a fabrication behemoth that led the industry. It's crazy how far it's fallen and how it's unable to produce anything. It's been completely eclipsed by TSMC. Looking back, the decade long delays in 10nm should've set off alarm bells at many points along the way.
8. There is no downside to malfeasance by corporate executives. None. In a just world, every one of the Sacklers would die penniless in a prison cell.
> Now imagine that corporate tax rate was 40% instead. It completely changes the decision-making process.
Seems more like a question of degree. Dividends are also taxed as income so ~36% is already paid in tax depoending on the income of the shareholder. Increasing the corporate tax rate to 40% brings the effective tax rate to ~52%.
In my experience there's a more fundamental problem with large companies. In a small company, the best way to succeed as an individual (whatever position you have) is for the company as a whole to succeed. At a very large company, the best way to succeed is to be promoted up the ladder, whatever the cost. This effect is the worst at the levels just below the top: you have everything to lose and nothing to gain by the company being successful. It's far more effective to sabotage your peers and elevate yourself rather than work hard and increase the value of the company by a couple of percentage points.
The thing is, the people that have been there since the beginning still have the mindset of helping the company as a whole succeed, but after enough time and enough people have been rotated out, you're left with people at the top who only care about the politics. To them the company is simply a fixture - it existed before them and will continue to exist regardless of what they do.
You're alluding to the double taxation problem with dividends. This is a problem and has had a bunch of bad solutions (eg the passthrough tax break from 2017) when in fact the solution is incredibly simple.
In Australia, dividends come with what are called "franking credits". Imagine a company has a $1 billion profit and wants to pay that out as a dividend. The corporate tax rate is 30%. $700M is paid to shareholders. It comes wiht $300m (30%) in franking credits.
Let's say you own 1% of this company. When you do your taxes, you've made $10M in gross income (1% of $1B), been paid $7M and have $3M in tax credits. If your tax rate is 40% then you owe $4M on that $10M but you have already effectively paid $3M on that already.
The point is, the net tax rate on your $10M gross payout is still whatever your marginal tax rate is. There is no double taxaation.
That being said, dividends have largely fallen out of favor in favor of share buybacks. Some of those reasons are:
1. It's discretionary. Not every shareholders wants the income. Selling on the open market lets you choose if you want money or not;
2. Share buybacks are capital gains and generally enjoy lower tax rates than income;
3. Reducing the pool of available shares puts upward pressure on the share price; and
4. Double taxation of dividends.
There are some who demonize share buybacks specifically. I'm not one of them. It's simply a vehicle for returning money to shareholders, functionally very similar to dividends. My problem is doing either to the point of destroying the business.
Good points but AFAIK bank loans from 2008 were paid back with interests, those were definitely not some free money. I would focus on root causes instead of populists shallow statements like that - too few regulations and oversight that allowed creation of securities that should never have existed in first place.
No industry will self-regulate, as you write the lure of short term bonuses for execs is too high and punishment for failures are non existent. I expect current US admin will make this even worse, greed and short term profit seems to be the only focus.
I'm all for root cause analysis. A big part of that is that large companies become extremely risk-tolerant because history has shown there is little to no downside to their actions. If the government always bails you out, what incentive is there to be prudent? You may as well fly close to the Sun and pay out big bonuses now. Insolvency is a "next quarter" problem.
I'm aware that TARP funds were repaid. Still, a bunch of that money went straight into bonuses [1]. Honestly, I'd rather the company be seized, restructured and sold.
You know who ends up making sacrifices to keep a company afloat? The labor force. After 2008, auto workers took voluntary pay cuts, gave up benefits and otherwise did what they could to keep the company afloat, benefits it took them ~15 years to fight to get back. In a just world, executive compensation would go down to $1 until such a time that labor sacrifices are repaid.
On #6, that's an individual income tax (or capital gain tax, depends on how you define things). Corporate income tax is the one that is applied independently of the money being invested on the corporation or distributed.
I'm don't think you should subsidize reinvesting in huge companies anyway. What do you expect to gain from them becoming larger?
It's much better (for society) to let them send the money back to shareholders so they can invest on something else.
Reinvesting in the company is the one thing we should absolutely subsidize. That goes to wages, capital expenditure and other measures to sustain and grow the company.
Paying out dividends and doing share buybacks just strips the company for cash until there's nothing of value left. It's why entshittification is a thing.
Treating all wages as expenses seems fine to me. But have you noticed that large companies just stop growing at some point and it doesn't matter how much money you pour at them?
That is, unless they use the extra capital to buy legally-enforced monopolies, or bribe regulators out of their way.
And no, enshitification is a thing because people want those companies to grow and grow, and keep growing. Some times even after they have the majority of humanity as customers.
Speaking as a former Google Fiber software engineer, I'm honestly surprised this is still around.
In 2017, basically all the Google Fiber software teams went on hiatus (mine included). I can't speak to the timing or rationale but my theory is that the Google leadership couldn't decide if the future of Internet was wired or wireless and a huge investment in wired may be invalidated if the future Internet was wired so rather than guessing wrong, the leadership simply decided to definitely lose by mothballing the whole thing.
At that time, several proposed cities were put on hiatus, some of which had already hired local people. In 2019, Google Fiber exited Louisville, KY, paying penalties for doing so [1]. That really seemed like the end.
I also speculated that Google had tried or was trying to sell the whole thing. I do wonder if the resurrection it seems to have undergone is simply a result of the inability to find a buyer. I have no information to suggest that one way or the other.
There were missteps along the way. A big example was the TV software that was originally an acquisition, SageTV [2]. Somebody decided it would be a good idea to completely rewrite this Java app into Web technologies on an embedded Chrome instance on a memory-limited embedded CPU in a set-top box. Originally planned to take 6 months, it took (IIRC) 3.5+ years.
But that didn't actually matter at all in the grand scheme of things because the biggest problem and the biggest cost was physical network infrastructure. It is incredibly expensive and most of the issues are hyperlocal (eg soil conditions, city ordinances) as well as decades of lobbying by ISPs of state and local governments to create barriers against competition.
> In 2019, Google Fiber exited Louisville, KY, paying penalties for doing so
Those mistakes in Louisville were huge. Literally street destroying mistakes that city Civil Engineers predicted and fought from happening in the first place, but Google Fiber did them anyway. Left a huge bill to the city taxpayers. It wasn't bigger news and a bigger upset because of NDAs and other contract protection things involved, but as an outsider to those NDAs/contracts, I can say it was an incredibly bad job on too many fronts, and should have left Google Fiber with a much more tarnished reputation than it did.
> There were missteps along the way. A big example was the TV software that was originally an acquisition, SageTV [2]. Somebody decided it would be a good idea to completely rewrite this Java app into Web technologies on an embedded Chrome instance on a memory-limited embedded CPU in a set-top box. Originally planned to take 6 months, it took (IIRC) 3.5+ years.
I worked on the "misstep" with a small team, and it’s wild to see Fiber still around and even expanding to new cities. As far as I can tell, the set-top box software had nothing to do with why Fiber was scaled down. Also, usability surveys showed people really liked the GUI!
The client supported on-demand streaming, live TV, and DVR on hardware with... let’s call them challenging specs. Still, it turned out to be a pretty slick app. We worked hard to keep the UI snappy (min 30 FPS), often ditching DOM for canvas or WebGL to squeeze out the needed performance. A migration to Cobalt [1], a much lighter browser than embedded Chromium, was on the table, but the project ended before that could happen.
Personally, it was a great experience working with the Web Platform (always a solid bet) on less-traditional hardware.
+1 to what was said above; the UI didn't take 3.5 years to make - we launched it fairly quickly and then continued to improve on it. Later there was large UX refresh, so maybe that's where OP is getting confused? Either way, that software continued to work for years after the team was moved on to other projects. SageTV was good, but the UI wasn't java - it was a custom xml-like layout.
> In 2017, basically all the Google Fiber software teams went on hiatus (mine included).
What does a hiatus entail in this case? Did these teams all just stop working on Fiber stuff and sit around all day hoping they would be given something to do?
They laid us all off. They had huge plans - millions of users! Then they intersected reality in KC where all people wanted was 5Mbit service and free TV... There were many, many people working to perfect the settop box for example. We got fq_codel running on the wifi, we never got anywhere on the shaper, the plan was to move 1+m units of that (horrible integrated chip the comcerto C2000 - it didn´t have coherent cache in some cases), I think they barely cracked 100k before pulling the plug on it all....
and still that box was better than what most fiber folk have delivered to date.
At least some good science was done about how ISPs really work... and published.
Too bitter. I referenced a little of that "adventure" here, in 2021... gfiber was attempting to restart with refreshing their now obsolete hardware... https://blog.cerowrt.org/post/trouble_in_paradise/
I was thinking the same thing, not to mention that when Google Fiber was first announced, I was happy to be all in on Google for services but now, I’d be hesitant to use them for anything more than I’m already tied to.
Sorry but this is just incorrect on many fronts. I can speak to this issue as a former engineer on Google Fiber so I got to see just how the sausage was made.
Existing national ISPs just have inbuilt advantages that a newcomer cannot replicate or can't replicate cheaply. This is the result of decades of lobbying state and local governments.
Take something as simple as where you run cables in the streets. You basically have two choices: you dig trenches or you string up cables on a pole. There is no best answer here as it depends on a lot of factors like weather and climate, local soil conditions, natural disaster risks (eg wildfires, earthquakes), distances involved and existing infrastructure and legislation.
So imagine in a given area trenching is uneconomical. This could be just because there's a lot of limestone rock in the soil so it's difficult, slow and expensive to actually dig the trenches and this may be complicated by local noise ordinances, permitting, surveying, existing trenches and so on. So you end up stringing up cables on poles.
Who owns those poles? Is it the city? Is it AT&T? You may have rights to string up cables on those poles but the devil really is in the details. You might have to apply for a permit for each and every pole separately. They might be approved by the city but then how does the work happen? Can you do it? Maybe. But you might need AT&T (or whoever) to do something first like move their own cables. Maybe several other companies have to move cables first. Maybe each company has 90 days to do that work and this can add up so it can take over a year just to be able to put a cable up on a pole. And you can't really do any work until all the poles are available. That's just how fiber works.
And where do you run the fiber too? Do you run it back possibly several miles to a POP? There are advantages in that but obvious disadvantages like cost and just overall cable size and weight. Or do you use local substations? If so, what kind of building is that? Is it a large building that residents find "ugly" and object to on aesthetic grounds or maybe even environmental grounds that means more delays? How much does that cost? Is AT&T grandfathered in with their substations and nodes?
And then after you've done all that and you have your last mile fiber, how many customers do you get? Roughly 30-40% of houses get fiber by how many companies are you splitting that pool with? You have to amortize your entire network build over your projected customer base and it makes a massive difference if it's 10% of dwellings or 15% or 30% or 40%.
In industry parlance this is called an "overbuild" and is inherently economically inefficient. It'll actually raise the cost of every ISP because each will get a lower overall take up rate.
That's why the best solution is municipal broadband that either provides service or acts as a wholesaler to virtual ISPs.
The cost of running a fiber cable from a POP to a house has only gone up over th eyears and it's the majority of your cost. That's really why Internet costs haven't come down. And also why the best Internet in the US is municipal broadband and it isn't even close.
Network protocls are slow to change. Just look at IPv6 adoption. Some of this is for good reason. Some isn't. Because of everything from threat reduction to lack of imagination equipment at every step of the process will tend to throw away anything that looks weird, a process somebody coined as ossification. You'll be surprised how long-lasting some of these things are.
Story time: I worked on Google Fiber years ago. One of the things I worked on was on services to support the TV product. Now if you know anything about video delivery over IP you know you have lots of choices. There are also layers like the protocls, the container format and the transport protocol. The TV product, for whatever reason, used a transport protocol called MPEG2-TS (Transport Streams).
What is that? It's a CBR (constant bit rate) protocol that stuffs 7 188 byte payloads into a single UDP packet that was (IPv4) multicast. Why 7? Well because 7 payloads (plus headers) was under 1500 bytes and you start to run into problems with any IP network once you have larger packets than that (ie an MTU of 1500 or 1536 is pretty standard). This is a big issue with high bandwidth NICs such that you have things like Jumbo frames to increase throughput and decrease CPU overhead but support is sketchy on a hetergenous network.
Why 188 byte payloads? For compatibility with Asynchronous Transfer Mode ("ATM"), a long-dead fixed-packet size protocol (53 byte packets including 48 bytes of payload IIRC; I'm not sure how you get from 48 to 188 because 4x48=192) designed for fiber networks. I kind of thought of it was Fiber Channel 2.0. I'm not sure that's correct however.
But my point is that this was an entirely owned and operated Google network and it still had 20-30+ year old decisions impacting its architecture.
Back to Homa, three thoughts:
1. Focusing on at-least once delivery instead of at-most once delivery seems like a good goal. It allows you to send the same packet twice. Plus you're worried about data offset, not ACKing each specific packet;
2. Priority never seems to work out. Like this has been tried. IP has an urgent bit. You have QoS on even consumer routers. If you're saying it's fine to discard a packet then what happens to that data if the receiver is still expecting it? It's well-intentioned but I suspect it just won't work in practice, like it never has previously;
3. Lack of connections also means lack of a standard model for encryption (ie SSL). Yes, encryption still matters inside a data center on purely internal connections;
4. QUIC (HTTP3) has become the de-facto standard for this sort of thing, although it's largely implementing your own connections in userspace over UDP; and
5. A ton of hardware has been built to optimize TCP and offload as much as possible from the CPU (eg checksumming packets). You see this effect with QUIC. It has significantly higher CPU overhad per payload byte than TCP does. Now maybe it'll catch up over time. It may also change as QUIC gets migrated into the Linux kernel (which is an ongoing project) and other OSs.
Anyone who has worked on a large migration eventually lands on a pattern that goes something like this:
1. Double-write to the old system and the new system. Nothing uses the new system;
2. Verify the output in the new system vs the old system with appropriate scripts. If there are issues, which there will be for awhile, go back to (1);
3. Start reading from the new system with a small group of users and then an increasingly large group. Still use the old system as the source of truth. Log whenever the output differs. Keep making changes until it always matches;
4. Once you're at 100% rollout you can start decomissioning the old system.
This approach is incremental, verifiable and reversible. You need all of these things. If you engage in a massive rewrite in a silo for a year or two you're going to have a bad time. If you have no way of verifying your new system's output, you're going to have a bad time. In fact, people are going to die, as is the case here.
If you're going to accuse someone of a criminal act, a system just saying it happened should NEVER be sufficient. It should be able to show its work. The person or people who are ultimately responsible for turning a fraud detection into a criminal complaint should themselves be criminally liable if they make a false complaint.
We had a famous example of this with Hertz mistakenly reporting cars stolen, something they ultimately had to pay for in a lawsuit [1] but that's woefully insufficient. It is expensive, stressful and time-consuming to have to criminally defend yourself against a felony charge. People will often be forced to take a plea because absolutely everything is stacked in the prosecution's favor despite the theoretical presumption of innocence.
As such, an erroneous or false criminal complaint by a company should itself be a criminal charge.
In Hertz's case, a human should eyeball the alleged theft and look for records like "do we have the car?", "do we know where it is?" and "is there a record of them checking it in?"
In the UK post office scandal, a detection of fraud from accounting records should be verified by comparison to the existing system in a transition period AND, moreso in the beginning, double checking results with forensic accountants (actual humans) before any criminal complaint is filed.
[1]: https://www.npr.org/2022/12/06/1140998674/hertz-false-accusa...