It is extremely complicated for browsers to change their existing behavior, and we should be cautious about them doing so.
We could however introduce another attribute that has the correct behavior, deprecate `disabled`, and throw warnings during ARIA validation/linting for pages that continue to use it. Call it `blocked` or `unsusable` or have if you want to be fancy, make a `status` or something with attribute with multiple values that it accepts. Whatever seems most reasonable.
But my point is we're not trapped in the world of developers needing to poorly replicate browser functionality for every single form they make; we could still have an attribute that makes it easy for developers to by-default program forms with the correct behavior.
In Javascript, adding `let` and `const` didn't require us to get rid of `var`. We didn't have to change `var` behavior to make `let` throw errors on redeclarations. There are options here for providing tools within browsers that work correctly.
They can, but they have to think a lot about backwards compatibility and existing code. So sometimes they can't, and in this particular case I think it's going to be hard, mostly because of existing Javascript code.
Also there's standardization and coordination between browsers, but if there is a genuine need for something they tend to find a way.
Even if the browser is somehow required to move the official "focus" to an arbitrary useless location, they don't have to treat that focus as the sole gospel truth of where to read from.
No, they literally have to. The accessibility tool landscape is a wild place, since there are so many different human disability types, and people need different assistance in different ways, so there is no one tool that can help everyone end-to-end. Some people can see, but their hands can't type, so they use a custom input method. Some people can't see, but they can type, so they just need to be able to hear the page but otherwise can interact with the page with their normal keyboard. Some people have issues both seeing, hearing, and typing, so they have their own custom setup.
The browser's focus has to be taken as the gospel truth, because that is the only thing that all these different assistance tools can agree on. There is no standard otherwise. If you want to be a smartass and develop a speech-to-text tool that does it's own focus management separate from the browser's, well now people who need custom input solution can't use your tool, because what they are typing into (browser's focus) is different from what your tool is focusing on.
No one is inputting directly to window.document, so in the scenario in the article you need to move the browser focus separately anyway. You'll notice I specifically said "where to read from", which is a distinctly different question from where input goes. Yes, you'll want to bring them together pretty often, but that's still better than dumping the user at the document root anytime a developer does something normal and predictable like disabling a button.