So, in practice there are some limitations here. Chat interfaces force you to feed the entire context to the model everytime you ping it. Even multi step tool calls have a similar thing going.
So, yeah we may effectively turn all of this effectively into autoregressive models too.