So you’re saying the experts chosen are a more literal mixture of layers from each model? Rather than a simple “pick which model to run”?
So you’re saying the experts chosen are a more literal mixture of layers from each model? Rather than a simple “pick which model to run”?