Kind of. More like a mixture of a mixture of experts.
The problem is MoE on its own isn't able to use the context as a scratch pad for differentiated CoT trees.
So you have a mixture of token suggestions, but a singular chain of thought.
A mixture of both is probably going to perform better than just a mixture of the former, especially given everything we know by now regarding in context learning or the degree of transmission synthetic data is carrying.
So…a sort of mixture of experts if you will