© 2025 Nine Entertainment Co.
This repo contains PyTorch model definitions, pre-trained weights and training/sampling code for our paper scaling Diffusion Transformers to 16 billion parameters (DiT-MoE). DiT-MoE as a sparse ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results