2026

Grounding and Enhancing Informativeness and Utility in Dataset Distillation
Grounding and Enhancing Informativeness and Utility in Dataset Distillation

Shaobo Wang, Yantai Yang, Guo Chen, Peiru Li, Kaixin Li, Yufa Zhou, Zhaorun Chen, Linfeng Zhang

ICLR 2026

We propose InfoUtil, a principled dataset distillation framework that formalizes informativeness and utility, leveraging Shapley value attribution and gradient-norm-based selection to synthesize compact yet highly effective datasets, achieving consistent SOTA performance and efficiency across architectures and benchmarks.

×
BibTeX Citation
@inproceedings{wang2026infoutil, title={Grounding and Enhancing Informativeness and Utility in Dataset Distillation}, author={Wang, Shaobo and Yang, Yantai and Chen, Guo and Li, Peiru and Li, Kaixin and Zhou, Yufa and Chen, Zhaorun and Zhang, Linfeng}, booktitle={The Fourteenth International Conference on Learning Representations}, year={2026}, url={https://openreview.net/forum?id=ThsYRbpv2F} }
Grounding and Enhancing Informativeness and Utility in Dataset Distillation

Shaobo Wang, Yantai Yang, Guo Chen, Peiru Li, Kaixin Li, Yufa Zhou, Zhaorun Chen, Linfeng Zhang

ICLR 2026

We propose InfoUtil, a principled dataset distillation framework that formalizes informativeness and utility, leveraging Shapley value attribution and gradient-norm-based selection to synthesize compact yet highly effective datasets, achieving consistent SOTA performance and efficiency across architectures and benchmarks.

×
BibTeX Citation
@inproceedings{wang2026infoutil, title={Grounding and Enhancing Informativeness and Utility in Dataset Distillation}, author={Wang, Shaobo and Yang, Yantai and Chen, Guo and Li, Peiru and Li, Kaixin and Zhou, Yufa and Chen, Zhaorun and Zhang, Linfeng}, booktitle={The Fourteenth International Conference on Learning Representations}, year={2026}, url={https://openreview.net/forum?id=ThsYRbpv2F} }
FastCar: Cache Attentive Replay for Fast Auto-Regressive Video Generation on the Edge
FastCar: Cache Attentive Replay for Fast Auto-Regressive Video Generation on the Edge

Xuan Shen, Weize Ma, Yufa Zhou, Enhao Tang, Yanyue Xie, Zhengang Li, Yifan Gong, Quanyi Wang, Henghui Ding, Yiwei Wang, Yanzhi Wang, Pu Zhao, Jun Lin, Jiuxiang Gu

ICLR 2026

We propose FastCar, a unified framework that accelerates auto-regressive video generation by exploiting temporal redundancy through a Temporal Attention Score for selective computation reuse, integrating with sparse attention and dynamic scheduling to enable real-time, high-resolution synthesis with over 2.1× speedup and minimal quality loss.

×
BibTeX Citation
@article{shen2025fastcar, title={Fastcar: Cache attentive replay for fast auto-regressive video generation on the edge}, author={Shen, Xuan and Ma, Weize and Zhou, Yufa and Tang, Enhao and Xie, Yanyue and Li, Zhengang and Gong, Yifan and Wang, Quanyi and Ding, Henghui and Wang, Yiwei and others}, journal={arXiv preprint arXiv:2505.14709}, year={2025} }
FastCar: Cache Attentive Replay for Fast Auto-Regressive Video Generation on the Edge

Xuan Shen, Weize Ma, Yufa Zhou, Enhao Tang, Yanyue Xie, Zhengang Li, Yifan Gong, Quanyi Wang, Henghui Ding, Yiwei Wang, Yanzhi Wang, Pu Zhao, Jun Lin, Jiuxiang Gu

ICLR 2026

We propose FastCar, a unified framework that accelerates auto-regressive video generation by exploiting temporal redundancy through a Temporal Attention Score for selective computation reuse, integrating with sparse attention and dynamic scheduling to enable real-time, high-resolution synthesis with over 2.1× speedup and minimal quality loss.

×
BibTeX Citation
@article{shen2025fastcar, title={Fastcar: Cache attentive replay for fast auto-regressive video generation on the edge}, author={Shen, Xuan and Ma, Weize and Zhou, Yufa and Tang, Enhao and Xie, Yanyue and Li, Zhengang and Gong, Yifan and Wang, Quanyi and Ding, Henghui and Wang, Yiwei and others}, journal={arXiv preprint arXiv:2505.14709}, year={2025} }
The Geometry of Reasoning: Flowing Logics in Representation Space
The Geometry of Reasoning: Flowing Logics in Representation Space

Yufa Zhou*, Yixiao Wang*, Xunjian Yin*, Shuyan Zhou, Anru R. Zhang(* equal contribution)

ICLR 2026

We study how LLMs “think” through their embeddings by introducing a geometric framework of reasoning flows, where reasoning emerges as smooth trajectories in representation space whose velocity and curvature are governed by logical structure rather than surface semantics, validated through cross-topic and cross-language experiments, opening a new lens for interpretability.

×
BibTeX Citation
@inproceedings{zhou2025geometry, title = {The Geometry of Reasoning: Flowing Logics in Representation Space}, author = {Zhou, Yufa and Wang, Yixiao and Yin, Xunjian and Zhou, Shuyan and Zhang, Anru R.}, booktitle = {The Fourteenth International Conference on Learning Representations}, year = {2026}, url = {https://openreview.net/forum?id=ixr5Pcabq7} }
The Geometry of Reasoning: Flowing Logics in Representation Space

Yufa Zhou*, Yixiao Wang*, Xunjian Yin*, Shuyan Zhou, Anru R. Zhang(* equal contribution)

ICLR 2026

We study how LLMs “think” through their embeddings by introducing a geometric framework of reasoning flows, where reasoning emerges as smooth trajectories in representation space whose velocity and curvature are governed by logical structure rather than surface semantics, validated through cross-topic and cross-language experiments, opening a new lens for interpretability.

×
BibTeX Citation
@inproceedings{zhou2025geometry, title = {The Geometry of Reasoning: Flowing Logics in Representation Space}, author = {Zhou, Yufa and Wang, Yixiao and Yin, Xunjian and Zhou, Shuyan and Zhang, Anru R.}, booktitle = {The Fourteenth International Conference on Learning Representations}, year = {2026}, url = {https://openreview.net/forum?id=ixr5Pcabq7} }

2025

Automating Structural Engineering Workflows with Large Language Model Agents
Automating Structural Engineering Workflows with Large Language Model Agents

Haoran Liang*, Yufa Zhou*, Mohammad Talebi-Kalaleh, Qipei Mei(* equal contribution)

arXiv 2025

We present MASSE, the first multi-agent system that automates structural engineering workflows by integrating reasoning, planning, and tool use to perform complex design and verification tasks—achieving training-free automation that cuts expert workload from hours to minutes and demonstrates tangible real-world impact.

×
BibTeX Citation
@article{liang2025masse, title = {Automating Structural Engineering Workflows with Large Language Model Agents}, author = {Haoran Liang and Yufa Zhou and Mohammad Talebi Kalaeh and Qipei Mei}, journal = {arXiv preprint arXiv:2510.11004}, year = {2025} }
Automating Structural Engineering Workflows with Large Language Model Agents

Haoran Liang*, Yufa Zhou*, Mohammad Talebi-Kalaleh, Qipei Mei(* equal contribution)

arXiv 2025

We present MASSE, the first multi-agent system that automates structural engineering workflows by integrating reasoning, planning, and tool use to perform complex design and verification tasks—achieving training-free automation that cuts expert workload from hours to minutes and demonstrates tangible real-world impact.

×
BibTeX Citation
@article{liang2025masse, title = {Automating Structural Engineering Workflows with Large Language Model Agents}, author = {Haoran Liang and Yufa Zhou and Mohammad Talebi Kalaeh and Qipei Mei}, journal = {arXiv preprint arXiv:2510.11004}, year = {2025} }
Why Do Transformers Fail to Forecast Time Series In-Context?
Why Do Transformers Fail to Forecast Time Series In-Context?

Yufa Zhou*, Yixiao Wang*, Surbhi Goel, Anru R. Zhang(* equal contribution)

NeurIPS 2025 Workshop: What Can('t) Transformers Do? Oral (3/68 ≈ 4.4%)

We analyze why Transformers fail in time-series forecasting through in-context learning theory, proving that, under AR($p$) data, linear self-attention cannot outperform classical linear predictors and suffers a strict $O(1/n)$ excess-risk gap, while chain-of-thought inference compounds errors exponentially—revealing fundamental representational limits of attention and offering principled insights.

×
BibTeX Citation
@article{zhou2025tsf, title={Why Do Transformers Fail to Forecast Time Series In-Context?}, author={Zhou, Yufa and Wang, Yixiao and Goel, Surbhi and Zhang, Anru R.}, journal={arXiv preprint arXiv:2510.09776}, year={2025} }
Why Do Transformers Fail to Forecast Time Series In-Context?

Yufa Zhou*, Yixiao Wang*, Surbhi Goel, Anru R. Zhang(* equal contribution)

NeurIPS 2025 Workshop: What Can('t) Transformers Do? Oral (3/68 ≈ 4.4%)

We analyze why Transformers fail in time-series forecasting through in-context learning theory, proving that, under AR($p$) data, linear self-attention cannot outperform classical linear predictors and suffers a strict $O(1/n)$ excess-risk gap, while chain-of-thought inference compounds errors exponentially—revealing fundamental representational limits of attention and offering principled insights.

×
BibTeX Citation
@article{zhou2025tsf, title={Why Do Transformers Fail to Forecast Time Series In-Context?}, author={Zhou, Yufa and Wang, Yixiao and Goel, Surbhi and Zhang, Anru R.}, journal={arXiv preprint arXiv:2510.09776}, year={2025} }
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Zichen Wen, Shaobo Wang, Yufa Zhou, Junyuan Zhang, Qintong Zhang, Yifeng Gao, Zhaorun Chen, Bin Wang, Weijia Li, Conghui He, Linfeng Zhang

NeurIPS 2025

We propose EPIC, a progressive consistency distillation framework that mitigates the training difficulty of token compression in multi-modal LLMs by enforcing token- and layer-level consistency, achieving superior efficiency, robustness, and generalization across benchmarks.

×
BibTeX Citation
@inproceedings{wen2025efficient, title={Efficient Multi-modal Large Language Models via Progressive Consistency Distillation}, author={Wen, Zichen and Wang, Shaobo and Zhou, Yufa and Zhang, Junyuan and Zhang, Qintong and Gao, Yifeng and Chen, Zhaorun and Wang, Bin and Li, Weijia and He, Conghui and Zhang, Linfeng}, booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems}, year={2025}, url={https://openreview.net/forum?id=gZjPllL9jM} }
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Zichen Wen, Shaobo Wang, Yufa Zhou, Junyuan Zhang, Qintong Zhang, Yifeng Gao, Zhaorun Chen, Bin Wang, Weijia Li, Conghui He, Linfeng Zhang

NeurIPS 2025

We propose EPIC, a progressive consistency distillation framework that mitigates the training difficulty of token compression in multi-modal LLMs by enforcing token- and layer-level consistency, achieving superior efficiency, robustness, and generalization across benchmarks.

×
BibTeX Citation
@inproceedings{wen2025efficient, title={Efficient Multi-modal Large Language Models via Progressive Consistency Distillation}, author={Wen, Zichen and Wang, Shaobo and Zhou, Yufa and Zhang, Junyuan and Zhang, Qintong and Gao, Yifeng and Chen, Zhaorun and Wang, Bin and Li, Weijia and He, Conghui and Zhang, Linfeng}, booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems}, year={2025}, url={https://openreview.net/forum?id=gZjPllL9jM} }
Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective
Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective

Yingyu Liang*, Zhizhou Sha*, Zhenmei Shi*, Zhao Song*, Mingda Wan*, Yufa Zhou*(α–β alphabetical order)

ICCV 2025

We provide a theoretical analysis showing that for diffusion models with Gaussian mixture data, the diffusion process preserves the mixture structure; we derive tight, component-independent bounds on Lipschitz constants and second moments, and establish error guarantees for diffusion solvers—offering deeper insights into the diffusion dynamics under common data distributions.

×
BibTeX Citation
@inproceedings{liang2025unraveling, author = {Liang, Yingyu and Sha, Zhizhou and Shi, Zhenmei and Song, Zhao and Wan, Mingda and Zhou, Yufa}, title = {Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2025}, pages = {11436-11446} }
Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective

Yingyu Liang*, Zhizhou Sha*, Zhenmei Shi*, Zhao Song*, Mingda Wan*, Yufa Zhou*(α–β alphabetical order)

ICCV 2025

We provide a theoretical analysis showing that for diffusion models with Gaussian mixture data, the diffusion process preserves the mixture structure; we derive tight, component-independent bounds on Lipschitz constants and second moments, and establish error guarantees for diffusion solvers—offering deeper insights into the diffusion dynamics under common data distributions.

×
BibTeX Citation
@inproceedings{liang2025unraveling, author = {Liang, Yingyu and Sha, Zhizhou and Shi, Zhenmei and Song, Zhao and Wan, Mingda and Zhou, Yufa}, title = {Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2025}, pages = {11436-11446} }
Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs
Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs

Yufa Zhou*, Shaobo Wang*, Xingyu Dong*, Xiangqi Jin, Yifang Chen, Yue Min, Kexin Yang, Xingzhang Ren, Dayiheng Liu, Linfeng Zhang(* equal contribution)

arXiv 2025

We investigate whether post-training techniques such as SFT and RLVR can generalize to multi-agent systems, and introduce Recon—a 7B model trained on a curated dataset of economic reasoning problems—which achieves strong benchmark performance and exhibits emergent strategic generalization in multi-agent games.

×
BibTeX Citation
@article{zhou2025recon, title={Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs}, author={Zhou, Yufa and Wang, Shaobo and Dong, Xingyu and Jin, Xiangqi and Chen, Yifang and Min, Yue and Yang, Kexin and Ren, Xingzhang and Liu, Dayiheng and Zhang, Linfeng}, journal={arXiv preprint arXiv:2506.00577}, year={2025}, url={https://arxiv.org/abs/2506.00577} }
Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs

Yufa Zhou*, Shaobo Wang*, Xingyu Dong*, Xiangqi Jin, Yifang Chen, Yue Min, Kexin Yang, Xingzhang Ren, Dayiheng Liu, Linfeng Zhang(* equal contribution)

arXiv 2025

We investigate whether post-training techniques such as SFT and RLVR can generalize to multi-agent systems, and introduce Recon—a 7B model trained on a curated dataset of economic reasoning problems—which achieves strong benchmark performance and exhibits emergent strategic generalization in multi-agent games.

×
BibTeX Citation
@article{zhou2025recon, title={Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs}, author={Zhou, Yufa and Wang, Shaobo and Dong, Xingyu and Jin, Xiangqi and Chen, Yifang and Min, Yue and Yang, Kexin and Ren, Xingzhang and Liu, Dayiheng and Zhang, Linfeng}, journal={arXiv preprint arXiv:2506.00577}, year={2025}, url={https://arxiv.org/abs/2506.00577} }
DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance
DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance

Xuan Shen*, Chenxia Han*, Yufa Zhou*, Yanyue Xie, Yifan Gong, Quanyi Wang, Yiwei Wang, Yanzhi Wang, Pu Zhao, Jiuxiang Gu(* equal contribution)

arXiv 2025

We propose DraftAttention, a method that accelerates video diffusion transformers by leveraging low-resolution pooled attention maps to enable dynamic sparse attention and hardware-efficient execution, achieving up to 1.75× speedup with minimal quality loss.

×
BibTeX Citation
@article{shen2025draftattention, title={DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance}, author={Shen, Xuan and Han, Chenxia and Zhou, Yufa and Xie, Yanyue and Gong, Yifan and Wang, Quanyi and Wang, Yiwei and Wang, Yanzhi and Zhao, Pu and Gu, Jiuxiang}, journal={arXiv preprint arXiv:2505.14708}, year={2025} }
DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance

Xuan Shen*, Chenxia Han*, Yufa Zhou*, Yanyue Xie, Yifan Gong, Quanyi Wang, Yiwei Wang, Yanzhi Wang, Pu Zhao, Jiuxiang Gu(* equal contribution)

arXiv 2025

We propose DraftAttention, a method that accelerates video diffusion transformers by leveraging low-resolution pooled attention maps to enable dynamic sparse attention and hardware-efficient execution, achieving up to 1.75× speedup with minimal quality loss.

×
BibTeX Citation
@article{shen2025draftattention, title={DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance}, author={Shen, Xuan and Han, Chenxia and Zhou, Yufa and Xie, Yanyue and Gong, Yifan and Wang, Quanyi and Wang, Yiwei and Wang, Yanzhi and Zhao, Pu and Gu, Jiuxiang}, journal={arXiv preprint arXiv:2505.14708}, year={2025} }
Looped relu mlps may be all you need as practical programmable computers
Looped relu mlps may be all you need as practical programmable computers

Yingyu Liang*, Zhizhou Sha*, Zhenmei Shi*, Zhao Song*, Yufa Zhou*(α–β alphabetical order)

AISTATS 2025

We demonstrate that a looped 23-layer ReLU-MLP can function as a universal programmable computer—revealing that simple neural network modules possess greater expressive power than previously thought and can perform complex tasks without relying on advanced architectures like Transformers.

×
BibTeX Citation
@inproceedings{liang2025looped, title={Looped ReLU MLPs May Be All You Need as Practical Programmable Computers}, author={Liang, Yingyu and Sha, Zhizhou and Shi, Zhenmei and Song, Zhao and Zhou, Yufa}, booktitle={International Conference on Artificial Intelligence and Statistics}, pages={2647--2655}, year={2025}, organization={PMLR} }
Looped relu mlps may be all you need as practical programmable computers

Yingyu Liang*, Zhizhou Sha*, Zhenmei Shi*, Zhao Song*, Yufa Zhou*(α–β alphabetical order)

AISTATS 2025

We demonstrate that a looped 23-layer ReLU-MLP can function as a universal programmable computer—revealing that simple neural network modules possess greater expressive power than previously thought and can perform complex tasks without relying on advanced architectures like Transformers.

×
BibTeX Citation
@inproceedings{liang2025looped, title={Looped ReLU MLPs May Be All You Need as Practical Programmable Computers}, author={Liang, Yingyu and Sha, Zhizhou and Shi, Zhenmei and Song, Zhao and Zhou, Yufa}, booktitle={International Conference on Artificial Intelligence and Statistics}, pages={2647--2655}, year={2025}, organization={PMLR} }
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix

Yingyu Liang*, Jiangxuan Long*, Zhenmei Shi*, Zhao Song*, Yufa Zhou*(α–β alphabetical order)

ICLR 2025

We introduce a novel LLM weight pruning method that directly optimizes for approximating the non-linear attention matrix—with theoretical convergence guarantees—effectively reducing computational costs while maintaining model performance.

×
BibTeX Citation
@inproceedings{liang2025beyond, title={Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix}, author={Yingyu Liang and Jiangxuan Long and Zhenmei Shi and Zhao Song and Yufa Zhou}, booktitle={The Thirteenth International Conference on Learning Representations}, year={2025}, url={https://openreview.net/forum?id=sgbI8Pxwie} }
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix

Yingyu Liang*, Jiangxuan Long*, Zhenmei Shi*, Zhao Song*, Yufa Zhou*(α–β alphabetical order)

ICLR 2025

We introduce a novel LLM weight pruning method that directly optimizes for approximating the non-linear attention matrix—with theoretical convergence guarantees—effectively reducing computational costs while maintaining model performance.

×
BibTeX Citation
@inproceedings{liang2025beyond, title={Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix}, author={Yingyu Liang and Jiangxuan Long and Zhenmei Shi and Zhao Song and Yufa Zhou}, booktitle={The Thirteenth International Conference on Learning Representations}, year={2025}, url={https://openreview.net/forum?id=sgbI8Pxwie} }
Numerical Pruning for Efficient Autoregressive Models
Numerical Pruning for Efficient Autoregressive Models

Xuan Shen, Zhao Song, Yufa Zhou, Bo Chen, Jing Liu, Ruiyi Zhang, Ryan A. Rossi, Hao Tan, Tong Yu, Xiang Chen, Yufan Zhou, Tong Sun, Pu Zhao, Yanzhi Wang, Jiuxiang Gu

AAAI 2025

We present a training-free structural pruning method using Newton’s method and compensation algorithms to efficiently compress decoder-only transformer models, achieving state-of-the-art performance with reduced memory usage and faster generation on GPUs.

×
BibTeX Citation
@inproceedings{shen2025numerical, title={Numerical pruning for efficient autoregressive models}, author={Shen, Xuan and Song, Zhao and Zhou, Yufa and Chen, Bo and Liu, Jing and Zhang, Ruiyi and Rossi, Ryan A and Tan, Hao and Yu, Tong and Chen, Xiang and others}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={39}, number={19}, pages={20418--20426}, year={2025} }
Numerical Pruning for Efficient Autoregressive Models

Xuan Shen, Zhao Song, Yufa Zhou, Bo Chen, Jing Liu, Ruiyi Zhang, Ryan A. Rossi, Hao Tan, Tong Yu, Xiang Chen, Yufan Zhou, Tong Sun, Pu Zhao, Yanzhi Wang, Jiuxiang Gu

AAAI 2025

We present a training-free structural pruning method using Newton’s method and compensation algorithms to efficiently compress decoder-only transformer models, achieving state-of-the-art performance with reduced memory usage and faster generation on GPUs.

×
BibTeX Citation
@inproceedings{shen2025numerical, title={Numerical pruning for efficient autoregressive models}, author={Shen, Xuan and Song, Zhao and Zhou, Yufa and Chen, Bo and Liu, Jing and Zhang, Ruiyi and Rossi, Ryan A and Tan, Hao and Yu, Tong and Chen, Xiang and others}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={39}, number={19}, pages={20418--20426}, year={2025} }
LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers
LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

Xuan Shen, Zhao Song, Yufa Zhou, Bo Chen, Yanyu Li, Yifan Gong, Kai Zhang, Hao Tan, Jason Kuen, Henghui Ding, Zhihao Shu, Wei Niu, Pu Zhao, Yanzhi Wang, Jiuxiang Gu

AAAI 2025

We present LazyDiT, a framework that accelerates Diffusion Transformers by reusing computations from previous steps and dynamically skipping redundancies, achieving superior performance over existing methods like DDIM across multiple models and devices.

×
BibTeX Citation
@inproceedings{shen2025lazydit, title={Lazydit: Lazy learning for the acceleration of diffusion transformers}, author={Shen, Xuan and Song, Zhao and Zhou, Yufa and Chen, Bo and Li, Yanyu and Gong, Yifan and Zhang, Kai and Tan, Hao and Kuen, Jason and Ding, Henghui and others}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={39}, number={19}, pages={20409--20417}, year={2025} }
LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

Xuan Shen, Zhao Song, Yufa Zhou, Bo Chen, Yanyu Li, Yifan Gong, Kai Zhang, Hao Tan, Jason Kuen, Henghui Ding, Zhihao Shu, Wei Niu, Pu Zhao, Yanzhi Wang, Jiuxiang Gu

AAAI 2025

We present LazyDiT, a framework that accelerates Diffusion Transformers by reusing computations from previous steps and dynamically skipping redundancies, achieving superior performance over existing methods like DDIM across multiple models and devices.

×
BibTeX Citation
@inproceedings{shen2025lazydit, title={Lazydit: Lazy learning for the acceleration of diffusion transformers}, author={Shen, Xuan and Song, Zhao and Zhou, Yufa and Chen, Bo and Li, Yanyu and Gong, Yifan and Zhang, Kai and Tan, Hao and Kuen, Jason and Ding, Henghui and others}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={39}, number={19}, pages={20409--20417}, year={2025} }