top of page

Efficient LLMs and RL&Agentic Systems

When RL Meets Adaptive Speculative Training: A Unified Training-Serving System

Junxiong Wang*†, Fengxiang Bie*†, Jisen Li†, Zhongzhu Zhou†, Zelei Shao†, Yubo Wang†,Yinghui Liu†, Qingyang WuAvner MaySri YanamandraYineng ZhangCe ZhangTri Dao,Percy LiangBen AthiwaratkunShuaiwen Leon SongChenfeng Xu† (co-lead)Xiaoxia Wu[Website]

​

Aurora is the first day-0 support speculative system! You can accelerate your LLM since today! 😎

aurora_ttt_animated-ezgif.com-video-to-gif-converter.gif

Beat the long tail: Distribution-Aware Speculative Decoding for RL Training

Zelei Shao, Vikranth Srivatsa, Sanjana Srivastava, Qingyang Wu, Alpay Ariyak, Xiaoxia Wu, Ameen Patel, Jue WANG, Percy Liang, Tri Dao, Ce Zhang, Yiying Zhang, Ben Athiwaratkun, Chenfeng Xu, Junxiong Wang [MLSys 2026]

image.png

Dobi-SVD: Differentiable SVD for LLM Compression and Some New Perspectives

Wang Qinsi*, Jinghan Ke*, Masayoshi Tomizuka, Kurt Keutzer, Chenfeng Xu [ICLR 2025]

video-cut1 (1).gif

​ThunderAgent: A Fast, Simple, and Robust Program-Aware Agentic Inference System

Hao Kang*, Ziyang Li*, Xinyu Yang*, Weili Xu, Yinfang Chen, Junxiong Wang, Beidi Chen, Tushar Krishna, Chenfeng Xu, Simran Arora [website]

image.png

CDLM: Consistency Diffusion Language Models For Faster Sampling

Minseo Kim, Chenfeng Xu, Coleman Hooper, Harman Singh, Ben Athiwaratkun, Ce Zhang, Kurt Keutzer, Amir Gholami [MLSys 2026]

image.png

Angles Don’t Lie: Unlocking Training-Efficient RL Through the Model’s Own Signals

Qinsi Wang*, Jinghan Ke*, Hancheng Ye, Yueqian Lin, Yuzhe Fu, Jianyi Zhang, Kurt Keutzer, Chenfeng Xu†, Yiran Chen† [Neurips 2025 Spotlight]

image_edited.jpg
bottom of page