Journal

On the Value of Myopic Behavior in Policy Reuse.

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025

We present a framework called Selective Myopic bEhavior Control~(SMEC), which results from the insight that the short-term behaviors of prior policies are sharable across tasks.

Chenjia Bai , Kang Xu , Shuang Qiu , Haoran He , Bin Zhao , Zhen Wang , Wei Li , Xuelong Li

On the Value of Myopic Behavior in Policy Reuse.

MoRE: Mixture of Residual Experts for Humanoid Lifelike Gaits Learning on Complex Terrains

In Pattern Recognition, 2025 (under review)

We propose a novel framework that enables humanoid robots to traverse complex terrains with controllable human-like gaits using a mixture of latent residual experts and multi-discriminators.

Dewei Wang , Xinmiao Wang , Xinzhe Liu , Jiyuan Shi , Yingnan Zhao , Chenjia Bai^✉ , Xuelong Li^✉

MoRE: Mixture of Residual Experts for Humanoid Lifelike Gaits Learning on Complex Terrains

Cross-Domain Offline Policy Adaptation with Dynamics- and Value-Aligned Data Filtering

arXiv preprint

We propose DVDF, a method for cross-domain offline RL that filters source data by both dynamics and value alignment, achieving strong performance in challenging settings.

Zhongjian Qiao , Rui Yang , Jiafei Lyu , Chenjia Bai , Xiu Li , Zhuoran Yang , Siyang Gao , Shuang Qiu

Cross-Domain Offline Policy Adaptation with Dynamics- and Value-Aligned Data Filtering

Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach

arXiv preprint arXiv:2512.02834

We propose TACO, a test-time-scaling framework for VLAs that improves inference stability and success rates by preventing distribution shifts at test time.

Siyuan Yang , Yang Zhang , Haoran He , Ling Pan , Xiu Li , Chenjia Bai^✉ , Xuelong Li^✉

Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach