Publications

(2025). VisualPRM: An Effective Process Reward Model for Multimodal Reasoning.
(2024). Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling.
(2024). Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization.
(2024). MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding.
(2024). MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity. SCIS.