Xiangbo Gao-image
about-me-image

Xiangbo Gao

I love to use deep learning to solve real-world problems.

about-me-image

About me

Autonomous Driving | Multi-agent Collaborative Perception | Aerial & Grounded Agent Cooperation | Current PhD @ TAMU | MS @ Umich | BS @ UCI

  • Location:Texas A&M University, College Station, TX
  • Age:25
  • Nationality:China
  • Interests:Snowboarding, Skiing, Rock climbing
  • Study:Texas A&M University, College Station, TX

Publications

Selected

LangCoop: Collaborative Driving with Language

Xiangbo Gao, Runsheng Xu, Jiachen Li, Ziran Wang, Zhiwen Fan, Zhengzhong Tu

CVPR 2025

Multi-agent collaboration enhances autonomous driving by enabling connected vehicles to share information, but current communication methods suffer from bandwidth, heterogeneity, and information loss issues. We propose LangCoop, a language-driven collaboration framework that uses natural language as a compact, expressive medium for inter-agent communication. Featuring M3CoT for structured reasoning and LangPack for efficient message encoding, LangCoop achieves a 96% reduction in bandwidth while maintaining strong closed-loop driving performance in CARLA simulations.

about-me-image

AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving

Shuo Xing, Hongyuan Hua, Xiangbo Gao, Shenzhe Zhu, Renjie Li, Kexin Tian, Xiaopeng Li, Heng Huang, Tianbao Yang, Zhangyang Wang, Yang Zhou, Huaxiu Yao, Zhengzhong Tu

TMLR 2026

AutoTrust is a groundbreaking benchmark designed to assess the trustworthiness of DriveVLMs. This work aims to enhance public safety by ensuring DriveVLMs operate reliably across critical dimensions.

about-me-image

STAMP: Scalable Task- And Model-agnostic Collaborative Perception

Xiangbo Gao, Runsheng Xu, Jiachen Li, Ziran Wang, Zhiwen Fan, Zhengzhong Tu

ICLR 2025

STAMP is a new framework for multi-agent collaborative perception in autonomous driving that enables diverse vehicles to share sensor data efficiently. Using adapter-reverter pairs to convert between agent-specific and shared feature formats in Bird`s Eye View, it achieves better accuracy than existing methods while reducing computational costs and maintaining security across heterogeneous systems.

about-me-image

MambaST: A Plug-and-Play Cross-Spectral Spatial-Temporal Fuser for Efficient Pedestrian Detection

Xiangbo Gao, Asiegbu Miracle Kanu-Asiegbu, Xiaoxiao Du

ITSC 2024

MambaST is a new framework for pedestrian detection that combines RGB and thermal camera data while leveraging temporal information. It uses a novel Multi-head Hierarchical Patching and Aggregation structure with state space models to efficiently process multi-spectral data, achieving better results on small-scale detection while being more computationally efficient than transformer-based approaches.

about-me-image

Scale-free and Task-agnostic Attack: Generating Photo-realistic Adversarial Patterns with Patch Quilting Generator

Xiangbo Gao, Cheng Luo, Qinliang Lin, Weicheng Xie, Minmin Liu, Linlin Shen, Keerthy Kusumam, Siyang Song

ICASSP 2024

PQ-GAN is a novel scale-free generator for adversarial attacks that works on images of any size. Unlike previous methods limited to local or fixed-scale attacks, it demonstrates superior transferability, defense resistance, and visual quality when tested against other attack methods on ImageNet and CityScapes datasets.

about-me-image

Sample Hardness Based Gradient Loss for Long-Tailed Cervical Cell Detection

Minmin Liu, Xuechen Li, Xiangbo Gao, Junliang Chen, Linlin Shen, Huisi Wu

MICCAI 2022

A new Grad-Libra Loss method improves cancer cell detection in imbalanced cervical cancer datasets by adjusting for both sample difficulty and category distribution, achieving 7.8% better accuracy than standard approaches.

about-me-image

Preprints

AirV2X: Unified Air-Ground Vehicle-to-Everything Collaboration

Xiangbo Gao, Yuheng Wu, Fengze Yang, Xuewen Luo, Keshu Wu, Xinghao Chen, Yuping Wang, Chenxi Liu, Yang Zhou, Zhengzhong Tu

ArXiv 2025

While multi-vehicle collaboration improves safety and efficiency, traditional infrastructure-based V2X systems face high deployment costs and poor coverage in rural areas. To address this, we introduce AirV2X-Perception, a large-scale dataset that uses UAVs as flexible, low-cost perception units providing dynamic, occlusion-free bird’s-eye views. Spanning 6.73 hours of diverse driving scenarios, the dataset enables standardized development and evaluation of Vehicle-to-Drone (V2D) algorithms for aerial-assisted autonomous driving.

about-me-image

SafeCoop: Unravelling Full Stack Safety in Agentic Collaborative Driving

Xiangbo Gao, Tzu-Hsiang Lin, Ruojing Song, Yuheng Wu, Kuan-Ru Huang, Zicheng Jin, Fangzhou Lin, Shinan Liu, Zhengzhong Tu

ArXiv 2025

Collaborative driving systems utilize vehicle-to-everything (V2X) communication to enhance safety and efficiency, but traditional approaches face bandwidth, semantic, and interoperability limitations. Emerging language-driven V2X frameworks offer richer semantics and reasoning capabilities yet introduce new vulnerabilities such as message loss and semantic manipulation. To address these, we propose SafeCoop, an agentic defense pipeline that safeguards language-based collaboration through semantic firewalls, consistency checks, and multi-source consensus, achieving significant safety gains in closed-loop evaluations.

about-me-image

More Publications

Simulating the Unseen: Crash Prediction Must Learn from What Did Not Happen

Z Li, X Cao, X Gao, K Tian, K Wu, M Anis, H Zhang, K Long, J Jiang, X Li, ...

arXiv preprint arXiv:2505.21743, 2025

Automated Vehicles Should be Connected with Natural Language

X Gao, K Wu, H Zhang, K Tian, Y Zhou, Z Tu

arXiv preprint arXiv:2507.01059, 2025

DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving

M Godbole, X Gao, Z Tu

arXiv preprint arXiv:2506.17590, 2025

Professional Services

Conference and Journal Paper Reviewing

  • CV & ML:ICCV, CVPR, NeurIPS, T-PAMI
  • Robotics:ICRA, IROS, RA-L
  • Transportation:ITSC, TRB

Resume

Education

Ph.D. in Computer Science

Texas A&M University2025.1 - Present

M.S. in Robotics

University of Michigan, Ann Arbor2023.9 - 2024.12

B.S. in Computer Science | B.S. in Mathematics

University of California, Irvine2018.9 - 2023.3

Employment

Graduate Research Assistant

TACO Group @ Texas A&M University2025.1 - Present

Graduate Research Assistant

Map and Motion Lab @ University of Michigan, Ann Arbor2024.7 - 2024.12

Graduate Research Assistant

UM Ford Center for Autonomous Vehicles (FCAV)2023.12 - 2024.6

Perception Research Intern

Anhui Cowa ROBOT Co., Ltd, Shanghai, China2023.4 - 2023.7

Full-stack Software developer

Tandll Investment Management Limited, China2020.6 - 2020.8

VR Software developer

Calit 2, University of California, Irvine2019.2 - 2019.7

Competitions

CVPR MEIS workshop 2025, Best Paper Award

N/A2025.6

LangCoop: Collaborative Driving with Language receivesBest Paper Award at CVPR MEIS workshop 2025.

CVPR Camera-based online HD map construction challenge 2023

N/A2023.5

Rank 13th in CVPR Camera-based online HD map construction challenge 2023

UCI 2020 Machine Learning Hackathon

University of California, Irvine, CA, USA2020.4

1st place on the subproject of 3D Human Pose with Scene Constraints

Google Hash Code 2020 Algorithms Competition

Irvine, CA2020.2

2nd place at UCI | Team name: ε=.99