publications

Happy to be cited!

2026

  1. MedSAMAgent.png
    MedSAM-Agent: Empowering Interactive Medical Image Segmentation with Multi-turn Agentic Reinforcement Learning
    Shengyuan Liu , Liuxin Bao , Qi Yang, and 6 more authors
    2026
  2. DIPE.png
    Beyond Sequential Distance: Inter-Modal Distance Invariant Position Encoding
    Lin Chen, Bolin Ni , Qi Yang, and 5 more authors
    2026
  3. CCVQA.png
    CC-VQA: Conflict-and Correlation-Aware Method for Mitigating Knowledge Conflict in Knowledge-Based Visual Question Answering
    Yuyang Hong, Jiaqi Gu , Yujin Lou , and 7 more authors
    2026

2025

  1. HunYuan-OCR.png
    HunyuanOCR Technical Report
    Hunyuan Vision Team , Pengyuan Lyu , Xiang Wan , and 7 more authors
    2025
  2. wikiprf.png
    Knowledge-based Visual Question Answer with Multimodal Processing, Retrieval and Filtering
    Yuyang Hong, Jiaqi Gu , Qi Yang, and 6 more authors
    2025
  3. R4B.jpg
    R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning
    Qi Yang, Bolin Ni , Shiming Xiang, and 3 more authors
    2025
  4. RCTS.png
    Re-ranking Reasoning Context with Tree Search Makes Large Vision-Language Models Stronger
    Qi Yang, Chenghao Zhang , Lubin Fan , and 3 more authors
    2025
  5. ERRSeg.png
    Efficient Redundancy Reduction for Open-Vocabulary Semantic Segmentation
    Lin ChenQi YangKun Ding, and 5 more authors
    Neurocomputing, 2025
  6. SAM-MI.png
    SAM-MI: A Mask-Injected Framework for Enhancing Open-Vocabulary Semantic Segmentation with SAM
    Lin Chen, Yingjian Zhu , Qi Yang, and 3 more authors
    2025
  7. CAVS.png
    Taming Modality Entanglement in Continual Audio-Visual Segmentation
    Yuyang HongQi Yang, Tao Zhang , and 5 more authors
    2025

2024

  1. IQ2Former.jpg
    Mask2Former with Improved Query for Semantic Segmentation in Remote-Sensing Images
    Shichen Guo , Qi YangShiming Xiang, and 2 more authors
    Mathematics, 2024
  2. COMBO.jpg
    Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation
    Qi YangXing NieTong Li, and 5 more authors
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2024
    CVPR 2024 Highlight
  3. Foley.jpg
    Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis
    Qi Yang, Binjie Mao , Zili Wang, and 6 more authors
    2024
  4. AVESFormer.jpg
    AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation
    Zili WangQi Yang, Linsu Shi , and 4 more authors
    2024
  5. CSpD.jpg
    Continuous Speculative Decoding for Autoregressive Image Generation
    Zili Wang, Robert Zhang , Kun Ding, and 3 more authors
    2024

2023

  1. DyHRNet.jpg
    Dynamic High-Resolution Network for Semantic Segmentation in Remote-Sensing Images
    Shichen Guo , Qi YangShiming Xiang, and 2 more authors
    Remote Sensing, 2023
  2. ScaleSeg.jpg
    Continual Semantic Segmentation via Scalable Contrastive Clustering and Background Diversity
    Qi YangXing Nie, Linsu Shi , and 3 more authors
    In 2023 IEEE International Conference on Data Mining (ICDM) , 2023