IMLIP2026特邀报告 | 王海峰：原生全模态大模型

会议简介

国际多语种智能信息处理会议（IMLIP）是多语种智能信息处理领域的国际顶级学术会议，是具有国际影响力的多语种智能信息处理领域产学研的高端会议，旨在为国内外学者提供学术交流与合作研究的平台，促进我国少数民族、“一带一路”沿线国家等语言学研究和自然语言处理的学术研究。

2026年第三届国际多语种智能信息处理会议（IMLIP 2026）由中国人工智能学会主办，中国人工智能学会多语种智能信息处理专委会、昆明理工大学共同承办，将于2026年7月24-26日在云南昆明召开，预计参会人数300-500人。

会议将邀请院士、国际知名语言学家与多语种处理技术专家等国内外知名专家共10余名作特邀报告，以期促进多语种语言学研究、智能信息处理学术界、大数据和人工智能等领域的产业界和广大爱好者之间的交流。

启幕在即，精彩可期
让我们共同期待

报告题目

原生全模态大模型

Native Multimodal Large Model

摘要：

多模态统一建模是当前大模型技术发展的前沿趋势之一。本报告解读万亿级原生全模态大模型技术，通过统一自回归框架实现文本、图像、视频与音频的多模态理解与生成。原生全模态大模型基于超稀疏混合专家（MoE）架构，通过模态无关的专家路由及共享专家池，提升跨模态学习与泛化能力；在预训练阶段首创弹性训练范式，能够在内存或时间受限场景中灵活权衡性能、模型大小和推理延迟；后训练阶段引入统一多模态强化学习，提升万亿级超稀疏大模型的收敛稳定性及多模态推理能力。得益于飞桨框架的多维混合并行训练、灵活注意力掩码等技术，模型的训练效率大幅提升。综合各项基准测试表明，原生全模态大模型在多个模态上均表现出强大且均衡的性能。

Abstract:

Unified multimodal modeling is currently one of the frontier large model technologies. This report presents a natively autoregressive foundation model designed for unified multimodal understanding and generation across text, image, video, and audio. All modalities are trained from scratch under a unified next-group of-tokens prediction objective, based on an ultra-sparse mixture-of-experts (MoE) architecture with modality-agnostic expert routing. In the pre-training stage, it pioneers an elastic training method, allowing for a flexible trade-off among performance, model size, and inference latency in memory- or time-constrained scenarios. In the post-training stage, unified multimodal reinforcement learning is introduced to improve the convergence stability and multimodal reasoning capabilities of the model. Thanks to techniques such as multi-dimensional hybrid parallel training and flexible attention masking in the PaddlePaddle framework, the model’s training efficiency has been significantly improved. Comprehensive benchmark results demonstrate the strong and balanced performance across multiple modalities of this model.

专家介绍

王海峰

国家卓越工程师，百度首席技术官、深度学习技术及应用国家工程研究中心主任。

自然语言处理国际学术组织ACL首位华人主席、ACL亚太分会创始主席、ACL Fellow、IEEE Fellow、国际欧亚科学院院士。兼任中国工程师联合体、中国电子学会、中国中文信息学会副理事长。

以第一完成人获国家技术发明二等奖、国家科技进步二等奖、中国专利金奖、北京市科技进步一等奖、吴文俊人工智能科技进步特等奖、中国电子学会科技进步一等奖等。获光华工程科技奖、首届全国创新争先奖、首个吴文俊人工智能杰出贡献奖。入选国家百千万人才工程，被授予“有突出贡献中青年专家”称号。享受国务院政府特殊津贴。入选北京学者。

Haifeng Wang, National Outstanding Engineer, Chief Technology Officer (CTO) of Baidu, and Director of National Engineering Research Center of Deep Learning Technology and Application.

Haifeng Wang is the first Chinese President of the Association for Computational Linguistics (ACL) as well as the founding President of the ACL Asia-Pacific Chapter, an ACL Fellow, an IEEE Fellow, and an Academician of the International Eurasian Academy of Sciences. He also serves as Vice President of the Chinese Society of Engineers, the Chinese Institute of Electronics, and the Chinese Information Processing Society of China.

He has won the Second Prize of National Technological Invention Award, the Second Prize of National Science and Technology Progress Award, China Patent Gold Award, the First Prize of Beijing Science and Technology Progress Award, the Grand Prize of Wu Wenjun AI Science and Technology Progress Award, and the First Prize of Science and Technology Progress Award from the Chinese Institute of Electronics, among others. He is a recipient of the Guanghua Engineering Science and Technology Award, the inaugural National Innovation and Excellence Award, and the first-ever Wu Wenjun AI Outstanding Contribution Award. He was selected for the National Hundred, Thousand, and Ten Thousand Talents Project and was awarded the title of “Young and Middle-aged Expert with Outstanding Contributions.” He receives a special government allowance from the State Council and has been selected as a Beijing Scholar.

会议简介

报告题目

专家介绍

王海峰

发表评论