Knowledge-Intensive Multimodal Reasoning
ICCV 2025, October 22 2025, Hawaii

About The Workshop

This workshop aims to advance the frontier of multimodal AI systems that can effectively reason across specialized domains requiring extensive domain knowledge. Recent advancements in multimodal AI—combining information from text, images, audio, and structured data—have unlocked impressive capabilities in general-purpose reasoning. However, significant challenges persist when these systems encounter scenarios demanding deep domain expertise in fields such as medicine, engineering, and scientific research. Such contexts require expert-level perception and reasoning grounded in extensive subject knowledge, highlighting the need for specialized strategies to handle domain-specific complexity. Through invited talks, panel discussions, and interactive poster sessions, researchers and practitioners from diverse backgrounds will share the latest developments, ongoing hurdles, and promising future directions for knowledge-intensive multimodal reasoning. The workshop aims to foster collaboration and stimulate innovation towards the development of next-generation multimodal AI systems capable of reliable, transparent, and contextually grounded reasoning in specialized, high-stakes environments.

Schedule

Coming Soon



Topics

The workshop will cover a range of topics, including but not limited to:

Knowledge-intensive Multimodal Learning:

This topic focuses on method and architecture designs that integrate domain-specific knowledge with diverse data sources (e.g., text, images, sensor data, and structured data) across specialized fields. We will cover data curation strategies, modality fusion techniques, representation learning frameworks, and explainability methods aimed at ensuring that models capture the domain knowledge crucial for reliable reasoning in high-stakes settings.

Multimodal Foundation Models for Specialized Domains:

This topic investigates how to adapt large-scale and general-purpose multimodal foundation models for domains where specialized expertise is essential, such as clinical diagnostics, scientific research, and advanced engineering applications. We will cover strategies for efficient fine-tuning, prompt engineering, domain-centric pre-training, and knowledge distillation to blend foundational capabilities with expert-level insights.

Embodied AI for Knowledge-Intensive Scenarios:

This topic explores the integration of multimodal reasoning in physical or interactive domains, ranging from industrial automation to laboratory robotics. Key discussion points include sensor fusion, adaptive learning with minimal supervision, human-robot collaboration, and simulation-to-real transfer in safety-critical scenarios. Emphasis will be placed on how advanced reasoning techniques—grounded in specialized domain knowledge—can help ensure transparency, robustness, and trustworthiness in embodied AI systems.

Evaluation and Benchmarking:

Robust evaluation protocols and benchmarks are essential for gauging progress and ensuring the reliability of domain-specific multimodal AI. We will cover the development of standardized benchmarks, performance metrics, and testing methodologies designed to capture the full spectrum of specialzied domain challenges for multimodal reasoning.

Exploring Broader Topics in Knowledge-intensive Multimodal Reasoning:

In addition to the core themes above, our discussions will expand to emerging areas such as integrating symbolic and neural methods for structured reasoning, ensuring privacy and security with sensitive data, exploring multi-agent collaboration for complex decision-making, and examining societal and ethical considerations when deploying multimodal systems in real-world, high-stakes environments.



Call For Papers

Key Dates

  • Submission Deadline: August 5, 2025, (AOE)
  • Notification: August 30, 2025, (AOE)
Deadlines are strict and will not be extended under any circumstances. All deadlines follow the Anywhere on Earth (AoE) timezone.

Submission Site

Submissions will be managed via OpenReview. Papers will remain private during the review process. All authors must maintain up-to-date OpenReview profiles to ensure proper conflict-of-interest management and paper matching. Incomplete profiles may result in desk rejection.



Accepted Paper

Coming Soon



Student Registration Grant

Coming Soon

Speakers and Panelists

Coming Soon

Organizers

This workshop is organized by

Faculty Organizers

Xiangliang Zhang
Xiangliang Zhang

Notre Dame

Manling Li
Manling Li

Northwestern

Yapeng Tian
Yapeng Tian

UT Dallas

Student Organizers

Yilun Zhao
Yilun Zhao

Yale

Tianyu Yang
Tianyu Yang

Notre Dame

Zhenting Qi
Zhenting Qi

Harvard



Flag Counter