Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Multi-modal medical image segmentation leverages complementary information across different modalities to enhance diagnostic accuracy, but faces two critical challenges: the requirement for extensive paired annotations and the difficulty in capturing complex inter-modality relationships. While Active Learning (AL) can reduce annotation burden through strategic sample selection, conventional methods suffer from unreliable uncertainty quantification. Meanwhile, Vector Quantization (VQ) offers a mechanism for encoding inter-modality relationships, yet existing implementations struggle with codebook misalignment across modalities. To address these limitations, we propose a novel Vector Quantization - Bimodal Entropy-Guided Active Learning (VQ-BEGAL) framework that employs a dual-encoder architecture with VQ to discretize continuous features into distinct codewords, effectively preserving modality-specific information while mitigating feature co-linearity. Unlike conventional AL methods that separate sample selection from model training, our approach integrates feature-level uncertainty estimation from cross-modal discriminator outputs into the training process—strategically allocating samples with different uncertainty characteristics to optimize specific network components, enhancing both feature extraction stability and decoder robustness.Experiments on benchmark datasets demonstrate that our approach achieves state-of-the-art performance while requiring significantly fewer annotations, making it particularly valuable for real-world clinical applications where labeled data is scarce. The code is available at https://github.com/xf-DU/vq-begal.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/4222_paper.pdf

SharedIt Link: https://rdcu.be/eHwXv

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04981-0_64

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/xf-DU/vq-begal

Link to the Dataset(s)

CHAOS dataset: https://chaos.grand-challenge.org/ AMOS dataset: https://amos22.grand-challenge.org/

BibTex

@InProceedings{DuXia_VectorQuantizationDriven_MICCAI2025,
        author = { Du, Xiaofei AND Wang, Haoran AND Wang, Manning AND Song, Zhijian},
        title = { { Vector-Quantization-Driven Active Learning for Efficient Multi-Modal Medical Segmentation with Cross-Modal Assistance } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15966},
        month = {September},
        page = {680 -- 689}
}

Reviews

Review #1

Please describe the contribution of the paper

The authors present VQ-BEGAL, a novel framework for multi-modal medical image segmentation that combines vector quantization with active learning. The proposed method features a dual-encoder architecture designed to extract modality-specific features while aligning them within a unified latent space, effectively addressing vector mismatch challenges. To further enhance segmentation performance, VQ-BEGAL incorporates an active learning mechanism that leverages uncertainty to selectively train specific components of the network using strategically chosen samples.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. Overall, the paper is well-written and easy to follow.
2. Adding sample selection into the training process using entropy calculation is an interesting approach.
3. The authors performed sufficient ablation studies to show the importance of each component of the framework.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. “Our approach integrates uncertainty estimation into the training process.” - This statement is misleading as they incorporate uncertainty estimation of the features to determine what to do with each sample. However, their method doesn’t quantify the uncertainty of the segmentation itself.
2. “High-uncertainty samples with complementary cross-modal information train the decoder (De), while low-uncertainty samples with redundant information stabilize encoder training (Ec and Em).” - The motivation for this choice is not clear from the text. How do the high-uncertainty samples help with decoder training and vice versa? 3 .Only dealing with liver segmentation downplays the contribution of the framework, as most of the existing work focuses on multi-class segmentation in these kinds of scenarios. If only liver segmentation is used, the authors should also consider testing on out-of-distribution settings.
3. In section 3.2, no explanation for β, α1, α2, α3, α4 parameters is given. Where are these parameters used?
4. The authors should include at least one qualitative result to showcase the performance of the proposed framework.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Although the performance improvement is noticeable, only using liver segmentation (which does not have a lot of variability in general) makes the results incomplete.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

This paper presents “Vector-Quantization-Driven Active Learning for Efficient Multi-Modal Medical Segmentation with Cross-Modal Assistance”, introducing three key innovations: (1) A dual-encoder architecture enhanced by vector quantization (VQ) to resolve vector mismatch via modality-specific feature extraction while learning a unified representation space. (2) The proposed VQ-BEGAL framework uniquely combines VQ-based feature disentanglement with an embedded active learning strategy—dynamically selecting and allocating high-uncertainty samples to optimize specific network components during training. (3) Rigorous validation on two public datasets, where the method demonstrates superior performance over state-of-the-art approaches in multi-modal segmentation tasks. The integration of VQ and active learning is novel and well-motivated, addressing both feature alignment and annotation efficiency. Overall, this is a technically sound contribution with promising empirical results for medical image segmentation.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Clear and Original Research Contribution: This paper presents a novel idea to Fill a gap in existing literature.
- Logical Structure and Coherence: This article is well-organized, the different sections are well-separated, and the article is easy to follow. The limitations of the problem are well explained and the challenges in this area are addressed. Contributions are explicitly stated. The formulation of the problem is good.
- Good Evidence and Data Support: Experiments have been conducted on two public datasets and the results have been compared with related methods to determine the superiority of the proposed model. The authors have indicated that after publishing the article, they will release the relevant code for reproducibility.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Superficial Literature Review: The review of relevant literature appears weak, which could be due to the limitation of the number of pages.
- Methodological Flaws: The loss function used, which appears to be a combinational function and each term has its own weight, is not introduced, and there is no sufficient reason for the selected weights of each term of this function.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
- Further ablation studies on the active learning component’s impact and scalability to larger datasets could strengthen the claims.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The novelty of the proposed method, the sufficiency of the evidence provided to validate it, and the ease of following the paper led me to make this decision.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper
- the authors implemented an active learning approach, including dual-encoder architecture and vector quantization to extract discrete codebook features of each of the two modalities; the sampling of train cases was then based on feature certainty, promoting cases with complementary features
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- novel approach to cross-modality feature disentanglement and active learning
- the proposed approach demonstrated best liver segmentation performance on CHAOS and AMOS datasets, wrt. relevant established approaches
- the manuscript is well-structured and clearly written
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- experiments were limited to liver segmentation
- statistical significance of the improvements was not demonstrated
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
- The experiments were limited to liver segmentation, which is a rather large and well-contrasted organ; the submission could be made stronger by addressing a more challenging segmentation problem, i.e. brain cortex (thin articulated structure), brain tumors and lesions (high variability), etc., which also benefit from efficient cross-modal feature utilization
- It would be beneficial to include a visual presentation of the segmentation results in the manuscript
- the author use the term “significant improvements” whereas statistical significance was not demonstrated; please use the term “substantial”
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Novelty of the proposed approach, clear presentation of the methodology, and evaluation against relevant established baseline approaches, with demonstration of superior performance.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

N/A

Meta-Review

Meta-review #1

Your recommendation

Provisional Accept
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A

back to top

Vector-Quantization-Driven Active Learning for Efficient Multi-Modal Medical Segmentation with Cross-Modal Assistance

Author(s):