Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Lymph node metastasis diagnosis in computed tomography (CT) scans is an essential yet very challenging task for esophageal cancer staging and treatment planning. Recent advances in deep learning have markedly improved the performance in lymph node (LN) metastasis classification. However, these methods often focus more on the averaged features of all CT slices containing a 3D LN instance, lacking effective fusion of key slice-wise features, which is important in the LN metastasis analysis by physicians. In addition, existing deep learning models are trained using CT scans in an end-to-end fashion, thus lacking the explicit incorporation of clinically relevant meta-imaging features (i.e., morphological and radiomic features). Meta-imaging features play a crucial role in LN assessment and may not be effectively captured by direct end-to-end deep learning models. To address these issues, we formulate the 3D LN metastasis classification as a multiple instance learning (MIL) problem by extracting and fusing slice-level features (instance) into a comprehensive bag representation. Building on this, we propose a two-streamed MIL framework with a prototype-guided aggregation method that effectively captures LN characteristics at both local and global scales. Furthermore, a multi-scale multi-source fusion module is introduced to integrate the heterogeneous meta-imaging features with deep learning features, enhancing the comprehensive representation of LN. Five-fold cross validation on a cohort of 284 esophageal cancer patients with 809 pathology-confirmed LN instances demonstrate the superiority of our methods compared to the state-of-the-art approaches with +2.66% in AUROC and +4.81% in sensitivity improvements.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1417_paper.pdf

SharedIt Link: https://rdcu.be/eHwLe

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04927-8_30

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{LiHao_Lymph_MICCAI2025,
        author = { Li, Haoshen AND Ai, Tashan AND Wang, Yirui AND Ji, Zhanghexuan AND Yu, Qinji AND Lu, Le AND Dong, Bin AND Zhang, Li AND Ye, Xianghua AND Zhao, Kuaile AND Jin, Dakai},
        title = { { Lymph Node Metastasis Classification with Prototype-guided Multiple Instance Aggregation and Heterogeneous Feature Fusion } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15960},
        month = {September},
        page = {312 -- 321}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper proposed an MIL method tailored for lymph node metastasis classification tasks with prototype-guided aggregation. They further introduced a multi-scale fusion module to integrate various types of features. Comprehensive experiments demonstrated the superiority compared to competitors.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

It’s a novel formulation of MIL and is also a new application of MIL to 3D CT scans. Technically speaking, this paper introduced learnable prototypes that act as cls tokens in transformers to capture global knowledge from different scales.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

This paper is well formulated and has conducted comprehensive experiments. Although it is well-organized overall, the technical descriptions in Section 2.1 are not very clear, leading to some confusion. For example, the rationale behind “group the CT slices into multiple sets of 3-channel images” is not fully explained—what is the purpose of this grouping? Could it involve combinations of images with more than 3 channels? I suggest that the authors provide a more formalized, symbolic description of this process. Also, there is a typo in paragraph 2 of section 2.1, where it should be P_g instead of F_g in the sentence “Formally, the learnable global prototype F_g …”. Additionally, there is a lack of detailed explanation on how the local stream is constructed and the reasoning behind the need to crop the ROI.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

My recommendation is based on the overall clarity of the problem definition and the details provided about the algorithm. However, Section 2.1 lacks a more detailed symbolic representation and formal definition, which significantly affected my evaluation.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

As the main contribution, the authors build upon a two-stream MIL framework to capture LN-relevant features and introduce a novel feature fusion module that integrates meta-imaging features with deep learning–based meta features.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The paper is well written, and the illustrations effectively support understanding of the proposed framework.
- The authors review the literature and identify key limitations regarding the slice-level information of metastatic lymph nodes (LNs) and the potential of clinically relevant meta-imaging features.
- Both quantitative and qualitative data help demonstrate the clinical relevance, as well as the strengths and limitations of the method. The results in Table 1 highlight the superior performance, particularly in comparison to 3D methods.
- The authors also provide clear reasoning for both the method’s successes and its limitations. The effects of meta-imaging features and slice count are evident and reinforce the method’s robustness. The improvements over state-of-the-art methods are clear and statistically significant.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- My concern here is the validation scheme. Recent works [1] on LN metastasis for a different pathology show limitations when much more more robust validation schemes are drawn i.e, multi-center and external validation.
[1] https://link.springer.com/article/10.1007/s11263-024-02314-1
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

Would be helpful to provide the source code.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The approach is novel and clinically relevant. The framework presents an effective feature fusion design. To the best of my knowledge, no prior work has combined MIL with clinically relevant meta-imaging features derived from radiomics and morphology to address the challenging problem of lymph node metastasis.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

The paper uses an impressive cohort for esophageal cancer and deep learning to assess lymph node metastasis, which is very relevant. The deep learning approach uses CT images, which is the standard clinical practice. The paper uses MIL to fuse features from a local and global level in CT images in 2.5D and fuses them together with radiomics features.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The MIL approach is very novel and is very well described. The size of the cohort is quite large for an esophageal cancer cohort, though appears to be from a single centre. The paper compares their approach against other modern methods to provide a very relevant comparison. The approach of the study is significantly better than the other approaches.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

The study would be more powerful with an external validation. The study uses a model to automatically detect LN, which is then used to make local patches for part of the modality fusion. It would potentially yield better results if the encoder of the local/global features was taken from a segmentation model directly.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(6) Strong Accept — must be accepted due to excellence
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The dataset is impressive but single centre. The analysis and results are robust. The methodology is very interesting.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

We thank the reviewers for their highly constructive and positive feedback. The following are our responses to review comments.

Response to Reviewer # 1: We acknowledge the importance of external validation and plan to evaluate our method on additional datasets in future work. Furthermore, we will explore utilizing the encoder from a pre-trained segmentation model as the backbone for feature extraction, which we believe can further enhance the performance of both local and global feature learning.

Response to Reviewer # 2: We acknowledge the limitation of single dataset and plan to extend our work to multi-center data in future studies to evaluate the robustness and generalizability of our method.

Response to Reviewer # 3: By grouping the CT slices into multiple sets of 3-channel images, we can apply the 2D backbone to extract features from each group. Compared to 3D model, 2D model has better initialization weights, benefit from pre-training on ImageNet, and is computationally more efficient. Then, these features are aggregated by a Multiple Instance Learning (MIL) framework, allowing for effective interaction and fusion across different slice groups. Regarding the unclear descriptions in Section 2.1, we will make detailed revisions in the camera-ready version to improve clarity.

Meta-Review

Meta-review #1

Your recommendation

Provisional Accept
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A

back to top

Lymph Node Metastasis Classification with Prototype-guided Multiple Instance Aggregation and Heterogeneous Feature Fusion

Author(s):