Abstract

We present CheXtriev, a graph-based, anatomy-aware framework for chest radiograph retrieval. Unlike prior methods focussed on global features, our method leverages graph transformers to extract informative features from specific anatomical regions. Furthermore, it captures spatial context and the interplay between anatomical location and findings. This contextualization, grounded in evidence-based anatomy, results in a richer anatomy-aware representation and leads to more accurate, effective and efficient retrieval, particularly for less prevalent findings. CheXtriv outperforms state-of-the-art global and local approaches by 18% to 26% in retrieval accuracy and 11% to 23% in ranking quality.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/3636_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/3636_supp.pdf

Link to the Code Repository

https://github.com/cvit-mip/chextriev

Link to the Dataset(s)

https://physionet.org/content/mimic-cxr/2.0.0/ https://physionet.org/content/mimic-cxr-jpg/2.1.0/ https://physionet.org/content/chest-imagenome/1.0.0/

BibTex

@InProceedings{Aka_CheXtriev_MICCAI2024,
        author = { Akash R.  J., Naren and Tadanki, Arihanth and Sivaswamy, Jayanthi},
        title = { { CheXtriev: Anatomy-Centered Representation for Case-Based Retrieval of Chest Radiographs } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15001},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The manuscript has proposed an anatomy-aware framework for chest radiograph retrieval, using evidence-based anatomy ensures greater precision in describing fewer common findings.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1.The manuscript leverages graph transformers to extract local features, circumventing non-specific abnormalities. 2.It may integrate specific anatomical location information with subtle details within the region enriches representation.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1.The motivation seems not relatively clear. The anatomically based innovation lacks robust support, appearing to be a naive intuitive conjecture or based on post-experimental interpretability analysis.

    1. Although this manuscript mentions multi-label classification issues, the authors fail to elucidate the limitations of multi-label retrieval in this retrieval task. 3.The clinical value should be emphasized. Intuitively, text similarity search appears to be more efficient and more widely accepted in clinical applications. 4.The research content is not sufficiently pertinent. In introduction, mentioning that this retrieval framework requires chest radiographs with high similarity to the target, the similarity query appears to be the bottleneck of this task. However, the focus of the manuscript lies in the robust extraction of fine-grained features, which seems biased.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    CheXtriev is based on anatomical priors for establishing a graph transformer to extract fine-grained local features in chest radiograph similarity retrieval tasks, and its effectiveness is validated through multiple experiments. However, from a clinical application perspective, the lack of robust support for anatomical priors and the potential efficiency reduction associated with such annotations are noteworthy. The details are as the following: Strengths: 1.The manuscript leverages graph transformers to extract local features, circumventing non-specific abnormalities. 2.It may integrate specific anatomical location information with subtle details within the region enriches representation.

    Weakness: 1.The motivation seems not relatively clear. The anatomically based innovation lacks robust support, appearing to be a naive intuitive conjecture or based on post-experimental interpretability analysis.

    1. Although this manuscript mentions multi-label classification issues, the authors fail to elucidate the limitations of multi-label retrieval in this retrieval task. 3.The clinical value should be emphasized. Intuitively, text similarity search appears to be more efficient and more widely accepted in clinical applications. 4.The research content is not sufficiently pertinent. In introduction, mentioning that this retrieval framework requires chest radiographs with high similarity to the target, the similarity query appears to be the bottleneck of this task. However, the focus of the manuscript lies in the robust extraction of fine-grained features, which seems biased.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Authors should clarify the practical clinical significance of this study.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The authors addressed all my concern in the response. Therefore I have change my scale.



Review #2

  • Please describe the contribution of the paper

    The authors propose a case-based retrieval of chest radiographs that utilize a graph transformer with local anatomy aware features. The authors show the proposed method can outperform the SOTA on multiple chest radiograph datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. using anatomy aware features for retrieval.
    2. learnable edge features that is independent of the input to encode the context
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Many symbols in sec 2.2 are not sufficiently explained. For example, may super and sub scripts are used without explanation.
    2. The encoding of the local features is very similar to AnaXNet but not enough comparison is given, which makes distinguishing the authors’ contribution difficult.
    3. In table 1, the performance of V0 actually outperform AnaXNet. However, both V0 and AnaXNet use very similar local feature encoding method. It is unclear why V0 is better than AnaXNet. It would be better to provide detail on how AnaXNet is evaluated.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    na

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    In addition to the weakness,

    1. it would be better to distinguish the proposed work from its predecessor (e.g., AnaXNet) by emphasizing the proposed additions. For example, the description of the problem formulation or common architecture design can be one subsection and the proposed method can be a separate subsection that include only the changes.
    2. the symbols can be simplified to remove many unnecessary confusions. This way, better explanations could be given while still fitting the page limit.
    3. The ablation study in Table 1 does not align with the description in section 2.2. The subsubsection names can better formulated to match the ablation.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed method seems to provide a significant performance boost. However, there are a few concerning factors in the main weakness that could challenge the performance gain. Therefore, the authors rebuttal is important for my recommendation.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The authors sufficiently addressed my concerns.



Review #3

  • Please describe the contribution of the paper

    The authors presented a graph-based anatomy aware framework for chest xray retrieval. Their method leverages graph transformers to extract informative features from specific anatomical regions. It also captures spatial context and the interplay between anatomical location and findings. Their method outperforms state-of-the-art global and local approaches by 18% to 26% in retrieval accuracy and 11% to 23% in ranking quality.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) Image retrieval is a very useful application, and chest xrays are especially difficult to interpret, therefore giving the paper a good amount of clinical significance; 2) The authors conducted different ablation experiments in order to prove that the proposed network was the optimal design; 3) The authors utilized a common dataset, therefore making it possible to directly compare their result to results in previous literature.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    While this might not count as a major weakness, the experiments & results sections could be strengthened in the following ways: 1) Evaluation metrics: the explanation of various evaluation metrics (AP, HR, RR) was very brief. For readers who were not familiar with image retrieval tasks, this could be difficult to understand; 2) The saliency map results presented in Fig 2 were not intuitive to understand. Was the Figure supposed to show superior performance of the proposed algorithm? If so, it should be spelt out.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Overall the authors have done a good job describing their method as well as comparing their method to other state-of-the-art methods. I would suggest the authors improve the evaluation and result sections so that the paper would be more friendly to readers who are unfamiliar with the image retrieval task (see my comments above).

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    1) The paper described an important clinical task to retrieve similar xray images. It is particularly important to do this for xray images because of how difficult and subtle xray images can be; 2) The authors have done a thorough comparison to other similar methods and showed superior result.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    Authors’ reply to my question was satisfactory.




Author Feedback

We thank the reviewers for recognizing our work’s significance in case-based CXR retrieval. Our approach achieves substantial gains (upto 26% accuracy, 23% ranking) over SOTA, especially for less prevalent findings. We appreciate the recognition of thorough experiments and effective graph-transformer design. We acknowledge concerns about detail due to space constraints and will incorporate all suggestions, including related to presentation, in final version.

@R4 (3.2) AnaXNet vs CheXtriev: Existing approaches rely on global features. We argue that anatomy-aware local features better capture subtle variations. AnaXNet (designed to classify) uses hand-crafted heuristics based on label co-occurrence. It lacks clinically relevant location information. Its naive 0/1 edges fail to capture higher-dimensional heterogeneity and neighbourhood aggregation limits expressiveness. Classifiers prioritize discriminative features, potentially losing details crucial for similarity.

CheXtriev, our proposed graph transformer (GT) framework, addresses these limitations with: (i) Learnable location embeddings(spatial context), (ii) Learnable continuous edge attributes(latent relationships b/w regions), (iii) GT layers with edge-aware attention(interactions b/w any node pairs), (iv) Shared-source gated residual connections (selectively combine features at different granularities). GTs have shown superior and generalizable representations(favorable for retrieval) compared to Graph ConvNets(GCN), supported by statistically significant boost (p<0.05) across metrics.

@R4 (3.3) AnaXNet vs V0: V0 averages region features for similarity. AnaXNet employs GCN before averaging. We use AnaXNet’s official code with region-level classification; its optimization for classification influences its design. We noticed when V1 was constructed with GCN instead of GT (AnaXNet but graph-level classification), it achieved lower performance than V0 (50.2mAP, 37.6mHR, 52.2mRR). This indicates classification-optimized features may not be the most effective for retrieval.

@R1 (3.1) Motivation of anatomy-awareness: We contend that anatomy-priors are not naive. Our method mimics radiologists’ systematic approach to reading CXRs [13,17,25], reducing missed findings in blindspots. Benefits are evident in lesser prevalent findings: FO/HF(+94%), PTX (+254%), CONS(+35%), PN(+20%). Obtaining automated anatomy annotations with high accuracy (>0.98 IoU for most regions[14]) is straightforward.

@R1 (3.3) Clinical value: MIR aids diagnosis and treatment planning [3], with retrieval augmentation [12] benefitting CAD. Even without query labels (basic assumption), it helps clinicians retrieve visually similar CXRs to explore differentials. Findings are not unique; underlying causes differ. For instance, two CXRs with same labels (PN, CONS) might show lower CONS with sharp border (bacterial) and upper CONS with blurred border (fungal), requiring different treatments. Label similarity might miss these subtle visual differences (share limitations with hashing). Reports (though not common) are more accessible than labels in clinics, but medical text similarity remains a challenge in NLP.

@R1 (3.4) Focus on features: Accurate relevance measurement (cosine similarity) hinges on quality of representations. Current multi-label retrieval methods often lose crucial disease and ROI detail [10, 6]. Clinicians rely on anatomy to identify pathologies, making it essential that retrieval representations are discriminative and informative. We focus on learning sufficiently detailed fine-grained features for effective similarity measurement, following clinical reasoning.

@R3 (3.2) Saliency maps: We adapted [20] with anatomy-occlusions to generate maps for query and retrieved pairs and show the interpretability of our results, often ignored in retrieval literature. We’ve shown these maps may not be meaningful for methods that don’t use anatomy-aware features (in interpretability analysis subsection).




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Satisfactory review.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    Satisfactory review.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    All reviewers have found the paper interesting use of anatomy information in retrieval. Appears to improve upon AnaXNet in many ways and the rebuttal is addressing the concerns of the reviewer.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    All reviewers have found the paper interesting use of anatomy information in retrieval. Appears to improve upon AnaXNet in many ways and the rebuttal is addressing the concerns of the reviewer.



back to top