Abstract

Patients with Intracranial Hemorrhage (ICH) face a potentially life-threatening condition and patient-centered individualized treatment remains challenging due to possible clinical complications. Deep-Learning-based methods can efficiently analyze the routinely acquired head CTs to support the clinical decision-making. The majority of early work focuses on the detection and segmentation of ICH, but do not model the complex relations between ICH and adjacent brain structures. In this work, we design a tailored object detection method for ICH, which we unite with segmentation-grounded Scene Graph Generation (SGG) methods to learn a holistic representation of the clinical cerebral scene. To the best of our knowledge, this is the first application of SGG for 3D voxel images. We evaluate our method on two head-CT datasets and demonstrate that our model can recall up to 74% of clinically relevant relations. This work lays the foundation towards SGG for 3D voxel data. The generated Scene Graphs can already provide insights for the clinician, but are also valuable for all downstream tasks as a compact and interpretable representation.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0751_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/0751_supp.pdf

Link to the Code Repository

https://github.com/MECLabTUDA/VoxelSceneGraph

Link to the Dataset(s)

https://instance.grand-challenge.org/

BibTex

@InProceedings{San_Voxel_MICCAI2024,
        author = { Sanner, Antoine P. and Grauhan, Nils F. and Brockmann, Marc A. and Othman, Ahmed E. and Mukhopadhyay, Anirban},
        title = { { Voxel Scene Graph for Intracranial Hemorrhage } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15002},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper introduces a 3D voxel Scene Graph Generation (SGG) framework for Intracranial Hemorrhage, which serves as a foundation for future research. The framework consists of two stages. In the first stage, the author employs the retina-Unet to extract features from Regions of Interest (ROI) as semantic segmentation outputs. In the second stage, two different SGG variant methods are utilized to perform relation prediction. The paper introduce a better segmentation method.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Strengths:

    1. Introduction of Scene Graph Generation (SGG) into 3D CT imaging.
    2. Transition from Neural Motifs and interactive message passing to V-MOTIF and V-IMP, representing advancements in methodology.
    3. Evaluation conducted on both a public dataset and a private dataset, enhancing the comprehensiveness of the study.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Weaknesses: 1.Lack of research novelty as the components of the framework are existing methods, without proposing new algorithms. I2.nsufficient details in the method section, particularly regarding the implementation of Retina-UNet and the modifications made to the variants used.

    1. Lack of justification for evaluating bleeding detection from the ventricle system and midline, and ambiguity regarding the specific nnDectection method employed. Additionally, there is a need for more transparency regarding parameter fine-tuning and preprocessing efforts to ensure fair comparisons. Furthermore, the paper lacks explanation for the superior performance of V-MOTIF and V-IMP in scene graph generation and object classfication accuracy of base method.
    2. Absence of ablation studies, such as using nnDectetion as a method to demonstrate whether improved segmentation results lead to better scene graph predictions.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The author stated their intention to release the code upon acceptance of the paper.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. Enhance the method section by providing more detailed descriptions. Present the methodology with mathematical formulations to precisely define the problem. Specify the components borrowed from existing works and highlight the innovations introduced by your framework. Ensure that the evaluation clearly demonstrates the impact of your innovations. Ensure consistency in the evaluation section. If the author claims better object detection performance, the evaluation should include results related to object detection, not just bleeding detection.
    2. Provide additional details about the methods used for evaluation. Specify which version of nnDectection was utilized and clarify whether it was implemented or obtained directly from the original paper.
    3. Explain the results in the evaluation section, particularly regarding the improvement in VGG results attributable to better segmentation. Include nnDection segmentation results and demonstrate how they contribute to the performance of the V-MOTIF and V-IMP frameworks.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. Lack of research novelty: The paper lacks novelty as it predominantly relies on existing methods without introducing new approaches or algorithms.
    2. Discrepancy between evaluation and conclusion: The evaluation section does not adequately reflect the conclusions stated in the introduction, particularly regarding the first contribution.
    3. Insufficient implementation details in the method section: The method part lacks detailed descriptions, particularly regarding the implementation of specific methods.
    4. Inadequate description in the evaluation section: The evaluation part lacks clarity and detail, making it difficult to understand and interpret the results effectively.
  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper presents a method to detect relationships between the Intracranial Hemorrhage (ICH) and adjacent brain structures. It uses a detection method (based on reference 9) together with a segmentation-grounded Scene Graph Generation (SGG) method to learn a representation of the “clinical cerebral scene”, that can be then used for downstream tasks such as patient outcome prediction.

    Detection is done using a method based on Retina Net (ref 9). Next relations between parts of objects are found using two SGG methods - Neural Motifs [ref 20], and Iterative Message Passing [ref 18].

    The work is evaluated on two head-CT datasets and demonstrate that the model can recall up to 74% of clinically relevant relations.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper combines and adapts few published methods into a system that is able to detect, segment and build relationships between parts that are bleeding and the main anatomical structures that are involved in the evolution of the ICH disease - the ventricle system and the midline.

    • It is a first work to apply SGG to voxel data in a practically useful application

    • The author proposed an improvement over Retina UNet to deal with overlapping structures . The bleeding detection method outperforms the state-of-the-art nnDetection [ref 3].

    • The paper is clearly written and the steps are described well despite the concise form that the MICCAI short format imposes. I like in particular Fig 2 that brings together all the steps and methods involved.

    • The work is well validated on two datasets INSTANCE2022 challenge dataset [ref 12, 120 CT scans of patients with ICH] and a private cohort (18 non-contrast head CTs of patients diagnosed with ICH). An internal tool that ill be made open source was developed for the graph annotation.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The modified Retina-UNet detection method lacks details. pg 4 : the detection method “leverage Retina-UNet’s segmentation capabilities by detecting both anatomies from the predicted semantic segmentation” - it is difficult to understand how this is done. I realize that a miccai paper might not provide enough space for an accurate description of the method. Both the detection and graph generation are evaluated using established methods.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The SGG methods are well described but details are missing from the modified detection-segmentation method. Authors will make the code available upon acceptance so the method will be reproducible using the code.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    I would encourage the author to clarify the detection method as space allows.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    • The paper is well written and clear. It proposes a relatively complex system that uses several methods but the experiments proves that it works well.
    • It proposes an interesting application of the SGG to voxel data, to learn a structured representation of the brain entitled involved in ICH.
    • It is important for the community to go beyond segmentation works
  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors argue that semantic segmentation allows detecting an object, for instance, intracranial hemorrhage, but not its relation to other structures, for example, midline shift, which may indicate a severe case of hemorrhage. In this paper, the authors propose the use of scene graph to model the relation of objects in a scene for a more robust detection of the object of interest in the image. They compare the performance of their method with nnDetection on the problem of intracranial hemorrhage detection. While nnDetection is a generic auto-configured detector, the authors show that due to the explicit relation definition, their method is able to surpass nnDetection. The authors evaluated their method in a public (INSTANCE2022) and a private dataset with harder cases. The paper is well written.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The idea of modeling the relation between objects for a more robust detection of critical cases is intuitive and is based on a series of works on scene parsing that is already established.

    • The comparison with a strong method, nnDetection, as a baseline is good case on the quality of the proposed method. Also, when we consider the complexity of the rules in nnDectection and the fact that in the proposed method the “rules” are learnt from the data based on the indication of the relation with the surrounding structures is an advantage.

    • The increase in performance is significative, which seems to show that the approach goes in the right direction.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The definition of the keys structures and their relations require a solid domain knowledge to be effective, which may indicate that the method is not so easily adapted to other problems.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?
    1. While the authors promise a code repository, the paper lacks sufficient detail on the specific structure implementation. This makes it difficult to fully grasp how information flows through the system. For example, Figure 1, while illustrating execution, could be enhanced with annotations or a supplementary diagram to clarify how information is utilized within each component of the system.

    2. The results section are very extensive for the space, compacting many tests. I would prefer moving some to supplementary material and including more discussion. The results suggest that modeling the relationship between structures was effective.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. While the authors promise a code repository, the paper lacks sufficient detail on the specific structure implementation. This makes it difficult to fully grasp how information flows through the system. For example, Figure 1, while illustrating execution, could be enhanced with annotations or a supplementary diagram to clarify how information is utilized within each component of the system.

    2. The results section are very extensive for the space, compacting many tests. I would prefer moving some to supplementary material and including more discussion. The results suggest that modeling the relationship between structures was effective.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    As pointed out above, the document should have more detailed description of the methods and more interaction with the tables of results; however, they present several studies showing the robustness of the method. Also, the baseline is a string method; however, they present a large improvement. Also, they site key references that will help an interested reader in understanding their system with more detail. Adding to it, they say they will make the code available, which will help to see the details. In general, it seems a well-grounded method.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We thank all the reviewers for the detailed and constructive feedback. Thank you for appreciating the technical challenges we addressed while introducing scene graphs to voxel data. There is no publicly available data for any application. There is no publicly available annotation tool. But beyond this, most open source libraries for object detection (e.g. torchvision, detectron2, mmdetection) are hard coded to only support 2D bounding boxes or point clouds for some 3D applications. Our proposed method is the first one for Voxel Scene Graph encompassing the entirety of the Computer Vision and Machine Learning literature. Our framework also has no existing equivalent. We will gladly incorporate the minor changes suggested by the reviewers into the final manuscript.




Meta-Review

Meta-review not available, early accepted paper.



back to top