Abstract

Medical anomaly detection is a critical research area aimed at recognizing abnormal images to aid in diagnosis. Most existing methods adopt synthetic anomalies and image restoration on normal samples to detect anomaly. The unlabeled data consisting of both normal and abnormal data is not well explored. We introduce a novel Spatial-aware Attention Generative Adversarial Network (SAGAN) for one-class semi-supervised generation of health images. Our core insight is the utiliza tion of position encoding and attention to accurately focus on restoring abnormal regions and preserving normal regions. To fully utilize the unlabelled data, SAGAN relaxes the cyclic consistency requirement of the existing unpaired image-to image conversion methods, and generates high-quality health images corresponding to unlabeled data, guided by the reconstruction of normal images and restoration of pseudo-anomaly images. Subsequently, the discrepancy between the generated healthy image and the original image is utilized as an anomaly score. Extensive experiments on three medical datasets demonstrate that the proposed SAGAN outperforms the state-of-the-art methods. Code is available at https://github.com/zzr728/SAGAN

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1816_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

https://github.com/zzr728/SAGAN

Link to the Dataset(s)

https://www.kaggle.com/c/rsna-pneumonia-detection-challenge https://www.kaggle.com/c/vinbigdata-chest-xray-abnormalities-detection

BibTex

@InProceedings{Zha_Spatialaware_MICCAI2024,
        author = { Zhang, Zerui and Sun, Zhichao and Liu, Zelong and Zhao, Zhou and Yu, Rui and Du, Bo and Xu, Yongchao},
        title = { { Spatial-aware Attention Generative Adversarial Network for Semi-supervised Anomaly Detection in Medical Image } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15005},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper introduces a spatially aware attention generative adversarial network model that employs a semi-supervised learning strategy for generating healthy images to carry out the task of anomaly detection in medical images. The network effectively utilizes the anomalous spatial features within unlabeled images, ensuring the accurate reconstruction of normal data and the precise restoration of pseudo-anomalous regions. This approach addresses the challenge of accurate restoration when there is a significant disparity between the anomalous and normal areas. The model has achieved the highest results in terms of Average Precision (AP) and Area Under the Curve (AUC) on three datasets: VinDr-CXR, RSNA, and LAG.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. By recognizing the anatomical consistency inherent in most medical images, the model encodes positional information into input image patches. Utilizing structural similarity, it enhances recovery quality.
    2. The model incorporates an attention gating mechanism inspired by prior work, which dynamically highlights regions of interest and generates corresponding masks. These masks enable the model’s decoder to preserve normal areas while effectively discarding and reconstructing the normal structures of the targeted anomalous regions.
    3. A spatially aware generator has been designed, featuring an anatomical consistency module and an anomalous region restoration module. This design emphasizes anomalous areas and restores their normal structures based on the same location’s spatial information.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The presentation of the resultant images lacks detailed interpretation of the detected anomalies. It would be beneficial to provide a more thorough explanation of the anomalous regions within the images, thereby allowing readers to gain a clearer understanding of the model’s reconstruction effectiveness.

    2. Recent work on anomaly detection using diffusion models does not include a comparison with methods based on diffusion models. To establish the relevance and advancement of the proposed model, comparative analysis with diffusion model-based methods is recommended.

    3. The code associated with the article has not been made available as open-source. Facilitating access to the source code would significantly enhance replicability and comprehension for readers. The authors are encouraged to consider open sourcing the code to foster transparency and facilitate further research in the field.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The source code associated with the article has not been made available as open-source. Providing access to the source code would greatly enhance both replicability and understanding for readers. The authors are encouraged to consider open-sourcing the code to promote transparency and support further research in this field.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. The exposition of the resultant images could be improved with a more detailed interpretation of the anomalies detected. An enhanced elucidation of the anomalous regions within the images would enable readers to more accurately ascertain the efficacy of the model’s reconstruction capabilities.

    2. Contemporary research in anomaly detection utilizing diffusion models lacks a comparative evaluation with existing diffusion model-based methodologies. To underscore the significance and innovation of the proposed model, it is advisable to conduct a comparative analysis against established diffusion model-based techniques.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The issue explored in this paper is of considerable significance, and the methodology introduced exhibits an advanced approach. Nonetheless, the paper does not provide a comparative analysis with the latest diffusion models. Such an analysis is crucial to validate the proposed method’s advanced nature and its effectiveness.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    1) The proposed work improves upon existing methods by not requiring paired images for training and by generating high-quality health images from unlabeled data, which includes both normal and abnormal images, thus making better use of available data. 2) The proposed method demonstrates superior performance over state-of-the-art methods in detecting anomalies across three different medical datasets, showcasing its effectiveness and versatility in medical anomaly detection. 3) The paper is presented in a way that highlights its innovative approach to utilizing unlabeled data and spatial features for anomaly detection, making a significant contribution to the field of medical image analysis.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper makes a significant contribution to medical image analysis by presenting SAGAN, an innovative method that generates high-quality health images without needing paired training data, effectively utilizing both normal and abnormal unlabeled images for anomaly detection. This approach not only enhances data utilization but also simplifies the complexity of the process, emphasizing the novel use of spatial features and unlabeled data in advancing the field.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The related work section suffers from two main shortcomings. First, it omits relevant and important papers, such as the one by Tang et al. (2021) on disentangled generative models for disease decomposition in chest X-rays. Second, the authors fail to clearly outline the limitations of previous studies and how their own work addresses these limitations. Specifically, after reading page 2, it remains unclear how the proposed approach tackles the shortcomings of existing methods. Additionally, the data splitting procedure needs clarification. Splitting the data only once raises concerns about the generalizability of the results obtained from a single split. Finally, the results comparison section would benefit from a more in-depth analysis. A more detailed explanation of why SAGAN outperforms other methods would be valuable for understanding the strengths of the proposed approach.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Please provide the code if possible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    1) Page 2: Before using the acronyms, define ‘DDAD’ and ‘AMAE’. 2) Equation 3: Define ‘x1’

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors must revise the “Introduction” section, as the current version is confusing. Additionally, they need to provide more details on the data split method used and evidence of reproducibility.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors have proposed a semi-supervised anomaly detection method using SAGAN. Compared to the original SAGAN network, the authors have made several improvements in the following aspects:

    1. The fundamental idea of this manuscript is to conduct data augmentation. Due to the lack of real medical data, the authors proposed to supervise the reconstruction of unlabeled images with two steps. One is to restore the real normal images, and the other is to recover the pseudo abnormal images.
    2. In the encoding part of the generative network, the ACM module is introduced to provide the spatial-aware encoder for every input medical image. It is consistent with the previous method in terms of anatomy and can help to generate more accurate medical images.
    3. In the decoding part of the generative network, the attention mechanism is added to dynamically produce masks for the anomaly regions. It can also accurately protect regular regions with affecting the detection of the abnormal parts.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Firstly, SAGAN helps to generate highly resolved medical images with better fidelity while the ACM mechanism ensures the anatomical consistency of the generated images with the original ones. The combination of the two techniques further improves the quality of the generated images. Secondly, attention model can let the network focus more on the anomaly parts of the images, which increases the accuracy of image restoration while keeping the consistency of the abnormal regions with their neighbor parts.

    1. The proposed model can be improved in terms of learning efficiencies and system performance with the help of supervised tokens/signals. The supervised information comes from two aspects. The first is the normal image restoration and the other is the restoration of pseudo abnormal images.
    2. A comprehensive numerical study has been conducted based on three benchmark medical image datasets, and results can validate the advantages of the proposed method against the SOTA models.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Limitations of this SAGAN model are not discussed in a scientific manner. For example, the training still replies heavily on quite a few normal data with annotations, which is a little contradiction to the hypothesis of the article.
    2. More detailed descriptions on the process of generating pseudo abnormal images are needed since this can help to validate the effects of the proposed SAGAN network and to provide some hints for possible improvements in future research.
    3. While the proposed method enhances the baseline model, Barinomaly, by introducing the Anatomical Consistency Module (ACM) and Abnormal Region Restoration Module (ARRM), there’s a notable absence of a clear delineation of the specific improvements made upon the baseline.
    4. The use of binary positional condition encoding in ACM, added to the channel dimension, might not be the most efficient approach. Moreover, the lack of comparisons with other positional encoding techniques leaves room for improvement in the methodology’s robustness and effectiveness.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The authors may say a little more on the process of training the hyper parameters in building the network.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. It is better to provide some detailed descriptions of the substitution method to replace the pseudo anomaly generation techniques.
    2. It will be more convincing to state a little more on reducing the need for normal annotations that the training usually replies on. In other words, generalization performance of the proposed model can be further analyzed to unveil the pros and cons of the technique.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The performance of this SAGAN network is better than the SOTA techniques in terms of the current indices AP and AUC on all the three benchmark datasets. However, the application of this method in real medical cases still needs further exploration.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We sincerely appreciate the reviewers for providing highly insightful comments. We greatly appreciate the feedback, as it will undoubtedly help us improve the quality of our paper. We have considered suggestions and addressed concerns to the best of our ability. According to the “Rebuttal Guide,” new experimental results in the rebuttal are not allowed. We will show additional results for suggested experiments when releasing source codes on GitHub.

Q1: Open source code (R3, R4, R5): We will provide open access to source code in our camera-ready version paper

Q2: Setting of training data (R3, R5): There is a key misunderstanding that the training setup is less likely to require normal data. Instead, in addition to normal data, our core idea is to fully utilize unlabeled data through normal data supervision. Therefore, our proposed SAGAN supervises the restoration of unlabeled data by ensuring accurate reconstruction of normal data and precise restoration of pseudo-anomaly data. We acknowledge the practical suggestion of reducing normal annotations and will analyze SAGAN’s generalization performance with less normal data in future work. Additionally, we only control anomaly ratio of unlabeled data and randomly select the appropriate number of anomaly data. We have conducted several experiments and chose the average value as the final result to ensure reproducibility. We will make clarify this.

Q3: Explanation of specific improvements made upon the baseline (R3, R5): SAGAN addresses limitations in previous works in two key ways. First, previous methods overlook the recurrent anatomical structures present in most radiography images. SAGAN introduces the Anatomical Consistency Module (ACM) to leverage these similarities in structures at the same positions, thereby enhancing the restoration quality. Second, previous methods fail to accurately restore anomalous regions. SAGAN proposes Abnormal Region Restoration Module (ARRM) which incorporates an attention gate mechanism. The attention gate enables the SAGAN decoder to retain normal regions while effectively discarding abnormal regions of interest and generating the corresponding normal structures. Ablation experiments on VinDr-CXR dataset show the quantitative enhancement of ACM and ARRM over the baseline.

Q4: Additional positional encoding techniques (R3): We have already conducted an ablation study on different positional encoding techniques. Due to limited space, we did not include them in the paper. These results will be released on our GitHub repository.

Q5: More thorough explanation of the anomalous regions (R4): We detect anomalies by comparing the difference between the restored image and the original one. Our results for detecting anomalous regions are demonstrated in two figures. In Fig. 2, we highlight the anomaly regions of interest using anomaly maps generated by the attention gate. In Fig. 3, we systematically present heatmaps derived from different data reconstructions. The heatmaps are based on the difference maps between the restored images and the original ones.

Q6: Other minor weaknesses (R5): Thanks for suggesting the valuable article by Tang et al. (2021). We will include it in camera-ready version. The definitions of “DDAD” and “AMAE” will also be added before using the acronyms. In Eq. (3), We have defined that x1 belongs to xn which denotes normal data.




Meta-Review

Meta-review not available, early accepted paper.



back to top