Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Neurologic emergencies need to treat unspecified anomalies with various shapes, intensities, and locations in 3D non-contrast brain CT. However, in practice, patients with anomalies take a relatively small portion of total CT volumes. In this situation, excluding unremarkable scans could reduce radiologists’ workload. We used a generative unsupervised anomaly detection (GUAD) with 3D Hierarchical Diffusion AutoEncoder (HDAE) model to develop this. In this study, we considered anomalies in two perspectives and made models. One is a Coarse-Morphological anomaly detection Model (CMM), and the other is a Fine-Grained anomaly detection Model (FGM). We ensembled these models’ decisions for the exclusion of the unremarkable scans. Models were trained with normal scans of 28,510 from institution A. For evaluation, we mainly used two consecutive test sets of 544 scans of institution A and 1,795 scans of another institution B. Among clinically significant and unremarkable scans, our study showed [NPV (Negative Predictive Value)/workload reduction] of [98.1%/9.7%] and [96.7%/19.9%] for institutions A and B, respectively. Additionally, we used a public dataset (NPV of 98.5%) and five other external hospitals’ hemorrhage sets (NPV of 96.0%) to evaluate robustness. Under the reasonable NPV, models showed the potential for workload reduction by omitting unremarkable scans. Compared to individual results of CMM or FGM, the ensembled decision usually shows NPV advantages. Also, with visual results, we observed our model could detect various types of anomalies.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/3060_paper.pdf

SharedIt Link: https://rdcu.be/eHwPH

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04947-6_21

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/Krying/WLR_ANO_3D

Link to the Dataset(s)

N/A

BibTex

@InProceedings{WonJon_Generative_MICCAI2025,
        author = { Won, Jongjun AND Kim, Jihwan AND Oh, Joonseo AND Yoo, Yereen AND Yum, Jieun AND Lee, Joonsang AND Park, Joon Hyung AND Jo, Wooyoung AND Nam, Yoojin AND Lee, Hyunki AND Hong, Gil-sun AND Kim, Namkug},
        title = { { Generative Unsupervised Anomaly Detection with Coarse-Fine Ensemble for Workload Reduction in 3D Non-contrast Brain CT of Emergency Room } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15962},
        month = {September},
        page = {214 -- 223}
}

Reviews

Review #1

Please describe the contribution of the paper

This study presents an innovative method for unsupervised anomaly detection, termed GUAD. The approach combines two complementary models: a coarse-grained morphological anomaly detection model and a fine-grained intensity anomaly detection model . By integrating decisions from both models, GUAD aims to reduce radiologists’ workload in emergency departments when analyzing 3D non-contrast brain CT scans. The method leverages a 3D HDAE to reconstruct input images, identifying anomalies through the differences between reconstructed normal images and the original scans.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

(1) Comprehensive Dual-Model Framework: GUAD integrates CMM and FGM, effectively addressing both morphological abnormalities and subtle intensity variations for thorough anomaly detection.
(2) Robust Performance Across Datasets: The model demonstrates reliability, achieving a consistently high negative predictive value (NPV) of over 96% across multiple datasets.
(3) Clinical Utility in Emergency Settings: By excluding normal scans, the method reduces radiologists’ workload in high-pressure emergency environments, showcasing its practical significance.
(4) Rich Data and Visualization: The study provides detailed metrics alongside extensive visualizations, presenting clear evidence of the model’s effectiveness and capabilities.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

(1) Absence of Comparative Evaluation: The research does not include comparisons with existing anomaly detection methods, which limits the ability to substantiate the model’s advantages over state-of-the-art approaches.
(2) Limited Innovation in Methodology: While effective, the proposed method shows minimal novelty in its underlying design and techniques compared to established models.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(2) Reject — should be rejected, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper’s structure is incomplete, as it does not include comprehensive comparative experiments with recent methods or detailed ablation studies of the proposed modules. (1) Incorporate Comparative Studies: To enhance the study’s contributions, experiments comparing GUAD to existing methods, including diffusion-based and other generative models, should be conducted.
(2) Examine Model Design and Parameters: A deeper exploration of the model’s architecture and parameter settings would provide greater transparency and contextualize its performance within the field.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Reject
[Post rebuttal] Please justify your final decision from above.

The author’s reply did not address my concerns adequately. The paper suffers from a lack of comparative evaluation and demonstrates limited innovation in its methodology. Consequently, I believe the paper is not suitable for publication due to its incomplete framework.

Firstly, to improve the study’s contributions, it is essential to incorporate comparative studies. Conducting experiments that compare GUAD with existing methods, such as diffusion-based and other generative models, would provide valuable insights.

Secondly, a more thorough examination of the model’s design and parameters is necessary. This in-depth exploration would offer greater transparency and better contextualize the model’s performance within the field.

Review #2

Please describe the contribution of the paper

The authors proposed an anomaly detection algorithm designed to identify various types of brain abnormalities visible on non-contrast CT scans taken in the emergency room. The model was trained in an unsupervised manner and is composed of two main components: a Coarse-Morphological anomaly detection model (CMM) and a Fine-Grained anomaly detection model (FGM). Both the training and testing datasets were collected independently by the authors, and the model’s performance was evaluated using metrics such as negative predictive value (NPV) and workload reduction.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Research on anomaly detection in brain CT is relatively limited, and most of the existing studies have primarily utilized GAN-based approaches. The authors of the submitted manuscript propose an unsupervised anomaly detection method using a diffusion-based autoencoder model. While their approach also involves generating normal brain images and comparing them with input images—similar to previous studies—the methodology differs in key aspects.

The model was trained in two different ways: one involved generating normal images by adding noise to normal inputs, following the general framework of diffusion models; the other involved generating anomalous images by combining normal inputs with both a condition mask and noise. The model was then trained to detect anomalies by comparing these two cases. Ultimately, it appears that the novelty of this study lies in its unique approach to image generation.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

While the image generation approach shows some novelty, it is somewhat disappointing that the study does not deviate meaningfully from the overall framework of previously reported GAN-based studies. It would have been more compelling if a direct performance comparison with existing GAN models had been included, but the lack of such evaluation stands out as a limitation.

https://www.nature.com/articles/s41467-022-31808-0

https://www.spiedigitallibrary.org/journals/journal-of-medical-imaging/volume-11/issue-4/044508/Generative-adversarial-networkbased-reconstruction-of-healthy-anatomy-for-anomaly-detection/10.1117/1.JMI.11.4.044508.full

https://www.spiedigitallibrary.org/conference-proceedings-of-spie/12465/1246504/Unsupervised-learning-of-healthy-anatomy-for-anomaly-detection-in-brain/10.1117/12.2653889.short
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The approach presents methodological novelty, and the manuscript appears to be well-structured in journal format. However, for a methodological proposal to be convincing, it should be supported by comparative evaluations with existing studies. The lack of such comparisons is a notable shortcoming.
Reviewer confidence

Somewhat confident (2)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

It is clearly a limitation that the manuscript does not demonstrate that the proposed model achieves state-of-the-art performance through a comparison with existing studies or models of similar purpose prior to submission. However, it is worth acknowledging that the model was evaluated on a relatively large number of test cases compared to previously reported studies, and that it achieved a high negative predictive value (NPV), which is particularly important in anomaly detection.

Review #3

Please describe the contribution of the paper

The paper proposes a generative unsupervised anomaly detection framework based on a hierarchical diffusion autoencoder to automatically screen out unremarkable 3D non-contrast brain CT scans in ERs. This method introduces two specialized models, a Coarse-Morphological anomaly detection Model (CMM) for identifying larger structural abnormalities, and a Fine-Grained anomaly detection Model (FGM) for detecting subtler intensity-based lesions. Both models are trained on thousands of normal scans from a single institution, then evaluated on various in-house and external multi-center datasets. The Negative predictive value (NPV) and workload reduction (WLR) are used as evaluation metrics. Results demonstrate that the ensemble of CMM and FGM can detect a wide range of neurological conditions while reducing the radiologists’ reading burden. The method shows robust performance across institutions, highlighting its clinical utility, scalability, and potential for real-world deployment.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1.The motivation of the paper is very clear and well justified, focusing on the urgent need to reduce radiologists’ workload by automatically filtering out normal brain CT scans in the emergency rooms. 2.The paper proposes two complementary models, including a coarse morphological model to detect large scale structural abnormalities and a fine grained model to capture subtle intensity variations. Their ensemble decision strategy is conceptually sound, ensuring that neither large nor small anomalies are overlooked. The overall design is novel. 3.The paper is well organized and flows smoothly. Figures are clear, well labeled, and make the paper easy to follow. 4.The authors validate their framework on a large internal dataset and on multiple external test sets, including data collected from different institutions and different scanner types, covering a variety of pathologies. This demonstrates their method’s strong generalizability and robustness under real world conditions. 5.The paper reports the NPV and the WLR, which directly ties performance to real clinical goals. They also provide a thorough threshold analysis, which allows users to tune the system for their own safety versus efficiency needs. 6.The paper is clinically meaningful. It has the potential to accelerate emergency radiology workflows by dismissing unremarkable scans and enabling radiologists to focus on serious cases first.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

1.While the reported NPVs are high, the system still missed critical cases as mentioned by the authors. Could you discuss how you plan to address this issue in the future, such as adding a simple safety net or performing a secondary backstop check? 2.The paper presents strong absolute performance but does not compare against existing methods. Adding such comparisons would help readers better estimate the true benefit of the proposed method. 3.The paper lacks details on inference speed, response times, hardware requirements, and memory usage. Providing these details would better clarify whether the method is applicable in real-time clinical use.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(6) Strong Accept — must be accepted due to excellence
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper solves a real-world problem of reducing radiologists’ workload by automatically filtering out normal brain CT scans in emergency rooms. The motivation is clear, the writing is well-organized, and the experiments are thorough. The method shows strong clinical usefulness and real-world potential. Overall, I recommend accepting this paper.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

My opinion remains unchanged. This would be a good paper if it included a more detailed comparison and description of the network architecture.

Author Feedback

We sincerely thank the reviewers for their insightful comments. The reviewers commented on the strength of our points, the clinical utility of workload reduction (WLR) and proper assessment across multiple datasets. The primary concern raised by all reviewers was the lack of comparison with other models. We recognize this and will explicitly address it as a limitation in the final version. We abbreviate Reviewer#N’s Question part n to R#N-n.

#Absence of Comparative Evaluation Although direct head-to-head comparisons of our model to the those from R#2-7’s three references were challenging due to rebuttal rules and differences in their primary study purposes, evaluation metrics, post-processing, and test sets, we compared the reported sensitivity or accuracy of lesion detection of the models, seeking whether they can catch all the anomalies. The first reference reported the sensitivity up to 1.00 (41 cases) and 0.96 (197 cases) for the two test sets. The second reference’s reported accuracy of lesions was 0.92 (15 cases) and 0.93 (11 cases). In the third, the sensitivity was 0.858. Our model demonstrated greater consistency across larger datasets as sensitivities derived from the NPVs reported in our paper (Fig. 1. e and Table 1) were 0.99 (118 cases), 0.96 (270 cases), 0.97 (36 cases), and 0.94 (269 cases) for the internal and three external datasets, respectively. Moreover, the first reference used median filtering with a window size of 17 for post-processing, where small lesions might have vanished. In our case, small lesions could be visualized (Fig. 2. f and Fig. 3. a, c), whereas such visualizations were not found in the reference.

#Other points R#2-7 questioned the advantage (meaningful deviation) of our method compared to GAN-based frameworks. Our strategy is end-to-end, while GAN frameworks usually require two-stage training (one for GAN itself, the other for the encoder for image reconstruction). Also, our diffusion-based model showed convergence stability, as it was able to reconstruct a clear brain shape after only 1 epoch of training. We utilized a pre-existing anomaly scoring method, as it requires manual thresholding, which may improve user interaction by allowing customized use for varying patient populations or availability of resources (WLR targets, etc.) in clinical settings, as described in session 3.3, line 6. For R#3-7-3, we offer practically acceptable inference time per scan: CMM ~88s, FGM ~120s, (RTX3090, VRAM<6 GB). We’ll describe the details in the final version. For R#3-7-1, we are currently developing an improved model with secondary check methods.

#Clarification on Paper Focus To the best of our knowledge, we have not found a similar study about “WLR on brain CT”. Our study could become one of the initial steps in this research direction. Given that this research is on a relatively unexplored subject, we primarily aimed to demonstrate the impact and clinical relevance of the deep learning application.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

Two of the three reviewers recommend acceptance, highlighting the clinical relevance and practical value of the proposed method. While one reviewer raises concerns about methodological novelty and missing comparisons, the submission is positioned as an application paper and fits well within the translational focus of the MICCAI application track. hat said, future work would benefit from broader comparisons to recent diffusion based approaches.

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

2 of the three reviewers recommend acceptance after rebuttal - despite all still mentioning the lack of comparison to the state of the art. However, the innovation and novelty is highlighted by the reviewers.

back to top

Generative Unsupervised Anomaly Detection with Coarse-Fine Ensemble for Workload Reduction in 3D Non-contrast Brain CT of Emergency Room

Author(s):