Abstract

Motivated by the question, “Can we generate tumors with desired attributes?’’ this study leverages radiomics features to explore the feasibility of generating synthetic tumor images. Characterized by its low-dimensional yet biologically meaningful markers, radiomics bridges the gap between complex medical imaging data and actionable clinical insights. We present RadiomicsFill-Mammo, the first of the RadiomicsFill series, an innovative technique that generates realistic mammogram mass images mirroring specific radiomics attributes using masked images and opposite breast images, leveraging a recent stable diffusion model. This approach also allows for the incorporation of essential clinical variables, such as BI-RADS and breast density, alongside radiomics features as conditions for mass generation. Results indicate that RadiomicsFill-Mammo effectively generates diverse and realistic tumor images based on various radiomics conditions. Results also demonstrate a significant improvement in mass detection capabilities, leveraging RadiomicsFill-Mammo as a strategy to generate simulated samples. Furthermore, RadiomicsFill-Mammo not only advances medical imaging research but also opens new avenues for enhancing treatment planning and tumor simulation. Our code is available at https://github.com/nainye/RadiomicsFill.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1807_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1807_supp.pdf

Link to the Code Repository

https://github.com/nainye/RadiomicsFill

Link to the Dataset(s)

https://www.physionet.org/content/vindr-mammo/1.0.0/ https://www.kaggle.com/datasets/martholi/inbreast

BibTex

@InProceedings{Na_RadiomicsFillMammo_MICCAI2024,
        author = { Na, Inye and Kim, Jonghun and Ko, Eun Sook and Park, Hyunjin},
        title = { { RadiomicsFill-Mammo: Synthetic Mammogram Mass Manipulation with Radiomics Features } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15003},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This work proposes a new method, based on diffusion models, to synthesize realistic mammographic images that contain masses reflecting certain attributes/features of the mass and the healthy breast.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) This is an important research topic that could potentially help address class imbalance in breast cancer screening datasets. 2) The proposed method is conceptually innovative. 3) Extensive evaluation experiments, including external validation and evaluation in downstream tasks. 4) The authors have used two publicly available datasets and have made their code available through github too.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) Potential data leakage in model evaluation - unclear if data splits were at the patient/image/mass level. Given that each mammographic exam contains multiple views, a mass may be visible in more than one view, and an exam may contain more than one masses, the way data was split is critical. 2) More details regarding the “tumor conditions” (i.e., specific shape/radiomic features) embedded in the model are essential. Also, molecular type of mass hasn’t been considered although it largely affects the mass appearance. 3) No quantitative results in the evaluations of 3.1. The violin plots do not suffice to prove the realism of synthetic images, especially when features are grouped. Pairwise quantitative evaluations are missing. 4) Without standard deviations or confidence intervals, and p-values for comparisons, the statistical significance in the comparisons shown in Tables 1-3 is unclear. 5) The use of 2D versus 3D mammograms is a limitation that needs to be acknowledged.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Despite the various strengths of this work, there are major weaknesses to be addressed as listed above.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Good novelty, important research topic, major methodological revisions needed.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper explores the feasibility of radiomics features to generate synthetic tumor images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. This paper is well and clearly written.
    2. The idea is new and interesting.
    3. The evaluations and visualizations looks good.
    4. The experiment result successfully shows radiomics features are useful for data augmentation with RadiomicsFill-Mammo.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. More details of the dataset need to be given. In Table 2, how many LB, HM, HB, HM cases are used for training and validation? Are the 1.000 AP of the L-B and L-M comes from the unbalanced dataset?
    2. Still now clear why the opposite side of breast is needed as input of the model. The generated region should not be symmetric to the other side?
    3. The paper should evaluate the diversity of the generated mass, to prevent the model simply memorizing the existing mass.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    See the weakness.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well written with solid experiments.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper presents a stable diffusion-based framework “RadiomicsFill-Mammo” that leverages radiomics features to synthesize realistic masses in breast mammogram images. RadiomicsFill-Mammo relies on a tabular encoder to integrate relevant tumor characteristics in the mass generation. The evaluation of the generated tumors suggests that the stable diffusion framework is able to generate tumors having the same feature distribution as the real test set, with the radiomics based encoder one performing slightly better than the text-based clinical ones. The authors evaluated the utilization of the synthetic tumors in a downstream task of mass detection using yolo-v8.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Extendible work: The proposed method can be applied to other clinical problems of relevance to MICCAI.

    Novelty: most generative AI in medical imaging work relies on images only frameworks like cyclic GANs or incorporating text prompts only . The idea of incorporating radiomics features and other conditions (the corresponding breast lateral, leveraging breast symmetry) to synthesize masses with the desired attributes seems promising.

    Evaluation: The authors experimented with different prompts the MassTextFill, ClinicalTextFill, and RadiomicsFill with two different encoder architectures, provides context on the contribution of the radiomics features and the proposed framework

    Comprehensive evaluation: using pixel-based metrics like (PSNR, SSIM, MS-SSIM) and feature-based metrics (FID). Also, the authors demonstrated the possibility to fine-tune on external dataset

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The radiomics feature selection wasn’t provided. Also an analysis on the impact of those features on the generated masses could be beneficial.

    The training hyperparameters and input sizes weren’t mentioned. I would include them in the supplementary material.

    Limited discussion on the model performance on the two different mammogram views (CC vs MLO). The author mentioned that the model was trained on both views and both breast laterals, however the results provided little details on the performance on each view on its own.

    Lack of details in the down-stream task: The paper lacks a thorough discussion on the fine tuned task on the Inbreast Dataset, specifically comparing the performance of the detection model when trained on real data versus synthetic data. The results should be presented within the same context to evaluate the relative performance and impact of using synthetic data. It is unclear why the performance is degraded on low-density images using synthetic data in the Inbreast vs high performance on the vinDer dataset.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    I would recommend to include a discussion about what radiomics features and why they were selected.

    A comprehensive evaluation of the utilization is warranted: How can the framework mitigate well-known issues that even were presented in the paper like the detection performance in dense breasts. Also, breast density can be categorized into 4 different categories or even as a percentage, is maybe simplifying it to low vs high contribute to the problem?

    Clinical validation: experts’ evaluation on the generated images would be valuable.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents a novel framework for lesion synthesis which is an importat topic currently. Also, the quality of the work and evaluation of the work. The demonstration on possibly extending the same framework on other datasets and downstream tasks. Overall, the paper is strong with some edits needed.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We sincerely thank the reviewers for dedicating their time to review our paper and for providing constructive feedback. We are encouraged by the reviewers’ recognition of the novelty, significance, and clinical relevance of our work. To improve the quality of this paper, we will do our best to address all of the reviewers’ concerns in the final version. Issues that are difficult to resolve within this paper will be improved in future work. Please see the clarifications below.

  • R3-W1 & R4-W1) Data split: All datasets were randomly stratified and split at the patient level, eliminating data leakage risks. For VinDr-Mammo, we used the organizer’s splits; for INbreast, we performed our own split. We will add the number of images per class and clarify texts. R4) The 1.0 AP performance for low-density cases is due to the few low-density samples and their clear visibility. Improving detection rates for high-density cases, where masses are less visible, is crucial.
  • R3-W2 & R5-W1) Details on tumor conditions: We used all features from Pyradiomics [1]. While some features may be redundant, the tabular encoder’s pretraining with feature-specific embeddings captures relationships between features. R3) The public datasets did not provide molecular type information, but we believe shape-based radiomics features suffice. R5) We plan to analyze the impact of selected features on generated masses in future work. [1] Van Griethuysen, Joost JM, et al. “Computational radiomics system to decode the radiographic phenotype.” Cancer research 77.21 (2017).
  • R4-W2) Why opposite side is needed? In 2D mammograms, normal and abnormal tissues overlap due to projection. A rectangular mask has normal tissue with optional abnormal tissue depending on the radiomics condition. Our model uses the information of both the given and opposite sides to fill the masked region with texture of normal tissue while generating masses that reflect the given conditions. This will be clarified.
  • R4-W3) Diversity in generated mass: Our model does not merely reproduce training masses. It effectively incorporates varying radiomics features for generation. Since the generated samples are diverse and different from the training samples, it leads to improved mass detection performance with synthetic samples (Tables 2 and 3).
  • R5-W3) Performance on different mammogram views: Our model was trained on both CC and MLO views combined without distinguishing between them. Our evaluation includes both views, reflecting overall performance. We did not consider developing separate models for each view, because they provide information from different angles rather than new information like multi-parametric inputs. We plan to explore models CC-specific and MLO-specific view models in the future.
  • R5-W4) Details on downstream task experiments: As noted, we had detailed comparisons among models in Table 2, while a reduced set of comparisons was done in Table 3. This was because Table 2 was the primary dataset with more samples, while Table 3 was an external validation with limited samples. Still, we will pursue expanded comparison in the future. We believe performance differences between datasets may arise from their different origins (Vietnam vs. Portugal implying different demographics). Additionally, differences in the composition of cases per class (class imbalance) between the datasets could also contribute to the varying performance.




Meta-Review

Meta-review not available, early accepted paper.



back to top