Abstract

AI for cancer detection encounters the bottleneck of data scarcity, annotation difficulty, and low prevalence of early tumors. Tumor synthesis seeks to create artificial tumors in medical images, which can greatly diversify the data and annotations for AI training. However, current tumor synthesis approaches are not applicable across different organs due to their need for specific expertise and design. This paper establishes a set of generic rules to simulate tumor development. Each cell (pixel) is initially assigned a state between zero and ten to represent the tumor population, and a tumor can be developed based on three rules to describe the process of growth, invasion, and death. We apply these three generic rules to simulate tumor development—from pixel to cancer—using cellular automata. We then integrate the tumor state into the original computed tomography (CT) images to generate synthetic tumors across different organs. This tumor synthesis approach allows for sampling tumors at multiple stages and analyzing tumor-organ interaction. Clinically, a reader study involving three expert radiologists reveals that the synthetic tumors and their developing trajectories are convincingly realistic. Technically, we analyze and simulate tumor development at various stages using 9,262 raw, unlabeled CT images sourced from 68 hospitals worldwide. The performance in segmenting tumors in the liver, pancreas, and kidneys exceeds prevailing literature benchmarks, underlining the immense potential of tumor synthesis, especially for earlier cancer detection.

The code and models are available at https://github.com/MrGiovanni/Pixel2Cancer

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1596_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1596_supp.pdf

Link to the Code Repository

https://github.com/MrGiovanni/Pixel2Cancer

Link to the Dataset(s)

https://huggingface.co/datasets/AbdomenAtlas/AbdomenAtlas1.0Mini

BibTex

@InProceedings{Lai_From_MICCAI2024,
        author = { Lai, Yuxiang and Chen, Xiaoxi and Wang, Angtian and Yuille, Alan and Zhou, Zongwei},
        title = { { From Pixel to Cancer: Cellular Automata in Computed Tomography } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15001},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    In this paper, the author propose a method for synthetic tumor generation that is applicable across different organs. They use cellular automata to simulate tumor development with 3 rules. The method is evaluated clinically with a clinical reader study, where 3 radiologists were convinced by the synthetic tumors. The method is also technically evaluated with segmentation algorithms trained with those synthetic tumors only. They achieve better results than 2 other algorithms (1 trained with synthetic tumors generated with a different strategy and 1 trained with real tumors) for 3 different use cases (liver, pancreas, kidney tumors).

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The idea of tumor synthesis has a huge potential to solve the bottleneck issue of data scarcity, annotation difficulty, and low prevalence of early tumors in public datasets. Using Cellular automata to simulate tumor development is interesting and novel. The paper is overall well written, easy to read and to follow. The author performed a thorough evaluation with a multi reader study, a technical evaluation where the synthetic tumors are used as training data of segmentation algorithms, and an ablation study. The improved small tumor detection, and accurate boundary segmentation is demonstrated.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Some small details are missing:

    1. In the implementation of the tumor development: How is the invasion direction chosen? The explanation of the interaction with organ is not clear (bottom of p4) Bottom line of Fig 2 is not clear from the legend

    2. In the segmentation algorithm implementation: Why 3 backbones are chosen? Training details should be added: how many images are used for training/evaluation/testing for the 3 methods?

    3. Data used: There are not healthy liver in the LITS dataset. The authors should clarify what is meant by “healthy subjects in LiTS.”

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    It would be interesting to show 1 image with the generated synthetic tumors (one clasified as real by the 3 radiologists and 1 correcly clasified as synthetic) This image could be added on the side of Table 1 for ex. The novelty of the proposed approach in comparison to [11] should be explained in the introduction.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    good paper with novel and interesting approach and good evaluation.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper presets a tumour synthesis approach in real CT data from liver, pancreas and kidney. The proposed method models three processes growth, invasion, and death using Cellular Automata. Tumours are generated in 9,262 raw, unlabeled CT images sourced from 68 hospitals worldwide. The method is evaluated in two ways: (1) three expert radiologists classify tumours as real vs synthetic Visual Turing Test [ref 4] (2) as training for segmentation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Generating synthetic tumours is important for (1) training segmentation methods and (2) predicting tumour growth for improved treatment. The paper proved very efficient for (1).

    The three mechanisms that are considered growth (proliferation), invasion, and interaction with surrounding organs agree with the literature in the area.

    The method proved very useful in improving segmentation accuracy in three main SOTA segmentation networks as part of the MONAI framework - U-Net [ref 25], Swin-UNETR [ref 8], and nn-Unet [ref 14]. As an example for liver segmentation, Pixel2Cancer outperforms the methods trained on real data by 5.7% in NSD. They also surpass Hu et al.by 4.4% in DSC and 6.1% in NSD. This proves the usefulness of the method in improving segmentation performance.

    The generated tumours were very realistic. The three radiologists that did the Visual Turing Test and results (Table 1) prove the realistic visual appearance of the simulated tumours. As an example, R3 (10 years) misidentifies 47.1% of synthetic tumors as real.

    Contribution 3 - Synthesizing tumors across organs - is quite novel and important. The method works very well in liver, pancreas and kidney. The fact that the method does not require manual annotations makes is adaptable to other structures as well.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    One main structure left apart is the brain. The authors did not comment on the applicability of the method for brain tumour simulation. One other concern related to tradition to brain data is that CT is not often available. How would the method be adapted for MRI data where quantifying tissue properties is more difficult.

    One other concern with the current model is the mass effect that the tumours cause on the surrounding tissue, changing the appearance and shape of the nearby structures like ref 28). It is not clear how this problem is not addressed in the current study.

    The approach is purely macroscopic and discrete, at the voxel level. Cells are much smaller than that. Most mathematical models (like ref 7) propose a continuous formulation, that could be applied at any resolution.

    The growth model from 2.2 appears a bit heuristic. There are many sound mathematical growth models. How do the proposed approach relate to those models ? Is there any way to incorporate one of the known mathematical models for growth (proliferation), invasion (diffusion) or mass effect into this formulation ?

    Nevertheless, the validation shows that the model is able to create realistic looking tumours.

    There is no comparison with any other tumour growth model.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The authors included the code with the submission. The code will be released.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Introduction “Moreover, no prior work has modeled how tumors develop over time” > there are many works that look at tumour growth prediction (mathematical models like in ref 7 for example). The statement needs clarification.

    As mentioned in the section 5, one other important application of tumour synthesis is growth prediction when stating with an exiting tumour. How could the method be applied in this case ? This would open the opportunity for validating the method in the context of tumour growth.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The approach is well validated and seem effective in producing visually realistic tumours that are proved efficient in training segmentation networks. This is a valuable contribution.

    My main concern is the lack of theoretical grounds for the method, considering all the mathematical or learning-based (ex [1] below) works in tumour growth. Also no comparison with other models is made.

    [1] A Generative Approach for Image-Based Modeling of Tumor Growth, Menze et al, IPMI 2011

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper proposes a rule-based method to synthesize artificial tumors (and their development) in CT data. The scope is placed on synthesizing tumors for data scarce, annotation difficulty, and low prevalence scenarios such as early tumors. The 3D Cellular Automata approach incorporates three generic rules for tumor growth, invasion, and death. The tumor population map/model is mapped to CT following a straightforward mapping function, i.e. no learning based method is used to synthesize the tumors. To test the effectiveness of the proposed approach, tumor realism is evaluated by 3 clinicians and three deep learning-base segmentation architectures are trained followed by evaluating segmentation performance across three application areas: liver, pancreas, and kidneys . Both the clinical reader study and the performance of the models trained on the synthetic tumors presents good results.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    • The authors investigate a very relevant and interesting field of tumor synthesis for improving models for early detection of cancer • The proposed tumor synthesis method is not learning-based and thus not data dependent. Knowledge about the tumor growth and general anatomy is utilized. • It is one of the first methods I’ve seen synthesizing tumor growth • The method is applied to multiple tumor types • Models trained on the synthesized tumors show improved performance

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    • The paper proposes tumor synthesis for improved early tumor detection. However, only a small ablation experiment result is shown (Figure 4 b). From the figure/text it is also not clear which segmentation model produced that result. Looking at the pancreatic tumor detection sensitivity, the presented results are very low compared to what the nnunet on the medical decathlon already achieved in 2019 on the same task: https://decathlon-10.grand-challenge.org/evaluation/e7503ac3-57b3-44a1-b448-b76314701b02/

    • While the authors present results of a clinical study in which the tumor realism is evaluated, the diversity of synthesized tumor is a bit in question. It is quite well known that not all tumors present as large hypo-intense tumors. The synthesized kidney tumors are all very well defined, inline with how kidney cysts would look rather than tumors. The proposed framework (in its current form) seems to be limited to producing hype-intense (dark) and round tumors. Further extending/improving the mapping from tumor population map to CT for greater diversity will be a great addition. I do think this requires a lot of domain knowledge and should likely be specific to each tumor type.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The code to simulate the tumor growth, mapping to CT and training code for the segmentation models are provided. The pretrained weights are also provided on the github repository. Everything should be reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    • Given that the automata model acts in 3D and the segmentation models were always trained on voxels/3D. Why “pixel to cancer”? “Voxel to cancer” is more accurate. • More details regarding the tumor detection performance and implementation are welcome • I think an incorrect reference to the “Theory of self-reproducing automata” [23] is used. • The proposed tumor synthesis approach seems very promising with a great deal of control provided in the synthesis process. While the same rules are currently applied in the three domains, as the authors point out in the conclusion, there are peculiarities about each tumor that is currently not captured. Future work could dive into these. • In the abstract the authors wrote: “Technically, we generate tumors at varied stages in 9,262 raw, unlabeled CT images sourced from 68 hospitals worldwide”. I am missing to see how this number was obtained and if that many tumors were synthesized.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper proposes an interesting cellular automata rule-based method for synthesizing artificial tumors in CT data, emphasizing scenarios with scarce data and early tumor detection. The proposed approach does not rely on extensive datasets and rather straightforwardly maps tumor growth models to CT data. This enables a low barrier for entry, allowing research labs without access to large dataset to employ the proposed method. The effectiveness is demonstrated through evaluations involving clinical assessments and deep learning models across multiple organ types, with promising results. However, the study reveals limitations in the diversity in the synthesized tumors and could clarify details on tumor detection performance.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

N/A




Meta-Review

Meta-review not available, early accepted paper.



back to top