Abstract

Although existing medical image segmentation methods provide impressive pixel-wise accuracy, they often neglect topological correctness, making their segmentations unusable for many downstream tasks. One option is to retrain such models whilst including a topology-driven loss component. However, this is computationally expensive and often impractical. A better solution would be to have a versatile plug-and-play topology refinement method that is compatible with any domain-specific segmentation pipeline. Directly training a post-processing model to mitigate topological errors often fails as such models tend to be biased towards the topological errors of a target segmentation network. The diversity of these errors is confined to the information provided by a labelled training set, which is especially problematic for small datasets. Our method solves this problem by training a model-agnostic topology refinement network with synthetic segmentations that cover a wide variety of topological errors. Inspired by the Stone-Weierstrass theorem, we synthesize topology-perturbation masks with randomly sampled coefficients of orthogonal polynomial bases, which ensures a complete and unbiased representation. Practically, we verified the efficiency and effectiveness of our methods as being compatible with multiple families of polynomial bases, and show evidence that our universal plug-and-play topology refinement network outperforms both existing topology-driven learning-based and post-processing methods. We also show that combining our method with learning-based models provides an effortless add-on, which can further improve the performance of existing approaches.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2215_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/2215_supp.pdf

Link to the Code Repository

https://github.com/smilell/Universal-Topology-Refinement

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Li_Universal_MICCAI2024,
        author = { Li, Liu and Wang, Hanchun and Baugh, Matthew and Ma, Qiang and Zhang, Weitong and Ouyang, Cheng and Rueckert, Daniel and Kainz, Bernhard},
        title = { { Universal Topology Refinement for Medical Image Segmentation with Polynomial Feature Synthesis } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15009},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    (1) propose a universal topology refinement pipeline to revise neural-network-derived segmentations. Their approach, which serves as a plug-and-play module, can adapt to any segmentation backbone; (2) To address a wide range of potential topological errors, they develop two types of continuous topology-perturbation synthesis and segmentation map synthesis. These designs guarantee the variety and completeness of the synthetic training data; (3) By leveraging an abundance of synthetic samples their approach exhibits more robust performance, especially for high-dimensional datasets with limited GT labels.They demonstrate superior performance compared to existing state-of-the-art methods, showcase the capacity for generalization by revealing its potential to improve existing segmentation models without the need for fine-tuning.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The proposed workflow for generating unbiased topology-aware segmentation maps, which is a kind of novel way of generating data. Authors make sufficient theoretical demonstrations on this.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Experiments are carried out on only one data set with single modality, and the effectiveness of the proposed method needs to be verified on more data sets; It is only for the tubular structure, hence it is inappropriate to call it universal; The methods of comparison are not very new, especially literature [23] is a review, which is inappropriate; Why the Dice didn’t improve at all, but declined a little, which needs to be explained. In summary, I need more experiments to the improvement of the method on the more specific segmentation tasks, such as blood vessels, airways, etc. Please report whether your approach improves the baseline on these tasks.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Additional experiments: The proposed method can be used for more specific scenarios that require fine segmentation tasks, such as blood vessels, airways, etc. Report whether your approach improves the baseline on these tasks.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed topology refinement post-processing scheme has high reproducibility, and the authors have demonstrated their method theoretically, and have promised to make the code public. However, the effectiveness of the proposed method needs to be verified on more data sets.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Reject — should be rejected, independent of rebuttal (2)

  • [Post rebuttal] Please justify your decision

    The proposed workflow for generating unbiased topology-aware segmentation maps, which is a kind of novel way of generating data. However, the effectiveness of the proposed method was only verified on one data set, as presented in the paper. The authors claimed that the proposed method has been verified on two more datasets, which wasn’t included in the Supplementary. In addition, the methods of comparison are not very new, including the method presented in the rebuttal, which is 2010. The rebuttal didn’t address my concerns.



Review #2

  • Please describe the contribution of the paper

    This paper solves an issue in medical image segmentation: the lack of topological correctness in existing approaches, which limits their use in many downstream applications. Current techniques focus on pixel-level precision but frequently overlook the topological structure of the images, which is critical for practical applications. The research proposes a model-independent topology refinement network that uses synthetic segmentations to train for a wide range of topological defects. This method uses the Stone-Weierstrass theorem to create topology-perturbation masks from orthogonal polynomial bases, resulting in a complete and unbiased error representation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The method is compatible with any domain-specific segmentation pipeline, making it an adaptable option for a variety of applications.

    2. Using synthetic data based on polynomial perturbations provides wide coverage of potential topological defects, increasing the model’s robustness.

    3. Demonstrated success in segmentation refinement, outperforming other approaches in terms of topology correctness.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The method’s use of polynomial synthesis to generate topology-perturbation masks may not always reflect the complex and varied nature of topological flaws in medical pictures, thus restricting its usefulness in a variety of real-world circumstances.

    2. When repairing some topological defects, refinement procedures may inadvertently generate new errors or distortions, especially if the corrections are not fully aligned with the true underlying anatomy.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. The paper tried with three distinct polynomial families; How you choose these families and how they affected the results? Why were these family picked above others?

    2. How did you chose the Wilcoxon signed-rank test to statistically validate your findings? Are there any additional statistical methods that could reveal different aspects of the model’s performance?

    3. Can you explain how different polynomial orders effect the performance of topology refinement? What prompted the decision to test up to N = 10, and how does adjusting this parameter affect the outcomes?

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper introduces a universal topology refinement method as an advanced approach to dealing with topological errors in medical image segmentation. To increase segmentation topological accuracy, this method uses model-agnostic post-processing networks trained on synthetic data. The use of polynomial-based data synthesis to construct topology-perturbation masks seem new, and is important. However, I have a doubt that if such approach can generate topological correctness with any sort of guarantee??

    Integrating the topology refinement method with other cutting-edge machine learning techniques, such as deep learning models that explicitly incorporate topological priors or adversarial training methods, may result in even greater segmentation accuracy and error correction.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    I appreciate the authors for providing a rebuttal. I believe this is a solid paper after reading other reviews and the rebuttal. It shows that topological errors in segmentation can be effectively reduced.



Review #3

  • Please describe the contribution of the paper

    This paper proposes a topological refinement method to postprocess probability maps output by any segmentation model in a model-agnostic way. It generates topology-perturbation masks using orthogonal polynomial bases with randomly sampled coefficients. Ground truth images perturbed with the masks are used to train a refinement network. It is claimed that the proposed procedure generates topology-perturbed probability maps in a comprehensive and unbiased manner. Authors experiment with multiple families of polynomial bases, and show that their refinement model outperforma existing methods and can be plugged into any segmentation model to enhance the quality of output segmentations without having to fine-tune the refinement network.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper proposes a novel method to generate topology-perturbed images to augment the training of the refinement network. The proposed topology refinement model can be used with any segmentation model without fine-tuning. The claims of improved performance are backed with experimental results and ablation studies.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Important details about the refinement network are missing, such as: a) What was the network architecture? b) How much synthetic data was used to train the network? c) How does the proposed method compare to others in terms of training time? The writing could be further improved. For example, mathematical notation needs to be defined and explained when it is first used and the captions need to explain the figures better.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    While the synthetic data generation is well explained, no details about the refinement network architecture or the training protocol are provided. There’s no way to reproduce the results based on provided information.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • Figure 1 - The notation is not explained which makes it hard to understand the figure. The caption needs to be self-explanatory.
    • It is not clear what architecture was used for the refinement network. Details about the architecture, training procedure, etc. must be provided.
    • It seems to me that the proposed model needs to cover a much larger input space to account for the diversity of topological errors. How much training data was used to train the network? How many perturbed images were generated from each ground truth sample?
    • Each synthetic input is generated by perturbing the ground truth. So while the proposed network is model-agnostic, is it not limited by the available ground truth data? Also, the method is model-agnostic as long as the models are performing the same task. It is not domain or task agnostic. So what’s the advantage of having a plug-and-play module?
    • Different segmentation tasks can have different requirements in terms of topology. For example, segmenting blood vessels and segmenting cellular structures have opposite topological contraints - one requires acyclic structure, the other needs cycles. Can the proposed refinement network be used in both situations without retraining or fine-tuning? If yes, please explain how. If not, can you justify the need for a plug-and-play module that only works for the specific domain and task it was designed for?
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper proposes a novel method to generate synthetic topology-perturbed data to train a refinement network independent of the segmentation model. This idea is powerful in its own right. While the authors claim to have trained a refinement network and provide experiment results no details about the model or the training procedure are provided. These details are essential for the paper to be complete. I expect these details to be provided in the rebuttal.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The authors have addressed my concerns satisfactorily. This work is novel and deserves to be published.




Author Feedback

We thank the reviewers for their positive feedback on our novelty and theoretical rigor of our polynomial-based topology refinement method.

  1. Completeness and Adaptability [R1, R3] Our approach utilizes a complete polynomial basis to synthesize training data, allowing it to approximate any complex topological structure with arbitrary precision, as guaranteed by the Stone-Weierstrass theorem [24]. This ensures that our method can adapt to a diverse set of topologies across different real-world scenarios, making it universally applicable.

  2. Cross-dataset Evaluation [R1] We evaluate the TopCoW dataset for 3D segmentation on purpose due to its challenging topological structures, including both 2D and 3D topological errors (holes). We also test our method on the CREMI and FIVES datasets for neuron and retina vessel segmentation, respectively. These results were not included in the submission due to space constraints and redundancy but will be provided in the supplement. Our method outperforms SOTA baselines in all metrics on these datasets, e.g., achieving 83.92 Dice and 6.11 Betti error on CREMI, compared to Warp-loss [9] results of 83.03 and 11.78. We emphasize that our goal is to enhance topological correctness while maintaining Dice performance.

  3. Baseline [23] [R1] We compare with a heuristic post-processing method that filters small isolated outliers. We will add [1] as additional reference. We disagree with the reviewer and find that a review paper is suitable for referencing a family of methods and their common post-processing techniques. [1] Vlachos, M., Dermatas, E.: Multi-scale retinal vessel segmentation using line tracking. CMIG, 2010.

  4. Introduction of New Errors [R3] We experimentally demonstrate that our refinement eliminates topological errors without introducing new errors. Our method improves performance across all baseline models and avoiding dataset-specific biases, as shown in Tab. 3.

  5. Polynomial Families and Order [R3] As stated in “Preliminary”, our three orthogonal polynomial bases are selected for their favorable analytical properties. For instance, Chebyshev Polynomials Tn(x) have n-1 critical points when n ≥1 (see Appendix). The polynomial order is chosen experimentally that incorporating more high-frequency terms can lead to anatomically implausible structures, as discussed in “Ablation study”.

  6. Statistical Significance [R3] Following [17], we conducted a Wilcoxon rank test to assess statistical significance. A T-test is also conducted, both resulting in statistical significance (p<0.05).

  7. Topological Guarantees and Integration with SOTA Methods [R3] Our method is compatible with any other segmentation method, including topology-aware methods with variant guarantees. However, none of these methods can guarantee topological correctness during inference; they rely solely on the network’s learned representation capability. Our method learns a robust representation by addressing synthetic topological errors with guaranteed variability. As illustrated in Tab. 3, we combine our method with cl-Dice, Boundary loss, PH loss, and warp loss, showing consistent reduction in topological errors.

  8. Network Architecture [R4] Our post-processing network uses a U-Net [21, 13] for fair comparison, as stated in “Quantitative Evaluation”. We will make our code available together with the camera ready version.

  9. Training Protocol and Time [R4] Our method randomly generates synthetic data in real-time, with a training time of 0.29 seconds per batch, 21.0 times faster than the baseline method PH-loss [10].

  10. Clarification of Domain-Agnostic Design [R4] Our model, trained on synthetic data using only the GT labelmap, is agnostic to covariate (domain) shifts, e.g., imaging devices and scanning protocols. Such a design makes our method a robust plug-and-play option for various upstream segmentation models. Retraining our model on new tasks with different topological structures is straightforward and quick.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This is a solid and novel work to generate synthetic topology-perturbed data to train a refinement network independent of the segmentation model. Experiments generally demonstrate its effectiveness. Hence, I recommend acceptance. However, there are still two concerns: 1) it uses the challenge data, yet, results are not reported on the official testing set. 2) The reported betti number 0 seems much larger than the top methods reported in the TopoCoW dataset. (1.07 vs 0.37 ~ 0.6)

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    This is a solid and novel work to generate synthetic topology-perturbed data to train a refinement network independent of the segmentation model. Experiments generally demonstrate its effectiveness. Hence, I recommend acceptance. However, there are still two concerns: 1) it uses the challenge data, yet, results are not reported on the official testing set. 2) The reported betti number 0 seems much larger than the top methods reported in the TopoCoW dataset. (1.07 vs 0.37 ~ 0.6)



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Reviewer 1 has concerns about not using the latest baselines but did not specify any for comparison. Nonetheless, all reviewers acknowledge the overall novelty of this paper. Therefore, I recommend accepting it.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    Reviewer 1 has concerns about not using the latest baselines but did not specify any for comparison. Nonetheless, all reviewers acknowledge the overall novelty of this paper. Therefore, I recommend accepting it.



back to top