Abstract

Segmentation models for thyroid ultrasound images are challenged by domain gaps across multi-center data. Some methods have been proposed to address this issue by enforcing consistency across multi-domains or by simulating domain gaps using augmented single-domain. Among them, single-domain generalization methods offer a more universal solution, but their heavy reliance on the data augmentation causes two issues for ultrasound image segmentation. Firstly, the corruption in data augmentation may affect the distribution of grayscale values with diagnostic significant, leading to a decline in model’s segmentation ability. The second is the real domain gap between ultrasound images is difficult to be simulated, resulting in features still correlate with domain, which in turn prevents the construction of the domain-independent latent space. To address these, given that the shape distribution of nodules is task-relevant but domain-independent, the SHape-prior Affine Network (SHAN) is proposed. SHAN serves shape prior as a stable latent mapping space, learning aspect ratio, size, and location of nodules through affine transformation of prior. Thus, our method enhances the segmentation capability and cross-domain generalization of model without any data augmentation methods. Additionally, SHAN is designed to be a plug-and-play method that can improve the performance of segmentation models with an encoder-decoder structure. Our experiments are performed on the public dataset TN3K and a private dataset TUI with 6 domains. By combining SHAN with several segmentation methods and comparing them with other single-domain generalization methods, it can be proved that SHAN performs optimally on both source and target domain data.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0407_paper.pdf

SharedIt Link: https://rdcu.be/dY6jN

SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72083-3_68

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/0407_supp.pdf

Link to the Code Repository

N/A

Link to the Dataset(s)

https://drive.google.com/file/d/1reHyY5eTZ5uePXMVMzFOq5j3eFOSp50F/view?usp=sharing

BibTex

@InProceedings{Zha_SHAN_MICCAI2024,
        author = { Zhang, Ruixuan and Lu, Wenhuan and Guan, Cuntai and Gao, Jie and Wei, Xi and Li, Xuewei},
        title = { { SHAN: Shape Guided Network for Thyroid Nodule Ultrasound Cross-Domain Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15004},
        month = {October},
        page = {732 -- 741}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    In this paper, the authors propose a shape-prior affine network for cross-domain segmentation of ultrasound images. The node shape prior knowledge is introduced to constrain the stable domain invariant representation in the model, and the global affine module is designed to construct the affine mapping relationship between the nodule shape distribution and the potential features, thus providing a preliminary estimation of nodule shape. Further, the neighbor affine module is introduced to refine the preliminary shape estimation and complete the segmentation of the nodule. Superior performance is obtained through extensive experiments on public and private datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    In this paper, considering that the thyroid node is usually an approximate ellipse, it is domain-independent but task-relevant. The authors attempt to initialize the nodule shape prior, establish a mapping between potential features and the axial/transverse (A/T) distribution of the nodule, and search for stable domain-invariant representations. The shape prior to the nodule S with the key state of A/T is a fixed binary circle with the image center as the center and the radius of r.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The statement in section 2.1 Nodule Shape Prior of the text is not clear enough, e.g., how is the shape prior generated, in what way is SHAN initialized to obtain the shape prior knowledge, and is it position-dependent?
    2. In section 2.2 Global Affine Module, how to understand the significance of each sub-item in Eq. 1? In addition, due to the shape diversity of nodules, how to determine the offset factor v_x and v_y, and the rotation angle theta?
    3. In section 2.3 Neighbor Affine Module, the offset of the elliptic prediction pixel-by-pixel A=f(D(E), U) How is f specifically implemented? This paper lacks relevant explanations.
    4. In this paper, how effective is the proposed method on samples with significant shape differences, such as non-elliptical samples?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. In section 2.1 Nodule Shape Prior, it is necessary for the authors to explain the method of shape prior generation, and supplement how SHAN initializes to obtain shape prior knowledge, so as to help readers understand nodule shape prior.
    2. In section 2.2 Global Affine Module, please supplement the meaning of each sub-item in Eq. 1 and the basis for the radiation transformation. In addition, it is necessary to give the basis for determining the offset factors v_x, v_y, and the rotation angle theta in the equation.
    3. In section 2.3 Neighbor Affine Module, please supplement specific implementation methods for f_n.
    4. In the experimental section, please add experiments on samples with significant shape differences, such as non-elliptical samples, to further demonstrate the effectiveness of the proposed method.
    5. In this paper, what is the complexity of the proposed model? Please provide the number of parameters for comparison of different models.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    There are the major factors that led me to my overall score for this paper: in this paper, the authors propose a shape-prior affine network based on the node shape prior to design the global affine module and the neighbor affine module. By constructing an affine relationship between the distribution of nodule shapes and potential features, the preliminary estimation and refinement of nodule shapes are carried out to complete nodule segmentation. The method is novel and the experimental results are better than other comparative models, but the description of the method’s innovation in the article is insufficient, and the experimental part lacks segmentation for difficult samples to illustrate the effectiveness of the method.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper introduces the Shape-prior Affine Network (SHAN), a novel approach to enhancing segmentation models for thyroid ultrasound images, particularly addressing the challenge of domain gaps in multi-center data. SHAN leverages shape priors, specifically the shape distribution of nodules which is both task-relevant and domain-independent, to create a stable latent mapping space without relying on data augmentation. This method not only preserves the diagnostic integrity of grayscale values but also effectively constructs a domain-independent latent space to improve segmentation accuracy and cross-domain generalization. Additionally, SHAN is versatile, designed as a plug-and-play method compatible with existing encoder-decoder segmentation models, and has demonstrated superior performance across multiple domains in comparative tests.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strengths of the paper are as follows:

    1. Novel Formulation: The Shape-prior Affine Network (SHAN) introduces a novel formulation in the domain of medical image segmentation. Unlike traditional approaches that heavily rely on data augmentation to bridge domain gaps, SHAN utilizes shape priors to create a stable latent mapping space. This is particularly innovative because it directly leverages the inherent characteristics of nodules (shape, size, aspect ratio) which are consistent across different datasets, thereby ensuring that the model is less dependent on the specific characteristics of the training data and more robust to variations across different centers.
    2. Original Use of Data: The use of shape priors as a stable latent mapping space represents an original approach to dealing with domain variability in medical imaging. This method acknowledges that while the appearance of images may vary across domains, certain biological features (like the shape of thyroid nodules) remain consistent. By focusing on these consistent features, SHAN effectively mitigates the challenges posed by domain-specific variations without the corruption risks associated with traditional data augmentation techniques.
    3. Strong Evaluation: The evaluation of SHAN is thorough and robust, involving both a public dataset (TN3K) and a private multi-domain dataset (TUI). This dual-dataset strategy not only enhances the validity of the results but also demonstrates the model’s effectiveness across different settings, which is crucial for clinical applications. The comparison of SHAN with other single-domain generalization methods, and its superior performance, further underscore its practical utility and the effectiveness of its novel approach. 4 Clinical Feasibility: SHAN’s design as a plug-and-play method compatible with existing encoder-decoder structures enhances its clinical feasibility. This design choice means that SHAN can be easily integrated into existing clinical workflows and systems without the need for extensive modifications. This ease of integration, coupled with its demonstrated effectiveness across different domains, makes it a viable option for real-world clinical applications, potentially leading to better diagnostic outcomes in thyroid ultrasound imaging.
    4. Cross-Domain Generalization: The paper specifically addresses the challenge of domain gaps, which is a significant issue in medical imaging due to the variability in imaging equipment and techniques across different clinics or hospitals. SHAN’s ability to generalize across domains without the need for domain-specific tuning or extensive data augmentation is a substantial advantage, as it simplifies the deployment of robust medical imaging solutions in diverse clinical environments.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    While the SHAN framework presents several innovative aspects for the segmentation of thyroid nodules in ultrasound images, there are potential weaknesses or areas that could be improved or further clarified. Here are some points that could be considered as weaknesses:

    1. Dependency on Shape Prior Initialization: The effectiveness of SHAN heavily relies on the initialization of the shape prior, which is assumed to be a binarized circle with a fixed A/T ratio of 1. This assumption might not be universally valid across all thyroid nodules, which can vary greatly in shape and size depending on the patient and pathology. The model’s performance could be limited by this initial assumption, particularly in cases where nodules do not conform to this idealized shape. Also, the authors should investigate the effect of changing the hyper-parameter r.
    2. Generalization to Other Nodule Characteristics: While focusing on the A/T ratio and the elliptic shape is relevant, thyroid nodules exhibit other characteristics that are also diagnostically significant, such as echogenicity, composition, and the presence of calcifications. The paper does not discuss how SHAN accommodates these other features, which could be critical for a comprehensive diagnostic assessment according to the TI-RADS guidelines.
    3. Lack of Comparative Analysis with Other Domain Adaptation Techniques: While SHAN is compared with other single-domain generalization methods, there is less emphasis on how it performs against other domain adaptation or transfer learning strategies that do not rely on shape priors. A broader comparative analysis could provide a clearer picture of where SHAN stands in the spectrum of current technologies.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    I think it would be valuable for the authors to make their code publicly available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    To provide detailed and constructive feedback for the authors of the SHAN framework paper, here are some suggested comments for each identified weakness:

    1. Dependency on Shape Prior Initialization Comment: The initialization of the shape prior as a binarized circle with a fixed A/T ratio of 1 is a critical assumption in your model. While this simplification aids in model design, it may not accurately represent the morphological diversity seen in thyroid nodules across different patients and pathological conditions. To enhance the model’s applicability and robustness, it would be beneficial to explore how variations in the initial shape prior, particularly the radius r and other parameters, affect the model’s performance.

    Suggestion: Consider implementing a sensitivity analysis to determine the impact of various initial shape priors. This could include experimenting with different geometric shapes, varying the A/T ratio, or even using statistical shape models that can adapt to the observed distribution of nodule shapes in your training dataset. Additionally, employing a parameter search or optimization strategy for r and other shape parameters based on validation set performance could make the model more adaptive and potentially improve segmentation accuracy.

    1. Generalization to Other Nodule Characteristics Comment: Your focus on the A/T ratio and the elliptic shape is well justified given their diagnostic relevance. However, thyroid nodules present a variety of other characteristics that are crucial for a comprehensive TI-RADS assessment, such as echogenicity, internal composition, and calcifications. The exclusion of these features could limit the diagnostic utility of your model.

    Suggestion: Expand the feature set used in your model to include additional ultrasound characteristics of thyroid nodules. This could involve integrating multi-channel input where each channel represents a specific feature of the nodule, or modifying the network architecture to incorporate additional pathways that can learn to identify and utilize these features. Additionally, collaborating with radiologists to obtain expert insights on the most predictive features for thyroid nodule diagnosis could enhance the clinical relevance of your model.

    1. Lack of Comparative Analysis with Other Domain Adaptation Techniques Comment: The paper provides a comparison with other single-domain generalization methods, which is useful for demonstrating the efficacy of your approach. However, the absence of a comparison with broader domain adaptation or transfer learning strategies may leave some questions unanswered regarding the full potential and positioning of SHAN within the wider landscape of medical imaging adaptations.

    Suggestion: To solidify the contributions of your work, consider extending your comparative analysis to include various domain adaptation and transfer learning techniques. This could involve testing your model against approaches that utilize different mechanisms, such as adversarial training, feature disentanglement, or instance normalization, which have shown promise in managing domain shifts in medical imaging. Detailed comparisons, including both quantitative metrics and qualitative assessments (e.g., visual segmentations), would provide a more comprehensive evaluation of where SHAN excels or may require further improvement.

    By addressing these points, the authors can significantly enhance the robustness, applicability, and scientific contribution of their work, potentially leading to broader acceptance and use in clinical practice.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. Innovation and Novelty: The introduction of the Shape-prior Affine Network (SHAN) represents a significant innovation in the field of medical image segmentation. The novel use of shape priors to address the issue of domain variability is a unique approach that shifts the focus from traditional data augmentation methods, which can corrupt meaningful diagnostic information, to a more stable and reliable feature of medical images—the shape of anatomical structures.
    2. Technical Rigor and Validation: The technical development of SHAN is robust, and the experiments are solid.
  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The study proposes an approach to mitigate the domain gap problem in segmenting thyroid ultrasound images. Instead of relying on approaches such as data augmentation or domain adaptation, the proposed method attempts to find a domain-invariant representation of thyroid nodules that is domain-independent but task-relevant.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Proposing the Global Affine Module (GAM) and Neighbour Affine Module (NAM) modules as a part of a plug-and-play design.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    • The study could provide more insight into the selection of loss function terms. • Releasing the code as a plug-and-play method would enhance the impact of the proposed approach.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The manuscript is well-written and technically sound, and the results are promising. Here are a few questions and comments that I hope enhance its quality:

    1. As we know, in a task such as segmenting nodules in thyroid ultrasound images, targets are often much smaller compared to the background region. Considering this inherent class imbalance within the dataset, wouldn’t it be more beneficial for the authors to consider, for instance, focal loss instead of a standard cross-entropy loss?

    2. Following the previous comment and regardless of the inherent imbalance, it is noted that the loss function of SHAN consists of the two cross-entropy losses between the elliptical prediction from the encoder, the final segmentation result from the decoder, and the Ground Truth, respectively. While one term aims to achieve coarse segmentation and the other tries to perform refinement, it would be advantageous if the author could offer further insights into their selection of loss terms. More specifically, would utilizing the same loss function with identical settings for both terms (scenarios) be best?

    3. It is mentioned that the r of the shape prior is chosen as 20. The manuscript would benefit from some explanations regarding how the value of r for the shape prior is determined. Is this selection based on the statistics present in the datasets? For example, is it derived from the mean of all masks across the dataset? If so, did the authors also include the test dataset in this calculation?

    4. In both Table 1 and Table 3, assuming that “SHAN (ours)” refers to the plug-and-play SHAN in a vanilla UNet, it would be beneficial if the author could provide some insight into why the vanilla UNet outperformed UNet++* where SHAN had been integrated in both cases.

    5. Considering the encoder architectures utilized in the study, trying to mitigate the shift variance problem by incorporating methods such as Pyramidal BlurPooling could potentially improve the method’s performance.

    6. The SHAN has the most significant improvement on TransDL”. Since the authors have not conducted statistical tests or reported p-values, and since significance is a statistical concept, it would be advisable for the authors to use another term to avoid confusion.

    7. In the caption of Fig. 1 and Fig. 2 in Supplementary Material: “begign” –> “benign”. Many white spaces were missed after periods. For instance: “… into 6 domains.For TN3K …”. “and the our method” –> “and our method”

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The manuscript is well-written, and the results are promising. However, the authors could enhance it by addressing some concerns.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We are sincerely grateful for the insightful and valuable feedback provided by the reviewers. Our responses to the comments are listed below:

To Reviewer#1, Thank you for the constructive comments.

  • About the Nodule Shape Prior: In Sec. 2.1, we provided details about how the prior is initialized, after discussing its necessity, so that the initialization might not be read clearly. Here, we give more information of the initialization: SHAN initializes a 2D binary matrix S, i.e. the shape prior of the nodule, in which pixels within a radius r from the center of S filled with 1 and others set as 0. The size and position of the shape in S are fixed.
  • About the Eq.1: The meaning of each item in Eq.1 is presented in the previous and following paragraphs. Among these, theta, z_{x, y}, v_{x, y} are the output of the fully connected layer f_c, which depends on different inputs.
  • About the f_n: Thank you for the reminder. We missed the explanation of f_n. The f_n is the Neighbor Decoder in Fig. 2, which will be added to the manuscript.
  • About the non-elliptical nodules: 1) In fact, most nodules are non-elliptical, which is shown by the GT in Fig.4 of the manuscript and the GT in Fig.1&2 of the supplementary. 2) SHAN generates transform matrix to refine the initial elliptical prediction by affine transformation, enabling SHAN to fit different nodules. 3) The metric HD95 can measure the consistency of boundaries between the prediction and GT, as shown in Tab.2. It can be seen that SHAN has excellent boundary fitting capability.

To Reviewer#3, Thank you for the constructive comments.

  • About the Nodule Shape Prior: Almost no nodules are perfectly circular. The choice of A/T=1 is due to two reasons: 1) It is an intermediate state. A/T<1 tends to indicate benignity, while the converse indicates malignancy. 2) The deformation is learnable. It is more intuitive to learn the different affine transformations of benign and malignant nodules in orthogonal directions from an intermediate state. The experiment with hyperparameters r is crucial. Based on the statistical data of the dataset, we can even provide different initialization states for benign and malignant nodules.
  • About the other characteristics: In TIRADS, apart from Shape, the other 4 factors have not been treated as prior in SHAN for the following reasons: 1) the dataset lacks annotations, while the shape can be derived from the mask. 2) Shape is more relevant to the segmentation task compared to the others. Thank you for the useful suggestion which we will explore in our future work.
  • About the comparation with other DA methods: We did not show how SHAN performs against other DA or TL methods for the following reasons:1) The purpose of SHAN is to enhance the generalization ability of the basic segmentation model. In our view, the SDG is one of the most direct methods to validate the generalization. Methods like DA can learn from the unseen domain during training, which diminishes the contribution of the segmentation model itself. 2) SHAN is designed as a plug-and-play method. It might be a better indication of SHAN’s generalization ability to combine SHAN with other existing DA methods, rather than a simple comparison. These experiments will be part of our future work.

To Reviewer#4, Thank you for the constructive comments.

  • About the loss: This is a good suggestion and could further enhance SHAN’s performance. Meanwhile, it is noted that SHAN is designed as a plug-and-play method. And different methods often have varying loss functions. So, to minimize the running cost, we decided to combine our loss functions with different methods in the most direct way (i.e., weighted equally).
  • About the code: We have considered releasing our code, which would undoubtedly benefit our manuscript. Since we plan to further expand our work, we will release the code in the extended version.

Thank you very much for your careful review and helpful comments on this paper.




Meta-Review

Meta-review not available, early accepted paper.



back to top