Abstract

Segmenting complex layer structures, including subcutaneous fat, skeletal muscle, and bone in arm musculoskeletal ultrasound (MSKUS), is vital for diagnosing and monitoring the progression of Breast-Cancer-Related Lymphedema (BCRL). Nevertheless, previous researches primarily focus on individual muscle or bone segmentation in MSKUS, overlooking the intricate and hybrid-layer morphology that characterizes these structures. To address this limitation, we propose a novel approach called the hybrid structure-oriented Transformer (HSformer), which effectively captures hierarchical structures with diverse morphology in MSKUS. Specifically, HSformer combines a hierarchical-consistency relative position encoding and a structure-biased constraint for hierarchical structure attention. Our experiments on arm MSKUS datasets demonstrate that HSformer achieves state-of-the-art performance in segmenting subcutaneous fat, skeletal muscle and bone.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1638_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1638_supp.zip

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Che_HybridStructureOriented_MICCAI2024,
        author = { Chen, Lingyu and Wang, Yue and Zhao, Zhe and Liao, Hongen and Zhang, Daoqiang and Han, Haojie and Chen, Fang},
        title = { { Hybrid-Structure-Oriented Transformer for Arm Musculoskeletal Ultrasound Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15001},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper presents a novel hybrid structure-oriented transformer for arm musculoskeletal ultrasound segmentation to aid in diagnosis and screening of cancer lymphedema. The results are compelling when compared to existing segmentation approaches.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Overall this paper is very well-organized, and the experimental plan and results are clear. The justification for the approach to the transformer model design was justified well. The results of this model, both qualitatively and quantitatively, for segmenting different anatomical layers from ultrasound images of the arm were very compelling.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    There were limited weaknesses for this paper, but it could benefit from some additional discussion of the results and some of the clinical relevance. Future work could include utilizing more MSK datasets to further validate these preliminary findings on this one dataset.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Overall, this paper was very strong. As noted above, it could be improved with more discussion regarding the clinical significance of the findings and future work.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper addresses an important problem in MSK segmentation that has applicability even beyond BCRL. Findings are relevant to the MICCAI community, and the paper was very well organized and written.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper proposes a novel deep learning architecture for the segmentation of musculoskeletal structures from ultrasound images. The contributions of the paper are the proposal of the hybrid structure-oriented trasnformer, which includes several components that leverage the intra- and inter-layer information within the tissues to improve the segmentation performance. The method was carefully engineered to account for curvilinear structure segmentation. The proposed approach has shown improvements over multiple state-of-the-art models, and was validated on internal and public databases.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The proposal of a novel architecture for MSK structures in US images.
    2. Proposal of a positional embedding module that leverages the hiearchical consistency between vertical and horizontal features pixels located at the similar structures.
    3. Adding new constraints (SBC), which fit with the proposed spherical transformation based attentions to account for curvilinear structures.
    4. Extensive validations on internal and external public databases with ablation studies on the proposed modules.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Unclear definitions of the polar transformation and relationships to the settings in the imaged structure geometry.
    2. Insufficient discussion of the failure modes in the proposed method.
    3. Insufficient explanation of image/acquisition differences between the internal and public databases.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The paper uses internal (not published) and an open database. Neither the code nor the internal database were made publicly available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. The proposed methods sound novel in terms of model archiecture and positional embedding models that fit with the addressed task. The method has shown excellent performance in all metrics compared with all the tested models. The ablation study has shown that the improvements conributed to the overall performance. It would be nice if more visualizations were added on the ablation study results. For example, removing the SBC layer would give better performance on all metrics and structures compared with the other components, but this performance was not discussed in the paper.

    2. It’s not clear how the HCPE and SBC polar transformations are set, how the origin of the coordinate system is set, and how is it sensitive to the curvilinearity to the target structures. Simple explanation of these settings would help the reader understand/generalize the concepts to other similar tasks.

    3. The results have shown significant drops in the internal database in Table 3 (by 10% margin) in the skin, fat and muscle layers. Why did this drop happen? Any differences between the public and internal databases in terms of image quality or imaging conditions?

    4. It’s not clear from the figure which color represents which structures. A color legend would be helpful.

    5. The U-Net and other models failures look somehow subtle, with no details provided on the training scenarios of those approoaches. Reference ot enlistings of the training settings would help in understanding the weak outcomes of those methods.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well structured, and has high novelty in terms of the methods and rigidity of the experiments, which led me to acceptance decision.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper
    1. The paper proposed an arm ultrasound (US) segmentation model for musculoskeletal (MSK) structures, considering musculoskeletal layers in the image.
    2. The paper proposed hierarchical-cosistentcy position embedding (HCPE) and attention with structure-biased constraint (SBC) modules into Transformer, designed explicitly for MSKUS.
    3. Experiments using a public arm MSKUS dataset to compare the proposed method with several popular and recent ones. A small-scale in-house dataset was further used to validate the generalizability.
    4. Ablation studies to validate the proposed HCPE and SBC modules.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The motivation and difficulty for arm MSKUS segmentation were well-described. The complex hierarchical layers and diverse morphology of MSKUS significantly degrade the segmentation model performance.
    2. Unlike other models, the proposed method tried to improve the segmentation performance by designing specific modules for the hierarchical layers and diverse morphology, proposing HCPE and SBC modules for Transformer. By converting the 2D image into a sphere image, the attention consistency for the same structure was enhanced/maintained.
    3. Benchmarking using a public arm MSKUS dataset showed a significant improvement over existing methods, some of which were recently proposed for MSKUS segmentation.
    4. The results of our ablation study provide compelling evidence of the effectiveness of the proposed HCPE and SBC. The HCPE outperformed the traditional learnable positional embedding and relative positional embedding.
    5. The generalizability of the proposed method was validated using a small-scale in-house dataset. The proposed method showed reasonable results on the in-house dataset.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The proposed model’s architecture basically followed the design convention that used stages of encoding with a segmentation head. While the detailed configuration of the model architecture was given, the scaling performance was not discussed.
    2. The generalizability of the proposed method was validated using a different dataset without re-training. However, it’s important to note that the validation dataset was small-scale, and there was no comparison with other methods. It is not clear whether the proposed method is still the optimal one on the validation dataset. There is a possibility that other methods may generalize better.
  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The main dataset used in this paper is a public one; the small-scale one for generalizability validation does not seem to be available. No statement about code availability. It is anticipated that the code will be opened.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The model was actually designed as a general backbone for several tasks, which can be adopted by switching the model head. I am also interested in its performance on other ultrasound tasks, such as classification and object detection. In the conclusion section, it might be better to mention the model’s potential for other US-based tasks.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well-organized, and the figures are clear. The proposal to solve complex layers and structure bias in MSKUS is novel. The improvement of the proposed method was solid. This is a strong paper with minor weaknesses.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We gratefully thank the reviewers for their remarks and suggestions. We appreciate the encouraging comments like “well-organized paper, well-justified model design, clear and compelling results; addresses an important problem in MSK segmentation that has applicability even beyond BCRL” of R1, “the paper is well structured, and has high novelty in terms of the methods and rigidity of the experiments” of R2, and “the proposal to solve complex layers and structure bias in MSKUS is novel; the improvement of the proposed method was solid” of R3. For the questions and concerns, our responses are given as below: (1) Q: “Definitions of the polar transformation and relationships to the settings in the imaged structure geometry” for R2 A: Exactly, considering the complex and hybrid-layer morphology, horizontal or irregularly curvilinear tissue layer structures in the MSKUS images, to promote the understanding of the hierarchical structures for segmenting, we convert the Cartesian coordinates of image pixels into polar coordinates, then calculate hierarchical-consistency positing embedding according to formulas (1) - (2), and combine structure-biased constraint for the final attention calculation. Empirical results prove that the proposed HSformer outperforms other segmentation models and has potential for wilder clinical application. The specific conversion process will be demonstrated in the publicly available code. (2) Q: “Explanation of image/acquisition differences between the internal and public databases” for R2 A: The differences between internal and public datasets are as follows: Firstly, the collection equipment is different. The public dataset uses Alpinion E-Cube 12 system (Bothell, WA, USA) with L3-12H high-density linear probe for MSKUS imaging, while the internal dataset uses a SonoScape E2 machine; Secondly, as shown in Figure 5 and Figure A3 in the appendix, the anatomical structures of the MSKUS images from the two datasets are consistent, with a large amount of speckle noise and shadow artifacts that pose greater challenges to the segmentation task. Although there are significant differences in image quality and style between the two datasets, the proposed HSformer achieves a mean DSC of 0.90 on the public dataset, and 0.82 on the internal dataset without model training, which demonstrate the strong generalization of our model.




Meta-Review

Meta-review not available, early accepted paper.



back to top