Abstract

Recent AI-based scoliosis screening methods primarily rely on large-scale silhouette datasets, often neglecting clinically relevant postural asymmetries—key indicators in traditional screening. In contrast, pose data provide an intuitive skeletal representation, enhancing clinical interpretability across various medical applications. However, pose-based scoliosis screening remains underexplored due to two main challenges: (1) the scarcity of large-scale, annotated pose datasets; and (2) the discrete and noise-sensitive nature of raw pose coordinates, which hinders the modeling of subtle asymmetries. To address these limitations, we introduce \textbf{Scoliosis1K-Pose}, a 2D human pose annotation set that extends the original Scoliosis1K dataset, comprising 447,900 frames of 2D keypoints from 1,050 adolescents. Building on this dataset, we introduce the \textbf{Dual Representation Framework (DRF)}, which integrates a continuous \textit{skeleton map} to preserve spatial structure with a discrete \textit{Postural Asymmetry Vector (PAV)} that encodes clinically relevant asymmetry descriptors. A novel \textit{PAV-Guided Attention (PGA)} module further uses the PAV as clinical prior to direct feature extraction from the skeleton map, focusing on clinically meaningful asymmetries. Extensive experiments demonstrate that DRF achieves state-of-the-art performance. Visualizations further confirm that the model leverages clinical asymmetry cues to guide feature extraction and promote synergy between its dual representations. The dataset and code are publicly available at \url{https://zhouzi180.github.io/Scoliosis1K/}.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/0904_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://zhouzi180.github.io/Scoliosis1K/

Link to the Dataset(s)

https://zhouzi180.github.io/Scoliosis1K/

BibTex

@InProceedings{ZhoZir_Pose_MICCAI2025,
        author = { Zhou, Zirui and Peng, Zizhao and Jin, Dongyang and Fan, Chao and An, Fengwei and Yu, Shiqi},
        title = { { Pose as Clinical Prior: Learning Dual Representations for Scoliosis Screening } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15972},
        month = {September},
        page = {467 -- 476}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper introduces a clinically guided, pose-based framework for adolescent scoliosis screening. The contributions of the paper are the following:

    • Scoliosis1K-Pose dataset which is an extension of the Scoliosis1K dataset with 2D pose annotations (449K frames from 1,000+ adolescents). This extension enables pose-driven scoliosis research with privacy-preserving natural walking data.

    • A nover approach named Dual-Pose Representation Framework (DRF) combining; (1) Skeleton Maps: Gaussian-based heatmaps of body keypoints for robust skeletal structure representation. (2) Postural Asymmetry Vectors (PAV) with clinically inspired metrics (vertical, midline, angular deviations) that capture bilateral asymmetries relevant to scoliosis diagnosis. (3) An attention module that embeds clinical asymmetry knowledge into feature learning.

    • The proposed method achieves state-of-the-art performance on the Scoliosis1K benchmark and enhances interpretability through attention heatmaps that align with clinical reasoning.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper proposes a novel dual-pose representation that combines skeleton maps and clinically inspired Postural Asymmetry Vectors (PAV) to capture both overall posture and specific asymmetries, closely reflecting how clinicians assess scoliosis.

    • It introduces a clinically-guided attention mechanism (PAV-Attention) that uses PAV to guide the model’s attention during feature extraction, enhancing diagnostic precision and interpretability.

    • The authors present a new Scoliosis1K-Pose Dataset, which is the first large-scale collection of 2D pose sequences for scoliosis screening, allowing privacy-safe analysis from natural walking videos.

    • The framework demonstrates strong empirical performance, outperforming previous methods with a +3.5% gain in accuracy and +4.2% in F1-score over ScoNet-MT[1], backed by ablation studies and baseline comparisons.

    [1] Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2921–2929 (2016)

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • The use of skeleton maps shows limited novelty, as the Gaussian-based approach is adapted from earlier works in gait and action recognition, such as SkeletonGait [2] and works of Duan et al. [3].

    • The PAV-Attention mechanism brings only incremental innovation, resembling existing modules like CBAM [4], with its main distinction being the incorporation of clinical priors.

    • The temporal modeling of asymmetry is simplistic, relying on time-averaged features that may overlook brief or phase-specific gait patterns important for scoliosis detection. It would have been better to see other strategies beside time-averaged features.

    • The PAV metrics used in the paper are not motivated enough nor are they compared with other clinical measures, such as pelvic obliquity used in scoliosis assessment.

    • The model’s generalizability remains unclear, as it is only tested on the Scoliosis1K-Pose dataset.

    [2] Fan, C., Ma, J., Jin, D., Shen, C., Yu, S.: Skeletongait: Gait recognition using skeleton maps. In: Proceedings of the AAAI conference on artificial intelligence. vol. 38, pp. 1662–1669 (2024)

    [3] Duan, H., Zhao, Y., Chen, K., Lin, D., Dai, B.: Revisiting skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 2969–2978 (2022)

    [4] Woo, S., Park, J., Lee, J., Kweon, I.: CBAM: Convolutional Block Attention Module. In Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8–14, (2018)

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    The paper is clear and well written. However, please discuss novelty more precisely (many components, such as the use of skeleton maps and attention mechanisms, have strong precedents in prior literature). Also, please correct spelling/grammar mistakes:

    • “a skeleton maps that convert discrete” -> skeleton map
    • “Given a pose data sequence” -> Given a sequence of pose data
    • “We extend this principle to dynamic assessment” -> assessments
    • “limb segment connecting joints” -> connecting the joints
    • “pelvis alignment” -> pelvic alignment
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    While this paper presents a well-motivated and clinically relevant approach to pose-based scoliosis screening, I am assigning a weak reject score due to some concerns that limit its current contribution. The core idea of integrating skeletal maps with clinically inspired asymmetry metrics is promising, and the introduction of the Scoliosis1K-Pose dataset adds potential value for the field. However, the paper’s novelty is somewhat incremental, as key components (such as skeleton maps and the attention mechanism) are derived from existing techniques in gait and pose analysis. Additionally, the use of mean aggregation for temporal modeling also oversimplifies the dynamics of gait-related asymmetries. These issues prevent a stronger recommendation since the paper’s weaknesses outweight its strengths.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper introduces the Dual-pose Representation Framework (DRF) for scoliosis classification, combining skeleton maps with a clinically inspired Postural Asymmetry Vector (PAV). The authors also pledge to release Scoliosis1K-Pose, a dataset extension with 2D skeletal keypoints used in this study.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This work presents several key strengths. It introduces the Dual-pose Representation Framework (DRF), which effectively integrates skeleton maps and a clinically inspired Postural Asymmetry Vector (PAV) to bridge data-driven methods with clinical assessment. The novel PAV-Attention mechanism uses the PAV as a domain prior to guide feature learning, offering a principled alternative to standard attention. Experiments on the Scoliosis1K dataset show state-of-the-art performance, with ablation studies validating the impact of PAV-Attention.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Despite its strengths, the paper has notable weaknesses. The choice of asymmetry metrics for the PAV (Vertical, Midline, Angular Deviations) lacks justification. The novelty of the PAV-Attention mechanism could be better contextualized within existing methods that incorporate priors, especially given its simplicity. The evaluation omits per-class metrics (Precision, Recall, F1), which is problematic given the severe class imbalance; this limits insight into the model’s effectiveness for detecting scoliosis cases. Additionally, the Scoliosis1K-Pose dataset, created by applying pose estimation to Scoliosis1K videos, introduces a layer of AI-derived error without clear evidence of verification or quality control, potentially affecting downstream model reliability. Finally, while the approach is innovative, the clinical relevance of pose-based scoliosis screening remains uncertain without stronger links to real-world diagnostic workflows.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper introduces a creative use of existing pose estimation methods with a novel attention mechanism that incorporates clinically inspired priors, offering a promising direction for AI-assisted scoliosis screening with state-of-the-art results. However, the paper lacks clarity regarding the dataset generated through AI-based pose estimation, which may introduce unverified biases. The evaluation would benefit from class-specific metrics, especially given the dataset’s imbalance. Lastly, while the approach is technically sound, its clinical relevance remains limited without stronger ties to real-world diagnostics.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper proposes to enhance an existing non-intrusive, video-based (silhouette-based, in fact) method for idiopathic adolescence scoliosis screening by making use of pose information (extracted from the video using existing deep learning methods) and asymmetry measured based on this pose information and incorporated in the video analysis as an attention mechanism. Evaluation is carried out on a substantial dataset, and demonstrates strong value added for this new method.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The experiments and results in this paper are very strong. The proposed method is compared to several alternatives, including the former silhouette-only based method, methods based only on pose information, and various combinations thereof, with and without the proposed attention mechanism, and using alternative attention mechanisms not explicitly based on pose asymmetry. The improvements in screening quality (measured by accuracy and F1-score) are substantial and given the magnitude of the improvements and the size of the dataset, there is no doubt that they are statistically significant. They are also probably quite significant clinically speaking. The paper is also clearly written and easy to understand.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    One weakness of this paper, though I would not call it major, is that the method used for extracting pose information from the video data was developed and tested for a more general population than that reflected in the Scoliosis1K dataset. Its accuracy was not evaluated specifically for adolescent individuals, and, perhaps more problematically, it was not evaluated for individuals with scoliosis. Providing such an evaluation, possibly on a small subset of scoliotic individuals the Scoliosis1K dataset, would certainly add strength to the paper.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    The bibliography and literature review could be a little bit more diverse. There are many works on non-invasive scoliosis screening/assessment imaging modalities, including surface topography, that pre-date the works cited when referring to “early work”. The work carried out at the Ste-Justine Hospital Research Centre and Polytechnique Montreal comes to mind.

    “dual-pose representation”: here the hyphen implies that it is the pose that is dual, not the representation. This is an important distinction for an accurate understanding of the paper.

    I would avoid using the terminology “scoliosis classification” when referring to screening. “Scoliosis classification” generally refers to assessment of a known deformity according to a specific classification system (e.g. the Lenke classification system), which is not what is being described in this paper. Similarly, it would be better to avoid using the term “assessment “ (which would typically be performed using X-ray/EOS imaging), when what is meant is actually screening.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The experimental methods and results are very robust, and they are also new.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A




Author Feedback

Response to Reviewers

We would like to sincerely thank all reviewers for their thoughtful and constructive feedback on our paper titled “Pose-Based Scoliosis Screening: Leveraging Clinical Asymmetry with Dual Pose Representation”. We deeply appreciate the time and effort each reviewer devoted to evaluating our work.

We have carefully considered all suggestions and comments provided. While the paper has been accepted, we recognize the value of these insights for guiding our future research and improving the clarity, rigor, and clinical relevance of our work. Specifically: -To Reviewer #1: Thank you for your encouraging review and for highlighting the strength of our experiments. We will address your suggestions on reference diversity and terminology in the camera-ready version. Your comment on the generality of the pose estimation method is highly insightful. While, as noted in our Introduction, pose representations have shown strong performance across various medical tasks using similar estimators, we agree that evaluating them specifically on adolescents with scoliosis would strengthen the study. We will consider this in future validation efforts. -To Reviewer #2: Thank you for your detailed feedback and constructive suggestions. We will carefully consider them in our work. -To Reviewer #3: Thank you for your valuable comments. We appreciate your recognition of our framework’s strengths and the thoughtful concerns you raised. Regarding the design of the Postural Asymmetry Vector (PAV), the choice of asymmetry metrics was informed by clinical experience and validated through preliminary statistical studies. These were omitted due to space constraints but will be briefly mentioned in the camera-ready version.

Once again, we are truly grateful for your valuable feedback, which will help us refine and extend our work moving forward.

Sincerely, The Authors




Meta-Review

Meta-review #1

  • Your recommendation

    Provisional Accept

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A



back to top