Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Patient mobility monitoring in intensive care is critical for ensuring timely interventions and improving clinical outcomes. While accelerometry-based sensor data are widely adopted in training artificial intelligence models to estimate patient mobility, existing approaches face two key limitations highlighted in clinical practice: (1) modeling the long-term accelerometer data is challenging due to the high dimensionality, variability, and noise, and (2) the absence of efficient and robust methods for long-term mobility assessment. To overcome these challenges, we introduce MELON, a novel multimodal framework designed to predict 12-hour mobility status in the critical care setting. MELON leverages the power of a dual-branch network architecture, combining the strengths of spectrogram-based visual representations and sequential accelerometer statistical features. MELON effectively captures global and fine-grained mobility patterns by integrating a pre-trained image encoder for rich frequency-domain feature extraction and a Mixture-of-Experts encoder for sequence modeling. We trained and evaluated the MELON model on the multimodal dataset of 126 patients recruited from nine Intensive Care Units. Experiments showed that MELON outperforms conventional approaches for 12-hour mobility status estimation with an overall area under the receiver operating characteristic curve (AU-ROC) of 0.82 (95% confidence interval 0.78-0.86). Notably, our experiments also revealed that accelerometer data collected from the wrist provides robust predictive performance compared with data from the ankle, suggesting a single-sensor solution that can reduce patient burden and lower deployment costs. Project repository: https://github.com/iheallab/MELON.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/5426_paper.pdf

SharedIt Link: https://rdcu.be/eG4Ds

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-05182-0_34

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/iheallab/MELON

Link to the Dataset(s)

https://github.com/iheallab/MELON

BibTex

@InProceedings{ZhaJia_MELON_MICCAI2025,
        author = { Zhang, Jiaqing AND Contreras, Miguel AND Sena, Jessica AND Davidson, Andrea AND Ren, Yuanfang AND Guan, Ziyuan AND Ozrazgat-Baslanti, Tezcan AND Loftus, Tyler J. AND Nerella, Subhash AND Bihorac, Azra AND Rashidi, Parisa},
        title = { { MELON: Multimodal Mixture-of-Experts with Spectral-Temporal Fusion for Long-Term MObility EstimatioN in Critical Care } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15974},
        month = {September},
        page = {343 -- 352}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper describes a classification method of mobility estimation data measured by accelerometers attached to ICU patients. The classification levels are completely immobile/ very limited/ strictly limited / limited. The classification model has two different branches, where spatio-frequency image(spectrograms) are inputs, and where accelerometer statistical features are inputs. The proposed method is shown to outperform GRU and Transformeters, and abberation was performed for two branch classifications.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The proposed method is shown to outperform GRU and Transformeters, and abberation was performed for two branch classifications.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

The motivation of the two-branch architexture was unclear to me. The authors argue that the combination of spatio-frequency features and accelerometer statistical features has different modalilties. However, they both are extracted from the same accelerometer outputs. I think the authors should explain what aspects of the accelerometer data were to be extracted by these two types of features.

Similary, the two-branch architecutre is given by the authors, but how these branches were designed is not explained.
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(2) Reject — should be rejected, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

A major disadvantage of this paper is that the motivation and design of the classification model was not clear. I think the authors should explain the charactoristics of the accelerometer raw outputs, spatio-frequency imaged features, and accelerometer statistical features, and then how they design the classification models.
Reviewer confidence

Somewhat confident (2)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

The paper introduces MELON, a model trying to predict 12-hour ICU patient mobility from wrist/ankle accelerometer data. Their main idea is to process the data in two ways – as spectrogram images and as sequences of statistical features – using separate network branches (ResNet and a Mixture-of-Experts Transformer) and then combine them.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Addresses an important problem: Automatic ICU mobility monitoring is definitely needed. Multimodal idea is intuitive: Combining frequency (spectrogram) and time-domain stats makes sense; they might capture different aspects of movement. Wrist sensor finding is practical: Showing the wrist works well enough is a useful result for deployment.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

Weak Validation: This is my main issue. They tested on only 126 patients from their own institution(s). That feels pretty small to claim robust performance (like the 0.82 AUROC) for something as variable as ICU mobility. There’s no external validation at all. How do we know this isn’t just tuned to their specific patient mix or setup? Model Complexity vs. Gain: The MELON architecture seems quite complex – two specialized encoders (pre-trained ResNet, pre-trained Time-MoE), attention fusion, etc. Is this necessary? The ablation shows dropping a branch hurts their model, but the overall gain compared to, say, just the Time-MoE branch (0.78 vs 0.82 AUROC) doesn’t scream “breakthrough” given the added complexity. Maybe a simpler approach could get close? The paper doesn’t really justify this level of engineering. Class Imbalance Handling: They note some classes are rare (“Completely immobile”, “No limitation”). While the overall AUROC looks okay, the paper doesn’t dig into how reliable the predictions are for these specific, potentially critical, minority classes. The wide CIs in Table 2 for these classes make me skeptical. This needs more focus if it’s meant for clinical use.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I’m leaning towards reject because the validation feels insufficient for the claims being made. An AUROC of 0.82 is decent, but based on only 126 patients with no external testing, I can’t be sure it means much in the real world. The model also seems potentially overly complex for the problem, and the benefit over slightly simpler approaches isn’t convincingly large or well-justified. Finally, the handling of imbalanced classes needs more careful analysis than just reporting overall AUROC.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

The paper deals with the estimation of the mobility level of a patient.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- very relevant application
- it seems that a new data set has been recorded and considered in this study
- large data set has been considered
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- the methodological part is too brief
- implementation is not available
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I guess paper can be accepted but authors should make code and data available to the public.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

We thank the reviewers for their thoughtful feedback. For clarity, each comment is listed in bullet form, followed by our response. Reviewer 1 Comment: The methodological part is too brief, and the Implementation is not available. Thank you for the valuable feedback. We have enlarged the Methods section of the paper by condensing the Introduction. We will post a workflow diagram, pseudocode, hyperparameter tables, and the fully documented code on GitHub immediately after acceptance to preserve double-blind review. Raw sensor data, governed by IRB constraints, will be shared on reasonable request. Reviewer 2 Comment: The motivation of the dual-branch architecture is unclear, and the rationale behind the design of the two branches. Thank you for highlighting the need for a more explicit rationale. While both branches start from the same accelerometer stream, they are crafted to isolate complementary information. Spectrograms represent the frequency-based features that emphasize periodicities, rhythms, and oscillations in the signal that are not immediately observable in the raw time domain. In contrast, statistical feature sequences capture structured and fine-grained variations over time, focusing on temporal variations such as signal amplitude changes, localized anomalies, bursts, and overall variability. The architecture mirrors this distinction. Spectrograms enter a pre trained ResNet to transfer frequency-based patterns to the mobility task. The statistics sequence passes through a MoE structure trained autoregressively; its self attention with rotary embeddings and sparse experts specialises in long sequences. Reviewer 3 Comment: Weak Validation Thank you for raising this important point. We recognise that external, multi-centre validation is the ideal next step, and we acknowledge this limitation in the Discussion. However, the current study represents a meaningful step toward generalisability for two reasons:

No public ICU mobility dataset currently exists, necessitating single-centre studies. We will openly share preprocessing and code to support replication and dataset pooling.

Our cohort is diverse, large, and logistically challenging to assemble in critical care settings, including 126 adults (2019–2024) from nine distinct ICUs (Cardiology, Cardiac, Medical, Neuromedicine, Neuro-vascular, Thoracic & Lung Transplant, Trauma, and Surgery), monitored using two sensor brands (Shimmer ECG, ActiGraph GTX3+) worn on wrist/ankle. Such variation in patient mix, workflow, and hardware reduces unit-specific bias and supports broader future validation. Comment: Model Complexity vs. Gain Thank you for your thoughtful feedback. A rise in AUROC from 0.78 to 0.82, though numerically small, can meaningfully improve decision-making for thousands of ICU patients each year. For example, a 4% difference in sensitivity would translate into 40 additional patients for every 1000 being correctly identified for their mobility status in the ICU. Taking into account that around 5 million patients are admitted every year to ICUs in the US, this would hypothetically impact ~240,000 patients. As for complexity, MELON’s architecture is purpose-built to model both long-range temporal dependencies and subtle, structured fluctuations that simpler models—GRU, Transformers, and other traditional approaches—could not capture in our experiments. Comment: Class Imbalance Handling Thank you for raising this important concern. We agree that reliable estimates for the “Completely immobile” and “No limitation” classes are critical for clinical use. Unfortunately, these extremes are genuinely uncommon in our current dataset—only five test-set cases in total, which inflates the CIs you observed even after mitigating by class weighting. Because additional data collection is still in progress, a more granular class-level analysis cannot be completed within the scope of the present manuscript.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

There was some diversity of opinion. However, each reviewer offered thoughtful comments and insights, attention to which would markedly improve the paper.
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

back to top

MELON: Multimodal Mixture-of-Experts with Spectral-Temporal Fusion for Long-Term MObility EstimatioN in Critical Care

Author(s):