Abstract

Deep learning (DL) has achieved remarkable progress in the field of medical imaging. However, adapting DL models to medical tasks remains a significant challenge, primarily due to two key factors: (1) architecture selection, as different tasks necessitate specialized model designs, and (2) weight initialization, which directly impacts the convergence speed and final performance of the models. Although transfer learning from ImageNet is a widely adopted strategy, its effectiveness is constrained by the substantial differences between natural and medical images. To address these challenges, we introduce Medical Neural Network Search (MedNNS), the first Neural Network Search framework for medical imaging applications. MedNNS jointly optimizes architecture selection and weight initialization by constructing a meta-space that encodes datasets and models based on how well they perform together. We build this space using a Supernetwork-based approach, expanding the model zoo size by 51x times over previous state-of-the-art (SOTA) methods. Moreover, we introduce rank loss and Fréchet Inception Distance (FID) loss into the construction of the space to capture inter-model and inter-dataset relationships, thereby achieving more accurate alignment in the meta-space. Experimental results across multiple datasets demonstrate that MedNNS significantly outperforms both ImageNet pretrained DL models and SOTA Neural Architecture Search (NAS) methods, achieving an average accuracy improvement of 1.7% across datasets while converging substantially faster. The code and the processed meta-space is available at https://github.com/BioMedIA-MBZUAI/MedNNS.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2702_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/BioMedIA-MBZUAI/MedNNS

Link to the Dataset(s)

MEDMNIST: https://medmnist.com/

BibTex

@InProceedings{MecLot_MedNNS_MICCAI2025,
        author = { Mecharbat, Lotfi Abdelkrim and Almakky, Ibrahim and Takac, Martin and Yaqub, Mohammad},
        title = { { MedNNS: Supernet-based Medical Task-Adaptive Neural Network Search } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15965},
        month = {September},
        page = {456 -- 466}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposes MedNNS, a neural network search (NNS) framework tailored for medical imaging tasks. It aims to jointly address architecture selection and weight initialization by building a meta-learning space based on supernetwork-extracted subnetworks and performance-predictive embeddings. The authors introduce a rank loss and a FID-based dataset similarity loss to enhance the alignment between datasets and models in the latent space. The method is evaluated on MedMNIST datasets and compared against both standard models and neural architecture search (NAS) baselines.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The idea of jointly learning dataset and model embeddings to guide network selection is interesting, and the proposed use of rank loss and FID loss as additional regularization terms is novel in its formulation.
    2. The paper utilizes weight-sharing Supernetworks to generate a large number of subnetworks (over 720k model-dataset pairs), which reduces the training overhead compared to traditional NAS like TANS.
    3. The authors report not only final accuracy (100 epoch) but also early-epoch (10 epoch) performance, which is useful for assessing adaptation speed.
    4. The visualization of the method (Fig. 2) and dataset and model embeddings (Fig. 3) are clear and beautiful.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. The authors claim MedNNS is the first NNS framework for medical imaging, which is incorrect. For example, “Automatic searching and pruning of deep neural networks for medical imaging diagnostic” [https://ieeexplore.ieee.org/abstract/document/9222548] predates this work. Such overstatements raise concerns about the thoroughness of the literature review.
    2. The importance of architecture selection and weight initialization has diminished with the emergence of powerful pretrained vision foundation models (e.g., ViT variants, CLIP, MedSAM). This work does not justify why task-specific architecture search is still needed or preferable, especially in the absence of large-scale pretrained backbones in their pipeline.
    3. Experimental settings: 1) The MedMNIST benchmark, while diverse in modality, is widely recognized as a toy dataset and not representative of clinical imaging challenges. There is no experiment on 3D imaging or full-size clinical datasets, limiting the applicability of results. 2) All models are based on ResNet-like backbones, which have been largely outperformed in recent literature. 3) The choice of optimizer and hyperparameters, such as Adam with a learning rate of 1e-2, is questionable and not well-justified.
    4. The paper does not present any clinical task, downstream diagnostic use case, or interpretation-related application. Thus, the work is better characterized as a general-purpose NAS strategy developed in the medical domain, rather than a contribution to medical AI specifically.
    5. Despite claims of improved efficiency, the method involves training Supernetworks and evaluating over 720k model-dataset pairs. The real-world feasibility of this approach is not demonstrated, and no wall-clock time or GPU budget is reported.
    6. The presentation is still lack of clarity: 1) The introduction is overloaded with equations and does not clearly delineate the limitations of prior work. 2) Sectional flow is hard to follow, and the method description is unnecessarily verbose in parts while underdeveloped in others (e.g., the FID loss lacks sufficient intuition).
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (2) Reject — should be rejected, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper introduces a technically elaborate but limited contribution. The novelty is overstated, the chosen datasets are too simplistic, and the architectural choices are outdated. Furthermore, there is no demonstrated clinical value, and the method imposes significant computational costs. While the meta-space formulation has conceptual merit, the authors do not convincingly show that it is necessary or competitive compared to modern transfer learning with large-scale vision models. Overall, the paper lacks the rigor, relevance, and clarity expected at MICCAI.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Reject

  • [Post rebuttal] Please justify your final decision from above.

    While the authors provide a thoughtful response addressing several concerns, key issues remain unresolved. First, the justification for task-specific architecture search is unconvincing given the growing availability of powerful pretrained medical vision models; the absence of comparative baselines weakens the argument. Most critically, the experimental setting relies entirely on MedMNIST, a toy dataset not representative of real-world clinical challenges. The paper lacks evaluation on full-resolution 2D/3D datasets or any clinical use case, limiting its practical relevance. Overall, the submission does not meet the bar for publication.



Review #2

  • Please describe the contribution of the paper

    This paper proposes a novel approach for selecting models and pretrained weights. Experimental results demonstrate that the proposed method outperforms existing approaches in most cases.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The proposed method incorporates novel elements, including new loss terms and the application of techniques aimed at reducing training costs.
    • The experiments are extensive and thoughtfully designed to minimize data-related biases. The results further demonstrate the effectiveness of the proposed method.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • As the authors mentioned, weights pretrained on ImageNet may not transfer well to medical imaging tasks due to domain differences. However, the proposed loss function incorporates FID, which relies on embeddings from a model trained on ImageNet. Although the ablation study suggests that this component contributes positively to performance, it would be helpful to know whether the authors considered the potential limitations of using such domain-specific embeddings.

    • It would be beneficial to include a comparison of the computational overhead between the proposed method and the baseline models, including the cost associated with training model zoos as well as inference time

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed method is novel, and the paper is clear and well-organized. The experiments are extensive and demonstrate superior performance.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    I believe the authors have adequately addressed most of the reviewers’ concerns. Overall, this work could be helpful for the community.



Review #3

  • Please describe the contribution of the paper

    The paper introduces MedNNS, presented as the first Neural Network Search (NNS) framework specifically designed for medical imaging tasks. Unlike traditional Neural Architecture Search (NAS), MedNNS aims to jointly optimize both the architecture selection and the weight initialization for a given target medical dataset. It does so by:

    1. Constructing a large “model zoo” efficiently using a Supernetwork trained on various source medical datasets, allowing the extraction of numerous subnetworks (architecture + inherited weights) without individual retraining.

    2. Building a meta-learning space where dataset embeddings and model embeddings are aligned. This alignment is optimized using a novel combination of losses: a performance prediction loss, a rank loss, and an FID-based loss (to ensure datasets with similar feature distributions are close in the embedding space).

    3. Enabling efficient querying of this meta-space with a new target dataset to retrieve the most promising pre-trained model (architecture + weights) for faster convergence and better performance compared to standard ImageNet pre-training or typical NAS approaches.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Novel Problem Formulation: Addresses the dual challenge of architecture selection and weight initialization in the context of medical imaging due to domain shift from natural images and the diversity of tasks/modalities. Framing this as an NNS problem specifically for medical imaging, leveraging a meta-learning space, is a novel and relevant contribution.


    Efficient Model Zoo Construction: The use of a Supernetwork combined with weight sharing and fair sampling to generate a large model zoo is a significant strength - almost 51x times over previous SOTA methods- making the meta-learning approach more feasible.


    Novel Meta-Space Optimization: The introduction of the rank loss and the FID loss for constructing the meta-space appears novel in this context. The rank loss aims to capture relative model performance more effectively than standard contrastive losses used in TANS, while the FID loss explicitly incorporates inter-dataset similarity. The advantages of incorporating these losses are evident in the results of the ablation study shown in Table 2 .


    Demonstrated Performance Gains & Faster Convergence: Experimental results on MedMNIST datasets show MedNNS variants generally outperform both standard DL models (with ImageNet pre-training) and several SOTA NAS methods in terms of final accuracy. Critically, the accuracy at 10 epochs is often substantially higher compared to the other methods, demonstrating much faster convergence.


    Strong visualization and analysis: The paper provides visualization of the learned meta-space, showing clear clustering and correlation between embedding proximity and model performance.


    Reproducibility: The authors promise to release the source code and the processed meta-space.


  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Limited Comparison to TANS: a direct empirical comparison with TANS is missing (stated as due to lack of TANS model zoo structure file). TANS used contrastive loss and it would have been interesting to compare it against the Rank/FID/Perf loss combinations proposed in this paper, and also better discern the contribution of the novel combined loss function proposed here vs the supernetwork structure.



    Generalization Issues and comparison with basic transfer learning: The paper acknowledges poorer performance on the TissueMNIST dataset, attributing it to high FID distance (dissimilarity) from other datasets in the meta-space. This highlights a potential limitation: the effectiveness of MedNNS depends on having sufficiently related datasets within the meta-space construction phase.

    Extending this point, given the similarity between the datasets Organ S, Organ C and organ A - how would the proposed method fare in comparison to just a basic transfer learning approach after training on S, C or A ?

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    R1: Fig 2, In A.2 you can use the notations E_m and E_d for the encoders (makes it easier to follow the text).

    Q. What is the ‘performance’ here that is repeatedly mentioned in the paper ?

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper tackles a highly relevant problem in medical AI – jointly optimizing architecture and initialization – with a novel NNS framework (MedNNS) tailored for medical tasks. The core strengths are the innovative use of Supernetworks for model zoo creation and the novel meta-space learning approach incorporating rank and FID losses. These contributions are technically sound and well-motivated. The empirical results convincingly demonstrate the primary benefit: significantly faster convergence and improved performance compared to standard ImageNet transfer learning and various NAS methods across multiple medical datasets. The clarity of the presentation and the promise of code release further strengthen the paper. While weaknesses exist (lack of direct TANS comparison, modest final accuracy gains in some cases, potential generalization limits) they can be easily addressed in future work and the strengths significantly outweigh these weaknesses especially for presentation at MICCAI. The work opens a promising direction for efficiently adapting deep learning models to diverse medical imaging tasks.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    I thank the authors for their comments. While the concerns with using FID still persist, the strengths of this paper and its potential for clinical impact and future discussions significantly outweigh any drawbacks that have been highlighted by other reviewers.

    As a foundational, first-attempt work, this paper is good to go in its current form plus a bit of polishing to improve clarity and remove repetitive material. My final decision is Accept.



Review #4

  • Please describe the contribution of the paper

    The paper introduces MedNNS, a novel framework for medical neural network search that constructs a shared meta-space embedding both models and datasets. This joint embedding allows for efficient retrieval of the best architecture and pretrained weights for a new medical imaging task.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1- The paper addresses a significant and challenge in medical imaging.

    2- The paper is well-written and logically organized. The motivation, methodology, and experimental results are presented in a clear and accessible way, making it easy to follow the proposed framework

    3- The experiments are rigorous, covering six diverse medical datasets with proper cross-validation. Results demonstrate consistent improvements.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    1) The paper does not provide sufficient details about how FID is computed. If FID is calculated using InceptionV3 pretrained on ImageNet — as is standard — it raises concerns about whether the resulting similarity scores truly reflect medically relevant features.

    2) The ablation study in Table 2 includes a contrastive loss, which is not used in MedNNS. While it seems to be included in a previous work, its presence is confusing in the context of an ablation. Typically, ablation studies are expected to isolate and test the effect of components that are part of the proposed method.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (6) Strong Accept — must be accepted due to excellence

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well-written and easy to follow, and the experimental setup is thorough, demonstrating consistent improvements over SOTA methods.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors have addressed all my concerns.




Author Feedback

We thank all reviewers (R1–R4) for their valuable feedback & recognizing several contributions such as novelty (R1&3), efficiency (R1,2), clarity (all), and rigorous analysis (R1,3&4). Response to the comments: [R2] First Medical NNS: Unlike our proposed MedNNS. “DNNDeepeningPruning” follows a two-stage typical NAS pipeline, which decouples architecture selection and weight initialization, making the method task-specific and requiring it to be repeated for each new dataset. In contrast, MedNNS jointly optimizes architecture and initialization within a meta-space built once and reused for new datasets by encoding only a few samples. While earlier methods apply NAS to medical imaging (see survey MedNAS), MedNNS is, to our knowledge, the first to unify architecture selection and weight initialization in a task-adaptive framework for this domain. [R2] Problem relevance: Although medical pretrained foundation models are powerful, they perform poorly on specific medical datasets, unless fine-tuned. However, Finetuning can cost more than training small models from scratch, and large models also incur high inference costs. MedNNS facilitates the automatic selection of task-specific small models, enabling higher performance with lower computational cost. [R3,R4] FID Loss: We acknowledge the reviewers’ points regarding the FID loss, which relies on ImageNet embeddings. While they may not be optimal for transfer learning in medical due to domain differences, they are sufficiently expressive to quantify similarities between datasets based on image samples. This is supported by Fig. 1, where FID correlates with the visual similarity of datasets. Also, [16] indicates that FID correlates with feature reuse. Note that FID is computed using standard methods, and its positive impact is demonstrated in our ablation study. [R1] Comparison with transfer learning: MedNNS surpasses basic transfer learning. e.g., transfer from OrganC to OrganS results in ~70% accuracy (Fig 1), MedNNS achieves ~81% on OrganS (Tab 1). This gain stems from transfer learning’s reliance on manually chosen architectures and weights, which may be suboptimal for the target task. [R2,R3] Computational Cost: MedNNS enhances efficiency compared to prior methods. Despite this, constructing the supernet-based model zoo still requires moderate resources (~3H on an A100 GPU per dataset). However, this one-time process enables rapid model selection for any new medical dataset by encoding only a few samples (~1s). Also, MedNNS reduces training costs for any target medical dataset by matching or surpassing 100-epoch models in just 10 epochs (Tab 1). We’ll clarify this further in the paper. [R2] Dataset and Architecture Choice: MedMNIST (224×224) is a diverse benchmark built from real-world medical datasets, including X-ray, histopathology, and OCT (Tab 2 in [26] for data sources), making it well-suited for evaluating generalization. We used ResNet-like architectures for efficient supernet design with flexible depth and width. As shown in Table 1, MedNNS-picked ResNets outperform baselines like DenseNet, validating our method. Meta-space hyperparameters were set empirically, they affect space quality, not SOTA comparison. See Experiments section for models’ training settings. [R2] Clinical relevance: Though not targeting a clinical task, MedNNS tackles a key medical AI challenge: adapting DL models to diverse medical data. As noted in the introduction, the unique nature of medical images often limits the effectiveness of transfer learning from natural images. This motivates a specialized NNS framework with tailored architectures and initializations for the medical domain. [R4] Contrastive loss: The contrastive loss is not part of MedNNS but is included in ablation to: 1) isolate the impact of the proposed Rank and FID losses from the effect of the supernetwork and show their added value, 2) enable an indirect comparison with TANS, since a direct comparison was not possible.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Even though the article has merit and is of interest to the MICCAI community, as recognized by all reviewers, I tend to agree with Reviewer 2 that it cannot be accepted in its current form. The main reason is the lack of comparison with fine-tuned or adapted pretrained vision foundation models (e.g., LoRA, adapters, prompt tuning on DINOv2, CLIP, MedSAM). Such a comparison, both in terms of performance and computational cost, is essential to justify why task-specific architecture search is still necessary or preferable.



back to top