Abstract

The success of deep learning in medical imaging applications has led several companies to deploy proprietary models in diagnostic workflows, offering monetized services. Even though model weights are hidden to protect the intellectual property of the service provider, these models are exposed to model stealing (MS) attacks, where adversaries can clone the model’s functionality by querying it with a proxy dataset and training a thief model on the acquired predictions. While extensively studied on general vision tasks, the susceptibility of medical imaging models to MS attacks remains inadequately explored. This paper investigates the vulnerability of black-box medical imaging models to MS attacks under realistic conditions where the adversary lacks knowledge of the victim model’s training data and operates with limited query budgets. We demonstrate that adversaries can effectively execute MS attacks by using publicly available datasets. To further enhance MS capabilities with limited query budgets, we propose a two-step model stealing approach termed QueryWise. This method capitalizes on unlabeled data obtained from a proxy distribution to train the thief model without incurring additional queries. Evaluation on two medical imaging models for Gallbladder Cancer and COVID-19 classification substantiate the effectiveness of the proposed attack. The source code is available at https://github.com/rajankita/QueryWise.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2993_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/2993_supp.pdf

Link to the Code Repository

https://github.com/rajankita/QueryWise

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Raj_Assessing_MICCAI2024,
        author = { Raj, Ankita and Swaika, Harsh and Varma, Deepankar and Arora, Chetan},
        title = { { Assessing Risk of Stealing Proprietary Models for Medical Imaging Tasks } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15011},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper explores the vulnerability of black-box models for medical imaging tasks. Specifically, the authors came up with the idea that model stealing (MS) attacks could be easily achieved by querying a publicly available dataset to the model and cloning its performance on a thief one. And an enhanced MS approach, named QueryWise, is proposed by using a limited number of unlabeled query data to train and obtain a promising thief model.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The idea of using existing datasets to attack black-box models to steal the model’s parameters and performance is a very interesting and worthwhile security problem.
    2. The written English of this paper is very clear, and the method proposed is very easy to follow.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The main idea of model stealing attacks proposed in this paper is very similar to the idea of “model extraction attack”[1]. However, the authors didn’t discuss any papers about model extraction attacks or compare the performance of their method to any existing model extraction attack approaches[1-3]. The authors should discuss the differences between model stealing attacks and model extraction attacks.
    2. As mentioned in the abstract and Sec.1, the proposed method QueryWise could use a limited query budget to train the thief model. However, in the experimental section, this number is set to 5000, which is a large number for most of the medical imaging datasets. I don’t think this could be treated as “limited”.
    3. The proposed method uses a teacher-student framework and uses pseudo-labeling as supervision signal, which shows high similarity to the methods used in knowledge distillation, the authors should discuss this in the paper.
    4. The purpose of using the anchor model is confusing to me. As claimed in the paper, the anchor model is proposed as a guide for thief model training. However, since the authors already used a teacher-student framework (the teacher could guide the student), the use of an anchor model seems unnecessary. [1] Stealing Machine Learning Models via Prediction APIs. 25th USENIX security symposium (USENIX Security 16)), 2016 [2] Increasing the Cost of Model Extraction with Calibrated Proof of Work. International Conference on Learning Representations (ICLR22), 2022 [3] Data-Free Model Extraction Attacks in the Context of Object Detection. International Conference on Computer Vision Systems (ICVS23), 2023
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Please refer to the weaknesses part for details.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Reject — should be rejected, independent of rebuttal (2)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. Some concepts are not discussed in the paper.
    2. Some concepts need more explanations.
  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • [Post rebuttal] Please justify your decision

    Thanks for the authors’ response. Although the authors claim in the rebuttal that they had cited [23] in the main paper as the discussion of extraction attacks (ME), but they didn’t have any discussion about ME, even [23] is described as a method of model stealing (see Sec. 1). Although 5,000 samples are indeed very few for conventional images, it is a large number for medical image datasets. The authors’ method is mainly aimed at medical images, so training the thief model with 5,000 samples cannot be described as “limited”. This amount of data can enable the model to achieve very good performance and should not be considered as “limited”.

    Although the author’s response addressed some of my concerns, I believe the two issues mentioned above are very serious. Therefore, while I have raised my score, I’m still inclined to recommend rejection of this paper.



Review #2

  • Please describe the contribution of the paper

    A ‘thief’ trains using predictions of a black-box ‘victim’ model during a Model Stealing (MS) attack. Authors propose an additional training step to enhance thief performance with minimal victim queries. Their Query-Wise (QW) method clones the thief (or ‘anchor’) to create a ‘teacher’, and uses both models to train a new thief (the ‘student’). The teacher updates based on the moving average of the student, while the anchor is fixed. Teacher sees only unlabelled data while student sees both. On labelled data, QW matches student & teacher predictions whilst aligning prob distribution of the student to anchor. On unlabelled data, QW aligns prob distribution of the student to teacher & anchor (only when the prediction confidence is high). Authors demonstrate QW on Gallbladder and COVID-19 US datasets, and natural images, measuing model acc, sensitivity, & specificity. They compare the anchor model to the updated QW thief, and count the agreement between the thiefs & victim.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper clearly lays out the need for studying MS attacks in the medical imaging domain. It demonstrates that the proposed method does indeed improve over multiple thief-training paradigms and architectures. Authors provide clear Figures providing useful insights into their QW method. Figure 3 is particularly interesting, showing the improvement of the student over the anchor on unlablleled data as training progresses. Datasets used are publicly available, and the training framework is mostly well-described.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Authors make 2 claims which are misleading and must be addressed: (1) the claim to novelty of studying MS attacks on medical imaging models, and (2) the claim that they have invalidated credible MS attack defences.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    There are some aspects of the work which are not clear enough to reprocude, e.g. Logit Adjustment is too generic. If would be useful to have the source code for QW available

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. §1 paragraph 4, pg 3 - Authors state that “a key insight of [their] work is that…hard labels queried from the victim may lack meaningful information” because the proxy dataset is out of distribution (OOD). However, this fact has been mentioned several times in their citations and is a well-known problem in machine learning (and modelling) in general: using OOD data yields inaccurate model responses (someitmes purposefully to avoid MS attacks). Authors may wish to clarify what is meant by their claim.

    2. §1 paragraph 5, pg 3 - The 1st contribution stated here needs clarification: is misleading for Authors to claim they are the first to attack a medical imaging model. They even cite a work [32] which performs text-based attacks on a model trained with lung CT.

    3. §1 paragraph 5, pg 3 and §5 - The 3rd contribution is misleading: results in the supplementary table to do not evidence the claim that their selected “stealing defences” have been “invalidated”. In fact, in some instances the accuracy of the thief models is reduced by the defences. Authors also state this in their results section, invalidating their own claim. The idea of MS attack defences is given very little discussion in this work and is a large topic to try and tag on to the end. It is suggested that the Authors give more time to this section to give more context on MS defences, and to clarify their claim, or leave this subject for subsequent work where they can explore the impact more thoroughly.

    4. Authors asses their QW method on 3 classification tasks, Gallbladder Cancer classification, COVID-19 classification, and natural image classification. Can authors comment on the utility of their method for regression models, e.g. scoring, or natural language prediction models, e.g. image to medical report?

    5. The query budget (number of queries to the ‘victim’) is fixed at 5000 for the medical imaging datasets, and 500k for the natural image experiment. It is most likely that the models which are victim to MS attacks are those which work on highly specialized data and therefore have very limited publically available datasets. Can Authors comment on the effectiveness of their QW method when the the number of queries is much lower, as would be necessary in this regime?

    6. Authors briefly mention that they use Logit Adjustment when training the student: please can they clarify how this is achieved - on a post-hoc basis, or using an auxilliary model to estimate class weights as in [19]? This weighting will directly influence the predictions of the resulting student model, and so this step should be clarified to improve reproducability.

    7. Table 1 shows that when a thief model has higher agreement with the victim, the performance metrics are usually lower - do authors have an intuition for this behaviour?

    8. §4 paragraph 3, pg 7 - Authors make a claim that having a thief model out-perform a human radiologist is a serious threat to the proprietary model. Is this only in the context where a malicious actor is able to claim their stolen model is comprable to human analysis? Or are the Authors have an alternative meaning here?

    Minor Conerns

    1. Authors experiments are in the regime where only hard labels are returned by the victim. Given that some MS attacks are designed to benefit from confidence scores returned by some victims, can Authors comment on whether QW would further improve such attacks? and would that improvement be as much as with hard label MS attacks?

    2. In their experiments, Authors compare with FixMatch [26] which uses image augmentations to further train the thief model. Could Authors comment on whether there would be any benefit to incorporating this idea into QW?

    3. It is noted that QW is consistently better in the natural image experiments: is this a result of increased availability of training data, or can authors account for this differently?

    4. Authors do not adequately describe the make-up of a training mini-batch. It is clear that a mini-batch contains both labelled and unlabelled data, but what is the ratio of these?

    Additional Comments

    1. §1 paragraph 5, pg 3 - Typos: “This ensures our techn
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Authors make several claims which have not been evidenced sufficiently. There are some sections in the paper that are not given the time which they deserve, e.g. discussion of model defences.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Reject — should be rejected, independent of rebuttal (2)

  • [Post rebuttal] Please justify your decision

    Authors have not given sufficient detail in their rebuttal to address the concerns of this reviewer regarding Authors’ claims. More detail is also needed to justify the inclusion of the short section on defenses.



Review #3

  • Please describe the contribution of the paper

    This paper investigates the vulnerability of black-box medical imaging models to MS attacks under realistic constraints like limited query budgets and lack of access to victim model’s training data, addressing the challenging issue of model stealing attacks in medical imaging.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The “QueryWise” method that is proposed in this paper effectively combines labeled and unlabeled data to enhance the capability of MS attacks.
    2. The effectiveness of this method is demonstrated through experiments on two medical imaging models focused on Gallbladder Cancer and COVID-19 classification.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. While the authors emphasize their use of the labelled dataset compared to existing MS methods, this aspect is not well demonstrated in the experiment section.
    2. Lack of Ablation Study: There is no ablation study in the experimental section to substantiate the effectiveness of the model.
    3. The impact of the findings on clinical practice remains unknown.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The authors provided relevant information about the implementation and experimental setup.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. In the experiment section, it would be beneficial to compare the performance of the proposed method when utilizing labeled data, and contrast this with the performance of existing MS methods.
    2. Ablation study is missing in the paper, which could reveal the contributions of individual components of your approach.
    3. More discussion is needed on the implications of your findings for clinical practice. How might these model stealing attacks affect real-world medical diagnostics?
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well written and addresses a underexplored issue model stealing attacks in medical imaging. The experimental design and methodological approach are sound, with a clear and structured implementation.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The authors have addressed my concerns. Overall, I do think that this paper is sound and thus I maintain my weak accept decision.




Author Feedback

We are encouraged that reviewers found our work interesting (R1), experiments insightful (R3), our method effective (R4), and our paper easy to follow (R1). We especially thank R3 for their constructive comments, which we will include in the final paper.

R1: Model stealing (MS) and extraction attacks (ME). No comparison with [1-3].

Existing literature uses MS and ME attacks interchangeably. We compared our method to ME [23] as well as MS [21] techniques. We didn’t compare with [1-3] as they are not directly relevant: [1] focuses on decision trees and MLPs, [2] is a defense approach, and [3] requires 5 million queries, which is unsuitable for medical imaging. Note: 1-3 are references provided by the reviewer, 21 and 23 are from the paper.

R1: 5000 is a large number for most medical imaging datasets; cannot be treated as “limited”.

We refer to 5000 as a “limited” query budget in comparison to existing MS attacks on general vision tasks [21,7] requiring millions of queries. We note that our method requires only “unlabeled” images for querying, which are easier to curate than labeled images, even in medical settings.

R1: Similarity to knowledge distillation (KD), and “…anchor model seems unnecessary”.

Our strategy is inspired by KD but extends it to include both labeled and unlabeled data. Knowledge from labeled data is embedded in a fixed anchor model, while the teacher model incorporates knowledge from both labeled and unlabeled data, and is dynamically updated from the student. Thus, the anchor and teacher complement each other.

R3: “… the novelty of studying MS attacks on medical imaging models is misleading”. “[32] performs text-based attacks on a model trained with lung CT”.

The victim model in [32] is a 1D CNN trained on textual data, with the attack also using text. To our knowledge, our work is the first formal study on MS for medical imaging models under (a) realistic threat model of 5000 queries, and (b) hard-label access, unlike [32,21].

R3: Results in supplementary do not evidence the claim that MS defenses have been invalidated.

None of the evaluated defenses consistently reduce thief accuracy for all MS methods. Moreover, any drop in thief accuracy is usually accompanied by a drop in victim accuracy, thus questioning the utility of these defenses. Therefore, we advocate more research to develop stronger defenses.

R3: Key insight: “hard labels may lack meaningful information…” is well known.

The key insight is actually in the second/next sentence. That, soft pseudo-labels produced by anchor and teacher help a student capture the class structure better than the hard labels by victim model, which are of limited use in OOD data.

R3: insufficient information for reproducibility.

We shall release the source code post acceptance of our paper.

R3: Clarify how Logit Adjustment is achieved.

We use the logit-adjusted softmax cross-entropy loss from [19] during training.

R4: While authors emphasize their use of the labeled dataset compared to existing MS methods, it is not well demonstrated in experiments.

We want to clarify that our main contribution is in the use of unlabeled data along with labeled, unlike existing MS methods that rely solely on labeled data. Table 1 shows that our method (Random+QW / k-Center+QW) outperforms existing methods using labeled data alone (Random, k-Center).

R4: Ablation study is missing.

Due to space limits, we could not include our ablation study findings in the paper. Our CIFAR-10 experiments showed that removing the anchor model significantly lowers thief accuracy while removing the teacher model has a smaller but notable impact. Logit adjustment yielded mixed results.

R4: Impact on clinical practice remains unknown.

Our findings indicate that the IP of proprietary medical imaging models deployed in the real world is not secure. This has huge implications for the IP owners and underscores the need to implement more stringent security measures before deployment.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The paper studies the problem of model stealing (MS) attacks that can be done by querying a publicly dataset with the model and cloning its performance. The paper also proposes an MS method, called QueryWise, which uses a limited amount of unlabelled query data to train the thief model. The paper received (reject -> weak reject, weak reject-> reject, weak accept -> weak accept) scores (before->after rebuttal). The reviewers listed the following strengths: very interesting idea and clear explanations. As for the weaknesses, the reviewers listed the following issues: although interesting, the idea of model stealing is similar to model extraction attack (but discussion is missing in the paper); and QueryWise uses relatively large datasets. In general, I think the paper is promising with a very nice idea to explore, but the issues identified will need to be addressed before the paper can be published.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The paper studies the problem of model stealing (MS) attacks that can be done by querying a publicly dataset with the model and cloning its performance. The paper also proposes an MS method, called QueryWise, which uses a limited amount of unlabelled query data to train the thief model. The paper received (reject -> weak reject, weak reject-> reject, weak accept -> weak accept) scores (before->after rebuttal). The reviewers listed the following strengths: very interesting idea and clear explanations. As for the weaknesses, the reviewers listed the following issues: although interesting, the idea of model stealing is similar to model extraction attack (but discussion is missing in the paper); and QueryWise uses relatively large datasets. In general, I think the paper is promising with a very nice idea to explore, but the issues identified will need to be addressed before the paper can be published.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The authors did not fully address the concerns raised by reviewers.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The authors did not fully address the concerns raised by reviewers.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The paper received mixed reviews and the criticism relates to the somewhat early stage of research. This meta reviewer argues that the paper makes a valuable and complementary contribution to MICCAI despite its limitations. In particular, the idea was highlighted as promising and interesting. The authors should improve the description and try adding requested details, improving the overall presentation as requested in the reviews and AC comments. Overall, the paper makes a valuable contribution and may inspire further research.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The paper received mixed reviews and the criticism relates to the somewhat early stage of research. This meta reviewer argues that the paper makes a valuable and complementary contribution to MICCAI despite its limitations. In particular, the idea was highlighted as promising and interesting. The authors should improve the description and try adding requested details, improving the overall presentation as requested in the reviews and AC comments. Overall, the paper makes a valuable contribution and may inspire further research.



back to top