Abstract

Learning from label proportions (LLP) is a weakly supervised classification task in which training instances are grouped into bags annotated only with class proportions. While this task emerges naturally in many applications, its performance is often evaluated in bags generated artificially by sampling uniformly from balanced, annotated datasets. In contrast, we study the LLP task in multi-class blood cell detection, where each image can be seen as a “bag”’ of cells and class proportions can be obtained using a hematocytometer. This application introduces several challenges that are not appropriately captured by the usual LLP evaluation regime, including variable bag size, noisy proportion annotations, and inherent class imbalance. In this paper, we propose the Vertex Proportion loss, a new, principled loss for LLP, which uses optimal transport to infer instance labels from label proportions, and a Deep Sparse Detector that leverages the sparsity of the images to localize and learn a useful representation of the cells in a self-supervised way. We demonstrate the advantages of the proposed method over existing approaches when evaluated in real and synthetic white blood cell datasets.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/3997_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/3997_supp.pdf

Link to the Code Repository

https://github.com/carolina-pacheco/LLP_multiclass_cell_detection/

Link to the Dataset(s)

https://github.com/carolina-pacheco/LLP_multiclass_cell_detection/

BibTex

@InProceedings{Pac_Vertex_MICCAI2024,
        author = { Pacheco, Carolina and Yellin, Florence and Vidal, René and Haeffele, Benjamin},
        title = { { Vertex Proportion Loss for Multi-Class Cell Detection from Label Proportions } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15012},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The authors proposed a vertex proportion (VP) loss based on optimal transport and propagates global annotations to instance labels as latent variables. A self-supervised model was introduced to localize cells and learn useful representations to classify them in a unified framework. The method was evaluated on the WBC subtype classification task.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. A vertex proportion (VP) loss based on optimal transport was proposed. It seems to be simple compared with other OT-based pseudo-labeling methods.
    2. A deep sparse detector (DSD) was introduced to localize cells and provide features for classification.
    3. The WBC subtype classification task was viewed as an LLP problem for the first time.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The VP-L2 and VP-CE performed the best results on the real data and synthetic data, respectively. However, for synthetic data, VP-L2 performed worse results than LLP-PLOT method, and for real data, VP-CE performs worse results than LLP-PLOT method. The results seem to be not robust. How to choose VP-L2 and VP-CE in real application?
    2. In the bag-level loss results description, it is unclear how to get the entropy, and there are no Table or Figure to show the entropy for these methods. Hard to understand.
    3. Lacking implementation details, such as model, optimizer, learning rate, epoch number.
    4. The precision of writing needs improvement to avoid typos and irregular Table format.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. More implementation details are needed.
    2. Add explanations why the results of VP-L2 and VP-CE loss in synthetic data and real data have different trends. Significant test also needs to be added.
    3. Writing skills need improvement.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Lacking details about the experiments, and writing skills need improvement.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    Learning from label proportions (LLP) is a weakly supervised classification task where instances are grouped into bags with class proportions. This study focuses on multi-class blood cell detection, treating each image as a “bag” of cells with proportion annotations from a hematocytometer. Challenges like variable bag size, noisy annotations, and class imbalance are addressed using the Vertex Proportion loss and a Deep Sparse Detector. Results demonstrate the superiority of this approach on real and synthetic white blood cell datasets compared to existing methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The problem of the paper is interesting and practical.
    2. The paper is technique sound, where different datasets are presented in it.
    3. The idea is interesting and and framework is clear.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. It is recommended to add a anonymous github link for reproducibility.
    2. Can you discuss the potential solutions on the limitation of the paper?
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    See Strengths and Weaknesses

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    See Strengths and Weaknesses

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors propose a Vertex Proportion Loss, based on Optimal Transport, for Learning from Label Proportions (LLP). They apply their method to the task of multi-class blood cell detection in lensless imaging. For this task, they also propose a Deep Sparse Detector that learns deep vector representations of cells by minimizing a reconstruction loss (and possibly the LLP classification loss). The loss attempts to minimize the cost of a transportation plan matching the distribution of predicted probabilities (that live on a simplex) to the ground truth distribution (supported by the vertices of the simplex). The OT amounts to solving a linear problem, and backpropagating the loss through the network can be done straightforwardly thanks to a result from optimization theory.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper contributes methodologically by proposing a novel Vertex Proportion loss based on Optimal Transport
    • The paper proposes an encoder for dealing with WBC holographic data
    • The paper is clear
    • The results are satisfactory. The loss favorably compares to the KL divergence and matches the performance of SOTA, more computationally demanding methods.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Reproducibility is not guaranteed (no code, no implementation details)
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The authors do not claim to make the code public upon acceptance. No details are given about the implementation. (I am not taking into account the supplementary material beyond the allowed 2 pages.)

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Minor comment: It is not clearly stated whether the training of the DSD and classification head is end-to-end or whether the DSD is trained purely from the unsupervised reconstruction loss, and in a second stage these representations are used for LLP.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The strengths mentioned above.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    After reading the reviews and rebuttal, I maintain my original score, especially as the authors have clarified on reproducibility / sharing the code.




Author Feedback

We thank the reviewers for their insightful feedback. We are glad they appreciated the methodological contribution of our proposed VP loss (R1, R3, R4), found the problem interesting (R3), the evaluation and results satisfactory (R1, R3), and the paper clear (R1, R3).

[R1, R3, R4] Concerns about reproducibility: We apologize for not making it clear that the synthetic WBC holographic dataset and the code (loss computation as well as pretrained models) will be released upon publication. Unfortunately the real WBC data cannot be publicly released as it is proprietary medical data. In addition, the supplemental document will be reduced to focus on sections 4 and 5 corresponding to implementations details (R1, R3, R4) and fit in the 2-page limit (R1).

[R1] Are encoder and classifier jointly trained?: We will specify in the paper that the encoder and classifier are trained separately as joint training leads to a decrease in detection performance (see example in Section 6 of the supplemental doc.).

[R3] What are limitations and potential solutions?: We identify two main limitations (which are listed in the conclusion section): (i) trading off recall for computational efficiency, and (ii) being restricted to sparse images. On the one hand, we hypothesize that recall could be improved (while maintaining efficiency) if a supervisory signal was provided to guide the support of our sparse encoding volume. On the other hand, sparsity is a fundamental assumption in the formulation of our method, and therefore it could not be applied to arbitrary domains unless the images are sparse under a differentiable and invertible transformation.

[R4] Inconsistent results in real and synthetic data: We would like to correct R4’s observation and point out that in fact VP-CE outperforms LLP-PLOT in both synthetic and real data (note that metrics are different, for synthetic data we use accuracy, whereas for real data we use proportion prediction error). We also wish to clarify that, as noted by R1, we claim “similar or better performance” with respect to LLP-PLOT, with our method having methodological novelty and computational advantages.

[R4] How to select a cost function (CE vs L2)?: VP-CE loss would be a reasonable choice for classification applications as CE is a widely used loss for classification tasks and it consistently shows good results throughout our experiments. In the paper we include results with a different cost function (i.e. L2) to emphasize the flexibility of our framework.

[R4] How is entropy computed?: The numbers reported in the results section correspond to the entropy of the output of the classifier, averaged across samples. Ideally, we would like each prediction to be close to one of the vertices of the simplex, in which case the average entropy is small.

[R4] Writing and table formatting: We are happy to incorporate specific suggestions if appropriate, but unfortunately, we are not sure what specifically the reviewer is referring to. We note the other two reviewers give strong scores and comments for clarity.

[R4] Details of implementation: Please see our response above on reproducibility.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This paper proposes a Vertex Proportion Loss based on optimal transport for Learning from Label Proportion and applies it to cell detection. The reviewers highlighted the novelty of the loss and the clear description as strengths. As weaknesses, they noted concerns about reproducibility, less robustness, and the need for a more detailed description of the methodological parts. The decision was split: Accept (A), Weak Accept (WA), and Weak Reject (WR). Since the rebuttal addressed many issues and presented its novel idea effectively, the meta-reviewer recommends accepting this paper.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    This paper proposes a Vertex Proportion Loss based on optimal transport for Learning from Label Proportion and applies it to cell detection. The reviewers highlighted the novelty of the loss and the clear description as strengths. As weaknesses, they noted concerns about reproducibility, less robustness, and the need for a more detailed description of the methodological parts. The decision was split: Accept (A), Weak Accept (WA), and Weak Reject (WR). Since the rebuttal addressed many issues and presented its novel idea effectively, the meta-reviewer recommends accepting this paper.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



back to top