Abstract

High resolution is crucial for precise segmentation in fundus images, yet handling high-resolution inputs incurs considerable GPU memory costs, with diminishing performance gains as overhead increases. To address this issue while tackling the challenge of segmenting tiny objects, recent studies have explored local-global feature fusion methods. These methods preserve fine details using local regions and capture context information from downscaled global images. However, the necessity of multiple forward passes inevitably incurs significant computational overhead, greatly affecting inference speed. In this paper, we propose HRDecoder, a simple High-Resolution Decoder network for fundus image segmentation. It integrates a High-resolution Representation Learning (HRL) module to capture fine-grained local features and a High-resolution Feature Fusion (HFF) module to fuse multi-scale local-global feature maps. HRDecoder effectively improves the overall segmentation accuracy of fundus lesions while maintaining reasonable memory usage, computational overhead, and inference speed. Experimental results on the IDRID and DDR datasets demonstrate the effectiveness of our method. The code is available at https://github.com/CVIU-CSU/HRDecoder.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/4020_paper.pdf

SharedIt Link: https://rdcu.be/dV51j

SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72114-4_32

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/4020_supp.pdf

Link to the Code Repository

https://github.com/CVIU-CSU/HRDecoder

Link to the Dataset(s)

https://ieee-dataport.s3.amazonaws.com/open/3754/A.%20Segmentation.zip?response-content-disposition=attachment%3B%20filename%3D%22A.%20Segmentation.zip%22&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAJOHYI4KJCE6Q7MIQ%2F20240315%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240315T053048Z&X-Amz-SignedHeaders=Host&X-Amz-Expires=86400&X-Amz-Signature=de00f49e9a770569b25982825c74f0b74a69e40fb965de881a8efca993f5b71f https://drive.google.com/drive/folders/1z6tSFmxW_aNayUqVxx6h6bY4kwGzUTEC

BibTex

@InProceedings{Din_HRDecoder_MICCAI2024,
        author = { Ding, Ziyuan and Liang, Yixiong and Kan, Shichao and Liu, Qing},
        title = { { HRDecoder: High-Resolution Decoder Network for Fundus Image Lesion Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15009},
        month = {October},
        page = {328 -- 338}
}

Reviews

Review #1

Please describe the contribution of the paper

The paper entitled “HRDecoder: High-Resolution Decoder Network for Fundus Image Lesion Segmentation” is well written. It extends existing HR image segmentation techniques to reduce the computation time and memory usage. The data shown in Table 1 for the comparisons with SOTA indicate that the HRDecoder improves the results in some situations; and the data shown in Table 2 indicate that the memory usage, computational overhead, and inference speed are still reasonable.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

This paper extends existing HR image segmentation techniques to reduce the computation time and memory usage. The data shown in Table 1 for the comparisons with SOTA indicate that the HRDecoder improves the results in some situations; and the data shown in Table 2 indicate that the memory usage, computational overhead, and inference speed are still reasonable. In general, I think the paper has a positive impact because the method may be applied to obtain better segmentation results in some situations.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

For the writing, the paper needs to be polished for publication. To make the paper easier for readers, K, σ, H, W, C, h, and w should be defined or explained before their usages. There are some typos in the paper.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Do you have any additional comments regarding the paper’s reproducibility?

N/A
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

For the writing, the paper needs to be polished for publication. To make the paper easier for readers, K, σ, H, W, C, h, and w should be defined or explained before their usages. There are some typos in the paper. Some complete sentences should be ended by semicolon rather than commas. In the following, the notation A –> B means than A should be changed to B

In Section 2.1, maintain details, simultaneously original image –> maintain details; and simultaneously original image

Where –> where

Dice loss, –> Dice loss and

In Section 2.2, The following is not a complete sentence; and is should be changed to are. Furthermore, given that the resource overheads of decoder is significantly lower than that of the encoder.

In the title of Section 3.1, Implemention –> Implementation

protocols, and we set –> protocols; and we set

FPS in Table 2 is not defined.

In Section 3.2, comparable or or even –> comparable or even

In Section 3.3, The following is not a complete sentence. While too many HR crops may lead the model to focus excessively on tiny lesions e.g. MA and neglect slightly larger lesions e.g. SE.

The following sentence may be confusing. Does the sentence mean that the scale ratio is in the interval (1-δ,1+δ) ? If so, then why should the scale ratio be in that interval? Fig. 3b illustrates the influence of the scale ratio (1-δ,1+δ).

weighted fusion is favored, so we –> weighted fusion is favored, we In Section 4, The following two sentences: Certain limitations do exist: 1.Simple feature fusion is specifically designed for fundus images, while scale attention may find broader applicability in various tasks. 2.Only a simple CNN-based decoder is short in capturing contextual information. may need to be changed to: There are certain limitations as follows: simple feature fusion is specifically designed for fundus images while scale attention may find broader applicability in various tasks; only a simple CNN-based decoder is short in capturing contextual information.

we believe our approach provides –> we believe that our approach provides
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Weak Accept — could be accepted, dependent on rebuttal (4)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper extends existing HR image segmentation techniques to reduce the computation time and memory usage. The data shown in Table 1 for the comparisons with SOTA indicate that the HRDecoder improves the results in some situations; and the data shown in Table 2 indicate that the memory usage, computational overhead, and inference speed are still reasonable. In general, I think the paper has a positive impact because the method may be applied to obtain better segmentation results in some situations. For the writing, the paper needs to be polished for publication.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

This manuscript presents a novel approach for fundus lesion segmentation from OCT images. The method is technically sound, and all three main contributions are properly validated. The presentation of all the core parts of the manuscript is also very good and clear. The developed method is rather generic, with potential to application to a range of similar problems.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

• Technically-sound method, very well presented • All the contributions are properly validated, extensive ablation studies performed • The figures are very informative and easy to read • Can potentially, with minor adjustments, be applied to different data
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

• The manuscript needs proof-reading
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Do you have any additional comments regarding the paper’s reproducibility?

N/A
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
1. The manuscript, including the supplementary section, needs additional proof-reading.
2. The number of classes K is used from the beginning of Section 2, but is introduced only much later, on the next page.
3. On page 5 the reference to the figure number is missing. The value assigned to the λ weight is repeated on the bottom of this page; which I find unnecessary.
4. On page 6, on the one-before-last line the authors mention “methods [26,1]”, while there is no “[1]” method in the corresponding table.
5. The supplementary material is not referenced in the main manuscript; please fix this by providing corresponding pointers.
6. The authors might want to duplicate the title of the vertical axis on the right axis as well; this will, in my opinion, improve understanding.
7. The captions of the supplementary figure and the tables can be improved by adding the information about e.g. the corresponding validation metric(s). As, in its current form, the reader has to fish this information out of the text.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Strong Accept — must be accepted due to excellence (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This is a very strong and complete submission, with very minor points for improvement
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

This paper proposes a framework to segment lesions in fundus images. The novelty is in the decoder where one module learns small features from upscaled feature maps and another module for multi-scale fusion. Evaluation on public datasets shows performance on par with state of the art while having significantly lower computational demands.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- the approach is evaluated comprehensively against state of the art methods on public datasets and outperforms most methods while keeping computational demands low
- the decoder architecture is simple but effective
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

I think the paper is very well written and evaluated. I don’t see any major weaknesses besides mixing introduction and methods section. For clarity and saving space the introduction could be shortened and the parts describing the method could be merged with the methods section. That would allow for more evaluation details from the supplementary material to become part of the paper.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Do you have any additional comments regarding the paper’s reproducibility?

N/A
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

The paper is well written and understandable. I’m surprised that the methods works so well despite mainly relying on upscaling the feature map after the encoder. My assumption would be that the scaling of factor 2 of the original input image still preserves most of the fine details in the original image for the method to work. The hyper parameter lambda has a missing figure reference that probably is referring to the figure in the supplementary material. The influence of the HR loss weight could be integrated in the main paper for better clarity. Overall, the method is very well evaluated and it is clear that it strikes it good balance between accuracy and speed.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Strong Accept — must be accepted due to excellence (6)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper proposes a simple yet effective framework which is comprehensively evaluated and compared to state of the art methods and results show a very good balance between accuracy and computational demand which is very useful for practical application.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Author Feedback

We sincerely appreciate the insightful reviews provided by the reviewers for our paper. We feel pleased for the reviewer’s acknowledgment of the contribution of our paper. We highly value the constructive comments regarding the writing quality and we will carefully address each suggestion to improve the quality of our paper.

For the writing. (#R1, #R3, #R4) We acknowledge the need for further polishing to enhance clarity and readability. We will make the suggested corrections including refining incomplete sentences and correcting grammatical errors. (#R1, #R3) We will carefully revise the captions and contents of some figures to improve understanding for readers.(#R3) We also recognize the importance of presenting more detailed information within the main part of the paper. We will try to reorganize the introduction, method, and evaluation sections of our paper to provide a more comprehensive understanding of our approach.(#R4)

For the missing figure in page 5, section 2.2. (#R3, #R4) We sincerely apologize for not thoroughly proofreading our manuscript before submission, which resulted in the occurrence of such mistake. As mentioned by #R4, the missing reference corresponds to Fig. S1 in the supplementary material. Due to space limitations, we moved the discussion on λ from the main paper to the supplementary material, inadvertently forgetting the reference in the main paper. We will take the reviewer’s advice and move this part to the main paper. We will meanwhile restructure the main paper and make it more informative and understandable.

Fig. 3b illustrates the influence of the scale ratio (1-δ,1+δ). Does the sentence mean that the scale ratio is in the interval (1-δ,1+δ) ? If so, then why should the scale ratio be in that interval? (#R1) Sure, the sentence means that the scale ratio is in the interval (1-δ,1+δ). The idea of our design is consistent with the random crop strategy during pipeline stage. We vary the crop size within a certain range to ensure that the cropped feature blocks can capture features of diverse scales. If δ is set too large, it will result in significant variability in the size of the cropped feature maps. Too small crops may fail to capture large lesion details, while too large crops would make it challenging to learn features of small lesions. In other words, overly large δ will introduce more uncertainty. Meanwhile, overly small δ will inhibit the model from learning multi-scale features. So we carried out an ablation study as shown in Fig. 3b and finally set δ=0.25 to achieve the best result.

Meta-Review

Meta-review not available, early accepted paper.

back to top

HRDecoder: High-Resolution Decoder Network for Fundus Image Lesion Segmentation

Author(s):