Abstract

In large scale electron microscopy(EM), the demand for rapid imaging often results in significant amounts of imaging noise, which considerably compromise segmentation accuracy. While conventional approaches typically incorporate denoising as a preliminary stage, there is limited exploration into the potential synergies between denoising and segmentation processes. To bridge this gap, we propose an instance-aware interaction framework to tackle EM image denoising and segmentation simultaneously, aiming at mutual enhancement between the two tasks. Specifically, our framework comprises three components: a denoising network, a segmentation network, and a fusion network facilitating feature-level interaction. Firstly, the denoising network mitigates noise degradation. Subsequently, the segmentation network learns an instance-level affinity prior, encoding vital spatial structural information. Finally, in the fusion network, we propose a novel Instance-aware Embedding Module (IEM) to utilize vital spatial structure information from segmentation features for denoising. IEM enables interaction between the two tasks within a unified framework, which also facilitates implicit feedback from denoising for segmentation with a joint training mechanism. Through extensive experiments across multiple datasets, our framework demonstrates substantial performance improvements over existing solutions. Moreover, our framework exhibits strong generalization capabilities across different network architectures. Code is available at https://github.com/zhichengwang-tri/EM-DenoiSeg.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1351_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1351_supp.pdf

Link to the Code Repository

https://github.com/zhichengwang-tri/EM-DenoiSeg

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Wan_Joint_MICCAI2024,
        author = { Wang, Zhicheng and Li, Jiacheng and Chen, Yinda and Shou, Jiateng and Deng, Shiyu and Huang, Wei and Xiong, Zhiwei},
        title = { { Joint EM Image Denoising and Segmentation with Instance-aware Interaction } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15007},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposes a novel framework that integrates denoising and segmentation for electron microscopy (EM) images using an instance-aware interaction approach. It combines a denoising network, segmentation network, and fusion network, allowing cross-domain interaction to improve task synergy. The instance-aware embedding module (IEM) facilitates the fusion of semantic and image features, enhancing the joint denoising and segmentation process. Extensive experiments across public benchmarks demonstrate the framework’s effectiveness, showing significant performance improvements over existing solutions, with potential applications in various EM imaging scenarios.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) The paper introduces a novel framework that combines denoising and segmentation in electron microscopy (EM) images. By merging these two tasks, the framework creates a unique approach that capitalizes on the synergy between them. 2) The inclusion of an Instance-aware Embedding Module (IEM) for fusing semantic and image features adds novelty to the method. This module enhances the interaction between the denoising and segmentation networks, leading to better results. 3) Extensive experiments are conducted on various public benchmarks, demonstrating significant performance improvements in both denoising and segmentation metrics, including PSNR, SSIM, VOI, and ARAND. 4) The paper is well-structured and the writing style contributes to the paper’s readability.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) The dual-domain framework with instance-aware embedding could result in high computational costs. The authors doesn’t provide detailed information on the total number of parameters or the computational costs. 2) The paper lacks information about implementation details, such as specific hyperparameters, and training configurations. 3) Discrepancies results in Table 3. By comparing results in Table 1 on CREMI-C with Table 3, the results indicate discrepancies in PSNR and SSIM. This inconsistency highlights the need for a more detailed analysis of parameter counts of the different network architectures.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Reproducibility concerns arise from the lack of detailed implementation information, such as the total number of parameters and specific hyperparameters (such training details).

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    1) The generalization experiments on different network architectures are a strong point of the paper. However, more discussion on the robustness of the proposed method to varying levels of noise and different datasets would strengthen the claims of generalization. 2) The paper mentions using a dual U-Net architecture for the denoising and segmentation networks (as in Table 1, 2, and 4). However, more detailed explanation about specifics of this U-Net architecture is needed. These information can be illustrated in a diagram. 3) Understanding the computational cost is essential for evaluating the practicality of the proposed framework. The authors could analyze the computational cost of their framework to demonstrate its advantage over state-of-the-art methods.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents an interesting approach to integrating denoising and segmentation, which is a novel idea in the context of large-scale electron microscopy (EM). However, some points need to be addresses, such as; 1) Inconsistencies in Results: As noted, there are discrepancies in the results when comparing different network combinations. This inconsistency raises questions about the robustness of the proposed framework and requires further investigation. 2) Lack of Detailed Analysis: The computational cost of each component and the total parameter count for the framework are not thoroughly analyzed.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper proposed the joint training of segmentation and noise reduction tasks, achieving mutual gain between the two.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The author argue that there exist gap in the ‘denoising-segmentation’ pipeline, subsequently proposing a framework that jointly train the two task simultaneously; And the experiments demonstrates the superiority of this framework.
    2. Based on attention mechanism, the author designs an Instance-aware Interaction Framework to achieve interaction between denoising and segmentation.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Weak innovation; The idea of interaction between denoising and high-level vision tasks, including segmentation, has actually been proposed in reference [13]: D. Liu, B. Wen, X. Liu, Z. Wang, and T. S. Huang. When image denoising meets high-level vision tasks: A deep learning approach. arXiv preprint arXiv:1706.04284, 2017; Please higlight the difference between this article and the work in reference [13].
    2. Lack of comparison with existing denoising methods.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    There are a few grammar errors present. For example, ‘Traditionally, methods for image denoising and segmentation have treated these tasks independently [18,6,23,1]’, this sentence is grammatically incorrect and somewhat perplexing, and might be better replaced with a expression like ‘Traditionally, image denoising and segmentation techniques have been developed and applied in isolation, with each task being approached separately.’

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. Currently, there is relatively little research on denoising and segmentation of EM modal data; In this paper, the exploration of this modal is meaningful.
    2. Lack of innovation in method: the idea of jointly training denoising and high-level vision task, including segmentation, has been proposed in previous work: [13]: D. Liu, B. Wen, X. Liu, Z. Wang, and T. S. Huang. When image denoising meets high-level vision tasks: A deep learning approach. arXiv preprint arXiv:1706.04284, 2017.
  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper introduces a joint segmentation and denoising neural network for electron microscopy. This network is built using a novel insance aware embedding block. The results show interactive segmentation and denoising improve the results of both.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The IEM block uses self attention to combine both denoising and segmentation features. This is novel. The experiments conducted are very thorough, cover many ablations, and compare against state-of-the-art. The results look very promising to me.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Some statistical testing would help to solidify the performance of the methods. I feel Figure 1(a) and 1(b) can be split into two larger images. There is a lot going on in Figure 1(a) and it is very important to understanding the work. Description of the backbones can be improved. As it stands I’m unclear where the IEM is added. This could also be showed in a larger Figure 1(a).

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Code is not available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Consider if any other tasks can also be integrated to improve performance. Improve the Figure and the description in Scetion 2.2

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I think the results speak for themselves, and the method has enough novelty.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We thank the reviewers for giving a positive consensus on our work. We further address your concerns below.

R1 & R2 & R3: Reproduction & Source code.  We thank all the reviewers for raising the important issue of reproducibility. We will release the code along with the camera-ready version to benefit the community and to help others build upon our work. Regarding the concerns about the training configurations and specific hyperparameters, we will also provide complete details in the open-source code.

R1: Discrepancies of results in Table 3. Thanks for pointing out the discrepancies. As mentioned in 3.3, ‘Due to the high memory demands of transformer models, Table 3 adopts a tile-based testing strategy (256 × 256), resulting in a marginal decline in denoising performance compared to Table 1. To be specific, for the models evaluated in Table 3, the block-testing strategy was adopted across all cases to ensure a fair comparison. To clarify, the tile-based testing strategy involves splitting the full-sized input image into non-overlapping tiles of 256 x 256 resolution, processing each block independently, and then stitching the outputs back together to reconstruct the full image. In contrast, for Table 1, the models were evaluated on the full-sized images directly during the testing phase without splitting them into tiles.

R1: Details on model parameters and computational cost. For our framework, the main increase in computational cost originates from the fusion network. Our advantage is that during testing, if only the segmentation result is needed, we do not need to go through the inference of the fusion network, thus avoiding any additional computational overhead. Compared to the classic U-Net, our fusion network introduces additional convolutional layers for the transformation of affinity features, as well as two instance-aware embedding modules (IEMs) to perform pixel-wise attention between image and semantic features. The total number of parameters of the fusion network is approximately 1.981M, and its computational complexity is around 27.220G FLOPs when using a 256x256 resolution image as input.

R1 & R2 & R3: More explanation, correction of grammar errors, and more clear illustration. Thanks for your careful review. We will fully revise the manuscript to correct all grammar errors and typos, add more explanation of the design and related work, and clarify the illustration as suggested in future versions.

Again, we appreciate all reviewers’s suggestions, which will help us further improve the quality of the paper.




Meta-Review

Meta-review not available, early accepted paper.



back to top