Abstract

Analysis and visualization of 3D microscopy images pose challenges due to anisotropic axial resolution, demanding volumetric super-resolution along the axial direction. While training a learning-based 3D super-resolution model seems to be a straightforward solution, it requires ground truth isotropic volumes and suffers from the curse of dimensionality. Therefore, existing methods utilize 2D neural networks to reconstruct each axial slice, eventually piecing together the entire volume. However, reconstructing each slice in the pixel domain fails to give consistent reconstruction in all directions leading to misalignment artifacts. In this work, we present a reconstruction framework based on implicit neural representation (INR), which allows 3D coherency even when optimized by independent axial slices in a batch-wise manner. Our method optimizes a continuous volumetric representation from lowresolution axial slices, using a 2D diffusion prior trained on high-resolution lateral slices without requiring isotropic volumes. Through experiments on real and synthetic anisotropic microscopy images, we demonstrate that our method surpasses other state-of-the-art reconstruction methods.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2483_paper.pdf

SharedIt Link: https://rdcu.be/dV5Ey

SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72104-5_57

Supplementary Material: N/A

Link to the Code Repository

https://github.com/hvcl/INR-diffusion

Link to the Dataset(s)

CREMI: https://cremi.org/data/ FIB25: https://www.janelia.org/tools-and-data-release Fluorescence microscopy image of Zebrafish Retina: https://publications.mpi-cbg.de/publications-sites/7207/

BibTex

@InProceedings{Lee_Referencefree_MICCAI2024,
        author = { Lee, Kyungryun and Jeong, Won-Ki},
        title = { { Reference-free Axial Super-resolution of 3D Microscopy Images using Implicit Neural Representation with a 2D Diffusion Prior } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15007},
        month = {October},
        page = {593 -- 602}
}

Reviews

Review #1

Please describe the contribution of the paper

In this study, the authors introduce a reconstruction framework based on implicit neural representation (INR), which enables 3D coherence even when optimized independently by axial slices in a batch-wise manner by optimizing a continuous volumetric representation from low-resolution axial slices, and utilizing a 2D diffusion prior trained on high-resolution lateral slices, without the need for isotropic volumes.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1, This application is novel as there are not many works done on 3D microscopy images on super resolution; 2, The paper proposed an implicit neural representations for isotropic volume reconstruction as a straight-forward optimization problem, which can be used for other reconstruction process. 3, although score-based diffusion is not new, but using it as a prior add some innovation to the work.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1, From the method’s point of view, the linear degradation process Ax in equ 1 doesn’t make much sense to me. In reality, the degradation process is way complicated than just a linear transform. The authors may need to argue the method has some conditions to be applied. 2, The results don’t support the superior performance of the proposed model. In table 1, the metric LPIPS is for a more high-level metric and aligns more with human interpretation comparing to PSNR and SSIM, but the table shows TPDM is consistently better across the 3 directions. 3, The resulting example images are hard to show the superior of the proposed method, in Figure 2, it is hard to say ‘Ours’ is better ‘TPDM’ by eye check and in Figure 3, although following the white arrow, it did show that TPDM has some artifacts, but ‘Ours’ has more blurry cell boundaries comparing to ‘TPDM’.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Do you have any additional comments regarding the paper’s reproducibility?

N/A
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

I would suggest the following: 1, if the linear degradation can’t be changed then show more cases will be helpful to show that it is applicable even with this linear transform. 2, Conduct an intensive fine-tune the proposed model and hopefully can get better results regarding LPIPS. Then have a better show case of generated example images. 3, The organization and writing is clear and I appreciate that.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Weak Reject — could be rejected, dependent on rebuttal (3)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Although the proposed model is interesting and this application is innovative as we need more works on microscopy field on super resolution, the results showed on this paper didn’t prove that the proposed method can get better results comparing to existing method.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

Weak Reject — could be rejected, dependent on rebuttal (3)
[Post rebuttal] Please justify your decision

For the degradation process, I accept the rebuttal response but would encourage further exploration for more complicated process in the future work. From Figure 2, zoom in, I don’t think that “we observed severe misalignment artifacts in the ZY view of the TPDM result” as other methods are even worse and the proposed method’s result is also blurry.

Review #2

Please describe the contribution of the paper

This paper introduces a microscopy axial super-resolution framework based on implicit neural representation (INR), which utilizes a pretrained 2D diffusion model as a prior/regularization, therefore eliminating the need for an isotropic 3D volume. As a result, the proposed method can effectively reconstruct the volume without the loss of 3D coherency.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The key contributions are:
1. An effective framework for microscopy axial super-resolution, which reconstructs the volume without the loss of 3D coherency and eliminates the need for an isotropic 3D volume.
2. A comprehensive evaluation of the proposed framework with related baselines on both simulated data and real-world cases. And the experimental results demonstrate the detailed and reliable reconstructions.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
The main weaknesses are:
1. Comparisons among different priors: R(x). This work proposes to utilize a pretrained 2D diffusion model as a prior/regularization. The comparison among different priors, such as TV, is not discussed or experimented with.
It is essential to conduct an ablation study to compare different priors. Such that, it can provide a more comprehensive understanding and quantify the efficiency of the proposed pretrained 2D diffusion prior.
1. Study of Hyper-parameter: lambda. The lambda is introduced as the hyper-parameter for balancing. However, there is no ablation study on the strength of this hyper-parameter.
It is more appropriate to conduct an ablation study about the impact of the hyper-parameter lambda.
1. Typos in Equations. There are some minor typos in Eq.(1) and Eq.(2). In Eq.(1), vectors, such as y_n, x, are not bold-faced, if they are scalars, then the L2 norm is not a suitable operator in the equation. Besides, it is much better to put the x underline the argmin, i.e., $\underset{\mathbf{x}}{\argmin}$; if not, it may cause confusion with other notations with a subscript. In Eq.(2), the expectation operator should contain the L2 norm only, without including the regularization term R(x).
It is advisable to correct these typos for better readability and understanding of this paper.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Do you have any additional comments regarding the paper’s reproducibility?

No.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

Please refer to the weaknesses.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Weak Accept — could be accepted, dependent on rebuttal (4)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Please refer to the contributions and weaknesses.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

Weak Accept — could be accepted, dependent on rebuttal (4)
[Post rebuttal] Please justify your decision

After reading the reviews from other reviewers and the rebuttal from the authors, I thought the authors didn’t fully address my concerns, where the hyper-parameter is chosen empirically, and they claimed that they observed that the proposed diffusion regularizer is effective compared to other regularizers, such as TV; to the best of my knowledge, the TV prior is the most popular choice for various INR methods. Therefore, I keep the ‘Weak Accept’ for this work.

Review #3

Please describe the contribution of the paper
- This paper presents a novel combination of utilizing a diffusion model and Implicit Neural Representations (INR) for the axial super-resolution of microscopy images. The authors describe a novel method aiming to enforce better 3D consistency when super-resolving 2D slices by using 3D INRs with 2D diffusion. For this, two separate losses (diffusion prior loss and consistency loss) are used in the INR reconstruction/super-resolution to achieve 3D consistency while working on a slice-per-slice basis.
- The authors compare their method against current state-of-the-art methods in microscopy super-resolution for single-channel and multi-channel datasets, which indicate superior quantitative and qualitative results, especially for consistency/boundary issues.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- To the best of my knowledge, the connection between using a diffusion model as a regularization method for INRs has not been made (at least not in (bio-)medical imaging), and it is not only exciting and relevant to the microscopy SR field but to other (bio-)medical INR applications as well.
- I really enjoyed the fact that the authors did both SR for single-channel and dual-channel experiments, which I believe highlights the utility of INRs, especially in the latter (which seemed more convincing to me with respect to the visual results).
- The conducted benchmarks are relevant and demonstrate the applicability and effectiveness of the paper.
- The paper is written in a clear, sound, concise, and comprehensive way, making it a charm to read, and I believe thus easier to reproduce.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
Major:
- The presented method does not outperform the established baselines in terms of LPIPS scores, which are widely regarded as highly indicative of human perception. Despite this, I find the method very interesting and relevant. In particular, the two-channel results demonstrate visual superiority, underscoring the potential utility of this approach. Moreover, the authors are very transparent about it, which I would like to highlight here (i.e. they were not obliged to report the LPIPS score.)
- I am puzzled by the relatively high LPIPS score for the presented method. I suggest the authors verify whether the same normalization was consistently applied across all baseline methods, as depicted in Figure 2. A review of these details might reveal why the LPIPS score isn’t as low (i.e., better) as one might expect for this method.
- The combined use of Fourier Features with SIREN is notably rare. Could you provide the rationale behind integrating these two techniques?
Minor:
- On page 4, the phrase “Often choices” should be revised to “A frequent choice for”.
- On page 5, modify [We use a same diffusion model for all the methods] to [We use the same diffusion model for all methods] to correct grammatical errors.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Do you have any additional comments regarding the paper’s reproducibility?
- I would like to encourage the authors to open-source their implementation either via anonymous GitHub in the rebuttal or in the camera-ready version. I believe their work is relevant to the microscopy imaging community, and it would be beneficial to share it.
- In terms of datasets, I checked and I was not ultimately sure if all datasets are readily available. I would ask the authors to comment on this, and to state if it would be even possible to share the downsampled images (which is not mandatory, but could be helpful for others).
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

(1) It is highly uncommon in the INR field to mix both Fourier Features (Tanczik et al.) and SIREN (Sitzmann et al.) as Fourier Features are commonly seen as a 1-layer SIREN. Have you tried your approach with SIREN or FF only? If so, can you please comment on this? What is the motivation for choosing both?

(2) While I really like the utilized method and related work section, I feel that the paper could be improved by adding some more / prior INR works related to generative models (leveraging INRs) in the related work section. This would set it properly into the context of the INR field, however I do not see them as required baselines.

[1] Gao S, Liu X, Zeng B, Xu S, Li Y, Luo X, Liu J, Zhen X, Zhang B. Implicit diffusion models for continuous super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 10021-10030).

[2] Chan ER, Monteiro M, Kellnhofer P, Wu J, Wetzstein G. pi-gan: Periodic implicit generative adversarial networks for 3d-aware image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition 2021 (pp. 5799-5809).

(3) Which LPIPS implementation are you currently using - you can also try to use the LPIPS weights from the MONAI repository, which have also been trained on medical images. (https://docs.monai.io/en/stable/losses.html)

(4) Utility of modeling the Point Spread Function (PSF) for future work:

Since you are dealing with highly anisotropic images, I would recommend looking into the modeling of the Point Spread Function (PSF) as in NesVor [3].

Xu J, Moyer D, Gagoski B, Iglesias JE, Grant PE, Golland P, Adalsteinsson E. Nesvor: Implicit neural representation for slice-to-volume reconstruction in mri. IEEE Transactions on Medical Imaging. 2023 Jan 11.

(5) Quantitative Assessment of Superiority in Two-channel Reconstruction:

I personally believe that the two-channel reconstruction results are (visually) the most convincing experiment to demonstrate the utility of your method. Have you thought about using the FID score to assess this experiment?

(6) Can the authors comment (or even better: show in the supplementary materials) how they selected the hyperparameter settings for the INR method?
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Accept — should be accepted, independent of rebuttal (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
- This paper presents a novel method for combining 2D diffusion models with 3D INR-based consistency modeling, which constitutes an interesting method that may spark novel ideas in other biomedical INR applications and seems to be a promising solution for single- and multi-channel microscopy reconstruction.
- The paper is very transparent about all implementation details, including the experimental setup, e.g., the downsampling process, unfavourable metrics (LPIPS), and ablates their method on different datasets. The authors mention a lot of details regarding their training process, all used hyperparameters etc.
- The results for reconstruction in single-channel and two-channel microscopy imaging are convincing.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

Accept — should be accepted, independent of rebuttal (5)
[Post rebuttal] Please justify your decision

The authors have adequately addressed my questions/concerns, and as the approach is very novel and interesting, I have decided to keep my score, as I believe this work is a great fit for MICCAI.

Author Feedback

Thank you for the valuable reviews and suggestions. Here are our responses:

Defining Problem as Linear degradation [R1, R4] Although the microscopy imaging process is much more complicated in real world, we simplified it as a PSF convolution and downsampling operator which are both linear operators by following previous works [1], [2]. Successful reconstruction of the two real-world data (CREMI, Retina) proves that simplifying and approximating the degradation process is applicable.

[1] Isotropic reconstruction of 3D fluorescence microscopy images using convolutional neural networks. MICCAI 2017 [2] DiffuseIR: Diffusion Models for Isotropic Reconstruction of 3D Microscopic Images. MICCAI 2023

Not superior results, especially in terms of LPIPS [R1] Although our method falls behind TPDM in LPIPS scores, we believe that our reconstruction result is more reliable and closer to the ground truth due to higher PSNR/SSIM. Even though LPIPS scores are better, we observed severe misalignment artifacts in the ZY view of the TPDM result, which is reconstructed using XY and ZX planes. Moreover, all the methods including TPDM, which directly uses the diffusion model, require 1000 reverse diffusion steps, while ours requires a much smaller number of iterations, between 200 to 500. Moreover, our method does not require backpropagating the diffusion model during reconstruction(compared to TPDM), which eventually leads to up to 10 times speed up depending on the choice of INR.

Ablation study on R(x) and weight lambda [R3, R4] Although the ablation study is not included in the main manuscript, we observed that the proposed diffusion regularizer is effective compared to other regularizers, such as total variation. The weight parameter lambda is chosen empirically. A small lambda (= 0.01) is similar to “no regularization”. A large lambda introduces strong diffusion prior but conflicts with the consistency loss, consequently failing to reconstruct. The best parameter we found is between 0.1 to 0.25.

FFE+SIREN [R4] Fourier feature embedding and periodic activation functions (SIREN) are both well-known for capturing high-frequency details. Following other INR-based medical image reconstruction methods [3], [4], we adopt both techniques to capture fine details. We compared random Gaussian FFE with others such as linear, or exponential embeddings; however, they cannot fit properly. Moreover, small std(2,4,8) of the Gaussian FFE(lower frequency) fails to capture detail and generates blurry results. We empirically found that SIREN is more capable of learning texture-rich data(FIB) and is easy to train compared to vanilla(ReLU activation) INR. Although the vanilla setting works, it shows more blurry results.

[3] NeRP: implicit neural representation learning with prior embedding for sparsely sampled image reconstruction. IEEE TNNLS 2022 [4] Multi-contrast MRI Super-resolution via Implicit Neural Representations. MICCAI 2023

Reproducibility [R1, R3, R4] We plan to release the source code upon acceptance. The original datasets can be found in the official link of the following papers.

FIB: Synaptic circuits and their variations within different columns in the visual system of drosophila. PNAS 2015 CREMI: https://cremi.org Zebra-Fish Retina: Content-aware image restoration: pushing the limits of fluorescence microscopy. Nature methods 2018

MONAI LPIPS, FID [R4] We were not aware of the MONAI LPIPS when submitting the paper, so we used the LPIPS metric with the pretrained default VGG which may not be optimal for microscopy images. However, we used this metric as a relative comparison metric. We thought that the FID score was less appropriate in our case to represent the reconstruction quality (it represents the generation quality of the generative model). Therefore, we introduced the visual results in Figs 2 and 3 for qualitative evaluation.

Meta-Review

Meta-review #1

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

This is an interesting paper that addressess an important problem in microscopy imaging. There are some issues with the quality of the produced results, but the interesting methodology still deserves to be presented in MICCAI.
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

This is an interesting paper that addressess an important problem in microscopy imaging. There are some issues with the quality of the produced results, but the interesting methodology still deserves to be presented in MICCAI.

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

N/A

back to top

Reference-free Axial Super-resolution of 3D Microscopy Images using Implicit Neural Representation with a 2D Diffusion Prior

Author(s):