Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Volumetric optical imaging is an essential tool for understanding various biological processes. However, due to the inherent limitations such as long imaging time, volume scanning techniques reduce volumetric information into sparse 2D slices. Although many deep learning methods attempt to reconstruct 3D volumes from sparse slices, they struggle with out-of-distribution (OOD) data, which arise from the diversity of biological structures, and the limited structural information in sparse slices. To overcome these challenges, we propose Sparse3Diff, a novel diffusion-based framework that reconstructs high-fidelity 3D volumes from sparse 2D slices. Sparse3Diff incorporates a sparse slice-guided position-aware diffusion process that utilizes sparse slices as guidance and conditions on z-position to maintain structural coherence along the z-axis. Additionally, to achieve stable reconstruction under sparse OOD data, we propose a self-alignment strategy that enables the model to be gradually fine-tuned by leveraging its own inferred slices as self-guidance. Experimental results demonstrate that even with sparse OOD data, the Sparse3Diff achieves accurate 3D reconstruction and remains robust across various scanning datasets.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/5171_paper.pdf

SharedIt Link: https://rdcu.be/eHaYv

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04965-0_48

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{LeeHyu_Sparse3Diff_MICCAI2025,
        author = { Lee, Hyun Jung AND Jo, Eunjung AND Lim, Minjoo AND Son, Young-Han AND Kang, Bogyeong AND Nam, Hyeonyeong AND Jeong, Ji-Hoon AND Shin, Dong-Hee AND Kam, Tae-Eui},
        title = { { Sparse3Diff: A Diffusion Framework for 3D Reconstruction from Sparse 2D Slices in Volumetric Optical Imaging } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15963},
        month = {September},
        page = {510 -- 519}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper introduces a new diffusion-based framework for reconstructing 3D volumes from sparse 2D slices by generating intermediate slices between sparse slices. The model integrates z-position information to enhance spatial correspondence between slices, and a self-alignment strategy that fine-tunes the pre-trained model to sufficiently use sparse slices for slice generation. The experiments on three datasets evaluated its effectiveness for sparse 3D volume reconstruction in optical imaging.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The idea of incorporating z-position for conditional diffusion modeling is novel for this z-axis sparse-to-dense reconstruction application.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

1.The paper is not well illustrated and explained, for instance, why V-loss is able to preserve structural details more effectively since V(X_t) just represents a combination of noise and intermediate slice. And what will it happen if the prediction errors of the intermediate slices accumulate? 2.The self-alignment strategy in both Fig.1 and the text is difficult to understand. And the rationality of this strategy is not clearly explained.
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

1.In Fig.1 B), it is suggested to refine the flowchart to make the self-alignment strategy more straightforward to read and understand. 2.The author should explain why the parameters are set to 10000 in Equation (3).
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(2) Reject — should be rejected, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The novelty and clarity of the model are limited, considering the existing application of diffusion models in image synthesis.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Reject
[Post rebuttal] Please justify your final decision from above.

The novelty and clarity of the method are limited, considering the existing application of diffusion models in image synthesis.

Review #2

Please describe the contribution of the paper

The authors introduce Sparse3Diff, a diffusion-based deep learning framework designed for reconstructing 3D volumes from sparse 2D slices, specifically addressing the challenges inherent in volumetric optical imaging. To effectively handle sparse out-of-distribution (OOD) data, the authors propose a self-alignment strategy, which leverages self-guided fine-tuning to gradually align the model to unfamiliar datasets. The diffusion model exploits similarities between neighboring slices and the target slice to infer intermediate layers, guided by a conditioning factor c, which encodes slice positions along the z-axis. While the position-aware conditional model ensures structural coherence at precise z-positions, the unconditional model focuses on reconstructing fine details. By combining these two approaches, the authors demonstrate improved reconstruction quality.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. This approach leverages an explicitly z-position conditioned encoding diffusion model with classifier-free guidance (CFG) for sparse slice–guided 3D reconstruction, effectively addressing the challenge of reconstructing intricate structures from a minimal set of sparse slices.
2. The authors introduce a self-alignment strategy tailored for sparse and OOD data. By iteratively using the model’s own inferred slices as self-guidance during the fine-tuning of the pre-trained model, this approach enhances robustness against unfamiliar biological structures.
3. In the evaluation, the authors demonstrate the performance of Sparse3Diff across multiple datasets containing diverse biological structures, clearly showcasing its robustness and superiority over existing methods.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. I appreciate the innovative approach of Sparse3Diff in handling sparse and OOD data. However, the manuscript presents using a specific probability to discard the conditioning factor ‘c’ during training. It would be helpful if the authors could provide more details or guidelines on how this probability is determined or optimized.
2. The self-alignment strategy proposed for handling sparse OOD data is an interesting and promising approach. However, I am curious about the reliance on the quality of the initially generated intermediate slices. If the initial inference of these slices is not accurate, how does the model correct or improve them in subsequent iterations? It would be beneficial if the authors could provide insights or experimental evidence on how robust the model is to initial inaccuracies and how effectively it self-corrects through fine-tuning.
3. I noticed that Sparse3Diff achieves higher SSIM but lower PSNR compared to INR and MicroDiffusion for DNA 3D volumes. Given that SSIM and PSNR often correlate, could the authors elaborate on why this discrepancy might occur?
4. The authors provide a comprehensive explanation of Sparse3Diff, from problem formulation to experimental results. However, I believe the study would benefit from ablation experiments to assess the contribution of individual components, such as the impact of the conditioning factor ‘c’ and the linear combination of conditional and unconditional results.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The manuscript presents a compelling idea by employing a self-alignment strategy for sparse OOD data—a form of self-consistent training. However, this approach can easily collapse when the constraints are insufficient because reconstructing a 3D volume from limited slices is inherently an ill-posed problem. More details should be disclosed to ensure reproducibility, and it is recommended that the code be provided.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

The authors propose “Sparse3Diff”, a diffusion-based framework for reconstructing 3D volumes from sparse 2D slices by generating the missing intermediate slices. Sparse3Diff addresses the key limitations in volumetric imaging, such as long acquisition times, low temporal resolution, and cellular damage, by eliminating the need for full 3D scans. To overcome the loss of structural continuity caused by sparse sampling, the authors introduce a position-aware diffusion process guided by the sparse slices and their z-positions, ensuring structural coherence throughout the volume. Additionally, a self-alignment mechanism enables the model to fine-tune itself using its predictions, improving robustness to out-of-distribution (OOD) data. The authors validate their method on two scanning datasets, showing strong performance and robustness across both, with potential impact across biomedical imaging applications.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The authors present Sparse3Diff, a diffusion-based approach for reconstructing 3D volumes from sparse 2D slices by generating intermediate slices. This formulation is novel and directly addresses the challenge of missing volumetric information in volumetric imaging caused by imaging constraints such as long acquisition times, low temporal resolution, and potential cellular damage. A unique contribution of the work is it use of a position-aware or self-guided diffusion process that leverages sparse slices as guidance and incorporates z-position information. This ensures structural coherence along the z-axis, a critical aspect of high-fidelity 3D reconstruction. The proposed self-alignment method enables the model to iteratively fine-tune itself using its own inferred slices, enhancing robustness, especially in the presence of OOD data. The self-guidance approach is both elegant and practical. Sparse3Diff is validated on two scanning datasets, showing effective reconstruction results and robustness across different domains, which underscores the generalizability of the method. The method offers a generalizable and data-efficient solution to 3D reconstruction with potential impact across various biomedical and medical applications where acquiring dense volumetric data is difficult.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

The authors did not discuss any limitations of the proposed method. While the authors claim their approach outperforms existing techniques such as INR and MicroDiffusion on simulated non-diffracting beam datasets, this is not consistently reflected in the reported results. for example, in Figure 4, the MicroDiffusion method yields a higher PSNR than both INR and the proposed method on the Neuron dataset. Additionally, in Tables 1 and 2, INR and MicroDiffusion achieve either the best or second-best results in SSIM and PSNR metrics across DNA, Membrane, and sparse OOD datasets such as Neuron. These competitive results are not acknowledged or discussed by the authors. Furthermore, although the authors emphasize the challenges of long acquisition time, low temporal resolution, and cellular damage in volumetric imaging, they do not report or compare the computational or processing time of their method relative to existing approaches, which is critical for evaluating its practical impact.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

The authors should revise the in-text citation numbering to ensure they appear in the correct sequential order.

In Figure 3(b), the 3D visualization of the vasculature reveals a slight difference between the reconstruction produced by the proposed method and the ground truth. It would be helpful for the authors to comment on or justify these discrepancies, particularly in light of their claim that the self-alignment strategy effectively adapts the model to sparse OOD data.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The authors introduce a novel and impactful framework for 3D reconstruction from sparse 2D slices. The use of a sparse slice-guided position-aware diffusion process, combined with a self-alignment strategy for handling OOD data, is both innovative and well-executed. These contributions directly address critical challenges in volumetric imaging, such as long acquisition times and missing intermediate slices. While there are minor inconsistencies in how the performance results are discussed, for example, some baselines outperform the proposed method in select metrics, these do not overshadow the overall strength of the method or its demonstrated robustness across datasets. Its contributions are likely to inspire further research and practical application, which strongly supports acceptance.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The authors addressed the major concerns raised in the initial review. Specifically, they clarified discrepancies in performance metrics by contextualizing the comparative results of INR-based, particularly concerning dataset characteristics such as sparsity and noise levels. Their explanation helped justify why the proposed method performs competitively in certain cases and where trade-offs exist. Additionally, the authors acknowledged the limitations of their approach and included a thoughtful discussion of scenarios where alternative methods outperform theirs. They also responded to concerns regarding computational efficiency by providing runtime comparisons and demonstrating that their method offers a practical balance between accuracy and speed for volumetric imaging tasks. Their thorough and constructive response improves the clarity and transparency of the work. Combined with the novelty of their approach and its potential impact on non-diffracting beam reconstruction, I believe the paper is now suitable for acceptance.

Author Feedback

We sincerely appreciate the reviewers for their time and constructive feedback.

(R1&R2) SELF-ALIGNMENT (SA): We clarify that our proposed SA strategy fine-tunes the pre-trained model on sparse OOD data by leveraging “inferred intermediate slices X̂” as self-guidance. This is one of the main contributions of our work, as this approach has not been previously explored in any diffusion-related studies. We acknowledge that the original manuscript may have lacked sufficient clarity on how SA avoids error accumulation.

To address this, we provide further clarification based on Sec 2.3: Step 1: We utilize a pre-trained model to infer X̂, using the available sparse OOD slices 𝙎 as guidance. Step 2: To align the pre-trained model to the OOD dataset, we fine-tune it using X̂ as a self-guidance. Specifically, we treat X̂ as pseudo-sparse slices and apply the diffusion process described in Sec 2.2.

Note that, unlike in Sec 2.2, we do not compute V-loss over all intermediate slices during this diffusion process. Instead, we compute the V-loss only for those intermediate slices that correspond to the same locations as 𝙎. This selective fine-tuning prevents error accumulation from X̂, ensuring the model is guided toward correction using reliable sparse slices 𝙎. Empirically, this robustness is demonstrated in Tab 2, especially on the Neuron dataset, showing that the SA consistently improves performance, confirming that our method does not suffer from error accumulation.

(R1) IMPLEMENTATION DETAIL: We apologize for the lack of clarity due to page limits and agree that additional explanation of the V-loss is necessary. V-loss defines the training target as a combination of noise and intermediate slices, which changes more consistently over timesteps compared to pure noise. This stabilizes training and helps the model better capture structural details.

(R2&R3) SSIM/PSNR ANALYSIS: For the DNA dataset, our method achieves the highest SSIM but shows slightly lower PSNR compared to INR-based models (e.g., MicroDiffusion). This is because PSNR measures absolute pixel-wise accuracy, while SSIM better reflects perceptual quality by considering luminance, contrast, and structure (ref [9] in the manuscript). As shown in Fig 2, INR-based models tend to emphasize bright regions, leading to pixel intensities that more closely match the GT in some areas, resulting in higher PSNR. However, this often comes at the cost of fine structural details and overall contrast, explaining their lower SSIM. For the Neuron dataset, our method shows slightly lower performance compared to INR-based models. This is likely due to the dataset’s extremely sparse structures and low intensity values, which differ substantially from the base distribution used during pre-training.

(R2) CONDITIONAL FACTOR: We follow the Classifier-Free Guidance (CFG) approach by randomly discarding the conditioning factor c with a 10% probability during training, allowing the model to learn both conditional and unconditional generation. The impact of CFG is reflected in the results shown in Tab 1&2. Regarding the linear combination in Eq. 5, we set the guidance scale w=7.5, which is empirically known to offer a good trade-off—higher values improve condition alignment but often degrade image quality.

(R3) LIMITATION & COST: As with other diffusion-based methods, our approach involves longer training and inference time compared to INR-based models, but overall shows stronger performance across datasets. In practical settings, acquiring a full 3D volume by laser scanning typically takes several hours (<6 mins per slice in base dataset). In contrast, our method’s inference time is relatively short (<20 sec per slice), making it a practical solution for real-world volumetric imaging workflows.

Upon acceptance, all clarifications and minor comments will be addressed in the final version, and the code will be released for reproducibility.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Reject
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

There is a clear consensus that substantial revisions are necessary; however, such major changes cannot be accommodated within the MICCAI review process. Therefore, the recommendation is rejection.

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

back to top

Sparse3Diff: A Diffusion Framework for 3D Reconstruction from Sparse 2D Slices in Volumetric Optical Imaging

Author(s):