List of Papers Browse by Subject Areas Author List
Abstract
Interactive segmentation in medical imaging remains challenged by progressive loss of crucial interaction cues (click responsiveness, boundary fidelity) in deep networks. To address this limitation, we propose Interactive Kolmogorov-Arnold Network with adaptive modulation (IKAN), a unified framework that synergistically preserves interaction signals through spline-activated basis functions while enabling iterative anatomical refinement. The architecture achieves enhanced diagnostic fidelity by integrating three core components: hierarchical multi-scale feature extraction through Hierarchical Inception and Channel Attention Module (HICAM), dual-branch adaptive probability modulation for backbone/side-feature fusion, and click density-guided prediction sharpening. By dynamically correlating user-provided clicks with multi-modal data patterns, our method resolves ambiguous boundaries in complex clinical scenarios. Evaluated across OCT, BUSI, and AISD datasets, our method demonstrates enhanced segmentation accuracy in complex clinical scenarios, outperforming state-of-the-art approaches through systematic preservation and amplification of diagnostic interaction cues. The code is available online.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/3869_paper.pdf
SharedIt Link: Not yet available
SpringerLink (DOI): Not yet available
Supplementary Material: Not Submitted
Link to the Code Repository
https://github.com/Umbrellaliu/IKAN
Link to the Dataset(s)
BUSI dataset: https://github.com/hugofigueiras/Breast-Cancer-Imaging-Datasets
AISD dataset: https://github.com/GriffinLiang/AISD
BibTex
@InProceedings{LiuSih_IKAN_MICCAI2025,
author = { Liu, Sihan and Wan, Tonghua and Cai, Yuxin and Chen, Shengcai and Hu, Bo and Wan, Yan and Qiu, Wu},
title = { { IKAN: Interactive KAN with Modulation Fusion for Medical Image Segmentation } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
year = {2025},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15960},
month = {September},
page = {272 -- 281}
}
Reviews
Review #1
- Please describe the contribution of the paper
This paper introduces IKAN, a unified framework for interactive segmentation in medical imaging that addresses the loss of crucial interaction cues like click responsiveness and boundary fidelity. IKAN employs spline-activated basis functions to preserve interaction signals and integrates three core components: hierarchical multi-scale feature extraction via channel-attentive HICAM modules, dual-branch adaptive probability modulation for feature fusion, and click density-guided prediction sharpening. By dynamically correlating user inputs with multi-modal data patterns, IKAN achieves superior segmentation accuracy in complex clinical scenarios, outperforming state-of-the-art methods on OCT, BUSI, and AISD datasets.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1.The experiments are extensive, validating the method’s effectiveness across three datasets and comparing it with other approaches, which are largely innovative. The visualizations further aid understanding: Figure 3 clarifies the modulation effects, and Figure 5 illustrates the impact of the number of points on the method’s performance. 2.The innovation of the proposed modulation mechanism is highlighted by its capability to resolve the limitations of prior methods.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1.The paper has notable shortcomings in its writing, such as logical discontinuities and redundant phrasing. Furthermore, the handling of module abbreviations is problematic. For example, the HICAM module is introduced in the abstract using only its abbreviation without first stating the full name, leaving readers unclear about its meaning at the outset. 2.The HICAM and AFF modules lack innovation, as they represent conventional channel attention mechanisms and fusion operations. 3.The visualization experiment in Figure 4 should include results from more methods for comparison, but currently, it only showcases the performance of the proposed method under varying numbers of points. 4.The lower portion of Figure 1 depicts the processes for “round t” and “round t+1,” which are not addressed in the text. The paper’s writing should prioritize content based on significance; however, it overemphasizes the less innovative HICAM module while failing to mention this aspect.
- Please rate the clarity and organization of this paper
Poor
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
1.The discussion of “Large models tailored for medical tasks…” in the abstract comes across as disjointed and should be more effectively linked to the preceding and subsequent content. 2.The phrase “Critical innovations include HICAM’s…” in the introduction is repetitive and could be rephrased for better clarity and conciseness. 3.The Discussion and Conclusion sections are overly complex and redundant, with the first and second paragraphs repeating the same content.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(3) Weak Reject — could be rejected, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
This paper presents adequate experiments, and while the modulation mechanism module demonstrates novelty, the remaining modules lack innovation. Furthermore, the paper suffers from notable writing problems. Based on these considerations, an overall assessment is offered.
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
N/A
- [Post rebuttal] Please justify your final decision from above.
N/A
Review #2
- Please describe the contribution of the paper
The research proposes a new interactive segmentation framework based on Kolmogorov-Arnold Networks. An argument is made that current foundation models (SAMMed2D/MedSAM) does not translate well to new domains not covered during pretraining. As such, strong architectures are still required that could optimally leverage the information in the specific domain. The work builds on UKAN and proposes three components: 1) hierarchical preprocessing for multi-scale feature extraction, 2) probability map modulation with adaptive components 3) a dual-branch fusion architecture synergizes. The proposed model outperforms various other methods when trained/finetuned on the same datasets, suggesting a strong architecture for interactive segmentation
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
Encouraging to see Kolmogorov-Arnold Networks (KANs) being explored in new domains such as interactive segmentation. Valuable that the authors will release the code publicly. The proposed framework demonstrates strong performance across multiple benchmarks
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
What is the exact role of the KAN/UKAN in proposed framework? By titling the work IKAN, it is suggested that KAN offers some advantage over another approach (standard convolutions for instance) when it comes to interactive models. The work introduces various other components that improve the interactive segmentation performances but provides little evidence that KAN does. What is the “no” KAN in the ablation? What is OCT, BUSI and AISD? These abbreviations are never introduced. FocalClick and RITM also has a variant with a larger backbone hrnet32-S2 and segformerB3-S2. How do they compare? I find it difficult to follow along with the design decisions in 2.1 – 2.3. What is the reason for each of the implementations? Sure it increases performance, but what prompted its implementation? Would FocalClick with a KAN backbone yield the same improvements?
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
Missing space and extra space in sentece “The complete transformation can be formally expressed as : “ typically equations can also be written as part of the sentence. Please introduce all abbreviations in the abstract u and v in equation 3, α in equation 4 , γ in equation 5 are not introduced.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The paper presents an interesting application of Kolmogorov-Arnold Networks (KANs) to interactive segmentation, proposing a novel framework that achieves strong benchmark performance. The inclusion of hierarchical preprocessing, probability map modulation, and a dual-branch fusion architecture adds value to the design. However, the specific contribution of KANs remains unclear, and important comparisons and clarifications (e.g., ablations, dataset descriptions, backbone variants) are missing. Despite these limitations, the strong empirical results make it a worthwhile contribution, especially if the authors address the concerns in a revision.
- Reviewer confidence
Somewhat confident (2)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
Given the additional clarifications and promise of code and data, I think the contributions of the paper would be valuable for the community.
Review #3
- Please describe the contribution of the paper
The authors propose a novel Kolmogorov-Arnold Network (KAN)-based framework built on U-KAN for interactive medical image segmentation. The proposed IKAN enhances segmentation accuracy and outperforms state-of-the-art methods across OCT, BUSI, and AISD datasets.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The proposed IKAN has novelties with
- a KAN framework with a sparse fusion module that leverages multi-source inputs and iterative refinements;
- a hierarchical multi-scale feature extraction module with channel attention to prioritize informative cues;
- a dual-branch fusion strategy with adaptive probability modulation.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Some references to prior work missing. The authors have compared the proposed IKAN to some of them: RITM, FocalClick, CFR-ICL, MFP, and SAM-Med2D, but there are still some that are not. Current methods for interactive segmentation can be categorized into three groups: dense fusion, sparse fusion, and dense & sparse fusions. Dense fusion: RITM, FocalClick, SimpleClick, InterFormer, CFR-ICL, and SegNext. Sparse fusion: DynaMITe, SAM, MFP, HR-SAM, and HQ-SAM. Dense & sparse fusions: OIS (accepted by ICLR 2025, https://openreview.net/forum?id=8ZLzw5pIrc, source code is pending release).
- The source code cannot be validated during review but will be released.
- Some flaws in writing.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
1) The abstract should not contain any undefined abbreviations or unspecified references, e.g., HICAM. Abbreviations should be defined at first mention and used consistently thereafter, such as AFF and HICAM. 2) The figure captions of figures should provide context and explanation in detail in the camera-ready version, for instance, WT Conv-KAN isn’t explained in this paper. ‘Interactive Loop’ in the caption of Fig. 1 should be ‘interactive loop’. 3) “Authors are not allowed to change the default margins, font size, font type, and document style.” The default color for URLs in the LaTeX template and final publishing PDF file is blue (\renewcommand\UrlFont{\color{blue}\rmfamily}), not green or red. Fig.~\ref{fig1} should be used in the text. 4) Correct typos. For instance, ‘CFR-ICL[10]’ in Table 1 should be CFR-ICL [16]. 5) The Reference section should follow the Springer Reference Style. If there are more than 6 authors, list the first author followed by “et al.”. The first letter of the word after the colon in the title should be lowercase, which is not an acronym or abbreviation. The format for Springer proceedings is unique to others, please refer to the previous papers in the MICCAI proceedings. 6) Could the authors provide an anonymized link to the source code for review?
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
This paper provides important new insights or theoretical understanding, and the experimental validation is partially sufficient.
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
My concerns are resolved after reading the authors’ rebuttal.
Author Feedback
We sincerely appreciate the reviewers’ valuable comments. Major concerns have been addressed below, while minor issues and flaws in writing will be revised in the final version, subject to space constraints. R1) Role of KAN/UKAN: KAN’s spline-activated basis functions explicitly model hierarchical anatomical patterns such as tissue boundaries and lesion textures via interpretable representations, aligning with clinical traceability needs. Integrating wavelet convolutions (WT Conv-KAN) preserves multi-scale feature interpretability, crucial for ambiguous boundaries. In the “No KAN” ablation setting, we replaced KAN blocks with standard convolutional blocks, keeping all other components unchanged. This caused a clear performance drop, as shown in Table 2, confirming the critical role of KAN in the framework. R1) Comparisons with Different Backbones and KAN Integration: IKAN is evaluated against RITM and FocalClick using the HRNet18s backbone for fair comparison. While larger backbones may improve performance, exploring the trade-offs of other architectures remains a topic for future investigation. Additionally, the effectiveness of KAN in IKAN arises from its close integration with our dual-branch structure and probability modulation strategy. As such, directly transplanting KAN into other frameworks, such as FocalClick, is unlikely to produce comparable benefits. Nonetheless, adapting KAN to alternative architecture presents a promising direction for future research. R1) Implementations of HICAM: HICAM’s two-stage design is essential for addressing the challenges posed by the often-overlooked heterogeneous nature of multi-source inputs in interactive segmentation. The first stage, a channel attention layer, dynamically prioritizes important modalities, such as user clicks, which provide key localization signals. The second stage, a multi-scale inception block, captures both local details and global context, while the secondary attention layer fine-tunes features based on iterative user inputs, ensuring relevance. R2) Contribution of HICAM: HICAM addresses a core challenge in interactive segmentation: adaptively distinguishing the relative importance of multi-source inputs, including raw images, user click prompts, and prior probability maps. Unlike conventional methods that simply concatenate inputs or apply basic convolutions thereby overlooking their distinct semantic roles. HICAM employs a hierarchical attention mechanism to explicitly quantify and prioritize each modality’s contribution, enabling more informed and context-aware feature fusion. R2) Novelty of AFF Module in KAN-CNN Fusion Strategy: The core innovation of our fusion strategy is the heterogeneous integration of KAN and CNN, rather than the AFF module itself. While AFF provides the implementation framework, the novelty stems from the complementary roles of KAN and CNN in addressing a fundamental challenge in interactive segmentation: combining interpretable global modeling with localized, user-driven refinement a dichotomy rarely addressed in existing methods. The KAN branch uses spline-activated functions to model hierarchical structures like tissue layers in OCT or lesion morphology in NCCT, while maintaining interaction signal traceability across depth. The CNN branch focuses on refining boundaries near user clicks through low-level feature adaptation. The fusion strategy combines these strengths: KAN provides global structural coherence, and CNN offers precise adjustments. Ablation studies confirm that optimal performance is achieved only when the dual-branch architecture, modulation strategy, and AFF module work in concert. Thus, the key innovation lies in the strategic fusion of KAN and CNN, offering a unique balance of clinical interpretability and responsiveness beyond the capabilities of conventional architectures. R1, R3) Reproducibility: To ensure reproducibility and aid understanding, we will release the source code and dataset upon acceptance.
Meta-Review
Meta-review #1
- Your recommendation
Invite for Rebuttal
- If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.
N/A
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
The reviewers recognize the significance of this work. The experiments are extensive. However, there are some issues in the writing. The authors are expected to fix these problems in the final version.
Meta-review #3
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Reject
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
Based on the three reviewer evaluations, this paper suffers from unclear contribution of the core KAN component and excessive module combination without clear justification. While titled “IKAN” suggesting KAN’s central role, the work fails to demonstrate what specific advantages KAN provides over standard convolutions for interactive segmentation, and the ablation studies don’t clarify the “no KAN” baseline. The paper appears to be a patchwork of multiple modules (HICAM, AFF, dual-branch fusion, probability modulation) with limited individual novelty - several components are described as “conventional channel attention mechanisms and fusion operations.” Additionally, significant presentation issues persist, including poor writing quality, undefined abbreviations, logical discontinuities, and missing comparisons with recent state-of-the-art methods.