Abstract

Medical image segmentation is the core technology of precision medicine, which can improve diagnostic accuracy, optimize treatment plans, and enhance research efficiency. U-Net is a classical and fundamental model in this field. Because of its excellent architecture, Transformer and MLP have been fused on top of it in subsequent work, all with good results. Each of these methods has advantages, but none further explores the image’s low-frequency feature information. The low-frequency feature information reflects the overall structure and contour of the image and provides key background and boundary information for image segmentation. To address this problem, we explore the potential of Wavelet Convolutions for medical segmentation tasks by proposing a novel feature extraction block: the Image Multi-frequency Feature Information Extraction (IMFIE) block. The IMFIE block can effectively extract both high-frequency and low-frequency feature information from images by combining Wavelet Convolutions. This approach takes full advantage of their excellent ability to mine and utilize low-frequency information in images while expanding the receptive field at a low cost. We propose a novel model, UWT-Net, which leverages the IMFIE block and reconstructs the classical U-Net. Experiments on three public pathology image datasets show that the proposed method outperforms the state-of-the-art baseline U-KAN. Code is available at https://github.com/zpc2002zpc/UWT-Net.git.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1637_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/zpc2002zpc/UWT-Net.git

Link to the Dataset(s)

BUSI dataset: https://www.kaggle.com/datasets/aryashah2k/breast-ultrasound-images-dataset GLAS dataset: https://websignon.warwick.ac.uk/origin/slogin?shire=https%3A%2F%2Fwarwick.ac.uk%2Fsitebuilder2%2Fshire-read&providerId=urn%3Awarwick.ac.uk%3Asitebuilder2%3Aread%3Aservice&target=https%3A%2F%2Fwarwick.ac.uk%2Ffac%2Fcross_fac%2Ftia%2Fdata%2Fglascontest&status=notloggedin CVC-ClinicDB dataset: https://www.kaggle.com/datasets/balraj98/cvcclinicdb

BibTex

@InProceedings{ZhaPen_UWTNet_MICCAI2025,
        author = { Zhang, Pengcheng and Ouyang, Xiaocao and Peng, Ran},
        title = { { UWT-Net: Mining low-frequency feature information for medical image segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15969},
        month = {September},
        page = {616 -- 625}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes a new feature extraction block: the Image Multi-frequency Feature Information Extraction (IMFIE) block to leverage the image’s low-frequency feature information for medical image segmentation.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper explores how to effectively utilize low-frequency information to enhance medical image segmentation performance.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. The sentences “Exploiting Dual Tree Complex Wavelet Transform (DTCWT) [14]. Introducing Wavelet Transform to the Downsampling Module of Semantic Segmentation CNNs” are incomplete and lack proper structure.
    2. The proposed network architecture demonstrates insufficient innovation compared to existing approaches.
    3. The comparative experiments are inadequate. The paper should include comparisons with other network architectures that utilize wavelet transforms, such as those presented in references [20], [14], and [6].
    4. Figure 2 lacks sufficient textual descriptions and explanations to clearly convey the network structure.
    5. The mathematical formulations in Section 2.1 are not presented with sufficient rigor and professionalism.
    6. The ablation studies focus primarily on the presence/absence of the IMFIE block and different WT-levels, but don’t thoroughly investigate alternative designs or compare against other wavelet-based architectures.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (2) Reject — should be rejected, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    While the paper mentions other wavelet-based methods, it doesn’t directly compare against them, making it difficult to assess the novelty of the IMFIE block relative to existing wavelet techniques.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper introduces UWT-Net, a novel model for medical image segmentation that addresses the underutilization of low-frequency feature information in existing methods. The key innovation is the Image Multi-frequency Feature Information Extraction (IMFIE) block, which combines Wavelet Convolutions (WTConv) with standard convolutions to effectively extract and leverage both high- and low-frequency features. Low-frequency information, which captures global structure and spatial relationships, is often overlooked but proves critical for improving segmentation accuracy. Experiments on three public datasets (BUSI, GlaS, and CVC-ClinicDB) demonstrate that UWT-Net outperforms state-of-the-art baselines, including U-KAN, showcasing its robustness and generalization ability. Ablation studies further validate the importance of the IMFIE block and the benefits of deeper wavelet decomposition. Additionally, the model is scalable, with multiple size configurations to balance performance and computational cost, making it adaptable to diverse clinical needs. Overall, UWT-Net advances medical image segmentation by effectively integrating wavelet-based feature extraction into the U-Net architecture, enhancing both accuracy and practical utility.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    While prior works (e.g., U-Net, Transformer-based models) focus on high-frequency details (edges, textures), this work systematically exploits low-frequency components (global structure, contours) to improve segmentation. The use of Haar wavelet decomposition (Eq. 1-4) provides a mathematically grounded way to separate and process multi-frequency features. The model is tested on three diverse medical datasets (BUSI, GlaS, CVC-ClinicDB), covering different modalities (ultrasound, histopathology, colonoscopy).The default model (64–1024 channels) achieves a balance between accuracy and computational cost (32.44 GFLOPs), making it practical for clinical deployment. The code is publicly available, adhering to open science principles.This combination of innovation, thorough evaluation, and clinical relevance makes the paper a significant contribution to the field.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Compare IMFIE to other wavelet/frequency-based methods (e.g., DTCWT, Fourier Neural Operators).

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    no

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The core idea is useful (wavelets for medical segmentation), even if not groundbreaking.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors propose a novel encoder-decoder segmentation model, UWT-Net, which integrates Wavelet Convolutions into both the encoder and decoder stages via a custom-designed module named Image Multi-Frequency Feature Information Extraction (IMFIE). Specifically, the model adopts a Haar-Wavelets Decomposition to extract and fuse multi-frequency discriminative features, with an emphasis on underused low-frequency information.

    Experiments on three public medical image segmentation datasets demonstrate that UWT-Net achieves state-of-the-art performance, outperforming seven baseline methods in terms of Intersection over Union (IoU) and F1-score.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Mathematically Grounded Novel Module: The core innovation lies in the proposed Image Multi-Frequency Feature Information Extraction (IMFIE) block, which is inspired by wavelet theory. This module combines standard convolutions with Wavelet Convolutions (WTConv) based on Haar filters. The new module enables thus multi-resolution frequency decomposition capturing both low- and high-frequency components.
    • Low-Frequency Feature Information: The module exploits low-frequency information, which is often ignored in conventional architectures. The paper argues and demonstrates that low-frequency features are vital for global structure and noise robustness—critical for medical image segmentation.
    • Applicability Across Domains: The model’s generalization capability is evaluated on three different types of public medical datasets: breast ultrasound (BUSI), gland histology (GlaS), and colonoscopy (CVC-ClinicDB). This can serve as evidence of the applicability of the proposed model across imaging modalities and anatomical regions.
    • Benchmarking and SOTA Results: The paper compares UWT-Net against seven well-established baselines, including recent transformer- and MLP-based methods like U-KAN, U-Mamba, and U-NeXt. The results consistently show that UWT-Net achieves a new state-of-the-art performance in terms of IoU and F1 score across all datasets.
    • Comprehensive Ablation Studies: The authors conduct different ablation studies involving the effect of the number of wavelet decomposition levels, and model capacity (S, base, L). These analyses confirm that the benefits relies on the wavelet-based design, rather than just increased parameter count or deeper networks.
    • Clinical Scalability: By offering multiple versions of the model (small, base, large), the authors demonstrate the flexibility and scalability of their method.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • Limited Comparison to Related Wavelet-Based Methods: the paper does not provide a direct comparison to recent wavelet-based architectures, e.g. WeT-UNet (https://pubmed.ncbi.nlm.nih.gov/38567422/), or even different Wavelet-based. Kernels
    • Clarity and Language Quality: The overall text clarity and English writing quality could be improved. The technical description and the current written style/language errors can potentially hinder its readability.
    • Equation and Module Description: The mathematical explanation could be more rigorous and better structured. The flow of notation and formulation makes the text sometimes a bit inconsistent.
    • Figure Quality and Captions: The captions and visualizations of Figures 1 and 2 can be improved. For instance, Figure 1 could explicitly label the LL, LH, HL, and HH wavelet subbands, and Figure 2 could depict the IMFIE block in a more modular and visually informative manner.
    • External Validation: All evaluations rely on internal dataset splits. Including an external or cross-institution test set would better support claims of generalizability, e.g Kvasir for Polyp-baseb segmentation.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    This paper presents a promising direction by adopting wavelet theory to address the underuse of low-frequency information in medical image segmentation. The integration of multi-level Haar Wavelet decomposition into a learnable convolutional block is well-motivated and supported by strong empirical results across diverse datasets. However, to further improve the clarity and impact of the work, I encourage the authors to:

    • Improve the technical writing quality, particularly in the explanation of the model and equations. The results are solid, but the manuscript would benefit from more precise and polished language.
    • Include comparisons to other wavelet-based segmentation methods, which would help better situate the contribution within existing literature.
    • In the ablation studies, consider evaluating alternative wavelet kernels (e.g., Daubechies) to better assess the generality of the proposed module.
    • Enhance the figure design and captions, especially for Figures 1 and 2, to make them more self-explanatory and visually informative. For example, Figure 1 could label the LL, LH, HL, and HH components explicitly, and Figure 2 could present the block architecture more clearly.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper addresses an important and underused aspect in medical image segmentation, i.e., exploiting multi-frequency information via wavelet theory. The proposed IMFIE block and UWT-Net architecture are technically sound and demonstrate strong empirical performance across three diverse datasets, outperforming several competitive baselines. However, the overall impact of the work is reduced by issues in clarity and presentation. The manuscript suffers from language and technical writing issues, particularly in the model description and equations. Furthermore, adding comparisons with existing wavelet-based methods would help round out the evaluation. Despite these weaknesses, the paper presents a novel contribution with strong results, and with revision, could become a solid addition to the field. These factors led to a moderately positive recommendation.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A




Author Feedback

Dear Reviewers and Area Chairs, We thank the reviewers for their valuable comments. We below clarify the main concerns(MC), and answer to other questions(Q).

MC:(@R1&R2&R4)Limited Comparison A:We have investigated prior studies in medical image segmentation based on U-Net and wavelet transform. However, very few works have combined the above two aspects. From the perspective of wavelet transform, related methods have not been compared mainly due to the following two factors:(1) Our approach focuses on feature extraction using wavelet transform, many works simply use wavelet transform for sampling. Spectral U-Net[14]employs DTCWT and iDTCWT for down-sampling and up-sampling, respectively. [6]uses the wavelet transform to modify the pooling method: MWP; (2) WeT-UNet[20]leverages wavelet transform for feature extraction within a U-Net architecture. However, it relies on pre-trained weights(Resnet-50), which makes the performance comparison unfair, and its source code is not publicly available. To demonstrate the advantages of our proposed model, we have compared it with methods published in top journals or conferences, such as U-KAN(AAAI’25), Rolling-UNet (AAAI’24) and U-NeXt(MICCAI’22). In particular, we benchmark against the latest state-of-the-art method, U-KAN10. Experimental results show that our model outperforms this powerful KAN-Based method, which provides effectiveness of our approach and highlights the importance of low-frequency feature information in medical image segmentation.

Q1:(@R2)Insufficient innovation A: We respectfully address the concern regarding the novelty of our work by highlighting the following key contributions: (1)Feature extraction operations of our UWT-Net are based on convolution. Despite its simplicity, our model outperforms recent complex and powerful models such as KAN, Mamba and attention-based models. This emphasizes the critical role of low-frequency feature information in enhancing medical image segmentation performance. (2)Our proposed UWT-Net leverages the hierarchical and recursive structure of wavelet transform to thoroughly mine informative low-frequency representations. (3)Our proposed IMFIE block in UWT-Net not only extracts discriminative low-frequency features but also effectively fuses them with high-frequency details, enabling comprehensive frequency-aware feature representation. (4)Our model achieves state-of-the-art results across three computational scales. Notably, it surpasses the latest SOTA method U-KAN(AAAI’25) even at the smallest scale(8.37GFLOPs), demonstrating its suitability for diverse clinical scenarios with varying accuracy and computational requirements.

Q2:(@R2)Ablation Studies A:As mentioned at the bottom of page 7, Table 2 indeed presents a comprehensive evaluation of the substitution of our proposed IMFIE block. We will revise the table for clarity in the final version. Our proposed IMFIE block is designed to deeply mine low-frequency feature information and facilitate effective fusion with high-frequency features. The results in Table 2 show that simply replacing the IMFIE block with a block consisting of standard convolutions leads to a significant performance degradation, demonstrating the effectiveness of the IMFIE block.

Q3:(@R4)External Validation A:We refer to the existing works(1011)for validation methods to ensure fairness. We validated our model on three datasets with different modalities and the results demonstrate the effectiveness of our model. In the future work, we will use external validation to evaluate the generalizability of model more comprehensively.

Q4:(@R2&R4)Language,Figure and Captions Quality,Equation and Module Description A:We are grateful to your valuable comments, and we will revise the manuscript to improve the writing clarity, refine the mathematical formulations, and enhance both the visual quality and explanatory detail of all figures and captions in the camera-ready version.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Given that the reviewers did not revise their ratings, and that the authors’ rebuttal did not substantively address the reviewers’ initial concerns, the issues raised remain unresolved.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



back to top