Abstract

Reliable 3D reconstruction of tissue architecture from sequential 2D multiplex images is challenging due to the noise and distortions introduced by ultrathin (50 nm) slicing and complex alignment procedures. Conventional cell tracking methods often fail under such conditions, resulting in inaccurate linkage of cells across sections. To bridge this gap, we propose a Bayesian Transformer framework that incorporates uncertainty-aware feature embeddings and higher-order graph matching with belief propagation. By tracking cells across consecutive sections, our method facilitates the 3D reconstruction of volumetric tissue organization, even in highly noise-prone scenarios. The methodology begins with a standard segmentation step, followed by feature extraction that computes morphological, shape, and texture descriptors, as well as deep CNN embeddings. These rich, uncertainty-sensitive representations reduce errors caused by both registration artifacts and morphological variability. We validate the effectiveness of the proposed approach on a private multiplex dataset of fixed tissue sections and further demonstrate its generalizability on public time-lapse microscopy videos, showcasing adaptability to diverse datasets. Experimental comparisons reveal that our method outperforms baseline tracking techniques, achieving higher accuracy and more consistent cell linkages across multiple serial sections. The code used in this research with sample dataset are publicly available at https://github.com/NabaviLab/bayesian-transformer-cell-tracking.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/4914_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{KarMos_Bayesian_MICCAI2025,
        author = { Karami, Mostafa and Hamzehei, Sahand and Arce, David and Raimondi, Gianna and Ostroff, Linnaea and Nabavi, Sheida},
        title = { { Bayesian Transformers and Higher-Order Graph Matching for Cell Tracking in Serial Tissue Sections } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15971},
        month = {September},

}


Reviews

Review #1

  • Please describe the contribution of the paper

    (1) The paper propose a cell tracking method based on bayesian transformers and high-order graph matching, which can be used to 3D tissue reconstruction. (2) The proposed framework is trained on unsuperevised condition, reducing the data annotation cost. (3) The author construct a private multiplex imgaging data and validate the model performance on the dataset beyonding other existing methods. (But I am not sure how to evaluate the model performance on unlabeled data)

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (1) The data contribution is prominent if the author willing make it public. (2) The experiments are relatively adequate, but there are still some minor issues. (3) The motivation of method design is clear.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    (1) The writing in method section is hard to read, such as, the calculation of W_q, W_k and W_v is not clear. (2) The equation about W is important, which can help understand how the W_q, W_k and W_v are calculated. Hence, it should be listed as a single-row Equation. In contrast, the Eq. 2 is common is deep learning but it is shown separately. So, these details also worth noting in writing.
    (3) Besides, the dimensions of the input and ouput vectors also not clear, such as, d, d_h, d_in, d_out. I think there may be a error about dimensions. (How to multiple X with W_q? The dimension of X is B1d and the dimension of W_q is d_out*d_in.) These should be further revised. (4) The table format should be improved, such as No DE change to w/o DE is more clear for reading.

  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    In genearal, the poroposed cell tracking method can be used the 3D reconstruction for tissue. I think the motivation is good and author also contribution a private dataset. (I am not sure the data has annotation or not) But the writing and some details also need further improvement. Hence, I am inclined to weak reject the paper at the moment.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Reject

  • [Post rebuttal] Please justify your final decision from above.

    The clarity of the paper would remain a big issue …



Review #2

  • Please describe the contribution of the paper

    The authors propose a machine learning model for cell tracking based on Bayesian transformers and hypergraph construction. In experiments, they find their approach beats the baselines across several metrics.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The proposed method is theoretically well-founded and quantitative results are good.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Some qualitative results would have been nice. Furthermore, to investigate the transformer-based feature embedding, a PCA or t-SNE plot would be insightful.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
    • 2 Methods:
      • Do Bayesian linear layers include biases besides weights?
      • Is there a mathematical justification for letting “two Bayesian linear layers, $W_\mu$ and $W_{\log \sigma^2}$ […] yield these parameters”? Are their weights not themselves sampled using $\mu$ and $\sigma$? How does multiplication with the random weights yield the parameters?
      • From the explanations before, I got the impression that $\mu$ and $\sigma$ and thus z are matrix-valued. Which norm do you compute in (3)?
      • Where does the index k in (3) come from. I assume this is crucial in the context of the contrastive loss.
      • Are nodes clustered into disjoint triplets or can nodes belong to multiple triplets?
      • You use $\mathcal{M}$, $\mathbf{M}$ and $M$. Are these all separate variables?
      • Please explain how triplet edges can contain single-node edges.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    • the belief propagation (message) term is not sufficiently well explained (see specific comments above)
    • qualitative evaluation is lacking
  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors sufficiently addressed all my questions.



Review #3

  • Please describe the contribution of the paper

    The authors introduce a novel method for 2D cell tracking that operates using only segmentation masks. Their approach leverages Bayesian transformers to model uncertainty in the tracking process and employs graph matching to associate cells across consecutive frames. They demonstrate the method’s effectiveness on both publicly available datasets and an in-house dataset.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper provides a well-structured and accessible introduction to the problem, clearly outlining the setting and related work in a way that is informative even for non-expert readers.

    The contributions are clearly stated, and Figure 2 is particularly effective in illustrating the model’s architecture and its modular components, aiding reader comprehension.

    The derivation of x^{t} and the description of the Bayesian transformer embedding are both well explained, providing clear insight into how the model processes input images. The construction of the higher-order graph is also described in a concise and understandable manner.

    Overall, the paper is well written, with a level of clarity that makes it accessible even to those new to the field.

    The experimental results are strong, demonstrating the method’s effectiveness across multiple datasets and its superior performance compared to existing baselines from the literature.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    It is unclear what exactly panels (a) and (b) in Figure 1 are intended to illustrate. While the caption mentions that the red circle indicates artifacts, the artifact appears only in panel 3. Additionally, the most relevant details in the figure are quite small, while the inclusion of a large context image may not be necessary—rescaling or zooming in on key regions could improve clarity.

    The authors mention using Cellpose to extract binary masks. While Cellpose is a strong segmentation tool, it may occasionally miss cells. It would be helpful to clarify how the proposed model handles cases where cells are not properly segmented or completely missed.

    The left panel of Figure 2c could benefit from simplification, as its current complexity makes interpretation difficult. Streamlining the visual presentation could make it more accessible to readers.

    Lastly, the term “ultraplex fluorescence” could use a brief explanation for readers unfamiliar with the term, to ensure clarity.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    the paper is very well written, with strong methodological background showcasing the performance on various datasets and setups.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A




Author Feedback

We thank the reviewers for their constructive feedback, which has improved the paper.

To Reviewer 2: 1) We will add a line in Sec. 2.2: [ \mathbf Q=\mathbf X\mathbf W_Q,\; \mathbf K=\mathbf X\mathbf W_K,\; \mathbf V=\mathbf X\mathbf W_V, \qquad \mathbf W_{Q,K,V}\in\mathbb R^{d_{\mathrm{model}}\times d_{\mathrm{model}}}. ] 2) The line above will be shown once on a single row, immediately before Eq. 2. Eq. 2 will stay separate because it serves a different purpose. 3) Input tensor: (\mathbf X!\in!\mathbb R^{B\times L\times d_{\mathrm{model}}}) with (L=1). Model parameters: (d_{\mathrm{model}}=64,\;h=2,\;d_h=32). Because the projection keeps the same width, (d_{\text{in}}=d_{\text{out}}=d_{\mathrm{model}}); these redundant symbols will be removed. Hence ((B,1,64)\times(64,64)\rightarrow(B,1,64)) is dimensionally consistent; no error exists. 4) In Table 3 we will replace “No DE” with “w/o DE” (without deep embeddings) and slightly widen the columns for clarity.

To Reviewer 3: 1) Each Bayesian linear layer maintains Gaussian posteriors for both weights and biases: $\mathbf W!\sim!\mathcal N(\mu_W,\sigma_W^2)$ and $\mathbf b!\sim!\mathcal N(\mu_b,\sigma_b^2)$. Both are sampled (re-parameterization) during training; at test time we use their posterior means. 2) The final hidden state (\mathbf H!\in!\mathbb R^{B\times L\times d_{\mathrm{model}}}) is passed through two independent Bayesian linear layers (\mathbf W_\mu,\mathbf W_{\log\sigma^2}!\in!\mathbb R^{d_{\mathrm{model}}\times d_{\mathrm{embed}}}) to obtain the Gaussian parameters ((\mu,\log\sigma^2)). Because these layers are stochastic, the predicted (\mu) and (\sigma) carry the model’s uncertainty, and they remain fully differentiable via the re-parameterization trick. This design follows standard practice in Bayesian VAEs and Bayesian Transformers. 3) In Eq. 3, we use the squared Euclidean norm along that dimension and then average over the batch. For clarity, we have added a line explicitly stating the specific norm used in this equation. 4) For each anchor $i$ we pick one positive $j$ (same cell in the next frame) and one negative $k$ (a different cell); $k$ therefore indexes the negative sample in the contrastive triplet. Considering the space constraints, we refer the reader to the original work on contrastive loss for further details. 5) Each cell can and does appear in multiple triplets. We enumerate every 3-cell combination within a distance threshold. This overlap lets belief-propagation share geometric evidence across neighboring triplets. We will state explicitly in Sec. 2.3 that triplets are not required to be disjoint. 6) Thank you for drawing our attention to the notation. The symbols represent the same message values; based on your helpful feedback, we have revised and double-checked the notation for $M$ to ensure consistency throughout the manuscript. 7) A triplet edge ((i_1,i_2,i_3)!\to!(j_1,j_2,j_3)) implicitly includes the three first-order edges ((i_s!\to!j_s),\,s=1,2,3). During belief propagation the triplet’s cost is added to the message of each of these edges, thereby enforcing geometric consistency without introducing extra variables.

To Reviewer 4: 1) We will crop non-informative background and add a zoom-in so the red-circled artifact is clearly visible. 2) Missed masks are rare (checked by an expert annotator) and, crucially, the same Cellpose output is fed to all compared methods, so performance remains fairly benchmarked. Our model treats a frame-specific unmatched region as a singleton node, preventing breakage if a cell is absent in the mask. 3) We will split the graphic into (i) a three-block pipeline overview and (ii) a small inset for message-passing, removing extraneous arrows for faster comprehension. 4) We will add an explanation of the “Ultraplex fluorescence”.

We thank reviewers for suggesting PCA/t-SNE. Space constraints prevented their inclusion, but we will consider them in future work.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This paper proposes a Bayesian transformer-based method for embedding extraction and a high-order graph matching method for cell tracking. While the approach is promising, its method section is below the MICCAI standard.

    The method section needs significant improvement, where important equations and illustrations are poorly presented. One major issue is the lack of clarity of the belief propagation formulation. It’s important to separately write down the cost function and the message updates (variable -> factor and factor -> variable). Currently, Eq (4) is incomplete due to the lack of definition of M(i,j), while it’s unclear how Eq (5) is derived.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The paper introduces an unsupervised cell-tracking framework that couples Bayesian-transformer embeddings with triplet-based hypergraph matching. Across two public benchmarks and a newly collected multiplex-fluorescence dataset, the method consistently surpasses established baselines, and the authors pledge to release both code and data.

    During review, the evaluations converged toward acceptance. Reviewer 3, initially uncertain about the derivation of the Bayesian layers and the details of the message passing, upgraded the score to Accept once the rebuttal clarified these points. Reviewer 4 endorsed the work from the outset, highlighting its sound theory, strong empirical results, and clear motivation. Reviewer 2 maintained a Reject, chiefly because of lingering concerns over prose clarity, but acknowledged the method’s technical soundness. With two Accepts against one Reject, the AC recommends accept: the probabilistic-transformer plus triplet-graph formulation is a novel and persuasive contribution to cell tracking.

    For the final version, the manuscript should incorporate the clarified equation, ensure consistent dimensional notation, streamline Figures 1 and 2 by zooming on the artefact and simplifying the belief-propagation inset. With these revisions, the paper warrants acceptance.



back to top