Abstract

Reinforcement learning (RL)-based tractography is a competitive alternative to machine learning and classical tractography algorithms due to its high anatomical accuracy obtained without the need for any annotated data. However, the reward functions so far used to train RL agents do not encapsulate anatomical knowledge which causes agents to generate spurious false positives tracts. In this paper, we propose a new RL tractography system, TractOracle, which relies on a reward network trained for streamline classification. This network is used both as a reward function during training as well as a mean for stopping the tracking process early and thus reduce the number of false positive streamlines. This makes our system a unique method that evaluates and reconstructs WM streamlines at the same time. We report ratios of true and false positives improved by almost 20\% on one dataset and a 2x improvement of the amount of true-positives on another dataset, by far the best results ever reported in tractography.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1898_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1898_supp.pdf

Link to the Code Repository

https://github.com/scil-vital/TrackToLearn

Link to the Dataset(s)

N/A

BibTex

@InProceedings{The_TractOracle_MICCAI2024,
        author = { Théberge, Antoine and Descoteaux, Maxime and Jodoin, Pierre-Marc},
        title = { { TractOracle: towards an anatomically-informed reward function for RL-based tractography } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15002},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes TractOracle, a novel reinforcement learning (RL) tractography system designed to address the limitations of existing methods by incorporating anatomical knowledge. TractOracle utilizes a reward network, TractOracle-Net, trained for streamline classification, which is used both for training the RL agent and for stopping the tracking process early to reduce false positives. TractOracle demonstrates significant improvements in true positive ratios and reductions in false positive ratios compared to existing methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper introduces an innovative approach by incorporating anatomical prior knowledge into reinforcement learning-based tractography (It should be note that the innovation here only lies in incorporating anatomical knowledge within the reinforcement learning framework. The way anatomical knowledge is incorporated doesn’t show significant innovation). This integration addresses the challenge of low valid connections. The results show that the reconstructed streamlines by the proposed method have more valid connections.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The proposed method only added the probability of whether it is a white matter region as anatomical knowledege to determine the termination condition of streamlines. Such methods have long been proposed or solved by others, such as Anatomically-Constrained Tractography.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    no

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. This study lacks experimental analysis of the impact of parameters on the results.
    2. The paper lacks comparisons with similar methods, such as SD_Stream+ACT and iFOD2+ACT.
    3. In Table 3, Track-to-Learn seems to have higher OL and OR, but from the performance in Fig 2, it doesn’t seem so, although they are from different datasets.
    4. Why can so many valid fibers based on the proposed method be retained in Table 4? The gap between the quantitative results of VC, IC, and NC from Table 3 among the methods is not particularly large.
    5. The author mentioned in the introduction that SIFT and COMMIT require high processing power. Does this method not require high processing power?
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I believe the proposed method shows innovation by integrating anatomical prior knowledge into RL-based tractography, addressing the issue of low valid connections. However, it lacks some effective experiments, such as comparisons with the ACT method and analysis of the impact of parameter settings on the results.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This study achieved high accuracy in reconstructing streamlines by utilizing TractOracle-Net to assign plausibility scores to streamlines reconstructed by TractOracle-RL. This method helps prevent ‘reward hacking’ and enables concurrent training and filtering.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is well written. The authors introduce an anatomically plausible score that contributes to training an agent with improved reconstruction accuracy. Additionally, a novel stopping criterion, guided by the predictions of TractOracle-Net, has been employed to halt streamlines when they deviate from anatomical plausibility. As a result, this approach effectively reduces the false positive rate in the reconstructed tractogram.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Some problems should be solved before it is considered for publication. Firstly, it is crucial to provide the quantitative results of the in-vivo data since the Tractoinferno Dataset does offer an evaluation pipeline. Another issue is the inadequate explanation of the experimental results, necessitating a detailed explanation of the reasons behind the obtained results, for example the relatively low scalar number of OL.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Several details could be improved, such as adding introductory text annotation to Fig. 1 (a) to differentiate between the two tractograms.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The motivation for this research is sensible. The methodology is clearly described, and the experiments validated the effectiveness of the proposed method.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    Development of a reinforcement-learning method for tractography, that takes anatomy into account. Results show the method outperforms previous state-of-the-art algorithms.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Efficient implementation for taking anatomy information into account in the reinforcement learning (a separate network predicts the anatomical score).

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The validity of the anatomical scoring is unclear. Ultimately, the performance of the algorithm hinges on the performance of RecoBundles, and it is not clear from the work what the limitations are of this algorithm. Failure modes of the whole approach is not described.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Open data sets used for training, but code is not provided.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Inspiring work. Computation times for training and application would be helpful to know.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Strong Accept — must be accepted due to excellence (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed method vastly outperforms state-of-the-art in today’s tractigraphy.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

Dear reviewers,

We thank you for the overwhelmingly positive reviews of our work “TractOracle: towards an anatomically-informed reward function for RL-based tractography”. Below we would like to sum up and address some of the concerns raised by the reviewers

“Using the TractoInferno pipeline to provide quantitative results”

We argue the pipeline does not evaluate meaningful metrics wrt the intent of the proposed method: reducing the number of invalid connections. The TractoInferno pipeline compares the volume of segmented bundles with the subject’s reference bundles. In that sense, the TractoInferno evaluation pipeline evaluates accuracy in terms of voxels, whereas our method focuses on accuracy in terms of connections. We would also like to point out that section 3.4 and Table 4 do include quantitative results for in-vivo subjects. By tracking using exactly the same number of seed points, we evaluated which method produced the most “valid streamlines” according to three streamline segmentation methods.

“Inadequate explanation of experimental results”

Table 3 reports the mean metrics for 5 reconstructions from five models trained with the same hyperparameters but different random seeds. As far as we are aware at the time of writing the proposed work, a ratio of 88% valid connections is by far the highest reported.

TractOracle was sometimes able to recover a 20th bundle, and both sd_stream and ifod2 failed to. We theorize that the 20th bundle may not be well recognized by TractOracle-Net, leading to its reconstruction only sometimes. TractOracle-RL recovered the fewest IB, highlighting again that the algorithm is highly accurate.

OL results for classical algorithms are consistent with previous literature on deterministic vs probabilistic tractography algorithms. Both “learned” algorithms obtained higher OR than ifod2 and sd_stream. This could be due to the learned procedure, or mask thresholds related to tracking termination criteria.

“TractOracle vs Anatomically-Constrained Tractography, comparison with sd_stream + ACT, iFOD2+ ACT”

We agree that anatomically constrained tractography has been proven to greatly improve the performance of the tracking algorithms, reducing the number of false positives. However, we wanted to perform a “fair” evaluation by giving all algorithms the same “input” or “knowledge” about the underlying anatomy. Indeed, neither Track-to-Learn or TractOracle have as part of their input the type of tissue at the head of the streamline. The algorithms are completely agnostic to the underlying tissues and instead rely on the shape of the streamlines being tracked. We agree, however, that the algorithm could greatly benefit from this information to propagate streamlines, as well as selecting or discarding them at the end of the tracking procedure. Future work could integrate this information to further include anatomical information into the tracking process.

“Discrepancies between in-silico and in-vivo results”

Indeed, there seems to be a discrepancy between the volume-related results in Table 3 and Figure 2 for Track-to-Learn and TractOracle. There may be a few reasons for this phenomenon, including the number of training subjects, the shape of the anatomy of the considered test subjects, the evaluation method (Recobundles vs streamline segmentation via regions of interests), etc.

“The validity of the anatomical scoring is unclear and failure modes of the whole approach are not described.”

Indeed, future work should be focused on exploring the limitations and failure modes of TractOracle-Net to ensure anatomical accuracy. One research avenue could replace Recobundles with other streamline filtering methods such as SIFT, extractor_flow or others, or perhaps a multitude of algorithms. Comparison of streamline segmentation with post-mortem analysis could validate that the streamlines are properly segmented.




Meta-Review

Meta-review not available, early accepted paper.



back to top