Abstract

Learning Neuron-level Circuit Network can be used on automatic neuron classification and connection prediction, both of which are fundamental tasks for connectome reconstruction and deciphering brain functions. Traditional approaches to this learning process have relied on extensive neuron typing and labor-intensive proofread. In this paper, we introduce FlyGCL, a self-supervised learning approach designed to automatically learn neuron-level circuit networks, enabling the capture of the connectome’s topological feature. Specifically, we leverage graph augmentation methods to generate various contrastive graph views. The proposed method differentiates between positive and negative samples in these views, allowing it to encode the structural representation of neurons as adaptable latent features that can be used for downstream tasks such as neuron classification and connection prediction. To evaluate our method, we construct two new Neuron-level Circuit Network datasets, named HemiBrain-C and Manc-C, derived from the FlyEM project. Experimental results show that FlyGCL attains neuron classification accuracies of 73.8% and 57.4%, respectively, with >0.95 AUC in connection prediction tasks. Our code and data are available at GitHub https://github.com/mxz12119/FlyGCL.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2059_paper.pdf

SharedIt Link: https://rdcu.be/dV587

SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72120-5_55

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/2059_supp.pdf

Link to the Code Repository

https://github.com/mxz12119/FlyGCL

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Li_SelfSupervisedContrastive_MICCAI2024,
        author = { Li, Junchi and Wan, Guojia and Liao, Minghui and Liao, Fei and Du, Bo},
        title = { { Self-Supervised Contrastive Graph Views for Learning Neuron-level Circuit Network } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15011},
        month = {October},
        page = {590 -- 600}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The authors describe a self-supervised technique for contrastive learning for Graph Neural Networks for downstream tasks such as neuron classification and connection prediction. They show significant performance gain over the other relevant approaches. Further, they derive two datasets HemiBrain-C and Manc-C from the FlyEM project to evaluate their approach.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The authors aim to bridge the learning gap between graph learning and brain wiring diagram study by incorporating a self-supervised contrastive learning framework at the node level.

    • They composed two new circuit networks collected from the FlyEM project.

    • They are achieving significant performance gain for the tasks of neuron classification and connection prediction outperforming all the compared approaches.

    • The paper is well written and easy to follow.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The authors claim to present a novel approach for graph self-supervised training in this paper; however, significant work has been done in this regard.
      • For example, [1] surveys various different techniques to pre-train graph neural networks including the various augmentation techniques discussed in this paper.
      • The paper does have a novel use for the technique at the application level; however, it still raises concerns for novelty in terms of their core methodology.
    • The authors claimed to construct Hemibrain-C and Manc-C datasets from the FlyEM project, are they releasing these datasets?

    • The paper is missing the ID.

    • The paper has minor typos.

    [1] Liu, Yixin, Ming Jin, Shirui Pan, Chuan Zhou, Yu Zheng, Feng Xia, and S. Yu Philip. “Graph self-supervised learning: A survey.” IEEE transactions on knowledge and data engineering 35, no. 6 (2022): 5879-5900.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    As explained in the weaknesses, I request the authors to explain in detail the specific novel aspect of their approach in terms of methodology. They claim “a novel self-supervised graph learning approach”, which would be utterly wrong without explaining in specific details suggesting novel use of such framework or novelty in terms of core methodology. Both can have altering impacts on the contributions. Additionally, I would like the authors to clarify their contributions to constructing the datasets.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    As explained in the strengths, the paper utilizes a contrastive self-supervise mechanism to learn graph neural networks and achieve remarkable performance gain. However, there are still holes in the manuscripts that authors need to clarify thoroughly as explained in the weakness and comments. I am tending toward slight rejection at the moment; however, upon clarification, I am happy to change my judgment and will accept the paper.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The Authors clarified my biggest concerns about the novelty and the dataset, so I am happy to increase my rating.



Review #2

  • Please describe the contribution of the paper

    The manuscript presents a self-supervised learning approach designed to automatically learn neuron-level circuit networks, enabling to capture the connectome’s topological features. Encoding of neural structural representation can be used for downstream tasks such as neuron classification and connection prediction. The proposed method was assessed using two new Neuron-level Circuit Network datasets, named HemiBrain-C and Manc-C, created by the authors. The paper focuses on fruit fly connectome.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) The topic addressed - learning the connectome - is interesting, challenging, and relevant. 2) The paper is clearly written and easy to follow. 3) The proposed method (FlyCGV) is based on self-supervised learning and as such does not require human annotations. 4) The method was applied to two fruit fly connectome datasets reconstructed by the authors. 5) F1-scores classification results presented (Table 1) show that FlyCGV outperforms seven competing methods for the two datasets. 6) AUC results for neuron connection predictions obtained by FLyCGV outperform four other competing methods (Fig. 3). 7) Two ablation studies were performed to test the effectiveness of base encoder and the impact of different graph augmentation.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) The proposed contribution does not seem to be methodological but application specific. 2) Literature review/related work is very limited, relevant works are missing – therefore it is difficult to evaluate the contribution of the manuscript. 3) To this reviewer understanding the compared methods are general and are not specific to the connectome data on which they were tested. 4) The extent of the second claimed contribution - building the Hemibrain-C and Manc-C datasets – is unclear. 5) Essential details about the data, its challenges and typical graph dimensions are missing. 6) The downstream tasks, and in particular the classification, are not well explained.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The code is enclosed to the sup. material.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    1) Novelty: GNNs are well established deep learning tools, and the used contrastive learning approach is well known. The same applies for the proposed augmentation. In addition, the proposed loss function InfoNCE was proposed in [12]. The authors claim that it is the first self-supervised neuron-level circuit network learning model, However, it is difficult to evaluate the contribution since the literature review is limited. 2) What are the differences/advantages with respect to (for example) the following self-supervised graph approaches to explore the connectome? A. https://link.springer.com/chapter/10.1007/978-3-031-43993-330 MICCAI 2023 B. https://ieeexplore.ieee.org/abstract/document/10122156?casa_token=UpNSrzsNx3gAAAAA:qoti2ng1ew9DFrdOS590pDu28BsR-i7Jzt7rVJwB3ZkPh6Tcx847UmtZoZOAcnrS-wa9SA9vRM- JBHI 2023 C. https://arxiv.org/abs/2403.01433, 2024 3) One of the two claimed contributions is building the Hemibrain-C and Manc-C datasets from FlyEM project. Indeed, looking at the project webpage - (referred to in the paper) https://www.janelia.org/project-team/flyem/ one can find details on both Hemibrain/Manc. The authors write: “we only keep the connectivity information and manufactured Neuron-level Circuit Networks, resulting two practical datasets, Hemibrain-C and Manc-C”. What was required for doing this? Can it be considered as a contribution? 4) The paper does not provide sufficient details about the data that were used in practice – What are the initial node features? how many are there? What is the typical number of edges (connections) per node in a graph? 5) For readers who are unfamiliar with these data it is difficult to evaluate the difficulty of the classification task. What does each class represent (explain Fig. 2 (2nd plot) & S1 (supp.))? 6) The paper addresses a multi-class classification task with 25 classes for HemiBrain-C and 7 classes for Manc-C. However, y in Eq. 3 is binary. Have you solved a 1 vs. all problem or a pairwise classification? Please clarify this point. 7) It would be interesting to see t-SNE visualization for Manc-C. 8) Minor: In some cases, there is an upper-case letter in the middle of a sentence.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Please see my comments in 10. In general, the authors should clarify their contributions with respect to existing works in the domain and provide an insight about the challanges.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The authors’ feedback is only partially convincing. The methodological novelty here is very limited. However, I agree that the proposed application allow the detection of neuron-level connections that reveal more fundamental brain patterns. I therefore keep my pre-rebuttal rating.



Review #3

  • Please describe the contribution of the paper

    Authors proposed FlyCGV, an self-supervised learning approach for learning neuron-level circuit network. Based on contrastive graph neural networks, it leverages graph augmentation to create contrastive graphs, enabling the model to learn topological features of neuron circuits without labels. It performs well in neuron classification and connection prediction tasks, using new datasets derived constructed by the authors based on the FlyEM project.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Novelty
      • Contrastive graph neural networks for neural circuit level.
      • Experiments using real dataset of fly brain.
      • Two real dataset were constructed from FlyEM project.
    2. High perforamance compared to previous approaches
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Weak interpretation and discussion of results, please refer detailed comments.
    • Inconsistent results for two different datasets, which make interpretation difficult and raise questions about the quality of datasets.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The authors already provided anonymized codes in supplementary materials.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • To establish a contribution and make them as a benchmark datasets, construction of Hemibrain-C and Manc-C should be emphasized and described better. More detailed statistics, than table S1 would be necessary, including statistics for each class.

    • Is feature dropping a special case of feature masking? Is FD a scenario where the shape of the mask consists of column-wise 0s and 1s? If so, is there a fundamental difference or effect between the two?

    • In Figure 4, why does CN decreases as the train ratio increases, which seems not general?

    • Additionally, there are some inconsistent results that make the interpretation difficult. For figure 4, why does GCN perform well for HemiBrain (almost as well as FlyCGV), while it fails in Manc? What is the main reason of the different results between the two datasets (number of data?, class?)?

    • In Figure 5, colormap with labels would be beneficial. It’s difficult to understand, and it’s not immediately clear if “Distinct clusters are evident, with crisp boundaries” as claimed by the authors. Some clusters appear not distinct in the t-SNE plot. Which clusters are distinct, and which are not? Also, there are colors present in t-SNE plots that are absent from the brain image, for example, there is no orange in the right. Are they simply occluded?


    Minor comments

    • (Line 4) zembrafish –> zebrafish
    • Recommend to clarify the critic function $\theta$ represents cosine similarity.
    • Please specify the reason of using F1(Micro). Was there class imbalance? If so, what was the distribution among different classes?
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The algorithmic novelty appears to be modest, but the application to connectome datasets with appropriate adaptation and experiments is the main strength. Overall, the paper is well-written.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The authors have clarified the contribution, provided details for dataset construction and methods. I increase the score and hope to see a revised version following reviewers’ suggestions.




Author Feedback

We deeply appreciate all constructive comments for improving our work. We have to response main concerns. The minor issue will be corrected in the revised manuscript. Reviewer 1#: Q1: “..explain the specific novel in terms of methodology” A1:We are aware that we fail to clearly explain our major contribution in the paper and somehow overclaim our method’s novelty. We agree with the reviewer`s comments that almost all the major components of the proposed method have appeared individually in previous works[1,2]. To avoid overclaiming the novelty in methodology, we will rename FlyCGV to FlyGCL, make clear citations and revise our claims in the manuscript accordingly. It should be noted that our main contribution is the introduction of the graph contrastive learning method to neuron classification and neuron connection prediction at neuron level, both of which have rarely been studied from a graph learning perspective so far. We believe that integrating existing contrastive learning components into a new system to address critical neuroscience problems is non-trivial and falls within the scope of MICCAI topics. Acquiring connectome data heavily relies on human annotation [3]. We believe that graph contrastive learning is simple enough and user-friendly, requiring fewer training labels. Q2: ”..they releasing these datasets?“ and ’..clarify their contributions to constructing the datasets. A2: Thank you for your comments. The raw data have been released on the site https://neuprint.janelia.org/, which includes neuropil, neuron morphology, neuron connections, and typing results. Unfortunately, these data cannot be directly used for deep learning tasks. we have made the following efforts:

  1. We constructed Hemibrain-C and Manc-C networks from 20 million synaptic connections and 84 million synaptic connections.
  2. The original classification adopted by neuroscientists was a mixed standard involving GAL4, morphology, region, cell body fibers, and lineage, which resulted in a very long-tailed distribution of categories. We coarsened the classification method through expert refinement, making it better suited for deep learning tasks.
  3. We followed the standard of constructing OGB-like[4] graph dataset using PyTorch Geometric (If interested, please see src/data/manc.py&hemibrain.py ). This enables researchers to easily engage in scalable, reproducible, multiple-task studies in neuroscience. We apologize for not providing sufficient details on how to construct the proposed datasets. We will include this information in the revised manuscript. Refs:[1] You, Y. et.al. 2020 NeurIPS(GCL).[2] Khosla, P. et.al. 2020 NeurIPS(InfoNCE).[3] Dorkenwald, S. et.al.2022 Nature Methods(FlyWire)[4]Hu, W. et.al. 2020 NeurIPS(OGB). Reviewer3: Q1:’… literature review is limited’ and “…differences with respect to the followings..” A1: Thanks for your feedback. Before FlyEM project, there are rare neuron-level connectome data. The principal difference between our work and the studies mentioned is that their data are derived from fMRI, which only identifies region-level brain signals, rather than neuron-level connections that reveal more fundamental brain patterns. We will include additional discussions on other non-neuron-level connectomes in the revised manuscript. Reviewer4: Q1:”Is feature dropping a special case of feature masking?” A1:In extreme cases, feature dropping can be considered a special case of feature masking. However, in graph learning, they have different implications: FD: Completely removes certain features, simulating missing or corrupted data to build robustness. FM: Sets features to zero temporarily, promoting flexibility and generalization. Q2: “..Why does CN.., not general? A2: CN is a classic heuristic method for link prediction that predicts connections by counting common neighbors. As the ratio increases, the common neighbor patterns are prone to being disturbed by high-connectivity nodes, resulting in a performance drop.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The rebuttal addresses reviewers’ concerns well on the overclaimed novelty and the lack of details. It is an exciting application paper with solid performance gain.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The rebuttal addresses reviewers’ concerns well on the overclaimed novelty and the lack of details. It is an exciting application paper with solid performance gain.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The novelty of the method was questioned by multiple reviewers (R1,R3,R4) but the authors have expressed that they are open to reducing the claim of novelty in the revised paper.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The novelty of the method was questioned by multiple reviewers (R1,R3,R4) but the authors have expressed that they are open to reducing the claim of novelty in the revised paper.



back to top