Model From “Attractor Metagenes” Team Tops The September 1 Leaderboard!

Please join all of us at DREAM and Sage Bionetworks in congratulating the “Attractor Metagenes” team for their September 1 Leaderboard Winner Achievement.  Please read on to hear from Wei-Yi Cheng who submitted the winning model on behalf of his team.   The BCC Support Support Team

Dear fellow BCC challenge participants and organizers,

This is Wei-Yi Cheng, along with my teammates Tai-Hsien Ou Yang and Professor Dimitris Anastassiou at Columbia University. It is our great honor to be highlighted as the top team on September 1st in the competition. Tai-Hsien and I are currently Ph.D. students in Prof. Anastassiou’s Genomic Information Systems Laboratory (GISL) and the three of us have recently been working extensively to develop prognostic models in this challenge. I would like to thank the organizers for giving me the opportunity to present ourselves, and the ideas that we have been using.

The main topic of my thesis will be the discovery of biomolecular mechanisms in cancers using an iterative computational process that converges to what we call “attractor metagenes” or just “attractors.” Contrary to other methods of finding modules of co-expressed genes, the attractor methodology is totally unconstrained so it can point to the core genes of the biomolecular event that it represents. Remarkably, we found that some of these attractors are present in nearly identical form in all cancer types that we tried, suggesting that they represent universal mechanisms. We like to think of two of these attractors as “bioinformatic hallmarks of cancer.” We call them the “mesenchymal transition attractor” and the “mitotic chromosomal instability (CIN) attractor.” We believe that they reflect universal biological mechanisms empowering cancer cells to invade surrounding tissues and to divide uncontrollably, respectively. They are also strongly associated with tumor stage and grade, respectively, as well as other phenotypes. We also found many other attractors, including amplicons, particularly one prominent universal amplicon at chr8q24.3, and some attractors that are cancer-type specific, such as the estrogen receptor attractor.  For information about the underlying algorithm and additional results please see our preprint in http://arxiv.org/abs/1204.6538.

In our models, we use the attractor metagenes for survival prediction. The mitotic CIN metagene is the most prognostic, but the other ones provide significant additional help. We think that using such metagenes representing biomolecular events is preferable compared to using individual genes or classification into subtypes. For example, one of the features of our top model as of September 1st (#118304) is the replacement of the PAM50 molecular subtype classifier by three of our attractor metagenes: the mitotic CIN attractor, the estrogen receptor attractor, and a chr7p11.2 amplicon involving EGFR. We do not claim that these three metagenes contain all the information in PAM50. But we believe that the effort to discover mutually exclusive “subtypes” of cancer (not just in breast cancer but in all types of cancer) may have done the community a disservice. Instead, we think that simply focusing on precise biomolecular events will lead to better understanding of the underlying mechanisms. For example, although similar subtypes have been identified across cancer types, this similarity has not been strong enough to infer that it reflects the same biological event. In contrast, the attractor metagenes are found to be nearly identical across cancer types.

In our submission we used AIC on the Cox regression model to select other clinical features, and included a GBM model fed with relevant clinical features and several other attractor metagenes. The R package for finding attractor metagenes is available under synapse ID syn1123167.

I would like to express my appreciation and admiration to Adam, Erhan, Thea, and all the other challenge organizers, as well as everyone who contributed with funding, data, or infrastructure, to make this challenge possible. The design and implementation of the challenge provided an open and transparent environment for us to know how we are doing, and to learn from the others at the same time. We believe that such open-source environment can really help push innovations further for better applications of bioinformatic tools. It will be rewarding if this wonderful and worthy collective effort leads to an improvement in the prognosis of this devastating disease. May the best model win!

Wei-Yi Cheng

Graduate Research Assistant, Ph.D. Candidate in Electrical Engineering

Genomic Information Systems Laboratory,

Columbia University

Comments are closed.