Using readily available computer programs, researchers have developed a system to identify genes that will be useful in the classification of breast cancer. The algorithm, described in BioMed Central’s open access Journal of Experimental & Clinical Cancer Research will enable researchers to quickly generate valuable gene signatures without specialized software or extensive bioinformatics training.Robin Hallett, a graduate student working under the supervision of Dr. John Hassell and other members of his research team from McMaster University, Ontario, Canada, developed the algorithm and used it to identify a 20 gene signature, which performed well on a 151 patient validation dataset.
Hallett said, “Until now, constructing such a signature requires the use of various clustering and classification algorithms, which in turn require specialized software and bioinformatics training. Importantly, we completed all steps of our algorithm using Microsoft Excel 2007. This software is widely, if not universally, accessible to the biological research community, suggesting that implementation of this technique will not be hampered by lack of software or training.”The researchers used data from a group of 144 patients to train the algorithm to identify genes whose expression levels correlated with patient survival. The 10 most highly ranked genes predictive of poor prognosis and those 10 genes most highly predictive of good prognosis established a 20-gene expression based predictor, which was found to perform as well as two other models in the validation group.According to Hassell, “Our algorithm produces prediction models with comparable accuracy to other feature selection techniques while having generally better accessibility and useability for biological research scientists. We’ve begun using our algorithm to generate gene expression based prediction models of breast cancer cell sensitivity to commonly used anti-cancer therapies.”