Stanford Report Online




Stanford Report, June 6, 2001
New method of analyzing genetic data enhances value of microarray technology

BY BETSY MASON

A group of scientists at Stanford University Medical Center have cleared a major hurdle in the race to create practical applications of the human genome. They have developed a new way to analyze the data generated by microarrays -- a technology that is already revolutionizing biomedical research.

Gilbert Chu, associate professor of medicine and of biochemistry, and Virginia Goss Tusher, a graduate student in biochemistry, are studying the genetic effects of radiation treatments on cancer patients. Because current techniques for analyzing genetic data were inadequate for their purposes, they worked with biostatistician Rob Tibshirani, professor of health and research policy and of statistics, to develop a new approach.

The result is an innovative method known as Significance Analysis of Microarrays, or SAM, that enables scientists to measure the reliability and accuracy of their data. Tusher and Chu are using SAM to reveal which genes are responsible for the adverse side effects of ionizing radiation therapy for cancer patients.

"Ideally, oncologists would like to predict which anti-cancer agents will kill the tumor while leaving the patient unharmed," said Chu, whose research is supported by the National Cancer Institute and the Burroughs-Wellcome Fund.

Two recent advances have made it possible to generate huge amounts of genetic data. The sequencing of the human genome has given scientists a map of our genetic landscape. Second, microarray technology attaches a copy of each gene to a solid surface less than 1 inch square, allowing scientists to determine which genes are active in human cells.

"It's a way of analyzing how all of the genes in the genome are behaving," Chu said. "Microarrays will play a key role in harvesting medical benefits from the human genome project."

Microarrays, often called gene chips, contain DNA corresponding to each gene in a precisely mapped location. A sample of cellular RNA, tagged with fluorescent molecules, is squirted onto the chip. The active genes that determine a cell's function will find their uniquely complementary strands and stick to them. These genes light up as green spots on the chip and can be precisely identified.

"In our first experiments, we wanted to see which genes are turned on and off by ionizing radiation," Tusher said.

She performed experiments with normal cells that were either exposed or unexposed to a dose of ionizing radiation similar to that used in treating cancer. She then searched for genes that lit up the array in each case with the hope of generating a list of potential culprits for adverse side effects.

"The problem was that every time we changed the analysis a little, we got a different set of genes," Tusher said. "We were using a low dose of radiation, and we needed something to pick up really subtle changes."

The greatest difficulty with analyzing the data from the microarrays is determining which results are significant. With previous analysis techniques, the number of false positives -- results generated by chance -- rendered Tusher's results virtually useless. SAM solved this problem by decreasing the rate of false results and revealing which genes were affected by radiation.

The trio is now analyzing the behavior of those genes in cancer patients to determine which ones are associated with adverse side effects from radiation treatment.

Chu likes to describe the problem in terms of schoolchildren flipping coins. Suppose there are 7,000 or so kids and each has a penny. The kids flip their coins six times indoors and six times outdoors to see if the sun has any effect on the results. Out of the thousands of children, some will flip only heads indoors and only tails outdoors, and vice versa. These couple hundred kids appear to be extraordinary.

"But are those kids really different?" Chu said. "With 7,00 kids, or 35,000 genes, amazing things can happen. SAM tells scientists what percent of the results are random by measuring the behavior of scrambled experiments."

In the hypothetical coin-toss experiment, each result would be tagged with the word "indoors" or "outdoors" according to where the toss occurred. If the same results are randomly relabeled, a true measure of the likely number of randomly "amazing" results is generated. This number would be compared to the number of amazing results in the actual experiment to determine the percentage of false results.

In the coin-toss experiment, the number of "amazing" results predicted to be false would approach 100 percent. But in biological experiments where genes are genuinely affected, the percentage will be much lower.

"SAM gives you a really nice picture of what the cell is doing," Chu said. With a reliable list of genes that are active in a given cell, scientists can focus their research much more effectively.

"The potential for benefit is obviously enormous," he added.

The new method was published in the April 24 issue of Proceedings of the National Academy of Sciences and has already gained a lot of attention.

"It's been available to the world for a few weeks, and already we've had about 270 downloads from places like Japan, China, Australia, Europe, Canada and of course the United States," Tibshirani said. "It's very popular because it's easy to use. It's built right into Excel."

The software was designed and written by Balasubramanian Narasimhan, senior research associate in statistics.

"Because cancer is so important and genetically diverse, it is the focus of much of the microarray work," Tibshirani said.

He's working with a group at Stanford using SAM to find genes related to survival in breast cancer, and another group is using it to study a promising drug for head and neck cancer. Other researchers at Stanford are using the method to study ovarian cancer and kidney cancer, he noted.

Betsy Mason is a science-writing intern for Stanford University News Service.