Dedicated to Discovery. Committed to Care.

Patterns from chips

Expert as they are in their own field, few scientists can write software capable of crunching the millions of bytes of data their experiments can generate. To their aid come biostatisticians and computational biologists who have developed programs, some for sale, some free over the Internet, for handling the numerical side of biological research.

One of the most popular free products among those involved in genomic research is the DNA-Chip Analyzer (dChip), developed by Dana-Farber's Cheng Li, PhD, and his mentor, Wing Wong, PhD, (now at Stanford University).

Downloadable from the Web, the software handles data from "microarray" experiments that register the activity of thousands of genes, letting researchers find patterns that distinguish cancer cells from normal ones and separate one form of cancer from another. Since the program's introduction four years ago, Li has received more than 2,000 user requests. It has also spawned an Internet-based support group where users can ask one another, and Li, questions about dChip's capabilities.

The software can help investigators determine if their gene-array experiments are picking up true differences between various tissue samples, or if the differences are due to statistical "noise." It can also help them make sure the molecular probes for tracking individual genes are doing their job. "dChip offers investigators a way to be sure the comparisons they're making are valid," Li says. "Then it gives them the ability to do the high-level analysis involved in genomic research."

According to Li, making dChip freely available was not only an important "selling point" to the National Institutes of Health, which funded its development. It was also the surest way to build a large, loyal audience. "Our users are the best source of feedback on how the software can be improved," he remarks, noting that the program has been cited by the authors of more than 350 published scientific studies to date.

dChip is just one of the computer tools developed by Dana-Farber biostatisticians and computational biologists to help scientists manage mountains of data. Others include BioConductor, an "open source" program led by Robert Gentleman, PhD, (now at the Fred Hutchinson Cancer Center) for analyzing and understanding genomic data, and an algorithm developed by Xiaole (Shirley) Liu, PhD, to help scientists study the portions of genes that act as on-off switches. Also up for sharing are special-use tools created by Robert Gray, PhD, for controlling error rates in microarray studies and sorting out the risks in people who have multiple health problems.