Graphic by Ana Marija Sokovic

When Charles Darwin took his historic voyage aboard the HMS Beagle from 1831 to 1836, “big data” was measured in pages. On his travels, the young naturalist produced at least 20 field notebooks, zoological and geological diaries, a catalogue of the thousands of specimens he brought back and a personal journal that would later be turned into The Voyage of the Beagle. But it took more than two decades for Darwin to process all of that information and into his theory of natural selection and the publication of On the Origin of Species.

While biological data may have since transitioned from analog pages to digital bits, extracting knowledge from data has only become more difficult as datasets have grown larger and larger. To wedge open this bottleneck, the University of Chicago Biological Sciences Division and the Computation Institute launched their very own Beagle — a 150-teraflop Cray XE6 supercomputer that ranks among the most powerful machines dedicated to biomedical research. Since the Beagle’s debut in 2010, over 300 researchers from across the University have run more than 80 projects on the system, yielding over 30 publications.

“We haven’t had to beat the bushes for users; we went up to 100 percent usage on day one, and have held pretty steady since that time,” said CI director Ian Foster in his opening remarks. “Supercomputers have a reputation as being hard to use, but  because of the Beagle team’s efforts, because the machine is well engineered, and because the community was ready for it, we’ve really seen rapid uptake of the computer.”

A sampler of those projects was on display last week as part of the first Day of the Beagle symposium, an exploration of scientific discovery on the supercomputer. The projects on display covered the very big — networks of genes, regulators and diseases built by UIC’s Yves Lussier — to the very small — atomic models of molecular motion in immunological factors, cell structures and cancer drugs. Beagle’s flexibility in handling projects from across the landscape of biology and medicine ably demonstrated how computation has solidified into a key branch of research in these disciplines alongside traditional theory and experimentation.


