Feeds:
Posts
Comments

Archive for April, 2013

Graphic by Ana Marija Sokovic

Graphic by Ana Marija Sokovic

When Charles Darwin took his historic voyage aboard the HMS Beagle from 1831 to 1836, “big data” was measured in pages. On his travels, the young naturalist produced at least 20 field notebooks, zoological and geological diaries, a catalogue of the thousands of specimens he brought back and a personal journal that would later be turned into The Voyage of the Beagle. But it took more than two decades for Darwin to process all of that information and into his theory of natural selection and the publication of On the Origin of Species.

While biological data may have since transitioned from analog pages to digital bits, extracting knowledge from data has only become more difficult as datasets have grown larger and larger. To wedge open this bottleneck, the University of Chicago Biological Sciences Division and the Computation Institute launched their very own Beagle — a 150-teraflop Cray XE6 supercomputer that ranks among the most powerful machines dedicated to biomedical research. Since the Beagle’s debut in 2010, over 300 researchers from across the University have run more than 80 projects on the system, yielding over 30 publications.

“We haven’t had to beat the bushes for users; we went up to 100 percent usage on day one, and have held pretty steady since that time,” said CI director Ian Foster in his opening remarks. “Supercomputers have a reputation as being hard to use, but  because of the Beagle team’s efforts, because the machine is well engineered, and because the community was ready for it, we’ve really seen rapid uptake of the computer.”

A sampler of those projects was on display last week as part of the first Day of the Beagle symposium, an exploration of scientific discovery on the supercomputer. The projects on display covered the very big — networks of genes, regulators and diseases built by UIC’s Yves Lussier — to the very small — atomic models of molecular motion in immunological factors, cell structures and cancer drugs. Beagle’s flexibility in handling projects from across the landscape of biology and medicine ably demonstrated how computation has solidified into a key branch of research in these disciplines alongside traditional theory and experimentation.

(more…)

Read Full Post »

Newer, faster supercomputers have allowed scientists to create detailed models of blood flow that help doctors understand what happens at the molecular level. (Photo from Argonne)

Newer, faster supercomputers have allowed scientists to create detailed models of blood flow that help doctors understand what happens at the molecular level. (Photo from Argonne)

This week, some 25 cities around the world are hosting events online and offline as part of Big Data Week, described by its organizers as a “global community and festival of data.” The Chicago portion of the event features several people from the Computation Institute, including two panels on Thursday:  “Data Complexity in the Sciences: The Computation Institute” featuring Ian Foster, Charlie Catlett, Rayid Ghani and Bob George, and  “Science Session with the Open Cloud Consortium” featuring Robert Grossman and his collaborators. Both events are in downtown Chicago, free, and you can register at the above links.

But the CI’s participation in Big Data Week started with two webcast presentations on Tuesday and Wednesday that demonstrated the broad scope of the topic. The biggest data of all is being produced by simulations on the world’s fastest supercomputers, including Argonne’s Mira, the fourth-fastest machine in the world. Mira boasts the ability to 10 quadrillion floating point operations per second, but how do you make sense of the terabytes of data such powerful computation produces on a daily basis?

In his talk “Big Vis,” Joseph Insley of Argonne and the CI explained how he and his team has developed equally impressive visualization technology to keep pace with Mira’s data firehose. Tukey, a 96-node visualization cluster, is Mira’s sidekick, sharing the same software and file systems with its big sibling to more easily take in data and transform it into images. Insley demonstrated how visualization was instrumental in two major simulations conducted on Mira: one studying arterial blood flow and aneurysm rupture in the brain, and another on nothing less than the evolution of the entire universe.

(more…)

Read Full Post »

image descriptionPeople who work in laboratories take a lot of things for granted. When they come into work in the morning, they expect the equipment to have power, the sink to produce hot and cold water, and the internet and e-mail to be functional. Because these routine services are taken care of “behind the scenes” by facilities and IT staff, scientists can get started right away on their research.

But increasingly, scientists are hitting a new speed bump in their day-to-day activities: the storage, movement and analysis of data. As datasets grow far beyond what can easily be handled on a single desktop computer and long-distance collaborations become increasingly common, frustrated researchers find themselves spending more and more time and money on data management. To get the march of science back up to top speed, new services must be provided that make handling data as simple as switching on the lights.

That mission was the common thread through the second day of the GlobusWorld conference, an annual meeting for the makers and users of the data management service, held this year at Argonne National Laboratory. As Globus software has evolved from enabling the grids that connect computing centers around the world to a cloud-based service for moving and sharing data, the focus has shifted from large, Big Science collaborations to individual researchers. Easing the headache for those smaller laboratories with little to no IT budget can make a big impact on the pace of their science, said Ian Foster, Computation Institute Director and Globus co-founder, in his keynote address.

“We are sometimes described as plumbers,” Foster said. “We are trying to build software and services that automate activities that get in the way of discovery and innovation in research labs, that no one wants to be an expert in, that people find time-consuming and painful to do themselves, and that can be done more effectively when automated. By providing the right services, we believe we can accelerate discovery and reduce costs, which are often two sides of the same coin.”

(more…)

Read Full Post »

The median deviation of simulated 2012 county-level yields from linear trend as a percentage of county-specific trend yields from 1979 to 2011. Image courtesy Joshua Elliot.

The median deviation of simulated 2012 county-level yields from linear trend as a percentage of county-specific trend yields from 1979 to 2011. Image courtesy Joshua Elliot.

[This article ran originally at International Science Grid This Week. Reprinted with permission.]

In 2012, the United States suffered its worst agricultural drought in 24 years. Farmland across the country experienced a devastating combination of high temperatures and low precipitation, leading to the worst harvest yields in nearly two decades. At its peak, nearly two-thirds of the country experienced drought conditions according to the US Drought Monitor. Worse still, instead of an anomalous year of bad weather, 2012 may have provided an alarming preview of how climate change will impact the future of agriculture.

These warning signs make 2012 an ideal year for validating crop yield and climate impact models that simulate the effects of climate on agriculture. Current climate change models predict that global temperatures could rise several degrees over the next century, making hotter growing seasons the new norm and truly extreme seasons (like 2012) more commonplace.

“A world four degrees warmer than it is now is not a world that we’ve ever seen before,” says Joshua Elliott, a fellow at the Computation Institute and the Center for Robust Decision-Making on Climate and Energy Policy. “Studying years like 2012 in detail can potentially be very useful for helping us understand whether our models can hope to accurately capture the future.”

(more…)

Read Full Post »

rayid-ghaniIf you received a surprisingly personalized e-mail or Facebook message from the Obama 2012 campaign, it was likely the product of the campaign’s groundbreaking analytics tools. As chief scientist of that acclaimed team, Rayid Ghani helped bring the computational techniques of data-mining, machine learning and network analysis to the political world, helping the re-election campaign raise funds and get out the vote in powerful new ways. Now that Barack Obama is back in the White House, we are pleased to announce that Ghani is joining the University of Chicago and the Computation Institute. Here, he will shift his attention and expertise to even bigger goals: using data and computation to address complex social problems in education, healthcare, public safety, transportation and energy.

Though he only started on April 1st, Ghani already has a full plate, including a position as Chief Data Scientist at the Urban Center for Computation and Data and a role developing a new data-driven curriculum at the Harris School for Public Policy. But Ghani’s most immediate project is The Eric and Wendy Schmidt Data Science for Social Good Fellowship, which hopes to train and seed a new community of scientists interested in applying statistics, data and programming skills to society’s greatest challenges. We spoke to Ghani about his time with the campaign and plans for the future.

Q: So what brought you to the University of Chicago and the Computation Institute?

Ghani: The reason I got involved with the campaign was that I was looking to combine the things that I care about with the things that I’m good at. I was good at machine learning and data mining research and I cared about making a social impact in the world. The campaign was the beginning of that, but not a long-term plan. After the campaign, I was even more enthusiastic  – if we could do all that we did in a year and a half, there’s certainly a lot more we can do if there is a more focused effort that can last.

(more…)

Read Full Post »

TEDxCERN_headerWHEN TED MEETS CERN

We’re happy to announce that Computation Institute director Ian Foster will be speaking at the first-ever TEDxCERN conference, to be held May 3rd at the particle physics laboratory in Geneva, Switzerland. The theme of the conference is “Multiplying Dimensions,” and Foster will speak in the second session on the topic of “Big Process for Big Data.” Other speakers include geneticist George Church, chemist Lee Cronin and philosopher John Searle. A webcast of the conference (hosted by Nobel Laureate George Smoot) will run on the TEDxCERN website, but the CI will also host a viewing party at the University of Chicago. Stay tuned for details, and enjoy the TEDxCERN animation on the origin of the universe — one of five animations (including one on big data) that will premiere at the event.

(more…)

Read Full Post »

CGI for Science

An image from a model of how endophilin sculpts membrane vesicles into a network of tubules. (Mijo Simunovic/CMTS)

An image from a model of how endophilin sculpts membrane vesicles into a network of tubules. (Mijo Simunovic/CMTS)

Computer graphics have greatly expanded the possibilities of cinema. Special effects using CGI (computer-generated imagery) today enable directors to shoot scenes that were once considered impossible or impractical, from interstellar combat to apocalyptic action sequences to fantastical digital characters that realistically interact with human actors.

In science, computer graphics are also creating sights that have never been seen before. But where movie special effects artists are realizing the vision of a screenwriter and director, scientific computer models are inspiring new discoveries by revealing a restless molecular world we cannot yet see with the naked eye.

Using computers to peer into this hidden universe was the theme of CI faculty and senior fellow Gregory Voth‘s Chicago Council on Science and Technology talk last week, titled Molecular Modeling: A Window to the Biochemical World. Scientists at Voth’s Center for Multiscale Theory and Simulation use computers to recreate real-world physics and produce awe-inspiring, intricate images, pushing the frontiers of discovery one femtosecond and nanometer at a time.

[Some of those images, including the one above by Mijo Simunovic, were on display as a “Science as Art” gallery, which you can view in a slideshow here.]

“The computer simulation allows us to make a movie, if you will, but it’s a movie describing what the laws of physics tells us,” Voth said. “It’s not a movie where we tell the computer we want this figure to run and shoot this figure. We don’t know what’s going happen. We know the equations, we feed them in [to a supercomputer], and we solve those equations…and we can reach scales we never dreamed of reaching before.”

(more…)

Read Full Post »

Older Posts »