Finding a better way to fight cancer doesn’t always mean discovering a new drug or surgical technique. Sometimes just defining the disease in greater detail can make a big difference. A more specific diagnosis may allow a physician to better tailor a patient’s treatment, using available therapies proven to work better on a specific subtype of disease or avoiding unnecessary complications for less aggressive cases.

“Finding better ways to stratify kids when they present and decide who needs more therapy and who needs less therapy is one of the ways in which we’ve gotten much better at treating pediatric cancer,” said Samuel Volchenboum, Computation Institute Fellow, Assistant Professor of Pediatrics at Comer Children’s Hospital and Director of the UChicago Center for Research Informatics. “For example, kids can be put in one of several different groups for leukemia, and each group has its own treatment course.”

Classically, patients have been sorted into risk or treatment groups based on demographic factors such as age or gender, and relatively simple results from laboratory tests or biopsies. Because cancer is a genetic disease, physicians hope that genetic factors will point the way to even more precise classifications. Yet despite this promise, many of the “genetic signatures” found to correlate with different subtypes of cancer are too complex – involving dozens or hundreds of genes – for clinical use and difficult to validate across patient populations.

Continue Reading »

blight3As the Data Science for Social Good fellowship enters its final month, many of the projects with nonprofit organizations and government agencies are picking up momentum. At the DSSG website, we’re posting regular updates on the fellows’ progress: how they determined the right problem to solve, what analytic and software tools they’re using to attack those problems, and what they have learned along the way. Some of the articles even offer a glimpse at early results and prototypes developed by the team over the first two months. Here’s a sampling of those progress reports.

Cook County Land Bank: The Problem

The Cook County Land Bank Authority was recently established earlier this year as a new government agency charged with acquiring and redeveloping vacant and abandoned properties. DSSG fellows are working with The Institute for Housing Studies at DePaul University to developed a tool — a sort of “Trulia for abandoned properties” — that will help the agency determine which properties to purchase in order to produce the greatest benefit for the surrounding community.

The Cook County land bank wants to play the midwife, proactively targetingindividual properties that have redevelopment potential and could help stabilize local areas.

But there are tens of thousands of boarded up homes and overgrown lots in Cook County, and the land bank’s budget is limited. How will the agency figure out which of these properties to acquire, and what to do with them?

Where can it actually step in and be effective, investing in properties that would not otherwise have been redeveloped, instead of soon-to-be-sold or unsellable ones?

Continue Reading »


By Kevin Jiang, University of Chicago Medicine

Just 12 molecules of water cause the long post-activation recovery period required by potassium ion channels before they can function again. Using molecular simulations that modeled a potassium channel and its immediate cellular environment, atom for atom, University of Chicago scientists have revealed this new mechanism in the function of a nearly universal biological structure, with implications ranging from fundamental biology to the design of pharmaceuticals. Their findings were published online July 28 in Nature.

“Our research clarifies the nature of this previously mysterious inactivation state. This gives us better understanding of fundamental biology and should improve the rational design of drugs, which often target the inactivated state of channels” said Benoît Roux, PhD, professor of biochemistry and molecular biology at the University of Chicago and senior fellow at the Computation Institute.

Potassium channels, present in the cells of virtually living organisms, are core components in bioelectricity generation and cellular communication. Required for functions such as neural firing and muscle contraction, they serve as common targets in pharmaceutical development.

Continue Reading »


John Lafferty in the Mansueto Library. (Photo by Jason Smith)

Learning a subject well means moving beyond the recitation of facts to a deeper knowledge that can be applied to new problems. Designing computers that can transcend rote calculations to more nuanced understanding has challenged scientists for years. Only in the past decade have researchers’ flexible, evolving algorithms—known as machine learning—matured from theory to everyday practice, underlying search and language-translation websites and the automated trading strategies used by Wall Street firms.

These applications only hint at machine learning’s potential to affect daily life, according to John Lafferty, the Louis Block Professor in Statistics and Computer Science. With his two appointments, Lafferty bridges these disciplines to develop theories and methods that expand the horizon of machine learning to make predictions and extract meaning from data.

“Computer science is becoming more focused on data rather than computation, and modern statistics requires more computational sophistication to work with large data sets,” Lafferty says. “Machine learning draws on and pushes forward both of these disciplines.”

Continue Reading »



China’s Milky Way 2 supercomputer was recently declared the fastest supercomputer in the world by industry scorekeeper Top500, the latest move in the increasingly international race for high performance computing supremacy. Late last month, CI Senior Fellow Rick Stevens appeared on Science Friday, alongside Top 500 editor Horst Simon, to talk about why that competition matters, and what the global push for faster computation will do for medicine, engineering and other sciences.

“These top supercomputers are like time machines,” Stevens said. “They give us access to a capability that won’t be broadly available for five to ten years. So whoever has the time machine is able to do experiments, able to see into the future deeper and more clearly than those that don’t have such machines.”

The same time machine metaphor was also picked up by the University of Chicago’s profile of Mira, our local Top500 competitor, which was bumped down to #5 by the Milky Way 2’s top ranking. But there’s no shame in fifth-best, when fifth-best can run 10 quadrillion calculations per second — the equivalent computing power of 58 million iPads. CI Senior Fellow Gregory Voth is quoted about how access to such a world-class resource helps both today and tomorrow’s scientists.

“Having access to a computing resource like Mira provides excellent opportunities and experience for educating up-and-coming young scientists as it forces them to think about how to properly utilize such a grand resource very early in their careers,” Voth says. “This gives them a unique perspective on how to solve challenging scientific problems and puts them in an excellent position to utilize computing hardware being imagined now for tomorrow.”


The Data Science for Social Good fellowship has reached the halfway point, and the website is starting to fill up with interesting content about the projects. Some fellows have already produced tools for the community to use, such as Paul Meinshausen’s interactive tree map of the City of Chicago’s Data Portal. Instead of a cold, no-frills list of the datasets available for download by the public, Meinshausen’ s map uses color and shape to guide users quickly to the data they are seeking and make rapid comparisons about the size of the dataset. The visualization was popular enough that programmers in Boston and San Francisco quickly applied his code to their own city’s data portals, while another built a common map for every city that uses Socrata software to share its data.

Continue Reading »

Will the cities of tomorrow be built on a foundation of data and computation? Among the CI-related events at the 2013 University of Chicago Alumni Weekend was a panel discussing the growing role of data-driven urban policy, featuring Urban Center for Computation and Data director Charlie Catlett, Dean of the Harris School of Public Policy Colm O’Muircheartaigh and Lewis-Sebring Distinguished Service Professor Stephen W. Raudenbush.

In his remarks, Catlett talks about the current window of opportunity for studying cities, produced by the dramatic expansion from narrow, outdated data snapshots to constantly updated streams of open data available to researchers and the public.

“The new opportunity that we have in Chicago is that the city has taken the lead…in publishing data about the city: business permits, food safety inspections, 311 calls, crimes,” Catlett said. “So for the first time ever, if you’re a social scientist, economist, somebody who studies cities, you can actually get real time data from the city of Chicago and begin to study what’s happening in the city right now, not what was happening over the last 20 years or so. The ultimate goal is to be able to ask the question ‘What should we do now?’ as opposed to looking back and saying ‘What should we have done 10 years ago?'”

In response to a question about the insights to be found in large datasets, Catlett also used a colorful metaphor: “As you get to volumes of data, you start to see patterns that you wouldn’t see if you were closer; in a similar way that crop circles aren’t visible if you’re on the ground, but as you get higher up you start to see them.”

The full video of the panel is available below.


Even the world’s fastest supercomputers need some time to prep themselves to join society. After eight months of construction and nearly a year of early research projects testing out its capabilities, the 10-petaflop IBM Blue Gene/Q system finally made its official public bow this Monday in a dedication ceremony at the suburban Argonne campus. At the event, Illinois Senator Dick Durbin said that the current fifth-fastest supercomputer in the world will allow Argonne and the United States as a whole to continue pushing the boundaries of science and reaping the benefits of research.

“Mira ensures the lab remains a linchpin of scientific research, enabling researchers to tackle extremely complex challenges ranging from improving combustion efficiency in car engines to modeling the progression of deadly diseases in the human body,” Durbin said. “High-performance computing is crucial to U.S. economic growth and competitiveness, saving time, money and energy, boosting our national security and strengthening our economy.  If the United States is to remain a leader in the 21st century, we need to continue investing in the science and innovation that will address our growing energy and environmental demands while building the industries of the future.”

The types of projects that will run on the now fully-active Mira demonstrate how the applications of high-performance computing are broader than ever. Beyond more traditional uses in cosmology and physics — such as a simulation of the universe’s expansion or climate modeling — Mira’s 786,000 processors will also be put to work on models of cellular and viral proteins and testing designs for energy-efficient engineering.

“As supercomputers continue to improve, so do the results. Faster and more sophisticated computers mean better simulations and more accurate predictions,” said CI Senior Fellow Rick Stevens. “Mira will help us tackle increasingly complex problems, achieve faster times to solutions and create more robust models of everything from car engines to the human body.”

For more information about Mira and the dedication ceremony, visit the story from the Argonne Newsroom or watch the video below.