Archive for the ‘Visualization’ Category



China’s Milky Way 2 supercomputer was recently declared the fastest supercomputer in the world by industry scorekeeper Top500, the latest move in the increasingly international race for high performance computing supremacy. Late last month, CI Senior Fellow Rick Stevens appeared on Science Friday, alongside Top 500 editor Horst Simon, to talk about why that competition matters, and what the global push for faster computation will do for medicine, engineering and other sciences.

“These top supercomputers are like time machines,” Stevens said. “They give us access to a capability that won’t be broadly available for five to ten years. So whoever has the time machine is able to do experiments, able to see into the future deeper and more clearly than those that don’t have such machines.”

The same time machine metaphor was also picked up by the University of Chicago’s profile of Mira, our local Top500 competitor, which was bumped down to #5 by the Milky Way 2’s top ranking. But there’s no shame in fifth-best, when fifth-best can run 10 quadrillion calculations per second — the equivalent computing power of 58 million iPads. CI Senior Fellow Gregory Voth is quoted about how access to such a world-class resource helps both today and tomorrow’s scientists.

“Having access to a computing resource like Mira provides excellent opportunities and experience for educating up-and-coming young scientists as it forces them to think about how to properly utilize such a grand resource very early in their careers,” Voth says. “This gives them a unique perspective on how to solve challenging scientific problems and puts them in an excellent position to utilize computing hardware being imagined now for tomorrow.”


The Data Science for Social Good fellowship has reached the halfway point, and the website is starting to fill up with interesting content about the projects. Some fellows have already produced tools for the community to use, such as Paul Meinshausen’s interactive tree map of the City of Chicago’s Data Portal. Instead of a cold, no-frills list of the datasets available for download by the public, Meinshausen’ s map uses color and shape to guide users quickly to the data they are seeking and make rapid comparisons about the size of the dataset. The visualization was popular enough that programmers in Boston and San Francisco quickly applied his code to their own city’s data portals, while another built a common map for every city that uses Socrata software to share its data.


Read Full Post »

applicant-mapWhile you’re planning for a summer vacation on the beach, we’re planning to host three dozen aspiring data scientists for The Eric and Wendy Schmidt Data Science for Social Good Fellowship. In just a couple weeks, 550 undergraduate and graduate students from around the world applied for the program, as visualized above. While the lucky 6.5% don’t arrive until early next month, the fellowship’s website launched today with portraits and Twitter/GitHub links for all the fellows, mentors and staff involved in this exciting effort. There’s also a debut post on the DSSG blog by organizers Rayid Ghani, Matt Gee and Juan-Pablo Velez, that nicely lays out the grand motivation for organizing this first-of-its-kind program.

By analyzing data from police reports to website clicks to sensor signals, governments are starting to spot problems in real-time and design programs for maximum impact. More nonprofits are measuring whether or not they’re helping people, and experimenting to find interventions that work.

None of this is inevitable, however.

We’re just realizing the potential of using data for social impact. We face hurdles to the widespread adoption of analytics in this space:

  • Most governments and nonprofits simply don’t know what’s possible yet.
  • There are too few data scientists out there – and too many spending their days optimizing ads instead of bettering lives.

To make an impact, we need to show social good organizations the power of data by doing high-impact analytics projects. And we need to expose data scientists to the problems that really matter.

That’s exactly why we’re doing the Eric and Wendy Schmidt Data Science for Social Good summer fellowship at the University of Chicago.

We want to bring three dozen aspiring data scientists to Chicago, and have them work on data science projects with social impact.

Be sure to browse through the fellows and watch the website for frequent updates as the fellowship gets to work this summer. For more on the concept of training data scientists to apply their talents to making the world a better place, read Chicago Magazine’s in-depth interview with Rayid Ghani, posted yesterday.

Read Full Post »

Newer, faster supercomputers have allowed scientists to create detailed models of blood flow that help doctors understand what happens at the molecular level. (Photo from Argonne)

Newer, faster supercomputers have allowed scientists to create detailed models of blood flow that help doctors understand what happens at the molecular level. (Photo from Argonne)

This week, some 25 cities around the world are hosting events online and offline as part of Big Data Week, described by its organizers as a “global community and festival of data.” The Chicago portion of the event features several people from the Computation Institute, including two panels on Thursday:  “Data Complexity in the Sciences: The Computation Institute” featuring Ian Foster, Charlie Catlett, Rayid Ghani and Bob George, and  “Science Session with the Open Cloud Consortium” featuring Robert Grossman and his collaborators. Both events are in downtown Chicago, free, and you can register at the above links.

But the CI’s participation in Big Data Week started with two webcast presentations on Tuesday and Wednesday that demonstrated the broad scope of the topic. The biggest data of all is being produced by simulations on the world’s fastest supercomputers, including Argonne’s Mira, the fourth-fastest machine in the world. Mira boasts the ability to 10 quadrillion floating point operations per second, but how do you make sense of the terabytes of data such powerful computation produces on a daily basis?

In his talk “Big Vis,” Joseph Insley of Argonne and the CI explained how he and his team has developed equally impressive visualization technology to keep pace with Mira’s data firehose. Tukey, a 96-node visualization cluster, is Mira’s sidekick, sharing the same software and file systems with its big sibling to more easily take in data and transform it into images. Insley demonstrated how visualization was instrumental in two major simulations conducted on Mira: one studying arterial blood flow and aneurysm rupture in the brain, and another on nothing less than the evolution of the entire universe.


Read Full Post »

Belshazzar's Feast (Rembrandt, circa 1635-1638)

Belshazzar’s Feast (Rembrandt, circa 1635-1638)

They say a picture is worth a thousand words. But if your camera is good enough, the photos it takes could also be worth billions of data points. As digital cameras grew increasingly popular over the last two decades, they also became exponentially more powerful in terms of their image resolution. The highest-end cameras today can claim 50 gigapixel resolution, meaning they are capable of taking images made up of 50 billion pixels. Many of these incredible cameras are so advanced that they have out-paced the resolution of the displays used to view their images – and the ability of humans to find meaningful information within their borders.

Closing this gap was the focus of Amitabh Varshney‘s talk for the Research Computing Center’s Show and Tell: Visualizing the Life of the Mind series in late February. Varshney, a professor of computer science at the University of Maryland-College Park, discussed the visual component of today’s big data challenges and the solutions that scientists are developing to help extract maximum value out of the new wave of ultra-detailed images — a kind of next-level Where’s Waldo? search. The methods he discussed combine some classic psychology about how vision and attention works in humans with advanced computational techniques.

As the centerpiece of the talk, Varshney displayed a 5 gigapixel photo of Mt. Whitney in California. If you knew what to look for, the amount of detail was incredible – Varshney could zoom in thousands of times on a given region of the photograph to show a group of hikers or a bear walking up the side of a mountain. But when you don’t already know what interesting information such a complex image contains, the search can be tedious and frustrating as you zoom in and laboriously check every individual pixel.


Read Full Post »

ModelHumans have a visual bias, even hundreds of thousands of years after our pattern recognition skills evolved due to prehistoric habits of hunting and predator avoidance. In a newspaper or a scientific article, a well-designed graphic or picture can often convey information more quickly and efficiently than raw data or a lengthy chunk of text. And as the era of data science is dawning, the interpretative role of visualization is more important than ever. It’s hard to even imagine the size of a petabyte of data, much less the complex analysis necessary to extract knowledge from the flood of information within.

Fortunately, scientists and engineers were studying this need for visualization long before Big Data became a buzzword. The Electronic Visualization Laboratory, housed at the University of Illinois at Chicago, has been active in this field long enough to have done special effects work on the original Star Wars. EVL researchers have pioneered methods in computer animation, virtual reality and touchscreen displays, and adapted those technologies for use by scientists in academia and industry. But in EVL director Jason Leigh‘s talk at the University of Chicago Medical Center on January 29th, the killer app he focused the most on was almost as old as those hunter-gatherer ancestral humans: collaboration.


Read Full Post »