University of Minnesota
School of Physics & Astronomy


Galaxy Zoo

Kyle Willett
Richard Anderson

Kyle Willett is a postdoctoral researcher working on the Galaxy Zoo experiment with Professor Lucy Fortson. Galaxy Zoo is a citizen science project in which hundreds of thousands of online volunteers help scientists sift through research data. The idea behind Galaxy Zoo and other citizen science initiatives is that there are certain types of identification tasks that it are very difficult to program a computer to do, but which people, even the general public with a small amount of introduction, can do readily.

Willett says that to measure detailed galactic structure, like the number of arms on a spiral galaxy, is difficult even for a sophisticated program. “The human brain is actually really good at this and it does not require a detailed level of training.” Without the help of volunteers it would take the astrophysicists in the Galaxy Zoo experiment decades to sift through the vast quantity of data they have from instruments such as the Hubble Space Telescope and the Sloan Digital Sky Survey.

Willett is using Galaxy Zoo to study how galaxies form and evolve as well as trying to understand the physics that govern the formation of galaxies. “We can now look at a wide range of populations, not just those that look like the Milky Way, that form stars at different rates.” Galaxy Zoo has more than one million images that can be used to measure things like the brightness and size of galaxies. One of the main drivers of the physics governing a galaxy is its morphology or the shape. Does it make a difference in star production if a galaxy is elliptical instead of a disc with spiral arms (like the Milky Way)? The Galaxy Zoo project started at Oxford, and from there has grown into more than thirty web-based citizen science projects. The Galaxy Zoo project was followed by a second phase, Galaxy Zoo 2, which looked at those same galaxies, but asking detailed questions like, “How many arms does the galaxy have? How round is it? How large is the bulge in the center?”

One of the major projects Willett has done at Minnesota was to process and catalog the galaxies in Galaxy Zoo 2, taking 60 million classifications from users and reducing it to a useable format. Each galaxy is classified by 40-50 people and it was necessary to find the correct way to add those together and control for incorrect classifications--users who consistently disagree with the majority. Galaxies that are further away are harder to classify, so Willett also has to correct for that. The data reduction amounts to a lot of time spent in front of the computer with huge data tables, containing millions of rows. For example, the simplest correction is the distance effect. Willett looks at properties of galaxies in nearby galaxies such as their relative populations and extrapolates for those far out in the Universe. Local galaxies provide the level of correction.

Willett is working with PhD student Melanie Galloway galaxies that have a bar shape and studying whether this bar has an effect on the galactic nucleus. Astrophysicists think that most galaxies have massive black holes at the center. While most (perhaps all) galaxies have this black hole, only a small fraction are in an “active” state, meaning that the nucleus emits radiation by the heating of gas and dust. “How do we get gas from the outer parts of the galaxy to the center to ‘feed’ the black hole? Maybe the bar is the structure that is feeding the black hole,” Willett says. If so, there should be more barred galaxies with active black holes. Using Galaxy Zoo 2, they now have more 10,000 examples that can be studied. When they looked at galaxies of a particular color and mass, they found that bars are not the only mechanism feeding the nucleus, but they likely play an important part. The large sample provided by Galaxy Zoo is the only way to accurately make this measurement.

Willett says a next step for Galaxy Zoo is to supplement the classifications made by volunteers with automated measurements. This is important because it will not be long before the amount of data retrieved in telescope observations far outstrips the output of a few hundred thousand online volunteers. The data generated by the volunteers will still be essential, though, and used to help accurately train the computer algorithms trying to perform these automated classifications.