Extra Grind The Blog of Gregory Hubacek
Sun, 2011-02-20 19:08

Recently, I had the opportunity of working with GOOD magazine on a few infographics. It was an awesome assignment, and something I've wanted to do for quite some time. The Transparency series by Good is an awesome source of professionally wrought infographics that I've always admired. The first three assignments dealing with Obesity, Drunk Driving, and Media Coverage went mainly without a hitch.

In January, GOOD and I started working on our fourth infographic, which was to focus on data from the American Community Survey (US Census). In all transparency, it was basically up to me to determine how to represent what data. Ultimately I decided to focus on three elements - High School Graduation rates, Bachelor's Degree rates, and Median Household Income. The idea was to show a correlation - if any - between the various levels of education and financial compensation.

One of the first things I noticed about the data was that it was represented by counties, which is an incredibly specific geographic measurement. Immediately I made the decision to attempt to represent that wide range of data in the final product. While this offers its complications, I believed that it was important to maintain the nature of the census. The one thing going for me was that all the ranges were already broken out into a map mode on the census website ( here's a link to the Median Income map for instance). By downloading these maps and chopping them apart in Illustrator, I began changing the color values of certain maps and overlaying them to see the relationships.

See the fullsize final graphic here

After completing the graphic, I took a look at it and immediately discounted the entire thing as too difficult to read. However, while attempting to prove this to myself, I found that I was able to pull out fascinating correlations, thus talking myself out of my original conclusion. There are a few keys to understanding how exactly this map works.

Value Ranges

Generally, the darker an area, the more of all values it has. The lighter an area, the lesser of all values it has. Pretty simple back and forth.

Color Shifts

An area that exhibits a color shift to one area or another would imply a bias in the data toward that value. For instance, a pink area would indicate higher levels of high school graduates with fairly few college graduates and low incomes. A blue county would have a high income level despite low graduation rates.

Color blends

An area with a color blend would show a relationship between the two higher values. For instance, a green county would have high levels of college graduates and high income rates while suffering from low high school grad rates. An orange or red county would show high levels of education and low compensation. Purple counties lack college graduates while having high incomes and high school graduates.

Current Population Only

I went to high school in Cedar Rapids, Iowa, attended college in Minneapolis, but currently live and work in Philadelphia. In this map, all my data would be attributed to Philadelphia. This is a result of the data being taken from the census report. So what this map is showing is the values of people currently living in that area.

High School vs. College

Another questionable relationship could be interpreted between the high school and college graduation rates, Notice that the range for high school runs from 60% at its lowest value to 92% at the highest range. The college graduation ranges from 10% to 45%.

The fact that the college and high school rates are adjusted for their highest and lowest values neutralizes the population loss between a diploma and a degree. It puts all values on a curve.

It's a lot to keep in mind. But, given this, allow me to highlight some areas that I thought showed fascinating results:

In general, there are darker ranges in areas with large cities. Seattle Washington, Portland, Oakland, San Francisco, Denver, Minneapolis, etc. However, if you look closely at the Eastern United States where cities are a bit older and counties a bit smaller, you'll find a weird effect around some cities.

Notice the donut-like appearance of cities like Richmond, Baltimore, Philadelphia, etc. These areas are showing the difference between city and suburbs. In these rare areas where cities and suburbs are carved into different counties, we can see the variation that's usually just averaged across a single county. Aside from these anomalies, very rarely are there sweeping changes between neighboring counties. That is, unless you're colorado.

I mean, just look at that. The variation between color is almost as much as the variation between the flat plains of Eastern CO and the Rocky Mountains (or as much as the political leanings of Boulder and Colorado Springs). Colorado, you so crazy.

California is generally well-paid. Aside from very Northern california, look at the swaths of green, blue, and purple - all shades that represent high income rates. Go west, young man.

One of the main pieces of criticism leveraged at the map has been the inclusion of both high school and college grad rates - as they essentially measure the same thing. NY is a great place to show how the values can speak to something more. Notice in the numbers, an 84% high school grad rate along with 57% receiving a degree. While it would seem normal, it's actually only slightly better than average when it comes to high school, while 57% of the population with a college diploma is actually one of the highest rates anywhere. Keep in mind the curve-adjusted relationship between college and high school, along with the current residents only rule. NY, specifically Manhattan displays a high level of both income and college graduates, one could guess this would be due to the high number of transplants along with the highest property values in the country. What this begins to reveal is a massive class difference with the overall high incomes averaging the county out of the stratosphere while local high school graduation rates are suffering.

Overall, I'm really proud of this graphic. It shows the information in a way that doesn't attempt to over-simplify the situation, or speak down to the audience. One thing that always bothered me about a lot of infographics is that they are essentially just an illustration of the data. This graphic uses the data to make aesthetic decisions, allowing the colors to be dictated by the values.

So what did others think about it? Well, surprisingly there was a bit of attention around it. Fast Company ran an article on it calling the infographic "actually quite unintuitive, but once you get the hang of it, it's amazing", adding that it was "grim to see just how much gray there is." Visualizing Data called it "a design which demonstrates an innovative approach to representing three variables of data overlayed onto a geographical landscape." Meanwhile, Flowing Data asked the question whether or not the graphic was to confusing. Adding "My first reaction was, gosh, that's a lot of colors for my brain to process. But is there useful information to glean from map? What's going on in the light pink and grayish areas in the southeast? What's with all the green on the California coast?"

However, it was far from a glowing response.

"Ug. I'm sorry. My brain can't translate "Color" into a mathmatical strength. It's kind of hard." - Michael Gray

"Ugly and poorly presented. Sorry Greg." - Mike

"This graph makes no sense! YOu guys are ranking in useless info right up there with the half baked research that eHow does! Shame it's not better then a D-! " - Robotbound

"This. Is the WORST infographic of the century. " - Guest

"This is too hard to read quickly, which I thought was the point of infographics. A rare failure for GOOD." - Guest

"This is totally incomprehensible. If a chart takes more than a minute or two to comprehend, it has very limited value. A written paragraph describing the relationship would be better." - Another Guest

"I'm not trying to be overly critical or mean spirited when I say this, but I think this is practically unreadable. I'm not able to decipher much of anything from this as it's just way too complex without REALLY studying it, and even then I'm not entirely certain if I'm interpreting what I'm seeing correctly. You're making me work to hard for the payoff. " - John Telford

"Tufte’s head just exploded at the site of that." Anna O'Brien

Oof. There were lots of positive comments as well, but these are a lot more fun to post. Ultimately, I stand by one of my earlier responses to the graph, saying that "The concept of the map was not necessarily to show exactly what each county had in terms of numbers, but rather to paint a picture showing the ups and downs in relation to the entire US. In short, we sacrificed specific data to show trends across the country. Admittedly, reading this map requires a bit of color theory. A point we deliberated on quite some time before determining that it should be plenty legible considering GOOD's audience. Any time you try to show five ranges of three different data sets across 3000+ counties in one graphic it's going to get a little hairy."

I guess you win some, you loose some, but overall it's really interesting to see the infographic community get so wrapped up in defending something one way or another. There's lots of opinions on it, and I wouldn't do it any other way in hindsight.