Color Psychology

As I usually do in my workshops, I talked to a group in Warsaw, Poland about how we should use color intentionally in our data visualization and that, in fact, the color choice itself can help us tell our story. I prepared a little activity around this issue, in which I asked them to generate a list of the words, phrases, feelings, memories, etc that come to mind when seeing this color:

Go ahead and try it for yourself. Try to come up with 3-4 things in your list.

The participants in my workshop said things like easy, energetic, and too bright.

And, go figure, its the color I pulled from a common Polish convenience store logo:

Convenience stores are meant to be exactly those things: easy, energetic, too bright. Makes sense.

And that’s the field of color psychology, which studies how humans have associations with specific colors. But it is tricky because associations with colors are shared, culturally (as in, green means growth to many in the US), and also incredibly personal and unique (but in Lansing, green means Michigan State Spartans). It can be a difficult area to gather data and, as a result, difficult to use color to support your story in your data visualizations, particularly if your audience is global.

I anticipated that there might be some cultural differences regarding this yellow, so just before my workshop, I posted this color to Twitter and asked people to reply with whatever comes to mind for them. People chimed in from around the world. Now, culture is not necessarily geographically-bound, but here’s a map of your associations with this particular shade of yellow.

Obviously, this isn’t enough data to state anything with any kind of certainty. But some interesting things popped up:

Denver folks mentioned a children’s game and characters from a kid’s movie. Must be a fun place to live.

It meant sunshine and warmth in places that don’t really get much of that, like Portland and Boston, and in places that have it to excess, like Tucson.

Minnesota brought up yellow snow. Makes sense.

Nebraska folks mentioned a kid’s TV show and a school bus. When I clicked back on their Twitter accounts, they both worked in education.

DC and London were the only places that just said, straight up, “yellow.” Go figure.

While it generally was associated with happiness all over the globe, there were places that said the opposite, such an envy in Kuwait and cowardice in southern California and illness in Kentucky.

What else do you notice in these responses? How do your own associations fit in this map?

I don’t know what to make of all of this, exactly, other than to say that while there are some shared cultural associations (happiness) there are plenty of associations that counter that (painful) to warrant us being very thoughtful about which colors we select and apply to our reports, slides, dashboards, and graphs. As tricky as it is, anything will be better than the Excel default colors.

Investigate some of the common (to the Western world) associations with different colors and pair these up with the colors in your logo or mandated color scheme so you know what to use, when.

And throughout the writing of this entire post, I had this song stuck in my head.

https://www.youtube.com/watch?v=qGXafeBeZcY

 

Color is an entire chapter in my latest book, Presenting Data Effectively, 2nd edition. I talk at length about how to choose the right colors and apply them to your work so your story is clear.

Easy, Simple One Page Handout

Few things are more tragic than excellent non-profits doing great work to help struggling families but can’t tell their story effectively. In this blog post, I’m going to step you through the redesign of a one page handout I created for my clients at the Education Development Center.

Their Michigan team was trying to talk about the clients they serve, using a page layout that looks like it came right out of the bowels of Microsoft Word.

I started by focusing in on the title. This currently isn’t telling me much. Are we talking about the impacts of service on these families? Are we talking about their demographics? Also, let’s swap out the unfriendly “FY16” for something that is more plain language. The new title is already giving out important details and explaining that the rest of the page will be demographic descriptions of the families.

Also note how I’ve already swapped out boring default Calibri for a stronger font (Segoe UI Black). I took out the italics, which actually typically make text look lighter and deemphasized, and left justified everything since that’s where we start reading in Western cultures.

Next I tackled the gender section. It took me a moment, but once I started really comparing their numbers, I realized that while gender is pretty evenly distributed among children, EDC is serving far far more women than men. That’s a huge story that’s lost in the current format (and the icons definitely aren’t cutting it).

In the new version, I used the precious title space to tell that story directly.

Their next section covered race and ethnicity, though there was no title to say as much.

So I added a title that spoke to the fact that most of the adults and children they serve are Black/African American.

The bottom of the handout showed age ranges, this time with a title “Age” for adults but no title for children.

The new title is more descriptive about the ages of their clients (and you can see I’m planning to do some color coding in my graphs, rather than color coding by page background).

Finally, in the middle of the handout, they were squeezing in data about education, throwing off the layout of both the ethnicity and age graphs. So I reordered the sections and added a new, descriptive title.

At this point, the titles of each section are going a long way in telling clear stories about the families served. They also act as section placeholders, marking off the handout into distinct areas that will hold the graphs. So let’s revisit those graphs!

Icons and large numbers can sometimes work in data visualizations. But now that we know, from the title/our interpretation of the data, that there’s a big gender disparity among children and adults, we need a better way to visualize that comparison.

Stacked bars will work well here. They show part to whole pretty well, especially when there are only 2 categories. And they fit into a small space, which comes in handy because I’m going to massage a lot of data into one page. Notice I started color coding adults as darker blue and children as a brighter blue.

Next I tackled the race bar charts. In the original, the bars were 2 different sizes, both in terms of their y-axis scale and the bar width – these are the things of kinds that will make a reader wonder what’s going on. Does the varying width mean something or is this just a lack of attention to detail (and if we lack attention to detail here, where else is it occurring in this organization)?

In the new bar charts, I ordered the race categories greatest to least, carried over my color coding, and made the bars much thicker, for a stronger visual impact. They are the same width and on the same scale. I also swapped columns for bars so that my race category labels could stay horizontal and on one line instead of wrapping onto 5 lines like the original.

For the ethnicity graphs, I switched out the thin doughnut charts for another set of stacked bars, which again take up MUCH less space on the handout and better let us compare children to adults.

Since age is age, no matter whether it is ages of adults and children, I put the two age datasets in the same graph type. It didn’t make sense to me to have one that was a bar chart and another that was a pie. Check out how I used my color coding to highlight certain sections of the age graphs that pointed out the primary clientele.

And because these age data are so closely related, I put them on the same scale.

Lastly, I added a stacked bar to the education data that previously had no graph at all. I kept their spirit of icons but relied on the new built-in options from PowerPoint. Again, color coding helps make it clear that this data is referring to adults.

AND I got rid of the page border, which was eating up some of my space.

Things I didn’t do: Leave room for their contact information and logo, though we certainly could. I definitely didn’t worry about the fact that this is just a bunch of bars and people might find it boring. I also didn’t communicate the fiscal year.

Creating the new version probably took just as much time as the original. This isn’t an issue of time. Or fonts (I used defaults). Or software (I used PowerPoint). It’s a matter of some education around presenting data effectively. That’s what I do in my workshops and my books. My clients have already incorporated this redesign into their reporting for this page and others and now they are leading with the story in their data.

Interactive Heat Maps

guest post by Jenny Lyons

One of the most common ways to analyze qualitative data is through thematic coding. Thematic coding allows us to move beyond the visualization of word or phrase frequency, like in a word cloud, and start to examine attributes and stories that emerge from the data. The New York Times has some of the best interactive qualitative visuals out there. Check out this example. Its clickable and lets you take a deeper dive into the candidates you are most interested in. These are awesome and beautiful but the reality is that most of us do not have the budget or expertise to build something like this. What if I told you I found a couple solutions?  I am going to show you how to build interactive heat maps in programs I guarantee you have on your computer right now (PowerPoint, Excel, and free Adobe).

Static heat maps are super easy to make. Usually, you have your themes across the top of the table in columns. Then, you have respondent interviews in each row. You then figure out a way to code each interview per theme. You could do something like: “met/partially met/not met” or “Important/kind of important/not important.” In the example I am going to showcase today, I used “love it/ehhh/yuck.”

Let’s say I did some interviews with researchers asking them about ways they visualize qualitative data. I asked about different chart types and coded their responses on whether they love it, feel ehhh, or if it is yuck.

This static heat map does a good job of taking all that complex qualitative data and simmering it down, but it’s never enough. It boils all my rich, meaningful data into three vague buckets. I find myself wanting more: why do they dislike it?  What makes a certain chart type so great? Of course, the answers to these questions can usually be found in the text-heavy report that accompanies this visual. Right now, I am not feeling very motivated to read the report. We need to do a better job at drawing our audience in by engaging them with the raw, meaningful data that underlies this visual. We need some of the New York Times hipness. I am going to show you three ways to make a heat map interactive, where background quotes behind each color-coded cell pop up. I am going to do this in PowerPoint, Excel, and Adobe Acrobat (the free version). All three carry their own unique advantages and disadvantages. Note: I used the same design in all three mediums. This is where we are headed:

PowerPoint Interactive Heat Map

  1. Design the structure of your heat map structure, all greyed out.

Do not feel intimidated by this design. In PowerPoint this is made with textboxes and square shapes. It is just a bunch of squares all aligned. When making this, the align tool in PowerPoint will be your best friend. If you are not familiar with that tool, then check this out and your mind will be blown (no more endless nudging).

  1. Once you have your basic structure down, we need to name our legend objects. To do this, start by clicking on your “Love it” square. Go to the Format tab and under the Arrange section, click on Selection pane. In the window that pops up, the rectangle shape selected will be highlighted saying “Rectangle #”.

Double click on the name “Rectangle 69” and rename it “Theme 1” or “Love it.” You can name this whatever you want, but you just need to remember the name for a later step. Repeat this step for all legend squares.

  1. You want to make sure all those grey boxes are grouped into one object. Select the entire object (all grey squares), copy, and paste it. Align it exactly on top of the other grey squares, so when looking at it, you cannot tell there are squares piled on top of one another. Ungroup the grey ones on the top. First start with the Love it squares. Color code all of them dark blue. Insert a textbox with an active title at the top. Make the background of the title text box white.

Now, select the title and dark blue “Love it” squares and group them.

  1. Next, you will do the same thing for the Ehhh theme. You must insert another textbox title overtop of the Love it title you just made. Make sure the background of the Ehhh title text box is white so that it completely covers the Love it title. Group this text box title and the medium blue squares together.

  1. Do the same thing for the last light blue group. Note, make sure all the textboxes for the titles all have a white background and are the same exact size. They should be layered on top of one another.

  1. Now, click on the dark blue Love it squares/title group and bring it to the front. Once it is to the front, while selected, go to the Animations tab. Click “Appear” under Animation. Then under the Advanced animation, click on Trigger, “on click of”, and select “Love it/theme1.”
  2. Then, while this group is still selected, you will select “Add animation” and click on “Disappear.” Then under the advanced animation, click on Trigger, “on click of”, and select “Love it/theme1.”  That tells PowerPoint that when the “love it” legend square is clicked, the Love it heat map group will appear, and when it is clicked again it will disappear.
  3. Repeat steps 6-7 step for Ehhh and Yuck. Make sure you link them to the correct trigger button.
  4. Test this out! Go into presentation mode and follow the instructions. Click on the theme legend Love it. It should do this:

Each time you click on a legend, it gets added to the heat map. You can unselect it by clicking on the legend button again.

  1. Now, we need to add the quotes in the background of each square. To do this, we are going to be using the hyperlinks feature in PowerPoint. These steps will need to be repeated for each square inside the heat map:
    Right click on just one square. Go to Link- Insert link. This box pops up:

Make sure “place in this document” is selected under link to on the left. Then select the slide where your heat maps lives (probably slide 1). Once that is selected, click on the “ScreenTip…” button in the upper right-hand corner. Type your quote here. (“I love theme maps so much, they help organize all my data and make it easier to comprehend and make meaning.”)
Now, when you are in presentation mode, you hover your mouse over that square and the quote pops up- yay!

  1. There are a couple more things I recommend doing to make this document more user-friendly. As you can tell, the only way to make this interactivity work is if you are in presentation mode. When you are in presentation mode, do you see how if you click on the three legend buttons it all works well, but if you accidentally click on one of the heat map squares, it goes to a black “end slideshow” screen. We don’t want that to happen. We want people to be able to play around in here. To make the screen not go black, go to the Transitions tab at the top. Over on the right under timing, do you see where it says Advance Slide and there is a check by “On Mouse Click”. Well, unselect that.

There are a couple of other things that make the usability of this file challenging: 1) people can edit this. When they first open the document, it is in the edit window, not presentation mode. And 2) It is annoying that you have to instruct people to put it in presentation mode. Some people (non-techy folks) might get confused. We can avoid these two issues by saving this as a different file type. Only do this step when you are 100% done editing. Go to file and save as. Under the file type dropdown, change it to “PowerPoint 97-2003 Show (*.pps)”. This will save it as a presentation-only document. The moment your client opens the file, it will be in that interactive mode and they will be unable to make edits.

Cool, right!?  I will give you a moment to get over the wave of awesome that I hope you are feeling right now. Let’s talk pros and cons:

Pros:

  • We can make buttons that show/hide parts of the heat map
  • Our titles are interactive depending on the group of the heat map we show
  • Quotes are easy to add as hover text

Cons:

  • The quotes are kind of boring. You cannot edit font, typeface, or color. Also, you cannot add photos next to the quotes.
  • The hip interactive features do not transfer when you PDF the file

The best time to use this method is if you are building a visual 1-2-page document, dashboard, or infographic and you want to include an interactive heatmap. Yes, the final product will not PDF, but you can save it in the secret .pps way to make it un-editable and immediately jump to the presentation/interactive mode. This technique can easily be paired with static text, statics visuals, or even other interactive visuals using the same technique.

Excel Interactive Heat Map

  1. Design your heat map in Excel. This is the same design as the PowerPoint one except the matrix is not made up of square shapes, it is composed of cells. Edit the height and width of your cells to get them the shape you want. Add a thick white border around the cells for the matrix. Ta da! A nice heat map.

  1. Now, design a quote+picture combo. I designed these in PowerPoint. There should be an image/quote per matrix square. Here is my Love it example:

  1. Back to Excel, you are going to click on the cell where you want to insert a quote. With the cell selected, go to the Review tab. Under Comments, select “New Comment.” A light-yellow box pops up. Click in the comment box. Then, move your cursor to the border of the box until it looks like crossing arrows, then right-click. You should see an option for format comment.
  2. First, navigate to colors and lines. Change the border color to none. Under fill color, choose fill effects. In the new window, click on picture, select picture, and then find the one you saved in your file. Make sure you check the lock aspect ratio at the bottom of the window so the picture doesn’t warp.
  3. Now, with the comment box selected, drag the corners of the comment box to your desired size.
  4. As you can see it probably says, “Your Name:” so let’s get rid of that. Click on the cell where the comment is, go to Review and edit comment. Now just double click in there and delete all the text.
  5. Now, you’re done! Repeat this step for every cell. When your mouse hovers over that cell, this is what it looks like (when I hover over Heat maps,a):

This Excel interactive heat map is fun and versatile!

Pros:

  • Quotes are easy to add as hover text and they are visually engaging. We can add color, font, and typeface to match our branding and heat map color codes. We can also add photos.

Cons:

  • The comments to not transfer to PDF.
  • There are ugly red triangles in the upper right-hand corner of every cell with a comment. If it bothers you a lot (like it does me) you could manually insert little colored squares overtop of them in the upper right-hand corner of the cell.

The best time to use this technique is when you want to share this as a standalone visual. You can lock the Excel sheet so no one can edit and have people explore your data and get to know the themes on a deeper level. This is also an easy addition to a dashboard created in Excel.

PDF Interactive Heat Map

When creating a PDF interactive heat map, start by getting it all designed and 100% ready to go. I made the version in PowerPoint. Then, just save as a PDF.

  1. Open the heat map PDF. Make sure you have all the quotes for each square in the matrix ready to copy/paste into the PDF. At the top of the tool bar, you will see the “add sticky note” button. Click on that.

  1. Click in the middle of the matrix square where you want to place a quote. When the comment box opens, right-click and select Properties. For the icon, make the color white and the opacity 0%.
  2. Under the general tab, for author remove your name and leave it blank. Select ok. Now, paste in your Quote in the comment box and click post. You can edit the text in Word to be whatever font/size you want and it will keep the font when you paste.
  3. Now, right-click on the comment box again and lock it.
  4. Repeat this step for all the squares in the matrix. When finished, you should be able to hover over the square and the comment will pop up!
  5. Done! This is a simple, easy way to embed raw data quotes in your finalized PDF deliverable.

The PDF interactive version is the one with the least frills. You can’t add pictures or interactive buttons, but since most of our deliverables are sent to our clients in PDF versions, this technique is extremely applicable.

Interactive heat maps give us both a visual synopsis of qualitative themes and allows us to highlight our raw data without bogging down the visual. We get more depth and interactivity. Try these formats out and let us know how it goes.

We have video instructions on this process in our Academy and Graph Guides programs. Learn it in Tableau or R, too!


Learn in the Academy!

You can find step-by-step instructions on how to make 60+ awesome visuals in my Evergreen Data Visualization Academy.

Video tutorials, worksheets, templates, fun, and a big-hearted super-supportive community. Learn Excel, Tableau, R or all three. Come join us.

Enrollment opens to a limited number of students only twice a year. Our next enrollment window opens April 1. Get on the wait list for access a week earlier than everyone else!

Master Dataviz with Graph Guides!

Our newest program, Graph Guides, is a custom-built, year-long sprint through 50 Academy tutorials.

When you enroll, we’ll assess your current data viz skill set, build you a customized learning path, and hold your hand as you blaze your way to new talents.

We open enrollment to 12 students at a time and only twice a year. Get on the waitlist for early access to our next enrollment window.

The 1-3-25 Reporting Model

It’s time to talk about how a highly visual, well-formatted recommendations page doesn’t have much impact when it is buried on page 104. This is how we make reporting less cumbersome, particularly in a digital reporting era.

Of course your reporting will probably include a slidedeck. I mean, you could totally give a talk with no slides. People would look at you. And you would be awesome. But most of the time we have a slideshow, designed with the principles I’ve been discussing in this blog, all my books, and in every workshop for years. The idea of the presentation is generally to spark conversation and get people interested in learning more about your ideas, Rockstar.

So what next? Now that we have piqued their interest, we – what? – toss them a 300 page report and wish them luck? Sounds like a terrible way to foster engagement.

Instead of giving people a new doorstop, we can extend their engagement with a handout. Something short and sweet, like the stuff I discuss in my new book. In fact, this is where the 1-3-25 reporting model comes in handy. The 1-3-25 model suggests that our reporting include a 1 page handout, a 3 page executive summary, and a 25 page report. In each of these layers, readers gain more and more detail. They can stop anytime, having already gotten the high points from us. But it provides a scaffolding toward learning, in which each step helps the reader learn a bit more without being completely overwhelming.

This example comes from an Australian University research department that requires all researchers to publish by this model and makes templates to fit it. Smart! (See their package at http://rsph.anu.edu.au/research/projects/2015-extension-overcoming-access-and-equity-problems-relating-rural-and-remote-phc)

Of course its hard to squeeze your work into just 25 pages, especially when you include graphics and data visualizations. So you’ll need an appendix and this is where you can put things like your logic model, methodology, and p values. And it can even be a separate document that you just link to from the others.

So let’s talk about that 25 pages and what that’s going to look like.

We normally go about structuring our reports (and presentations and posters) like this:

Background
Literature Review
Methodology
Discussion
Findings
Conclusion

It feels serious and logical and rigorous. Does it look familiar? Probably so, because it’s the basic format for a journal article. It’s just that a journal article is not the same dissemination forum as the work many of us do, where our charge is focused on being useful to decision makers and we are paid to provide them actionable information.

Jane Davidson wrote a life-changing article on this matter, where she reorients us toward truly user-oriented reporting, in which we do not make the reader wait until page 104 to get to the good stuff. Today’s readers just don’t have the patience for it. They might flip through to glean highlights but few read, word for word, something so long and tedious. In fact, there’s a hashtag just for this scenario: #TLDR, which means Too Long, Didn’t Read.

The revolution in reporting is simple: Arrange the sections of the report in the reverse order. Report the findings and conclusions first. That’s what people came to learn, so give it to them. If they are satisfied, you are done! If they have questions, you have explanations, because that’s your discussion, methodology, literature review, and background. Reporting in reverse values their time. It means the C-Suite members of your audience can go on with making strategic decisions with your findings and the few statisticians in the room can hang out til the end and talk nerdy with you. Reporting in reverse puts the audience first.

This blog post is an excerpt from my new book, Presenting Data Effectively, 2nd edition. The second edition of this bestseller is in full color and includes much more on reporting for a digital reading culture. Order now

The Wrong Kind of White Space

Long-ish reports are probably never going away entirely, so let’s make them suitable for a digital reading age.

In the olden days, when we printed reports, they often had extra blank pages at the front and back. It probably gave printed materials a sense of refinement or maybe its used to build anticipation. But in PDFs, this doesn’t work out well. What happens when you open a PDF, start scrolling, and encounter blank pages? You probably think, oh this thing is still loading. So you go check email. And you never come back!

Those folks who choose to print also get annoyed by having to waste paper on blank pages. That’s some seriously unnecessary white space.

My clients in North Dakota had been adding way too much white space to the front of their report (*HAD* been – their new stuff is better!). First there’s a cover page. Ok.
Then a blank page. No.
Then another cover page. No.
Then some copyright information that could be a sidebar or a part of the following page, which is the Table of Contents (Sure.).
Then there’s a page of decoration. No. This could be a cool sidebar elsewhere in the report.
Then a two page introduction that could be condensed to one page.
The real content doesn’t start until page 9, out of a 30 page report.
No one wants to scroll past all of that dead weight to get to the good stuff.

The end matter of the report is equally wasteful. While a Resources page is nice, the References and Acknowledgements and Future Cancer Care pages could all be sidebars elsewhere. The blank page has to go. The contact information on the final page could also become a sidebar. In a digital reading culture, we have to eliminate all this extra white space.

Same rule applies for those blank pages in the middle of a report. They are usually inserted, like in books, so that new chapters start on the right hand side when a report is book bound. But who binds reports anymore?? Even worse are when those blank pages say things like “This page intentionally left blank” which means I can’t even reuse the paper. Annoying your readers does not turn them into fans or advocates.

Let’s start saving time, attention, trees, and sanity.

This excerpt comes from Chapter 5 of my latest book, Presenting Data Effectively, 2nd edition, out now! In the new edition, I discuss all kinds of easy-to-implement ways to talk about data in reports, slides, and graphs. Order now.

Adding Confidence Intervals to a Dot Plot

Evergreen Data Visualization Academy member Dana McGuire recently wrote me to ask about her dot plot. She said, “Would there be a way to show the bar or confidence interval somewhere? I have gotten positive feedback on the look of the graph overall, but it is a scientific conference with recommendations to add error bars.”

As you know, the decision to add things like confidence intervals or standard deviations to a graph should be carefully considered. It’s audience-dependent. And Dana’s right – they are probably needed for a scientific conference. So here’s how to do it.

First, make a dot plot in Excel.

Then, click in the graph so it is active. You’ll see a tab called Chart Tools. In the Design tab, look alllll the way over on the left for a button called Add Chart Element. Choose Error Bars from that list. If you are in an older version of Excel, you won’t see this. Instead, you’ll see a tab between Design and Format that’s called Layout. In that tab, you’ll see the button to add Error Bars.

You like to think that’s it, right? So easy. However, since we made the dot plot based on Excel’s scatterplot, which has x and y coordinates, Excel added error bars in both directions. Yuck. Just click on the vertical ones and hit the delete key on your keyboard. We don’t need those.

The leftover horizontal error bars remain. We could stop here. But by default those error bars are black. They contrast a lot against the white chart background, making them leap out, more so than the dots themselves. And in the order of importance here, the error bars aren’t #1. So let’s format them.

In the Format Error Bars box that popped up for you, click the radio button by No Cap – that’s just extra clutter we can eliminate.

It would also be a good idea to lessen the high contrast. Click on the paint bucket icon (or, in older versions of Excel, look for Line Color). In the Line menu, choose a gray and make it very thick, like 15 pt. It’ll look more like a gray band, hanging out in the background, where it belongs. Some of the gray bands are going to overlap, since the values are so close together. I decided to make the gray bands 30% transparent, so I can see where one ends and another begins. Adjust transparency in the same menu as the line color.

Personally, I had trouble clicking on those little buggers. The shortcut trick is to right-click on anything and choose the Format option. In the box that pops up, you can work with the dropdown menu to select and modify anything in the graph.

In Dana’s original graph, the bands were a bit hard to see because the dots were so large. The little smidge sticking out would probably be ok but if you want to see more of the confidence interval, make the dots smaller, like 10pt, and use an x axis.

Indeed, once the x axis is in there, its pretty easy to see that we don’t actually have to start the graph at zero. These are dots, after all. They represent the data by their position. So the axis can begin at something closer to the actual data. In the graph below, I adjusted it to 30, making the confidence intervals spread out a bit. 

Now Dana has options! Rad!

And how did Dana get these redesign options from me? Well, she is enrolled in the Evergreen Data Visualization Academy, where I hold monthly office hours webinars in which people submit data ahead of time and I walk through their options and solutions. Just one of the many benefits of joining the Academy or Graph Guides program.


Learn in the Academy!

You can find step-by-step instructions on how to make 60+ awesome visuals in my Evergreen Data Visualization Academy.

Video tutorials, worksheets, templates, fun, and a big-hearted super-supportive community. Learn Excel, Tableau, R or all three. Come join us.

Enrollment opens to a limited number of students only twice a year. Our next enrollment window opens April 1. Get on the wait list for access a week earlier than everyone else!

Master Dataviz with Graph Guides!

Our newest program, Graph Guides, is a custom-built, year-long sprint through 50 Academy tutorials.

When you enroll, we’ll assess your current data viz skill set, build you a customized learning path, and hold your hand as you blaze your way to new talents.

We open enrollment to 12 students at a time and only twice a year. Get on the waitlist for early access to our next enrollment window.

How to Show Ranking Data in Excel

Danielle, a member of my Evergreen Data Visualization Academy, submitted a question to our monthly Office Hours webinar about how to show ranking data. She sent this survey question:

There are eight categories below.  Rank each item to continue to the next page.  Rank the following services in order of preference (most preferred item on top), where 1=most preferred to 8= least preferred.

Service A __
Service B __
Service C __
Service D __
Service E __
Service F __
Service G __
Service H __

and she asked:

I use Qualtrics, and right now this survey has collected approximately 2,900 responses.  I will be asked to provide overall results for this question to the stakeholder. How can I show that overall, people assigned a specific rank to a specific service?

Ranking data can be tricky to know how to visualize because it is but isn’t parts of a whole. The data will likely arrive from Qualtrics in a table, where each row sums to 2,900 (assuming a perfect world with no missing data).

Only Qualtrics will probably sort the data in the order the services were listed on the survey and I re-sorted them here to run from least to greatest on the #1 ranking.

The solution to knowing what type of visual is best is to think about what your audience will want to see. Those stakeholders will come at you wanting to know which service was ranked highest. So a simple bar chart, ordered greatest to least (that’s why I messed with the table) will be the clearest visual. Note that I’m only graphing the first two columns in the table – just the service names and how often they were ranked #1.

Some stakeholders might want to know a little more – like top 3 ranking – but a bar chart is still probably the simplest solution here. (And once you settle on bar chart, you can always branch out to a dot plot or a lollipop graph).

This is about as far as you can take the data, given the one question we see here on a single survey. But if you were to ask this question to the same group of people over time, you’ll be able to tell a story about the changes in the highest ranked services and a bar chart will not work any longer.

I made a new table for this ranking data, intended to show the ultimate rank in each year (not the total number of survey responses or really anything from the original table). But if you look carefully, I’m not showing the actual rank – I’m showing the opposite. Service E was the lowest rank in the graph above (let’s say that was 2014 data) and in the table below, Service E is listed as rank #1. This makes sense once we graph the data.

I just highlighted the table and inserted a simple line graph. Excel generates a line graph with a y-axis that runs from 0 at the bottom to, in this case, 9 at the top. So by reversing the ultimate rank in the table, Service E appears at the bottom of the graph in 2014.

Then I modified the y-axis so it started at 1, my “lowest” rank, and stopped at 8, my “highest” rank. Once it was set, I deleted the y-axis scale and labeled both ends of each line with the service name. For more on how to do that, see my post on Directly Labeling in Excel. Essentially, we’re just working from a regular line graph. The real work happens in thinking through how the table should be set up to show up appropriately in the line graph – and in this case, it’s entering the ranks in their reverse.

In our Office Hours call, I talked through these options and demonstrated how to make this graph. Then I sent the file back to Danielle so she had ready-made visuals all set to go as soon as her real data comes in from Qualtrics. That’s the kind of personal coaching you get when you join the Academy.


Learn in the Academy!

You can find step-by-step instructions on how to make 60+ awesome visuals in my Evergreen Data Visualization Academy.

Video tutorials, worksheets, templates, fun, and a big-hearted super-supportive community. Learn Excel, Tableau, R or all three. Come join us.

Enrollment opens to a limited number of students only twice a year. Our next enrollment window opens April 1. Get on the wait list for access a week earlier than everyone else!

Master Dataviz with Graph Guides!

Our newest program, Graph Guides, is a custom-built, year-long sprint through 50 Academy tutorials.

When you enroll, we’ll assess your current data viz skill set, build you a customized learning path, and hold your hand as you blaze your way to new talents.

We open enrollment to 12 students at a time and only twice a year. Get on the waitlist for early access to our next enrollment window.

Guest Post: Posters – They’re Not Just for Conferences Anymore!

SE Note: I almost never make posters but I know they are a hot reporting tool for many of you, so I asked poster veteran Kylie Hutchinson to share her secrets.

Posters are an important, but often overlooked, dissemination tool for visually communicating your results. Traditionally, we think of posters as an academic thing, but they’re not just for conferences anymore! An effectively designed poster can be very “sticky” compared to other forms of reporting. While a fifty-page report is sitting on a shelf somewhere collecting dust, a poster can hang around an organization’s lunch room or hallway for a long time, continuing to engage stakeholders and disseminate your key messages.

The first time I participated in a conference poster session was a disaster. I had completed an evaluation for an HIV organization and someone on staff wanted to present the results at a health care conference. They found a graphic designer to prepare a poster and when I saw it I thought, “Wow, that’s pretty”. I decided to submit it myself to the American Evaluation Association conference that same year. At the appointed time, I proudly tacked my poster to the wall and proceeded to stand there for the next two hours like an awkward wallflower at a teen dance. Nobody stopped to look at it, and no one talked to me. I felt like a real loser.

Kylie’s first disastrous experience with a conference poster. How would you improve upon this example? Add your ideas to the comments below.

It was a disappointing experience, but the following year I was (ironically) asked to be a poster judge for the same conference. I quickly learned what I wished I’d known the year before. Here are a few quick tips I’ve summarized from other sources on how to rock your poster presentation and reach the widest audience possible.

Design

  1. Organize your content into sections with clear headings that help orient the reader quickly. Consider using questions as headers instead of the traditional Introduction, Methods, Results, etc.
  2. Readers tend to read from top to bottom, so use a column format to vertically structure the flow of your sections. After the title (top and centered), the most important area for reader attention is the top left corner.
  3. Keep the design neat and clutter free. Avoid lengthy text paragraphs and aim for 40% white space. Avoid placing borders around text boxes and images which can interrupt the flow.
  4. Opt for dark colored letters on a neutral or light background for better readability. Avoid bright colored backgrounds.
  5. Limit your use of colors to two or three. If possible, choose colors that are related to your subject area.
  6. Use a font size that is large enough to read from 5-6 feet away. Use a sans serif font for headings and a serif font for body text.
  7. Capitalize Each Word of your title instead of ALL CAPS for better readability.
  8. Do a mock up by printing out each section separately, then lay them out on the floor or kitchen table. Move them around to play with different formats.

Images

  1. Use charts, illustrations, and images to break up large sections of text.
  2. Pick relevant and meaningful images that will help to quickly communicate your subject matter. Your own photos are preferable to stock photos, provided they are high quality.
  3. Crop images to include only the most important content.
  4. Choose simple and bold illustrations rather than finely detailed ones. And because you’re reading Stephanie’s blog, I don’t need to tell you how to format your charts!

Content

  1. Tell a story about your work and why it matters as succinctly as possible: what you did, what you learned, and what you recommend going forward. The poster should be self-explanatory to someone reading it on their own.
  2. Turn off your computer and think of three to five key messages that you wish to convey. Use these as the basis for your content.
  3. Ensure both the question you addressed and your conclusions are stated clearly.
  4. Avoid excessive details on the methods you used, unless that is the focus of your topic.
  5. Prepare a more detailed one or two-page summary handout as a take-away. Include your contact information.

Here’s a better example of a poster that’s also a tip sheet.

Strong, D. R. (2005). Designing communications for a poster fair. Pennsylvania State University. Retrieved from: http://www.personal.psu.edu/drs18/postershow/. Used with permission.

If you follow these tips you’ll be well on your way to being a poster rock star. But don’t forget to get it professionally printed! You don’t want to embarrass yourself by pasting together a series of 8 x 11 sheets on Bristol board. There are reliable companies that will print your poster online and ship it directly to your conference location so you don’t have to travel with that awkward tube.

If you’re curious to learn more about effective poster presentations, check out some of the resources below.

Kylie Hutchinson is principal consultant with Community Solutions Planning & Evaluation and the author of Survive and Thrive: Three Steps to Securing Your Program’s Sustainability. She is currently writing two books; one on effective evaluation reporting and another on learning from evaluation failures.

Sources:

Strong, D. R. (2005). Designing communications for a poster fair. Pennsylvania State University. Retrieved from: http://www.personal.psu.edu/drs18/postershow/.

Evergreen, S. (n.d.) Potent presentations initiative: Guidelines for posters. American Evaluation Association. Retrieved from http://www.eval.org/page/p2i-tools.

Hess, G. & L. Liegel. (2008). Creating effective poster presentations. Retrieved from: https://projects.ncsu.edu/project/posters/documents/QuickReferenceV3.pdf.

Where to Start and End Your Y-Axis Scale

The Y-Axis Debate is one of the most hotly discussed among cool data nerds, like me and my friends. Going out for drinks with me is either a blast or a bore, depending on your nerd level. So let me clarify the parameters of the debate, including where nerds mainly agree, where they don’t, and what I advise. This post is an update to one I wrote a long time ago and my thinking has evolved since then.

Data viz nerds agree bar charts must start at zero.

The general idea is that a viewer should be able to use a ruler to measure the pieces of your visualization and find that the measurements are proportionate to the data they represent.

In the case of bar charts, this means that the y-axis must always start at zero.

The bars in a bar chart encode the data by their length, so if we truncate the length by starting the axis at something other than zero, we distort the visual in a bad way. My friend Chris Lysy calls this the “Cable News Axis” because it’s so common in TV news programming.

Data nerds don’t always agree that other graph types should have to start at zero.

Outside of bar charts, whether the y-axis must start at zero is still a matter of debate. There are cases where it wouldn’t make any sense.

If zero is not in the realm of possible data points, perhaps it doesn’t need to be included in the y-axis

A visualization of stock market activity is a great example. If the y-axis started at zero, the visual would look like a flat line. We wouldn’t see any variation, and the visual would become meaningless for us. But (fingers crossed) zero isn’t a possible data point in a data set for the stock market, so there’s no real justification for starting the axis at zero.

Other than for bar charts, I advocate for a y-axis that is based on something reasonable for your data. Maybe the minimum of the axis is your historically lowest point. Maybe the minimum should be the point at which you’d have to alert your superiors. Maybe the minimum is the trigger point where your team has decided a different course of action is needed. Whatever you pick, just pick. Make it meaningful and intentional. Not something the software automatically decides for you (though that’s a place to start your thought process).

Data nerds don’t always agree where the scale should end.

There are some who think that parts of a whole data, for example, must always run on a scale that goes all the way to 100%.

I think this squishes the data and makes for an awkward graph, where we can’t fully see what’s happening. Similar to my guidance for where a graph should end, it isn’t likely that any of these bars will ever reach 100% so, in my latest thinking about axes scales, it doesn’t have to run to 100%.

We can actually see the data more clearly if we choose an axis that is closer to where our real data ends.

While this does get our data into full view, it might leave out parts of the story. These survey response options (not the data, which I totally made up) come from Human Rights Campaign. Let’s say, just as a hypothetical case, they were interested in recruiting some people who said No into those who said No but I identify as an Ally. Let’s say they know they won’t change the minds of everyone in the No group but they set a goal to grow the percent reporting as Allies to 75%. That should become the new maximum for this scale.

and preferably, let’s label the goal as such so that our reasoning is evident. The point is that you should choose a maximum for your scale that makes sense. Maybe the maximum is your goal or your most successful campaign. This way, the axis itself becomes part of the story you need to tell about your data.

This advice and more like it are part of the second edition of Presenting Data Effectively, out June 2017 and available for preorder now.

Guest Post: Visualizing Regression Effectively

Updated Note from Stephanie: This blog post generated a lot of discussion. Some of that is in the comments here, some of that has been deleted, some of it came from Twitter and via my inbox. Be sure to read the comments to get a sense of the critique. At the bottom of this post, before the comments, I’ve provided some of my reasoning because I think context helps and to explain why I’ve had to delete comments (Hint: Threats!)

Note from Stephanie: I outlined a few ways to show regression data in my latest book but they all avoid the regression table itself. This guest post from William Faulkner, João Martinho, and Heather Muntzer illustrates how to improve the simple table and how to take that data even further into something that doesn’t require a PhD to interpret.

The world is going to hell in a handbag, and it’s because data viz people haven’t stepped up to the plate.

There’s a dark horse of data viz hidden in plain sight, which has for decades made a mess of one of humanity’s most crucial quantitative tools. This villain is the regression table.

The world runs on regressions.

How many times have you heard “studies show that [blah],” or “it turns out [blah] leads to [blah]”? More often than not, these ‘facts’ are (over)simplified interpretations of a regression. Regressions are THE most common statistical way to determine whether there’s a relationship between two things – like doing yoga and wearing tight pants, or, as we’ll see in a sec, a person’s race and likelihood of being shot by the police.

People interpret the results of regressions using regression tables (and little else)

A ton of super important decisions get made on the basis of simple statements like “studies show you can reduce [blah] by [blah]%.” And these invariably come from a regression table, which usually looks something like this example analysis of 1974 cars, testing whether those with automatic or manual transmissions are more efficient:

(R user? DIY kit available here)

Regression tables are TERRIBLE visualization tools. The WORST.

Nobody wants to look at that thing! Are you kidding? And worse, even if you’re a quantitative genius really interested in the results, it’s STILL hard to intuit what’s going on.

For The Love Of Humanity, Let’s Fix This.

WARNING: This middle section is for the nerds. If you don’t run regressions yourself, feel free to skip down to Section III

The Tame Tweaks

Even without going wild, we can just stop being so careless. With just a few simple tweaks, we can go from this:

To this:

We’re not claiming perfection, but at least we’re not being as cruel to our audience. Seems a lot easier now to see that the automatic-manual distinction is not as important for efficiency when we account for weight and horsepower.

Let’s Go Nuts

Think outside the box (ahem, table), when it comes to regressions, maybe we can just graph the coefficients?

How changing the horsepower, weight, and transmission type of the average 1974 car seems to affect its mileage:

Again, not perfect. But it’s a start towards diagrams that intuitively show what we really care about in most cases:

  • Size of coefficients
  • Uncertainty of coefficients (confidence intervals and/or statistical significance)
  • Explanatory power of the model overall

And that’s it!

Because It Is Important.

Last year, Harvard professor Dr. Fryer released a working paper inspiring some controversial headlines.

Did the paper really claim that blacks were 23% less likely than whites to be shot during an encounter with police? The whole hullabaloo boils down to – you guessed it – a regression table which is, as per usual, practically indecipherable:

Let’s try our tweaks from before:

Add a possible title, like:

As we factor in other variables, such as whether the suspect had a weapon, whether the bias is towards blacks or whites becomes a lot less clear and we can’t be nearly as precise about the amount of bias.

That’s better. Not perfect, but better. Turns out as we consider more and more aspects of the encounter, that strong bias towards police shooting white suspects gets a lot more muddled.

And It’s Not Even That Hard.

But you’re right. Who cares about nuance? In a world which constantly steamrolls detail in the name of thumbs-up or thumbs-down now-and-forever conclusions, who’s got time to worry about subtleties?

We’d like to think some people do. We’d like to think oh-so-many-more would take interest were it not for these bristling anathemas – regression tables. Regression analysis and data viz experts, let’s give folks a chance:

  • Be nice to your audience. If you put a regular, white, asterix-splattered regression table in front of them, that’s inconsiderate. So don’t. Instead,
    • Use accessible labels, translate jargon
    • Take out extra decimals
    • Use color, shading, and transparency to express the key info in multiple ways.
  • Consider what’s important about the analysis – this means both the finding itself and the degree of uncertainty surrounding it! Tools like heatmaps and the coefficient charts above help to put all that detail out there in a quickly digestible format.

Big problem. Reeeeeasonably easy solutions. Now the fate of the world is up to you. No pressure.

Authors:

William Faulkner – Director, Flux RME

João Martinho – Evaluation Specialist, C&A Foundation

Heather Muntzer – Independent Designer

PS: We asked for data from the Harvard team to replicate this study and produce even better visualizations. Despite a few kind replies, they never got around to sharing it.

PPS: More materials from this project are available in this Google Drive folder.

PPPS: Questions? Email William.

Updated Editor’s Note:

I’m updating my editor’s note on this post because of how laughably out of hand things have gotten. Statisticians are really unnerved by some of the wording used in this guest post. It’s ok to disagree. I welcome those discussions and comments because they help everyone keep evolving their thinking.

But I’m heavily screening all comments posted to this thread from this point forward because now I’m getting threats like this one:

If you do not bring that uninformative guest post down, or fix it, I will bring this to the attention of the media and my fellow colleagues at ASA. Trust me, you do not want that kind of attention.

Um what? Report me to the American Statistical Association? LOL

More than one person took issue not with the content of the blog itself, but with the way that I asked the critics to improve upon what they didn’t like. One person wrote:

I am profoundly upset with your and Faulkner’s reaction to the comments.

Rule #1, as I’ve stated before, is that this is my blog. I’ll say what I want. I’ll outright and without apology delete any comments that attempt to tell me how to handle commenters or whether to pull a post. If you feel so strongly that it is bad, don’t read it. Start your own blog.

That doesn’t mean I defend errors. It means that I’m ok with mistakes – I’ve made plenty of my own – especially when they foster good discussion. But good discussion means generation of new ideas. Only one statistician in all of this mix has agreed to make a better attempt – in a few weeks.

Yes, of course, I’m asking critics to do better. Being an armchair critic is easy. To paraphrase Brene Brown, if you aren’t in the arena with me – actively trying to make things better by putting forth efforts that could be wrong or critiqued – I’m not interested in your opinion.

Several commenters questioned my intelligence. And then one guy (they were almost all white guys) said:

People are just trying to help you.

LOL you dudes are so funny. Insulting my intelligence is not help. 

It might help, actually, to understand a bit more about me and the guest post authors. I have a PhD in interdisciplinary research and evaluation. I met the guest authors at an evaluation conference. (Did you see that the lead author’s name is William Faulkner?? How can you NOT have tequila with this guy?) If you don’t know what evaluation is, it’ll be good for me to explain it to you because there’s a pretty big difference between conducting pure statistics and evaluation.

Evaluators are like researchers in that we seek to generate knowledge but we conduct our studies for real organizations who are trying to learn whether they’ve made an impact with their work, or whether new strategies could help them be more efficient. We use anything from observation to a random controlled trial to get at the data. Our methods often have to be creative, since we are collecting data from actual humans, not in clinical settings. Our analyses are always rigorous. And we have to generate explanations of those analyses for real human decision-makers, in time for them to actually make use of it.

Our audience is real life, not a journal. Those explanations can be very challenging to compose. It can be difficult to balance statistical jargon fidelity with the need to speak in a plain language for the understanding and action-taking on the table with our clients. Will and team made one of the first attempts I’ve ever seen at making regression more digestible for people. Of course the first attempt will never be perfect. But kudos to them for giving it a shot, instead of just running some stats and wondering why the audience doesn’t get it (or worse, questioning the audience’s intelligence).

Twice as many people sent love and support for this post as those statisticians who got furious. And that’s because people are hungry and eager for something better than the way the stats people have been doing it. So keep building. Ever forward, friends.

Oh and please please PLEASE report me to the American Statistical Association! I’ve seen some slidedecks from their conferences over the years. You could use my help.

From the blog