Card-sorting

From text to themes

T. Zimmermann    Microsoft Research, Redmond, WA, United States

Abstract

Suppose you just ran a survey with the question “What problems are stopping us from meeting our deadlines?” Since you didn’t know the problems in advance, you asked participants to simply write the problems into a textbox. This type of question is called an open-ended question; in contrast to closed-ended questions, which limit participants to a list of predefined answer choices. Your survey was very successful and you received hundreds of responses. Now you want to make sense of the data. How can you do this in a systematic way? In this essay, I will show you how card sorting can infer themes from text responses. You will learn everything you need to know for your first card sort.

Keywords

Card sorting; Open card sorts; Closed card sorts; Hybrid card sorts; Surveys; Questionnaires; Text Analysis; Qualitative Analysis

Suppose you just ran a survey with the question “What problems are stopping us from meeting our deadlines?” Since you didn’t know the problems in advance, you asked participants to simply write the problems into a textbox. This type of question is called an open-ended question; in contrast to closed-ended questions, which limit participants to a list of predefined answer choices. Your survey was very successful and you received hundreds of responses. Now you want to make sense of the data. How can you do this in a systematic way?

I’ve frequently used card sorting to derive themes from text. Card sorting is widely used to create mental models and derive taxonomies from data and to deduce a higher level of abstraction and identify common themes. The idea is simple: print text on index cards and sort them into groups that correspond to themes. This has several advantages compared to, say, annotating text in Excel. With physical cards it’s easy to split and merge groups or simply just read the cards of a group. You also build up spatial memory of the groups, which helps with remembering the groups.

Over past years I used card sorting in dozens of projects for the analysis of survey data and software artifacts. Here are two examples:

 Andrew Begel and I used card sorting to analyze the responses to a survey question “Please list up to five questions you would like (a team of data scientists who specialize in studying how software is developed) to answer.” We received 679 questions, which we sorted into 12 themes such as Bug Measurements, Development Practices, and Productivity. We then distilled the responses (raw questions) into descriptive questions, which led to a final catalog of 145 questions.

 Silvia Breu, Rahul Premraj, Jonathan Sillito, and I used card sorting to identify frequently asked questions in bug reports. We extracted 947 questions from a random sample of 600 bug reports which we then sorted into eight themes such as Missing Information, Triaging, and Status Inquiry. For each question we had additional statistics such as when the question was asked, whether it was answered, and how long the response took. This allowed us to combine the card sort with a quantitative analysis.

There are two basic types of card sorts. Open card sorts have no predefined groups; the groups emerge and evolve during the sorting process. Closed card sorts have predefined groups; they are typically used when the themes are known in advance. In some cases, hybrid card sorts can work well too, you start with a representative sample of cards to identify themes, and then later sort the remaining cards into those themes. Most times I’ve used open card sorts.

Cart sorting has three phases: in the preparation phase, we create cards for each response; in the execution phase, cards are sorted into meaningful groups with a descriptive title; finally, in the analysis phase, abstract hierarchies are formed in order to deduce general categories and themes.

Preparation Phase

To create the cards, I use the Mail Merge feature of Microsoft Word. The input is simply an Excel spreadsheet with one row per card and all the relevant information separated into columns (with column headers). Word allows you to customize the layout of the cards by using the Mail Merge fields.

It is important to have one thought/comment per row. For surveys, the best approach is to design the survey in a way that allows different thoughts to be entered separately. For example, instead of having one text box for all five responses, Andrew Begel and I provided one text box for each response: “Please enter your first response.” TEXTBOX. “Please enter your second response.” TEXTBOX. And so on. Unfortunately, it is not always possible or practical to design surveys in such a way. In that case, you might have to split responses manually (sometimes with support from your survey tool), ideally before you print your cards. For large documents like webpages, interview transcripts, or PDF files, quantitative coding tools such as Atlas. TI or QDA Miner (which has a free Lite version) can be used to extract relevant pieces of text from large documents.

On each card, I print demographics, if available, and always a card identifier to uniquely label the card. The demographics provide you with a better context during sorting. The identifier allows you to combine the results of the card sort with additional data, which can enable additional analysis later on (see the following for an example). I simply number the cards to get my identifiers, eg, 100, 101, 102. I typically start at higher numbers (like 100 or 1000) so that all identifiers have the same number of digits; I found that this helps to enter card numbers more easily into the computer later on.

I usually prints cards 4-up on a letter page. Print the text in a large font (at least 20 point); the larger the better, and the easier it is to read. After the mail merge, you can manually reduce the font size for any cards that don’t fit. A trick I recently discovered is to sort the Excel sheet by response length (put longer responses in the first rows to prevent Excel from cutting off text). That way, you have to only go through some cards to reduce the font size, or you could split the responses into two or more sets (eg, one set for short responses in a large font, one set for long responses in a smaller font).

Execution Phase

Get a large room for your card sort. Besides the cards, bring pens, sticky notes, markers, envelopes, and rubber bands. Use the pens and sticky notes to create descriptive titles once you have several cards in a group. Use the markers to highlight important text, for example, when the text is very long. Plan about 2 hours for a card-sorting session, but not more than three hours. Longer sessions give you brain freeze. If you have a large number of cards you will need multiple sessions. Don’t forget to take breaks.

The first card is the easiest one to sort. It always starts its own group. For the second card, decide if it is similar to the first group. If it is similar, put it in the first group, if not start a new group. Repeat for each of the following cards: decide if it fits an existing group, and if not start a new group. Keep going until you run out of cards.

Don’t overthink where to put a card. If it’s difficult to decide where to put a card, put it at the end of the pile and come back to it later. The groups are not fixed and can change during the course of the card sort. It is fair game to split, merge, or even resort groups. For example, when a group gets very big, you might want to revisit the cards and see if there are important subthemes. While sorting, you might discover some cards that make no sense or are off-topic; I put those in a special Discard group.

Card sorting in teams can help you derive better themes and sort a larger number of cards. Jointly sort the first few cards with the entire team (calibration phase), then divide up the cards, but still communicate while sorting. Let others know when you start a new group. When you have an ambiguous card, read the card out loud and ask others for feedback. Avoid having more than four or five people for a card sort. Too many cooks spoil the broth. Always have one team member lead the card sort.

Don’t forget to take photos of your card sort. That will help you to later reconstruct the spatial layout of the cards. Photos are also great to impress friends or to explain the process in a presentation. Please refer to Fig. 1.

f27-01-9780128042069
Fig. 1 A card sorting session.

After you are done with the card sort, use the envelopes and rubber bands to store the cards. I put the sticky note to the side of the top card and then pile up the groups. Have the sticky note stick out a little from the top card. That helps you to later separate the pile of cards into the individual groups.

Analysis Phase

After you are done with the card sort, it’s a good idea to go through the cards one more time to check for consistency within the groups. It’s still okay to move cards around, though at some point you want to freeze the groups.

As the last step, take a look at the groups and see if you can deduce more general categories and themes, especially if you have many groups. In the card sort of data science questions, Andrew Begel and I ended up with 60 groups… too many to make sense of the data! We then used affinity diagrams to find general themes. The result was a hierarchy of themes and categories. Each group from the card sort was a category. We then combined categories into themes. For example, the groups “Productivity Measures,” “Measuring the Individual,” “Impact on Productivity from Build and Process Tools,” and “Tradeoffs between Vendors and Full Time Employees” all became part of the theme “Productivity.”

If you have extra data, you can do a quantitative analysis of the themes. For example, for the card sort of bug reports questions, we had the response time for each question. With the identifier, we could tie back the response time to the different themes and check for differences. Before you can do the analysis, you need to enter the results of the card sort into the computer. Always enter card numbers in pairs: one person reads out the numbers, the other types in the numbers. Both check for errors. To combine card sort categories with the original data, I use the VLOOKUP function in Excel.

Last, a word of caution. A common mistake is to give more importance to themes that have more cards; however, quantifying inherently qualitative data is dangerous. Here’s a simple example for the question from the beginning “What problems are stopping us from meeting our deadlines?” If someone does not mention the slow build system as a problem, it does not automatically mean that they are happy with the build system. They might not have recently worked with the build system or they just did not think of it when they took the survey.

That’s all you need to get started with your first card sort. Good luck!

References

[1] Begel A., Zimmermann T. Analyze this! 145 questions for data scientists in software engineering. In: Proceedings of the 36th international conference on software engineering (ICSE 2014). New York, NY: ACM; 2014:12–23.

[2] Breu S., Premraj R., Sillito J., Zimmermann T. Information needs in bug reports: improving cooperation between developers and users. In: Proceedings of the 2010 ACM conference on computer supported cooperative work (CSCW 2010). New York, NY: ACM; 2010:301–310.

[3] Mail merge for email, http://office.microsoft.com/en-us/word-help/use-word-mail-merge-for-email-HA102809788.aspx.

[4] Affinity diagram, https://en.wikipedia.org/wiki/Affinity_diagram.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.123.189