Figure 16-5: Workflow for combining parallel and serial microtasks.
Chapter 16
Combining Microtasks and Preparing Workflow
In This Chapter
Understanding parallel and serial tasks
Checking for errors
Bringing together parallel and serial tasks
Making decisions in workflow
Microtasking takes some effort, there’s no doubt about it. You have to design your tasks, upload them to a microtask market, let the crowd do its work, check the results and then download the work that meets your standards. (See Chapter 8 for a discussion of how to microtask.) With all this to do, you could be forgiven for wanting to keep your microtasking job as simple as possible. After all, why would you want to make it more complicated? Well, of course you should keep your microtasks as simple as possible, but sometimes the very simplest microtasks aren’t the best. Simple microtasks may be easy to design and easy to start, but they can only do simple things.
Checking the results of microtasks can take real effort. You have to review each task to be sure that the work’s done properly. When you’ve uploaded a large number of tasks, you quickly learn that checking the results of each microtask takes a lot of time and isn’t always feasible. Microtasks have a way of expanding. You start by doing 10, but soon you’re doing 100 tasks, and before you know it you’re doing 1,000. You can probably check 10 microtasks and maybe even check 100, but when you’re running 1,000, you can’t check them all.
To check for errors in microtasks or to get microtasks to do complicated things, you need a process called workflow, where the work flows from one task to the next. Workflow is a very powerful tool. With it, you can get microtasks to do things that you might not have imagined were possible (see the sidebar, Using microtasks to see the big picture).
Workflow can be challenging to create. You have to design and test many different microtasks and then work out how to combine them. To help, in this chapter you find out about the basic ideas of workflow. You discover new ways of dividing complicated jobs into small tasks, and clever methods of combining the results. At the end of the chapter, I look at how you might be able to use workflow to handle real problems in crowdsourcing.
Discerning the Difference between Parallel and Serial Microtasks
To understand the nature of workflow, you first need to understand the different kinds of microtask. (If you want to know the basics of microtasking, head first to Chapter 8.) Microtasks come in two principal varieties: parallel tasks and serial tasks. (Technically, a third variety exists called a recursive task, but it’s yet to find any practical application, so I won’t go into that one here.) In addition to these two approaches, you can combine them in many different ways (see the section Combining parallel and serial tasks later in this chapter).
Doing the job all at once: Parallel tasks
Parallel microtasks are the simplest form of microtask. You start with a large problem that you can divide into tiny tasks which you give to the crowd. For these tasks, everyone in the crowd gets the same set of instructions. Each crowd member is doing the same thing but with different data. Each member is trying to transcribe a different business card or characterise a different tweet or recognise faces in a different photograph. One set of instructions. Many people. Different data. No concern about which data is processed first and which is processed last. These things describe parallel microtasks.
Figure 16-1 shows how parallel microtasks work. You have a single job that you can divide into identical tasks. You give the tasks to the crowd. The crowd completes the tasks and returns them to you. You reassemble the tasks into a complete job.
Figure 16-1: Parallel microtasks.
Each audio file contains some standard information: the date and time of the shift, the name of the guard, the location of the property, the state of the property when the guard arrived, anything unusual that happened during the hours of the shift, and the state of the property at the end of the shift. That information’s followed by a list of any events and details that the guard believes to be important: a door left open, a person knocking on the door, a box of files left on the floor.
The simplest way of handling these transcription tasks is to make them into 1,000 parallel microtasks. You upload the audio files to a crowdsourcing platform and let the crowd transcribe the six elements of information from each one. You don’t care which file is transcribed first and which is transcribed last. Eventually, they will all be done, and you can then use the results.
Putting one thing after another: Serial tasks
Serial microtasking tasks are like the stages of an assembly line. You take a complicated task and break it into little steps that you perform in order – in series. You begin by asking the crowd to do the first step. Next, you ask it to do the second, and then the third, and so on until the crowd finishes all the steps. Figure 16-2 shows the steps of a serial task, a task that performs six independent steps on a single object.
Figure 16-2: A single serial task.
1. Transcribe the date and time of the shift.
2. Transcribe the name of the guard.
3. Transcribe the location of the property.
4. Transcribe the state of the property at the start of the shift.
5. Transcribe the events that happened during the shift.
6. Transcribe the state of the property at the end of the shift.
You can create various different kinds of serial microtask to transcribe these audio files. In one way of transcribing files with serial microtasks, you first take one audio file, give it to the crowd and ask a crowdworker to transcribe the date and time of the shift. Secondly, you give that same audio file to another worker and ask that worker to transcribe the name of the guard. Next, you give the file to a third crowdworker, who you ask to transcribe the location of the property. You continue this process through the fourth, fifth and sixth steps.
Do you see what you’ve done there? You’ve divided the transcription job into six new microtasks, but you’ve not accrued any actual benefits from doing the work in serial. If anything, you’ve made your life more complicated. Whereas you once had to manage just one kind of microtask, you now have to manage six.
However, you can tweak these six microtasks just a little bit and get tasks that are crowd corrected. You get a crowd-corrected microtask by asking each worker to both do something new and to check all the work that’s been done before. If the worker sees a mistake in the earlier work, she should correct it. Making this change gives you six new microtasks:
1. Transcribe the date and time of the shift.
2. Transcribe the name of the guard, check the results from step 1 and correct any mistakes.
3. Transcribe the location of the property, check the results from steps 1 and 2 and correct any mistakes.
4. Transcribe the state of the property at the start of the shift, check the results from steps 1, 2 and 3 and correct any mistakes.
5. Transcribe the events that happened during the shift, check the results from steps 1, 2, 3 and 4 and correct any mistakes.
6. Transcribe the state of the property at the end of the shift, check all the information already transcribed and correct any mistakes.
These new tasks have to be done in a fixed order. A crowdworker can’t do the second task until the first task is complete, or the third task until the second is done. Furthermore, each task requires a little more work than its predecessors. The first task doesn’t require any checking of previous work, whereas the second task requires a check of the work of the first task as well as doing some transcription, and the third task requires a check of the work of the first and second tasks as well as doing a little transcription. (Figure 16-3 shows how this job is divided into serial tasks for each audio file.)
Figure 16-3: Serial tasks with the crowd.
The crowd doesn’t see serial microtasks from the same perspective as you do. In fact, the workers may not be aware that they’re doing serial microtasks. For them, each step in the process is a microtask. Because they can see several steps, workers understand that you’re offering several different kinds of microtask. However, they may not be aware that the different microtasks are connected.
Some members of the crowd may look at each microtask and try each one. Others may look at only one and concentrate on it. Each member of the crowd chooses which of the steps she’d like to do, but the crowdsourcing platform ensures that the different steps are done in the proper order. No one attempts to do step 2 before step 1’s been completed.
Minimising Error
Workflow is a powerful tool where you take the results from one microtask and combine them with the output of a second, third or even a fourth microtask to produce a complicated result. You can use workflow for many different reasons, but it’s an especially useful way of minimising the effects of errors, mistakes, goofs, typos, misunderstandings and even sabotage in microtasks.
Appreciating the value of serial tasks
At first, serial tasks may not look too appealing. They tend to be more expensive than parallel tasks, take more time and be more trouble.
So, you may ask why do serial tasks at all. The answer is that you choose serial tasks because they offer a better way than parallel tasks of finding errors in the work. Sure, they take more time, more money and more effort, but they product more reliable results.
Among the six workers processing each audio file, for example, each worker both adds a little bit of information and checks previous workers’ results. If one worker spots an error, she can correct the mistake before she sends it forward to the next step. When the job reaches the last step, it’s been checked by each of the workers along the way.
Duplicating parallel tasks
Despite the benefits they bring, serial tasks can still seem like a lot of work. Because of this, many people instead try to weed out errors in microtasks by using another technique – duplicating parallel tasks and checking the results.
Duplicating parallel tasks does have the benefit of being a simpler alternative to doing serial tasks, but it does have drawbacks of its own. It doesn’t guarantee perfect results, for one. Two crowdworkers can agree and still be wrong. Think about it for a moment. The audio file might have a voice that speaks with an unfamiliar accent or uses a word in an unusual way. When they listen to the file, both workers may misunderstand the words in the same way. They agree, but both are wrong. (For more on this blunder, see Chapter 23.)
Duplication also involves more work than you may at first think. If you transcribe 1,000 audio files with parallel workflow and duplicate all the microtasks, you expect to do at least 2,000 individual microtasks. The exact number you do depends on the percentage of mistakes you see in the work. If 10 per cent of the transcriptions contain errors, you expect to do roughly 2,222 microtasks. (Of the original 2,000 transcriptions, 200 have an error and don’t match. Of those 200, you find 20 with an error when they’re re-transcribed. Of those 20, you find a final 2 with errors in the transcription.)
Working through an Example: Devising Workflow and Making Decisions in Mechanical Turk
When you use workflow in microtasking, you take a large job and break it into a combination of parallel tasks, serial tasks and computer programs. You use workflow in order to bring the power of human intelligence to large, complex jobs.
No matter whether you use parallel or serial microtasks – or a combination of both – an important component of workflow is decision making. When you make a decision in a workflow, you review the results of one microtask with a computer program and automatically decide which microtask should be done next.
Of course, this process may continue for a long time. If no two of the three results agree, the workflow creates a fourth task and again sends it to the crowd. The workflow continues this process until it finds that the results of two tasks agree.
Figure 16-4 gives you a diagram that illustrates this workflow. It shows how the workflow sends the tasks to the crowd, makes a decision and sends the final result to you.
Figure 16-4: Workflow decision making.
See Chapter 8 for an introduction to Amazon’s Mechanical Turk. Mechanical Turk is a general-purpose microtasking crowdsourcing platform that enables you to create your own workflow and to create rules to govern your decision-making process. You do this by writing a program to run on your own computer. That program sends commands to Mechanical Turk to create microtasks, send those microtasks to the market and recover the results of the microtasks after the crowd has done its work.
To create a workflow in Mechanical Turk, your first step is to go to the Mechanical Turk website (www.mturk.com) to create a requestor account. This enables you to post microtasks and to deposit funds that cover the costs of your tasks and the Mechanical Turk fees. When you've done that, you can return to your own computer and create your workflow.
When you create workflow for Mechanical Turk or any microtask platform, you regularly use five commands. Those five commands create microtasks, send the tasks to the microtask market (in this example, Mechanical Turk), review the results of those tasks, and accept or reject them. In Mechanical Turk, those five commands are:
CreateHIT
: creates a single microtask
GetReviewableHITs
: retrieves the microtasks that have been done by the crowd
ApproveAssignment
: approves a microtask and pays the worker
RejectAssignment
: rejects the microtask and doesn't pay the worker
GetAssignmentforHIT
: gets the results for a microtask
Starting with parallel tasks
To create a single microtask, parallel or serial, you use the command CreateHIT
when writing your program. To create 1,000 parallel microtasks for the audio transcription job example (see the earlier section Doing the job all at once: Parallel tasks), you invoke the CreateHIT
command 1,000 times and give it a different audio file each time.
With the CreateHIT
command, you can easily duplicate your microtasks and have multiple crowdworkers transcribe your file. When you invoke the command, you simply tell it that you want to duplicate the work, and specify the number of times you want it duplicated. Again, following the audio file example, you tell Mechanical Turk that you want each of the 1,000 microtasks to be done twice.
The programming commands GetReviewableHITs
and GetAssignmentforHIT
give you a way to compare the results of duplicate microtasks. The first command gives you a list of microtasks that are completed. The second command gives you the results for each microtask. You can then compare the results of the two duplicate tasks. If the results are the same, you can decide that the tasks have been done well. If they differ, you can create a new microtask, again using CreateHIT
, to resolve the difference.
Advancing to serial tasks
You can perform serial tasks in Mechanical Turk, although you need to do a little extra work. The audio file transcription example consists of six different tasks in its serial form. Each task transcribes a new part of the file. After the first task, each subsequent task asks the crowdworker to review and correct the work that’s been done already. (You can see the list of tasks in the section Putting one thing after another: Serial tasks.)
Using CreateHIT
, you can create 1,000 microtasks that do the first step of the serial process, and then give these tasks a name such as Step 1. When the Step 1 tasks are done, you can recover the results using GetReviewableHITs
and GetAssignmentforHIT
. With those results, you create the next step of the process. Using CreateHIT, you create 1,000 new tasks called Step 2. When those tasks are done you proceed to create microtasks for the next steps of the process until you've completed all six steps.
Combining parallel and serial tasks
When you’ve got to grips with handling both serial and parallel microtasks with workflow, you can combine them to create a highly reliable way of doing work. For example, you can take each audio file in the transcription job example, transcribe it with both a parallel microtask and six serial microtasks, and compare the results of the two approaches. If the results from the two different approaches agree, you can be certain that the results of the microtasking are reliable. Figure 16-5 shows the workflow for this process.
Figure 16-5: Workflow for combining parallel and serial microtasks.
This form of microtasking is more reliable than the simple approach that I describe in Chapter 8, simply because the different crowdworkers are unlikely to make the same mistakes.
Going for Gold: The Many Benefits of Workflow
Still, workflow turns microtasking into a very powerful form of crowdsourcing. It enables you to combine different kinds of skills and different approaches to work on a single job, to improve the ability of the crowd to tag images or transcribe text, to improve the quality of data searches, and even to edit articles and reports. It also enables you to improve the reliability of work through a process known as the Gold Standard.
The Gold Standard evaluates the quality of work being done as the crowd handles your microtasks. When you adopt the Gold Standard, you create special microtasks that have known answers and give them to the workers in the middle of your existing job. The workers don’t know that these microtasks are any different from the others or in any way special. They just take on these tasks like any other task and are paid for the work just the same. Therefore, by using these tasks you should be able to capture the reliability of each worker.
If the members of the crowd do the Gold Standard tasks correctly, you can be confident that they are doing all of your work well.
If the crowd isn’t doing the Gold Standard tasks properly, you should feel concerned that the results of the work being produced aren’t reliable. From looking at the results of these specific tasks, you may be able to identify some workers who aren’t producing good results and exclude them from your job.
Mechanical Turk makes it easy for you to implement Gold Standard questions. When you use the CreateHIT
command, you can specify that you're creating Gold Standard questions by using what Mechanical Turk calls an Assignment Review Policy. When it uses this policy, Mechanical Turk compares the correct answers to the question with those given by the crowd and marks those that are wrong. Using workflow, you can then exclude unreliable workers form your jobs.
Many of the other companies that offer crowdsourcing services also use Gold Standard questions to improve the reliability of the results.
18.116.63.191