Mechanical Turk Tutorial

Amazon Mechanical Turk is a website where you can post short tasks and have workers quickly and easily perform these tasks for small sums of money. It is ideal for running short psychology experiments, since it allows large amounts of data to be collected quickly and easily. While a slightly less controlled environment than running studies in the lab, running studies on Turk has a number of advantages. In particular, it allows experimenters to (1) collect enough data to ensure their studies are not underpowered, as psychology studies chronically are; (2) collect data from a sample that is much more diverse than local college undergraduates and much more representative of the US at large; (3) easily replicate studies before publishing them to ensure that effects are real and effect sizes aren't inflated.

I have personally run many studies on Mechanical Turk (for example, Brady & Alvarez, 2011, Psychological Science; Brady & Tenenbaum, 2013, Psychological Review; Brady & Alvarez, 2015, JEP:LMC;). To help others run such studies, I've put together a series of tutorials to help experimental psychologists learn to code Turk studies using HTML, CSS and Javascript (with jQuery).

I've run this tutorial twice and the videos for both times are presented here (Northwestern, Harvard). The most updated version of these videos are from a tutorial given at Northwestern University. Northwestern generously had these professionally filmed and hosts them on their website so others can learn from them.

These tutorials are really only a start. There are many different techniques for coding experiments and these tutorials only introduce one of these techniques. At the bottom of the Northwestern tutorials, I've also included some examples of other, more advanced ways to run experiments, including code -- feel free to adapt for your purposes.

Tutorials from Northwestern Univ.

Video 1: Running experiments on Mechanical Turk

The first video from this set of tutorials is an introduction to Mechanical Turk, aimed at PI's and providing an overview of Turk for all researchers. It is approximately one hour and describes the kinds of studies one can run on Turk, in addition to the pros and cons of Turk research and the general principles of running experiments on the web.

The abstract of this talk was as follows:

Mechanical Turk is a service provided Amazon that enables anybody to post short tasks and have participants from around the United States or the world complete them in exchange for payment. Workers can browse a huge set of tasks and choose which to complete. Only a small proportion of tasks on Turk are cognitive science experiments: the vast majority are short, simple tasks for business purposes (e.g., translating or transcribing speech; screening for inappropriate images, etc).

In this talk, I'll give a brief overview of using Mechanical Turk as a large-scale subject pool for research. I'll review the pros and cons of Turk as a participant sample and as a platform for experiments. I'll review what we know about the demographics of Turk users and what we know about the validity of Turk for psychology experiments. I'll then explore the technical and practical limitations of running online experiments, and give a variety of examples of Turk studies, both from my own work (e.g., Brady & Alvarez, 2011, Psych. Science; Brady & Tenenbaum, 2013, Psych. Review) and from others.

note: if you're running an ad-blocker, this will sometimes block the videos on this page; you may need to turn it off.

View larger video here.

The code from the quickly-coded experiment in this talk (grey vs. gray) is as follows:

if (Math.random()<0.5) {
 document.write('<p>Which of these is the best example of the color gray?</p>');
 document.write('<input type="hidden" name="whichGray" value="gray">');

} else {
 document.write('<p>Which of these is the best example of the color grey?</p>');
 document.write('<input type="hidden" name="whichGray" value="grey">');
<p><label><input name="darknessVal" type="radio" value="0" /><span style="background-color: rgb(30,30,30); width: 50px; display:inline-block">&nbsp;</span></label></p>
<p><label><input name="darknessVal" type="radio" value="1" /><span style="background-color: rgb(50,50,50); width: 50px; display:inline-block">&nbsp;</span></label></p>
<p><label><input name="darknessVal" type="radio" value="2" /><span style="background-color: rgb(70,70,70); width: 50px; display:inline-block">&nbsp;</span></label></p>
<p><label><input name="darknessVal" type="radio" value="3" /><span style="background-color: rgb(90,90,90); width: 50px; display:inline-block">&nbsp;</span></label></p>
<p><label><input name="darknessVal" type="radio" value="4" /><span style="background-color: rgb(110,110,110); width: 50px; display:inline-block">&nbsp;</span></label></p>
<p><label><input name="darknessVal" type="radio" value="5" /><span style="background-color: rgb(130,130,130); width: 50px; display:inline-block">&nbsp;</span></label></p>
<p><label><input name="darknessVal" type="radio" value="6" /><span style="background-color: rgb(150,150,150); width: 50px; display:inline-block">&nbsp;</span></label></p>
<p><label><input name="darknessVal" type="radio" value="7" /><span style="background-color: rgb(170,170,170); width: 50px; display:inline-block">&nbsp;</span></label></p>
<p><label><input name="darknessVal" type="radio" value="8" /><span style="background-color: rgb(190,190,190); width: 50px; display:inline-block">&nbsp;</span></label></p>
<p><label><input name="darknessVal" type="radio" value="9" /><span style="background-color: rgb(210,210,210); width: 50px; display:inline-block">&nbsp;</span></label></p>
<p><label><input name="darknessVal" type="radio" value="10" /><span style="background-color: rgb(230,230,230); width: 50px; display:inline-block">&nbsp;</span></label></p>

The data from this "experiment" is available here (with Turk worker ID's removed for anonymity).

Video 2: Programming experiments for Mechanical Turk

The second video is a tutorial session where I taught participants to code a simple experiment, involving the following principles:

  • Counterbalancing the order of trials
  • Brief presentations of stimuli
  • Recording reaction times from mouse clicks and key presses
  • Saving data using hidden inputs
  • Debugging Javascript using the inspector

The abstract of this talk was as follows:

In this tutorial, I'll go into the details of how to run studies on Amazon's Mechanical Turk. I'll start with a brief overview of Turk; review and take questions on what is and isn't technically possible; and discuss what experimental designs should and should not be used on Turk. We will then begin a hands-on tutorial, where we collectively implement, gather data on, and analyze the data from a simple experiment (e.g., getting folks to rank the size of a series of objects). Once we've done that, we will implement a more complex experiment, including counterbalancing/randomizing, showing stimuli for only brief durations and other advanced techniques.

The technologies used to code experiments on Turk are the same ones used for webpages: HTML for structure; CSS ('cascading style sheets') for display and positioning elements of the page; and Javascript for manipulating elements of the page and controlling and changing the HTML and CSS in response to user input. In this tutorial, you'll learn a bit about each of these languages and how they interact.

Please bring a laptop, preferably with Google Chrome installed. In addition, please set up a Mechanical Turk requester account in advance (this can be slow and require additional verification for international folks, so shouldn't be left for the day before).

This video lasts approximately 3.5 hours, most of the time spent coding and teaching people to code experiments. This is the video to watch if you want hands-on experience with coding a simple experiment for Mechanical Turk. Professor Steve Franconeri is the handsome fellow who introduced me. Video:

View larger video here.

A link to the Google Doc we reference throughout the experiment (with all the pasting I did preserved in it) is

The final code from this session is as follows:

<script src=""></script>
.trialDiv {
  border: 2px solid gray;
  padding: 20px;
  width: 200px;
  margin: 0 auto;
  display: none;

body {
  font-family: Arial, Helvetica;
  font-size: 14pt;

#submitButton {
  display: none;

#instructions {
  width: 300px;
  margin: 0 auto;
  text-align: center;
  margin-top: 100px;

#startExperiment {
  color: rgb(200,200,200);
  text-decoration: none;

#startExperiment:hover {
  color: white;

#startExperimentButton {
  width: 200px;
  text-align: center;
  border: 4px outset gray;
  background: gray;
  padding: 5px;
  margin: 0 auto;

<div id="instructions">
<p>Judge whether this object is bigger or smaller than a <b>shoebox</b>.
<div id="startExperimentButton">
<a href="#" id="startExperiment">Start Experiment</a>

<div id="trial1" class="trialDiv">
<p><img src="" width="100" id="trialImage1"></p>
<p><input type="radio" class="responseButton" name="question1" value="-1"> smaller</p>
<p><input type="radio" class="responseButton" name="question1" value="1"> bigger</p>
<input type="hidden" name="question1RT" value="0" id="reactionTime1">

<div id="trial2" class='trialDiv'>
<p><img src="" width="100" id="trialImage2"></p>
<p><input type="radio" class="responseButton" name="question2" value="-1"> smaller</p>
<p><input type="radio" class="responseButton" name="question2" value="1"> bigger</p>
<input type="hidden" name="question2RT" value="0" id="reactionTime2">

<div id="trial3" class='trialDiv'>
<p><img src="" width="100" id="trialImage3"></p>
<p><input type="radio" class="responseButton" name="question3" value="-1"> smaller</p>
<p><input type="radio" class="responseButton" name="question3" value="1"> bigger</p>
<input type="hidden" name="question3RT" value="0" id="reactionTime3">

<div id="trial4" class='trialDiv'>
<p><img src="" width="100" id="trialImage4"></p>
<p><input type="radio" class="responseButton" name="question4" value="-1"> smaller</p>
<p><input type="radio" class="responseButton" name="question4" value="1"> bigger</p>
<input type="hidden" name="question4RT" value="0" id="reactionTime4">

$("#testDiv").click(function() {
var trialOrder = shuffle([4,3,2,1]);
var curTrial = 0;
var nTrials = 4;
var startTrialTime;
function trialIsOver() {
  var curTime = new Date();
  var rt = curTime - startTrialTime;
  $('#reactionTime' + trialOrder[curTrial]).val(rt);

  $('#trial' + trialOrder[curTrial]).hide();
  if (curTrial >= nTrials) {
  } else {
    startTrialTime = new Date();

function showTrial(whichImage) {
  setTimeout(function() {
    $('#trial' + whichImage).show();
    setTimeout(function() {
      $('#trialImage' + whichImage).hide();
    }, 200);

    $(document).bind("keydown.steve", function(event) {
      if (event.which == 83) {
        console.log("pressed small");
      if (event.which == 66) {
        console.log("pressed big");
  }, 500);

$("#trial4").bind("mousemove", function(e){

function showFirstTrial() {
  startTrialTime = new Date();

/* Fisher-Yates shuffle */
function shuffle(o){
    for(var j, x, i = o.length; i; j = Math.floor(Math.random() * i), x = o[--i], o[i] = o[j], o[j] = x);
    return o;

/* Wait for clicks */

<!-- do not put this on Turk -->
<input type="submit" id="submitButton">

Please note that the technique used in this video -- in particular, entering the experiment directly into the Turk code box and using <form> fields (like <input type='hidden'>) to save the data is only an "entry-level" view of how to use Turk. If you look at some of my example experiments, you will see that they do not use this format at all. Instead, they are external webpages (e.g., This Month-Number HIT described in the first video) and do not save their data to Turk at all -- if you view-source on that HIT, you will see that it actually saves the data in a different way, directly to my web server:

/* Save the data to the server (choose what to save) */
function SaveData() {
  var totalTime = (new Date()) - startExpTime;
  var newDate = new Date();
  var curID = (IsOnTurk())? GetAssignmentId() : prompt("Doesn't look like you " + 
  "are on Turk, so you're probably testing. Enter an ID to save your data with:", "id");
  d = { 
    "curID": curID,
    "workerID": GetWorkerId(),
    "curTime": + " @ " + newDate.timeNow(),
    "userAgent": navigator.userAgent,
    "windowWidth": $(window).width(),
    "windowHeight": $(window).height(),
    "screenWidth": screen.width,
    "screenHeight": screen.height,   
    "totalTime": totalTime,
    "comments": $('#comments').val(),
    "birthday": $('#date').val() + "," + $('#month').val() + "," + $('#year').val(),
    "trialStruct": trialStruct
  SendToServer(curID, d);

/* Send the data to the server as JSON: */
function SendToServer(id, curData) {
  var dataToServer = {
    'id': id,
    'experimentName': 'MonthType',
    'curData': JSON.stringify(curData)
    function(data) { 
  ).fail(function(data) { 

There are a few things to note about this code: First of all, it depends (probably not in a good way!) on a few facts about how the rest of the HIT is designed:

  • It expects a variable called startExpTime to have been set to a new Date() when the experiment started
  • It expects functions called GetAssignmentId() and GetWorkerId() to exist (these are available by copying them from:
  • It expects a hidden input element with the id of assignmentId to exist -- this is necessary to set up and set the value of yourself if you post the HIT as a separate webpage (an "external question HIT"; see Amazon's help on this or see below, for how to post such a HIT)
  • It expects that, by the time the SaveData() function is called, participants have entered information about their birthday, some comments, and all the relevant data is saved in a variable called trialStruct (you can see this is how it is stored by looking at the Javascript that shows the trials)

In addition, the HIT, as designed, sends the data by POST-ing it to my server -- notice that it calls (a Perl file). If you adapt this technique, you'll need your own server (that supports https, the secure protocol) and your own copy of a file that saves the data. The contents of are included here:

use CGI qw(:standard);
use CGI::Carp qw(warningsToBrowser fatalsToBrowser);  # Make sure we can see any errors

# Structure on the server:
#  This file:
#  Data gets saved in:
#  Each subject gets saved using their id:
# Note that this means that anybody can load this file directly and
# create files on your server. E.g., if somebody goes to the URL
# that will create a file on my server in the directory tim. So it isn't the
# safest thing in the world. But I don't have any problems, and they can
# only create text files.
# How to make this file work:
#  You'll probably need to change the top line to be the address of perl
#  on your server (it might be /usr/bin/perl or /usr/local/bin/perl).
#  You'll also need to chmod this file to 755.

print header;
my $cgi = CGI->new;

# Parse out the three fields we know the experiments always send us,
# the id, the experimentName, and the data (curData). Save the data
# in a file called {id}.txt in folder {experimentName}.
my $id = $cgi->param('id');
my $experimentName = $cgi->param('experimentName');
my $curData = $cgi->param('curData');

mkdir "data/${experimentName}", 0777 unless -d "data/${experimentName}";
open(FID, '>', "data/${experimentName}/${id}.txt");
print FID "$curData";
print "done";
Please read the comment carefully -- if you put this on your own server, it opens up the possibility for anybody to create files on your server. Don't put this on your server without knowing what you're doing. In particular, you may want to use a .htaccess file that only allows this file to be called from your own server, not from an external server. If Perl is not your thing, the same thing can be achieved in Python:
import cgi
import os
import sys
sys.stderr = sys.stdout # Make sure we can see any errors

# Structure on the server:
#  This file:
#  Data gets saved in:
#  Each subject gets saved using their id:
# Note that this means that anybody can load this file directly and
# create files on your server. E.g., if somebody goes to the URL
# that will create a file on my server in the directory tim. So it isn't the
# safest thing in the world. But I don't have any problems, and they can
# only create text files.
# How to make this file work:
#  You'll probably need to change the top line to be the address of python
#  on your server (it might be /usr/bin/python or /usr/local/bin/python).
#  You'll also need to chmod this file to 755.

  # Read the id from the JSON and use as the filename:
  fs = cgi.FieldStorage()
  if not os.path.exists(fs["experimentName"].value):
  saveFile = open("data/" + fs["experimentName"].value + "/" + fs["id"].value + ".txt", "w")
  # Now read the field curData, which we will assume is a string
  # and print it to this file:
  # Close the file and tell jQuery all went well:
  print "Status: 200 OK"
  print "Content-type: text/plain"
  print fs["id"].value + " saved"

  # Tell jQuery something went wrong
  print "Status: 400 Bad Request"
  print "Content-type: text/plain"
  print "Error"

This Python version has all the same caveats as the Perl version. Don't put this on your server without knowing what you're doing. In particular, you may want to use a .htaccess file that only allows this file to be called from your own server, not from an external server.

These scripts both save the data in a JSON-style file: this is a file format designed to mimic the way Javascript writes arrays and objects. Most languages you use to analyze data probably have their own way of reading JSON formatted files. For example, in MATLAB, the function loadjson(), from the File Exchange, works.

Thus, my analysis pipeline, for example, uses a MATLAB function that automatically downloads all the new data from my server and converts it to MATLAB format before analyzing it:

function Process()
  % Download any new subjects from the server:
  dataFolder = 'Data';
  experimentName = 'MonthNumber';
  FetchFromServer(experimentName, dataFolder);
  % Load each subject in turn:
  subjects = dir(fullfile(dataFolder, '*.mat')); 
  for i=1:length(subjects)
    % Load the data: 
    curD = load(fullfile(dataFolder, subjects(i).name));
    data{i} = curD.d;
    % Print the subject assignment ID and their comments, if any:
    fprintf('(%d) %s %s\n', i, subjects(i).name, data{i}.comments);
    % How long did it take them to finish the HIT?
    totalTime(i) = data{i}.totalTime/1000/60;
    % Process the data for this subject:
    responses = arrayfun(@(x){x.responseTyped}, data{i}.trialStruct);
    stimName = arrayfun(@(x){x.stimulusName}, data{i}.trialStruct);
    rt_sub = arrayfun(@(x)(x.rt), data{i}.trialStruct);
    rt_sub_start = arrayfun(@(x)(x.startType), data{i}.trialStruct);
  % How long did it take subjects to finish the task, including instructions/breaks?
  bar(x,n,'FaceColor',[.5 .5 .5]);
  xlabel('Minutes to complete task');

  % Plot data, etc...	

% Download data from the server and convert it to .mat:
% ----------------------------------------------------------------
function FetchFromServer(expName, dataDir)
  % This function depends on your server allowing directory
  % listings in these folders, e.g., if you go to 
  % you'll see the list of files there. You can usually configure
  % this separately for different directories, depending on your
  % server.
  % One other trick here is that you can create an 'Exclude'
  % folder inside your data folder, and put subjects in there
  % (if, lets say, their file didn't save correctly). You need
  % this because the script will just redownload the files over
  % again if you delete or move a subjects file. So the only way
  % to keep it from doing that is to put it into the Exclude folder.
  serverUrl = ['' expName '/'];
  dirInfo = urlread(serverUrl);
  k = strfind(dirInfo, 'Parent Directory');
  [~, ~, ~, ~, uris] = regexp(dirInfo(k:end), '<a href="(.*?).txt">');
  for i=1:length(uris)
    fName = [uris{i}{1} '.txt'];
    % If we haven't downloaded this file yet:
    if ~exist(fullfile(dataDir, fName), 'file') && ...
        ~exist(fullfile(dataDir, 'Exclude', fName), 'file')
      % Download file:
      urlwrite([serverUrl fName], fullfile(dataDir, fName));
      % Create mat file:
      matFile = fullfile(dataDir, [fName(1:end-4) '.mat']);
      if ~exist(matFile, 'file')
        fprintf('Creating .mat file for %s\n', fName(1:end-4));
        d = loadjson(fileread(fullfile(dataDir, fName)));
        iSaveD(matFile, d);

% ----------------------------------------------------------------
function iSaveD(fname, d) %#ok<INUSD>
  save(fname, 'd');

So the first more advanced technique for using Turk is to save the data to your own server, rather than using form fields to save it to Turk. There is another more advanced aspect, mentioned above, that my HITs all use -- they are all hosted on my own webpage and posted as an external HIT. Posting an external HIT -- that is, hosting the HIT on your own server, but having it be shown as a HIT on MTurk the same as an other -- is pretty straightforward from Python using boto. Here's a sample of how you'd take a webpage (say, the month number example from above) and post it as a HIT:

from boto.mturk.connection import MTurkConnection
from boto.mturk.question import ExternalQuestion
import boto.mturk.qualification as mtqu
from dateutil.parser import *

ACCESS_ID = 'YourTurkAccessID'
SECRET_KEY = 'YourTurkSecretKey'
#HOST = '' # Use this to post to the sandbox instead
HOST = ''

def PostHits():
  mtc = MTurkConnection(aws_access_key_id=ACCESS_ID,
  q = ExternalQuestion(external_url = "", frame_height=675)
  keywords = ['memory', 'psychology', 'game', 'fun', 'experiment', 'research']
  title = 'Type the name of the months and days of the week quickly!'
  experimentName = 'MonthNumberGame'
  description = 'Play a short 2 minute game where you have to know the numbers that go with months and days.'
  pay = 0.50
  qualifications = mtqu.Qualifications()
  qualifications.add(mtqu.PercentAssignmentsApprovedRequirement('GreaterThanOrEqualTo', 90))
  qualifications.add(mtqu.LocaleRequirement("EqualTo", "US"))

  theHIT = mtc.create_hit(question=q,
                          lifetime=10 * 60 * 60, # 10 hours
                          duration=120 * 60, # 120 minutes
                          approval_delay=5 * 60 * 60, # 5 hours

  assert(theHIT.status == True)
  print theHIT
  print theHIT[0].HITId


To use this API, you'll need to get an access ID and secret key from Amazon that are connected to your Amazon account. Instructions for this are available here.

There are some requirements to making a webpage that can serve as an external HIT. In particular, the basic format you'll want is below:

<script src=""></script>

<form id="turkSubmit" action="" method="post">


<input type="hidden" name="assignmentId" id="assignmentId" value="">
<input type="submit" value="Submit" name="submitButton" id="submitButton" >


The critical elements here are that you need to have a <form> that has an action of's externalSubmit value; and that you need to, somewhere in your Javascript code, set the value of the assignmentId hidden input element to be the assignmentId that Turk passes to your code when it is run as an external HIT (in this case, I just show how to do this using the code from the TimTurkTools.js file which contains a function called GetAssignmentId(), but you could write this yourself more directly).

This is a brief overview of more advanced techniques, including saving to your own server, automatically downloading these files in MATLAB, and posting and using external HITs, but I hope the included code is useful to people who wish to take on these more advanced topics. (Note that now we are using HTML, CSS, Javascript, Python, Perl and MATLAB as part of our pipeline -- sorry about that. You could condense Python/Perl/MATLAB all to just Python, if you wanted! I just use whatever languages seem most convenient to me at the time).

Tutorials from November 2012 (Harvard Univ.)

These older tutorials were recorded in the Harvard Vision Lab, and so the audience in the tutorials is made up of vision scientists, who typically code their studies in MATLAB with the Psychtoolbox. At some points in the tutorials I make reference to MATLAB to explain concepts, but MATLAB skills are not a prerequisite for understanding the tutorial. These tutorials were very informally conducted and filmed, and so do not have the production quality of the talks from Northwestern. In addition, they are largely redundant with what is presented in the Northwestern tutorials. However, since they might still be useful, I have left them here for posterity.

Each tutorial is about 2 hours of video, plus slides and code. The slides are not very clearly visible in the first video, but both the slides and the code are available for download so you can follow along. The videos from Day 2 and 3 use screen capture so it is easier to follow along. The videos were recorded while I was teaching the Vision Lab members, and so there are sometimes questions from the audience that cannot be heard, or pauses during which I was walking around the room. However, I continue to share them in hopes they might be useful to a broader audience.

Day 1 focuses on the following questions:
  • What is Mechanical Turk?
  • Who are the Turkers?
  • How much should I pay?
  • Are Turk workers good experimental subjects?
  • How to setup a Turk HIT
  • The Mechanical Turk sandbox
  • Choosing HIT properties
  • Introduction to HTML
  • Parsing Turk output in MATLAB and Excel

Watch the video:


Day 2 focuses on the following techniques:
  • Breaking each trial into a separate page, using Javascript
  • Preventing advancing to the next page without answering the current question
  • Randomizing trial order
  • Making it prettier using CSS
  • Hiding the Submit button until they complete the HIT
  • Preventing completion of the the HIT in preview mode
  • Blocking people who have done similar HITs before

Watch the video:


Before Day 3, you can do the "homework". The homework was simply to change the HIT from the end of Day 2 to allow the scale to go from 1-7 rather than 1-5, and to add two new pictures to the set so there are 6 trials rather than 4. Day 3 focuses on coding an actual working memory experiment. It uses the following techniques:
  • Hiding and showing displays at certain times (e.g., using timers)
  • Using the keyboard to record responses.
  • Recording response times.
  • Using hidden input fields to store data.
  • Including CSS in your HTML.

Watch the video: