How Does Question Difficulty Order Affect Evaluations of Test Performance? (Part 2)

May 27, 2018 Yana Weinstein

By Yana Weinstein

Last week, I wrote a blog post that ended with a data prediction cliffhanger. I asked readers to predict how question difficulty order on a test might affect students’ evaluations of their own performance on that test. More specifically, I asked whether having difficult questions at the beginning or at the end of a test would lead to decreased estimates of performance on the test. Would students find a test overall more difficult if it started or ended with difficult questions? At the time of writing, the responses from over 100 respondents are split almost exactly 50/50. I’m relieved, because it might have been a bit less interesting to write a blog post about an obvious result!

To investigate the effect of question difficulty order on evaluations of performance, we first needed a large set of normed questions. By “normed” what I mean is that we needed to know the average difficulty of each question, so that we could play around with question difficulty order.

I found a set of trivia questions that had been normed in 1980 (1). Of course, people’s knowledge of things like history and geography had changed from 1980 to 2010, so I had to pilot the questions again to get a more accurate measure of question difficulty. On one end of the difficulty scale, we have questions such as “The general named Hannibal was from what city?”, and on the other we have questions such as “What was the name of Tarzan’s girlfriend?” Having established question difficulty, it was time to start playing with question order!

For the basic manipulation, we took one set of 50 questions, and simply arranged them either in easy-hard or hard-easy order (for a between-subjects design, where a different group of participants answered the questions in the two orders) (2). Or, for a within-subjects design, we used two sets of 50 questions, matched for difficulty, so the same group of students answered one set in easy-hard order and the other in hard-easy order (of course, the order if these questions sets themselves was counterbalanced between participants!) (3).

Note: Our early experiments included a “random order” condition in addition to the “easy-hard” “hard-easy” question order conditions (3). However, I soon noticed that performance evaluations in this condition floated around capriciously between those made in the easy-hard and hard-easy condition, and it was unclear how to interpret this. So, we focused instead on chasing the difference between those two extreme conditions.

Before looking at the effect of question order on evaluations of performance, we must ask whether question order had any effect on performance itself. One might have reasonably predicted that having difficult questions at the beginning of a quiz would cause students to give up more easily. However, this doesn’t happen in any of my data: performance is always on par between easy-hard and hard-easy conditions (2), (3), (4).

What differs are the performance evaluations. (Finally, I’m about to tell you what happens!). When a test starts with difficult questions, students are more pessimistic about their test performance than when that same exact test is reversed, starting with easy questions. The way we explain the effect is that during the first few questions, students form an impression of the test – just like people form impressions of another person when they hear that they possess a set of traits (5)

Data from Weinstein & Roediger (2012)

In a future blog post, I will tell you about a set of studies in which we tried to make this bias go away (4).

References:

(1) Nelson, T. O., & Narens, L. (1980). Norms of 300 general-information questions: Accuracy of recall, latency of recall, and feeling-of-knowing ratings. Journal of Verbal Learning and Verbal Behavior, 19, 338-368.

(2) Weinstein, Y., & Roediger, H. L. (2010). Retrospective bias in test performance: Providing easy items at the beginning of a test makes students believe they did better on it. Memory & Cognition, 38, 366-376.

(3) Weinstein, Y., & Roediger, H. L. (2012). The effect of question order on evaluations of test performance: how does the bias evolve? Memory & Cognition, 40, 727-735.

(4) Bard, G., & Weinstein, Y. (2017). The effect of question order on evaluations of test performance: Can the bias dissolve? Quarterly Journal of Experimental Psychology, 70, 2130-2140.

(5) Anderson, N. H. (1965). Averaging versus adding as a stimulus-combination rule in impression formation. Journal of Experimental Psychology, 70, 394-400.