Log Confidence Problems
This page discusses a particular type of question that might be
included on some of my exams. I call this a "log confidence"
problem. Please make sure to read the following discussion of
this type of problem so that you will know how to deal with it
correctly during the exam.
Structure and grading of the problem:
The problem will be similar to a multiple choice problem, in that there
will be a question and several possible responses. But, instead
of simply picking one of those possible responses, students will
indicate next to each response their level of confidence that this is
the right answer, as a percentage. If you believe that a response
is probably the right answer, you will put a large number like 80 or
90. If you believe that a response is probably not the right
answer, you will put a small number like 5 or 10. For practical
reasons, these confidence numbers will be required to be integers
between 1 and 99, and of course they must add up to 100 exactly.
The number of points that you will be awarded for your choices will be
based on the log of the confidence you place in the response which is
actually the correct answer. Specifically, the points awarded
will be computed with the formula f = a log(x) + b, where x is the
indicated response, and a and b are constants (NB, without another base
explicitly written, all logs should be assumed to be natural
logs). All scores will be rounded to the nearest tenth of a point.
For example, consider the following scenario:
Q: What is the largest prime number that is less than 1000?
Confidence
|
Response
|
5
|
A. 237
|
40
|
B. 991
|
50
|
C. 997
|
5
|
D. 999
|
The student here has indicated that he is fairly sure that A. and D.
are not the right answers. Between the two remaining he does not
have much certainty, but he thinks C. is slightly more likely.
The correct answer is C., and thus the student's score on this problem would be: a log(50) + b
Motivation for this structure
Why the log, you ask? The answer to this question involves considering strategies.
In general, suppose that the student's true confidence levels for the
four responses are p_1, p_2, p_3, p_4, that the student's four
indicated answers are x_1, x_2, x_3, x_4, and that the formula for the
points awarded is f(x_i), where the ith response is the correct
answer. Then the expected number of points that the student will
receive is represented by the formula
E = p_1f(x_1) + p_2f(x_2) + p_3f(x_3) + p_4f(x_4)
We will assume here that students answer questions (that is, choose the
x values) with the goal of maximizing this expected number of
points.
Suppose for example that the student knew in advance that the score on
the problem would be exactly the indicated number, with no log
taken. If that were the case, then we would have
E = p_1 x_1 + p_2 x_2 + p_3 x_3 + p_4 x_4
Given this, and with the intent of maximizing this expected number of
points, what values of the x variables should the student indicate for
this problem? Viewing this as a function of the x variables, with
the constraints that all of the variables are positive and that the sum
must equal 100, it is not hard to show that the best expectation value
is achieved by putting zero for most of the responses and 100 for the
response considered most likely. That is -- it is to the
student's advantage in that case to misrepresent his/her true levels of
confidence in order to maximize the expected score on the problem.
For any given scoring function (f), one can view this as a maximization
problem for the student, where the student will want to maximize E, and
will choose the values of the x variables that give the greatest value
of E. One approach to solving this maximization problem involves
using Lagrange multipliers (which you will recall from Math 103).
Surely, the student would be well advised to choose as answers whatever
values of the x's will maximize the expected number of points E, and
not necessarily the actual confidence numbers.
So, how do we remove this sort of strategy from the problem, with the
understanding that students will always do what is in their best
interests?
It is not hard to show that if the scoring function is f(x) = a log(x)
+ b, then the maximum value of E is achieved when x_1 = p_1, x_2 = p_2,
x_3 = p_3, x_4 = p_4. That is, for any scoring function of this
form, the most beneficial strategy for the student is to represent the
true confidence levels as accurately as possible. In other words,
there is no strategy other than trying to understand the problem as
well as possible.
This sort of grading on log confidence problems completely removes strategy from the problem.
What remains for the student is simply to weigh his/her confidence in
the different possible responses as accurately as possible.
Points to keep in mind
- One can never be truly 100% confident about an answer -- we are
all fallible. So, confidence levels of 0 and 100 are not
reasonable answers to a question of this sort. Of course, we also
know that log is not defined for 0, so we simply cannot allow answers
of 0. This is why all indicated confidence numbers must be at
least 1.
- If
there are four possible responses, then because of the previous point
the highest possible confidence would have to be 97. Similarly if
there are more or fewer possible responses.
- If the indicated confidence answers add up to more than 100, then
of course this would give the student an unfair advantage over other
students. In order to deal with this inequity, any "surplus
confidence" will be removed from the confidence answer for the correct
response. Thus, it would be very much NOT in a student's
interests to indicate numbers that add up to more than 100.
- Similarly, if a student does give a confidence of 0 for one of
the responses, again this student has an advantage in distributing that
confidence to the other responses. Again, rectifying that will be
achieved by removing value from the confidence answer for the correct
response.
- If the problem is left completely blank, it will be scored
assuming confidence answers evenly distributed among the possible
responses.
- If confidence answers are missing for some of the responses, the
remaining confidence (not indicated for another response) will be
evenly distributed among those possible responses.
- If we use f(x) = 1 log(x) + 0 for a problem with 4 possible
responses, then students 97% confident in the right answer would get
4.57 points, while students only 25% confident in the right answer
would get 3.21 points. In other words, students who understand
the problem thoroughly get only slightly more points than students that
fully realize they have no understanding at all. Ultimately this
is just a problem of perspective, since this is a curved class.
But, the ratios can be brought more into line by introducing a negative
value of b.
For example, if we use f(x) = 4 log(x) - 9, these scores become 9.3
versus 3.9. The stronger student then gets more points in a
proportion similar to the grading of typical essay type problems.
Given that this brings the grading more in line with the other problems
on the exam, values of a and b such as these will be typical for this
type of problem.
Note, this sort of scaling does not affect the student's optimal
strategy on the problem. It is still in the student's best
interests to indicate confidence accurately for each of the responses.
Note also that, as this is a curved class, the "-9" in the above
example scoring function does not actually make any real difference in
the grading for the class. For example, we saw that the more
confident student above received 5.4 more points than the unconfident
student; if the scoring function had instead been simply f(x) = 4
log(x), each student would get an additional 9 points, but it would
remain that the more confident student would receive 5.4 more points
than the unconfident student. Students should not attach any
significance specifically to the negativity itself.
- Note, using the scoring function above, a student who indicates a
confidence percentage of 1% would get a score of -9! This is
quite a bit lower than the points awarded even to the student who knows
nothing about the problem and puts 25% for all four responses!
There are different ways to look at this observation. Most
importantly, it is a lesson in how to gauge and accurately represent
your own level of confidence about something -- that is, if you are at
all concerned about the possibility that this score of -9 might end up
happening, then maybe you actually are not as confident as you thought
that this is the wrong response to the question. Remember, in
order to optimize your expected number of points, you have to represent
your confidence ACCURATELY, and if you are inaccurate in that
representation in any way, over or under, you lower your expected score.
Second, strategically, note that it is not such a bad thing to
"confess" to a little bit of uncertainty. For example, suppose
for a question with four responses you put 10, 10, 10, 70; your
possible scores with the above scoring function (f(x) = 4 log(x) - 9)
are 0.2, 0.2, 0.2, 8.0. On the other hand, suppose you put 1, 1,
1, 97; then your possible scores are -9, -9, -9, 9.3. If you are
really, really confident that the fourth response is the right answer,
then the latter gives you an extra 1.3 points. On the other hand,
if you are only mostly sure that the fourth is the right response, and
have some reasonable uncertainty about the other responses, it is in
your interests to go with the former... you still expect to get
most of the points, but you protect yourself from what you have to
admit to yourself is a very real possibility that you might end up with a score of -9.
From a third point of view, there has to be a danger in indicating a
high confidence about one of the responses, otherwise there would be no
reason not to do it. That danger comes in the form of the low
scores associated to the low confidences you would have to
correspondingly put for the other responses.
Again -- it is all about being ACCURATE and HONEST in your
representation of your confidences, if you want to do what is in your
own best interests. Said differently -- strategize though you
might, the Lagrange calculation shows that you maximize your expected
number of points by representing your confidences ACCURATELY.
- Regular multiple choice problems are loaded with randomness,
because a student with some knowledge but some uncertainty takes a risk
by choosing only a single answer. These log confidence problems
remove that randomness by allowing students to distribute their
confidence appropriately. But -- if you do not represent your
confidence accurately, you re-introduce that randomness, in addition to
lowering your expected number of points. If you make this choice, then you must live with the consequences.