Ass[backwards]essment in Higher Ed

About a week ago, in a NYT article entitled “Deep in the Heart of Texas,” professor and provocateur Stanley Fish lambasted the Texas Public Policy Foundation and Texas Governor Rick Perry for proposing that the evaluation of faculty should move to a more “consumerist” (Fish calls it “mercenary”) model. The proposal would require college and university faculty to “contract” with their students, and it promises to reward–by as much as $10,000– faculty who meet their contracts’ terms. Who decides whether the conditions of the contract have been met? Why, the student-customers, of course. And how do they decide this? By filling out teacher evaluations at the end of their contract’s term… er, I mean, at the end of the course.

I’ve never met a colleague, in any discipline of higher education, who unreservedly supports student evaluations. The reasons for most faculty’s dislike of them varies widely– some think evals are badly designed, some think they’re weighted too heavily (or, less often, not heavily enough), some think the evaluation rubrics reward and punish the wrong things, some have just been burned too badly, too many times, by negative evals– but almost everyone agrees with Stanley Fish that there is a fundamental misconception at work in what the results of student evaluations are assumed to indicate. The misguided “consumer contract” model of the classroom figures faculty as providing a service to their customers (students). Correspondingly, this model figures student evaluations as a kind of “customer satisfaction” gauge. The idea at work in this model is that students have a right to recieve something in exchange for their tuition money (and, presumably, their effort and time), so student evaluations are a way of holding faculty accountable for upholding their end of the contract. Now, it most certainly is the case that student evaluations do hold (at least untenured) faculty accountable, and there are many good reasons to advocate faculty accountability. But what are they being held accountable to? Fish claims:

… what they will be accountable to are not professional standards but the preferences of their students, who, in advance of being instructed, are presumed to be authorities on how best they should be taught.

That’s no small complaint. In addition to the very obvious problem of students perhaps not being the most authoritative or reliable judges of the merit of their instruction– at least not at the time that they submit evaluations— there are many other suspect variables at play in student evaluations. Students who perform poorly in a course tend to evaluate their professor’s skill (disproportionately) more harshly, just as students who perform well tend to (disproportionately ) laud their professor’s role in their achievements. Similarly, students tend to rate professors higher in courses where they already have an interest or investment, and lower for courses that are “requiured,” outside of their major, or outside of their strong skill sets. Courses held early in the morning consistently recieve lower evaluations than classes held in “prime time,” that is, between 10am and 2pm. In the humanities, writing-intensive courses score lower. In the social and natural sciences, courses with labs score lower. And that’s not even to mention the host of other, more ambiguous and difficult to locate, prejudices that factor into student evalutions, like the fact that young female professors and non-white professors consistently receive lower evaluations, harsher criticisms and, frankly, more abuse from students. Of course, that is not to say that student evaluations are without merit or useful information, but only to say that, well, they’re not all they’re cracked up to be.

I am not opposed to student evaluations in principle. I think they offer an interesting, even if not always entirely fair and balanced, picture of the quality of instruction in any particular course. And in the case of my own students’ evaluations, I have often found indications of areas in which I needed to improve– from requiring less (or more) reading per week, to controlling dominant students better in class discussions, to talking slower, to providing more feedback or responding to emails faster– and I do my best to weigh students’ satisfaction or dissatisfaction with my course against my own standards (and my discipline’s standards) for what the course requires. But I do agree with Fish and many of the people who responded to his followup article that making student evaluations the sine qua non of judging faculty merit is grossly misguided. And to do so in such a grossly “consumerist” way (as Texas is proposing) is a recipe for disaster.

The truth is, there already is something like a “contract” in every classroom. It’s called a syllabus. Faculty are held accountable to the criteria laid out in that document, and although it is not technically a legal document, it is a very close analogue. As opponents of tenure will undoubtedly object, there are very few consequences for tenured faculty who breach the limits of their syllabi, but untenured faculty are most certainly held to the letter of the law there. And, presumably at least, faculty who regularly disregard their syllabi and disrespect their students are not granted tenure. So, maybe there are a few rogue tenured faculty out there who are just phoning it in without any repercussions, but that hardly seems to warrant compromising the integrity of academia by imposing the consumerist model upon it.

There is just too much to lose by forcing faculty into some dumbed-down version of a fawning wait-staff. The merit of a college course cannot be reduced to the equivalent of some Facebook thumbs-up “like” image. Student evaluations are important, even uniquely valuable, elements in the broad measure of a faculty member’s contributions, but they cannot be the first princinple of that measurement. It’s like asking a batter who has just struck out (or been walked) to judge the merit of the home-plate umpire. As the British painter and historian Benjamin Haydon once said: “Fortunately for serious minds, a bias recognized is a bias sterilized.” When it comes to the role of student evaluations in faculty assessment, more serious minds are needed.

6 comments on “Ass[backwards]essment in Higher Ed

  1. Curry O'Day says:

    Right on. I always tried to be fair in my evaluations, but students are simply not in a position to fairly judge their professors…at least not until they have graduated and done something with their education. One who thinks student evals should be the sole measure of a professor's ability need only go to ratemyprofessor.com to understand why that is a horrible idea.

    I don't visit ratemyprofessor.com, mainly because I believe that anonymous evaluation is a very cowardly way to criticize someone. One should be willing to take responsibility for his or her statements. I understand that there is a certain fear of repercussion behind the anonymity of official student evaluations, but I think they would be better if the students names were tied to them. That way a student would really have to mean what he or she says, instead of knowing that he or she can be unfairly critical without ever taking responsibility for it.

  2. I think you get it right in distinguishing the benefit we as teachers can garner from anonymous *comments* from the value of anonymous *grading*.

    One thing I'd like to try next semester: weekly comment sheets/evaluations. When my mother went back to school (in her fifties) she had a philosophy professor who used this system: comment sheets available throughout the semester for suggestions or requests. I'm hoping to implement a similar system using anonymous Blackboard polls.

  3. Art Carden says:

    Quick thoughts:

    1. There's a kernel of a good idea in here–we're here for and accountable to the students, who are spending a ton of somebody's money for our services–but if this goes through, execution will probably be a disaster. Top-down central planning has been a disaster pretty much everywhere it has been tried; why would we think it will work for faculty evaluations?

    2. There are much easier ways to create a measure of instructor quality from existing data while controlling for most confounding effects (course, department, time of day, race & gender of the professor, class GPA, etc).

    3. For that matter, there are probably better (but harder) ways to evaluate whether students are learning anything in a particular instructor's class: in a nutshell, give everyone in (say) Doctor J's classes an econ exam and then measure the difference in scores between people with no econ training and people who took econ 100 from me. Endogeneity and selection effects–people who really wanted to take econ are likely to do better no matter what–can be addressed using data from the student registration process.

    4. An anecdote: when my advisor was dept chair at U. Washington in the 1970s, he was looking for a better way to evaluate faculty. He polled alumni and found, to his surprise, that they said the best professor they had was the one who consistently got the worst evaluations.

    5. Finally, here's a paper on professorial quality from the most recent JPE.

  4. Art Carden says:

    Re: anotherpanacea:

    Great idea. I just got back from teaching at a summer seminar where we received evaluations immediately after our lectures. This helped immensely.

  5. Art, I don't know you, but I've spent some time with that Carrell/West paper, and I wonder what you think of the generalizability of results from the Air Force Academy?

    It's certainly an intuitive result, but I worry about the way the results seem custom-tailored to confirm expectations. I like my natural experiments a little bit more contrarian.

  6. Art Carden says:

    @anotherpanacea re: Carrell/West: guarded skepticism. I haven't spent much time with the paper–I'm mostly familiar with the secondary discussion in the econ blogosphere–but I think it's an interesting result that deserves replication in other settings and contexts. I know from my own work that the first, second, third, and fourth published papers on an issue are rarely the last word.

Leave a Reply

Your email address will not be published. Required fields are marked *