Designing Systems to Limit the Impact of Bias

In March, we wrote about how coaches and evaluators must identify, then set aside, biases in order to fairly and accurately evaluate and develop teacher practice. This critical work by individuals is powerful, but not the only tool available to leaders and policy makers in reducing bias: Specific design elements within performance management systems can help reduce opportunities for bias to impact results

Drawing upon findings from the MET Study and our practical experience with school systems across the country, we recommend three design features to limit biases: 

  1. Balance with multiple measures: Evaluations that rely on a single input are not only more vulnerable to year-over-year volatility but they present the greatest opportunity for evaluator bias to impact outcomes. Consider, for example an evaluation system with one input, ratings on a common teaching rubric. As evaluators work to determine ratings, their biases can (and will) creep in. Without a second or third objective measure to balance the overall evaluation, the teacher’s rating is subject to the influence of these biases with no checks or balances. 
  2. Utilize multiple observers/evaluators: As with multiple measures, adding additional observers/evaluators increases statistical reliability (reliability = overall consistency of a measure). MET Study authors write: “When the same administrator observes a second lesson, reliability increases from .51 to .58, but when the second lesson is observed by a different administrator from the same school, reliability increases more than twice as much,from .51 to .67.” While districts must wrestle with the best allocations of limited time and resources, there is great benefit to having a second set of eyes on a teacher’s performance. 
  3. Use outcomes-based tools: Many common evaluation instruments measure inputs–the actions taken toward a goal–rather than assessing the desired outcomes. Unfortunately, measuring inputs can yield false positives: a teacher can try a specific approach and receive high marks, even if the strategy fails to produce appropriate student learning. Instead, strong evaluation systems use outcomes-based tools and rubrics that identify observable and measurable behaviors. For example, we’re big fans of the Core Teaching Rubric, which describes with specific language what students are saying and doing at different levels of performance–rather than teacher actions–so that observers can evaluate performance based on what matters most: student experience and learning. By focusing on student outcomes, there is less risk of observers inserting their biases or preferences for a particular instructional strategy. Instead, an outcome-based tool focuses observers on what and how students are learning, which is what matters the most.

While these structural changes alone will not eliminate bias, they can make a big difference.  What have you seen be successful in reducing bias in performance management? Let us know in the comments!