Maybe Annual Performance Reviews Are Really Better?

Bear with me… this is kinda a long post but IMHO it can have big impact on how you make decisions on performance.

There is a lot of chatter on the interwebs about the death of traditional performance reviews.  Many HR pros and consultants saying we’ve outgrown the process and are suggesting that using social media thinking, real-time data, even recognition events as performance touchstones (I wrote about this here) should be the basis of evaluating performance.

In many cases the argument is that performance is a moving target, change happens too quickly and therefore the performance 8 months ago may have little impact on the required performance today or in the near future.  They talk about how one-sided reviews are and how 360 reviews create potential problems when they are “anonymous.”  The argument is made that one manager controlling the process injects biases and problems.

But a couple of articles are giving me pause when thinking about how to make better performance decisions.

First – Signal VS Noise

There is a big difference between information and data.  Data is what it is.  There is no real value in data.  Data becomes valuable when it becomes information.  The problem is you never know when that happens.  Data is noise.  Information is signal.

I direct you to an article on the blog Elezea posted back in May about being careful who you listen to.  In that post the author brings in some information from Nassim Taleb (The Black Swan, Fooled by Randomness – two MUST reads btw.)  The post references information from a post on Taleb’s new book (Antifragile: Things That Gain from Disorder.)  Paraphrased and edited from that post:

The more frequently you look at data, the more noise you are disproportionally likely to get (rather than the valuable part called the signal); hence the higher the noise to signal ratio…

…Say you look at information on a yearly basis, for stock prices or the fertilizer sales of your father-in-law’s factory, or inflation numbers in Vladivostock. Assume further that for what you are observing, at the yearly frequency the ratio of signal to noise is about one to one (say half noise, half signal) it means that about half of changes are real improvements or degradations, the other half comes from randomness.

But if you look at the very same data on a daily basis, the composition would change to 95% noise, 5% signal. And if you observe data on an hourly basis, as people immersed in the news and markets price variations do, the split becomes 99.5% noise to .5% signal. That is two hundred times more noise than signal which is why anyone who listens to news (except when very, very significant events take place) is one step below sucker.

Think about that for a minute.

The more we look at the measure of something (especially something with qualifiable content vs. pure quantifiable outcomes) the more we inject noise into the stream and the more we hide signal.  This seems to me to have a profound impact on the idea of doing away with an annual review of performance to make way for a more “real time” view – and “real time” intervention.

Think about how you manage your investments.  Do you buy/sell each day?  Do you watch the numbers hour by hour, minute by minute?  Probably not.  You probably have an innate understanding that the market is volatile and you’re more interested in long-term results vs. short term gains.  You know it doesn’t make sense to judge the value of an investment based on a few data points within a small window of time.

But let’s add a little something else to the mix.

Information That Doesn’t Matter

In addition to the simple problem of signal to noise ratio is the problem where information that we think adds value but doesn’t.

A very interesting article on FastCompany ran back in August about an experiment on deciding whether to give someone a loan.  (I suggest you could substitute giving someone a loan with the hiring process and put in some other variables.)

To shorten the discussion a bit here is my synopsis of the experiment:

Participants were divided into two groups. Group 1 was asked to decide on a loan for

a recent college graduate with a stable, well-paying job and a solid credit history but during the routine credit check they discover that for the last three months the applicant has not paid a $5,000 debt to his charge card account.

Group 2 was given the same info with a slight twist:  instead of learning the exact amount of the student's debt, they were told that the size of the debt was unclear – either $5,000 or $25,000. Group 2 participants could decide to approve or reject the applicant immediately, or they could delay their decision until more information was available, clarifying how much the student really owed. Not surprisingly, most Group 2 participants chose to wait until they knew the real debt number.  After a bit of a wait the experimenters told Group 2 it was $5,000.

So – for all intents and purposes the two groups got EXACTLY the same info but one had to ask for it and wait while the first group got it all in one fell swoop.

The results are pretty amazing…

  • 71% of Group 1 participants rejected the applicant
  • Only 21% of Group 2 participants rejected the applicant

What changed between Group 1 and Group 2?  Why is $5,000 in debt enough to bounce an applicant 3/4ths of the time when it’s known up front and yet it only mattered in about 1/5th of the instances when the information was requested later?

Put that in performance appraisal language – or even hiring language.  Do we tend to make different decisions when we have more data?  Or when we think we’ve done a better job by seeking more data?  Maybe.

I don’t think the experiment was as much about the idea of “gather more data” as it was about the issue of “contrast.”  The 2nd group was given a range from $5,000 to $25,000 – when they heard the number was $5,000 they subconsciously thought – “heck that’s not so much when you compare it to $25,000 so it’s not that big a deal.”

Either way – if $5,000 isn’t that much compared to $25,000 it shouldn’t have been that much compared to nothing since they knew it was $5,000.  The addition of the $25,000 number really didn’t change the “data.”  $5,000 is $5,000 – and a loan risk is a loan risk.  Why should it matter that it is compared to a fictional number?

The real issue here is whether $5,000 is the right number regardless of what that number “could” have been.

A Long Road/Read to Get Here:  More Often and More Isn’t Always Better

My thinking as I read these two articles is this:

  1. Annual performance reviews require we collect data over a longer period of time – potentially allowing for smoothing of the data points for those that are pretty solid, stable performers.  It also allows for us to see really big changes and deviations better because we’re looking at more data over a period of time and actually see the “average” versus two data points in small window.
  2. We make different decisions when things are put into a “context” or have been framed.  Sometimes that is good and sometimes that is bad.  It’s good when we have valid information within valid context.  It’s bad when the information has no connection or validity.  Think about whether writing a proposal in 20 minutes is a good thing or it’s a bad thing.  Would you say it was good if initially you were told they wrote the proposal somewhere between 20 minutes and 20 hours?  If you were told it was 20 minutes after hearing the range would you think 20 minutes was good?  How do you know?  Which of those numbers is a good number?  If you know 20 minutes is good – then the 20 hours is really irrelevant.

As I read those two articles all I could think of is the poor employee who is being judged often and quickly on bits of data with and without context and wonder if they would be better served if their manager had spent more time waiting for more signal and seeking context rather than reacting to the company social feed and last week’s typo-laden presentation.

Should we consider the annual performance review again?

I’ll leave you with this quote from Nassim Taleb…

In business and economic decision-making, data causes severe side effects —data is now plentiful thanks to connectivity; and the share of spuriousness in the data increases as one gets more immersed into it. A not well discussed property of data: it is toxic in large quantities —even in moderate quantities.

Are we creating more bad decisions because we think we’re getting better information more often?


FOT Background Check

Paul Hebert
Paul Hebert is the Senior Director of Solutions Architecture at Creative Group Inc and a writer, speaker and consultant. Paul focuses on influencing behaviors and driving business results through employees, channel partners and consumers. Over the course of his career, Paul has worked closely with clients to design influence, marketing, motivation, incentive, loyalty, recognition and reward programs to increase effectiveness and reduce costs. Paul is a recognized authority on incentives and performance motivation. Want to know what’s going to motivate your people to perform at their best and impact the bottom line? Want to know whether your service award program really means anything at all? Curious what psychological principles drive sales behavior? Paul’s your guy… unless you fervently bow down to Maslow. Check out his personal blog at "What Is Paul Thinking?" when you're tired of his FOT rants.


  1. Mary says:

    May also explain why certain managers “ding” a person more heavily on their annual performance review for any missteps or errors made in the last quarter of the performance year, or even in the month or so immediately beyond the performance year end. It’s as if they start the year out with a full year’s perspective, but then in the last critical few weeks, as their anxiety levels rise and the “must have something to write” factor kicks in, they suddenly start scrutinizing performance on a daily basis and nitpicking any issues during that time frame as reflective of the whole performance. Everyone in my department dreads making the slightest mistake or error in February or March, because even if they have walked on water from April through December and been an excellent high-performer, any slight mistake during the last two months of the performance year will trigger that mediocre “average” review rating. Sad, isn’t it?

    • Paul Hebert says:

      It is unfortunate but humans do put more weight on more recent information – even if that info is an anomaly to the average data. It takes a bit of work to stop ourselves from doing it – but knowing we have that kind of decision bias helps us prevent it from happening.

  2. Jon Windust says:

    Really good post.

    Why not combine the two ideas? Keep a regular performance diary where social recognition, records of one-on-ones, etc are stored. Then include that as part of the annual review.

    • Paul Hebert says:

      Jon – good idea… at least that way you can balance the new with the old and get a more complete picture. The key is to not put more weight on the most recent activity but to look at it over time and see what the “average” is.

Trackbacks for this post

  1. Managing by 140 Characters – Why You Can’t Trust Anyone « I2I – Incentive Intelligence

Comments are now closed for this article.

Contact Us | Hire FOT to Speak | About FOT