User ID:
Password:

 
Remember me
Lost password?

TUESDAY TRIBUTES
06/17/2014
Wine Reviewing Produces Only Random Results—Yes or No?

By Charles Olken

Let’s summarize: There has recently been a lot of talk about randomness in blind tasting—and it has come from friends of mine, most of whom have missed the point in my humble opinion.

Okay, that is what today’s blog is in a nutshell: I am about to argue with my friends. Sorry about that. Most inappropriate. Who the hell do I think I am?

Color me an unapologetic believer in blind tasting. The core arguments against are twofold:

--Reviewers think that their blind tasting results are immutably correct because they have been done blind, and they are wrong.

--The more a reviewer tastes the same wine, the wider the variance will be and ultimately randomness will be the result.

The first point is hardly worth debating. No one I know who does blind tasting thinks they are infallible or that their results are so “correct” that they cannot be wrong. Frankly, not tasting blind may lead to more consistent results with known wines. Can you imagine Robert Parker or Steve Heimoff tasting Harlan or Colgin or Bond and winding up with a random set of results? So, we can dismiss the first argument as just that. Either you believe in blind tasting or you do not. Either you believe that knowing the wine being tasted makes that critic less likely to be influenced by the name and vintage or you believe that not knowing makes the critic less likely to be influenced.

But the second point is the one that bugs me. Whether you agree with the judgments that we publish here at CGCW or not, it is pretty much the case that we are not horribly disconnected with the judgments of other reviewers. Sure there are the occasional big disagreements, usually not about the elements in a wine but about how likeable they are or they are not.

The suggestion that a given taster’s results would amount to nothing more than random if the same wine were tasted many times over in blind tastings is, on its face, a bit of a stretch given the degree of correlation among the most experienced reviewers. Those correlations are far closer than random, and that is an observable fact if one cares to read a dozen reviews of the same wine.

What it really comes down to is the value of wine reviews. Do they have validity in the first place? When Earl Singer and I started Connoisseurs’ Guide back in 1974, we were a couple of involved consumers who wanted to produce a particular style of review—long, carefully crafted, studied—because that was what we wanted as consumers. It is our style, more than anything else, that set us apart from Jim Laube or Steve Heimoff or most other experienced reviewers.

We know we were going to be judged on style, but, more significantly, we knew we were going to be judged on how closely our reviews came to describing the experiences of our readers when they pulled corks on wines we had reviewed. It is that latter point on which all evaluative publications stand or fall. I would be wrong if I suggested that we get it right all the time or that every reader loved us so much that we had a perfect rate of renewals. But, we have been at this stand for four decades and we still have a batch of followers. That did not happen because our reviews were random.

And this is really not about us anyhow. The debate, boiled down to its essence, is whether blind tasting has value or not; whether the written opinions have a reasonable amount of validity or are nothing more than some random junk thrown on the wall. On that point, it is the readers who decide, not the arguers pro and con.


The CGCW Experience - Take the Tour

Meet the New CGCW

For thirty-five years, Connoisseurs’ Guide has been the authoritative voice of the California wine consumer. With readers in all fifty states and twenty foreign countries, the Guide is valued by wine lovers everywhere for its honesty and for it strong adherence to the principles of transparency, unbiased, hard-hitting opinions. Now, it is becoming the California winelover’s most powerful online voice as well. And, our new features provide an unmatched array of advice and information for aficionados of every stripe.

Comments

Randomness
by Patrick
Posted on:6/17/2014 1:01:20 PM

I agree with you that randomness won't be the result of repeated tastings by an expert. But I once read on Steve's blog (I think it was) that he said that his scores could vary by as much as 4 points, based on the conditions of the tasting and the order of wines in it. Do you agree with him on that?  Thx.

Variance
by Charlie Olken
Posted on:6/17/2014 4:37:50 PM

Patrick, the operative word here is "could". 

There are places in the 100 point system the way it is used today where that could happen and places where it is unlikely.

That, to me, is the biggest problem with the 100-point system. It is not a straightline system but one of relative relationships. Or, perhaps I should say, "It has become".

Certainly, in the upper 90 range, when you start talking about the differences between 94 and 98, especially for those who use all 100 points, there is less significant qualitative difference, in my opinion, than say 83 and 87.

In fact, there is really not a perfect system anywhere. Some reviewers choose to use few words, and some who use few words essentially engage in silly talk anyhow.

We like to think that the long reviews in CGCW are what is required because that is what we wanted as consumers when we started our publication. Those who read other critics are apparently happy with less specificity, and indeed with words that at times tell no story.

Could there be four point swings? Yes, of course. We always taste every wine getting 90 points or more a second time and any wine getting at the low end of the scoring range. That extra step does not always eliminate, and indeed, cannot eliminate, the occasional wide variance, but, for us, at least, when that happens, we taste a third time.

Perfection is not possible, and no reviewer is going to taste a dozen times. The wineries would not supply the wine and the publication would go bankrupt if it bought all that wine, let alone findign the time to taste thousands of wines a dozen times.

But, the suggestion that randomness will result if a wine is tasted a dozen times, as was alleged elsewhere, is simply counterintuitive.

Thanks for asking. I am sure I have not answered in the specificity you might like, but I don't their is a finite answer to the question--except "yes, it could at times".

Awwwwright....
by TomHill
Posted on:6/18/2014 9:34:35 AM

Awwwwright, Charlie...sucked me in again.

   I think Heimhoff's original choice of words was...just plain wrong. Though comparing wine critics to blind monkeys was not too farfetched!!!  :-)

When he used the term "randomness", I think what he really meant to use was "replicable". Randomness smells to high heaven of statistics...a subject I'd prefer to avoid.

So...lets go w/ "replicable" and the 100-pt scale, be it Parker's, WineSpectator's, or CGCW's 100-pt scales. The implication is that a wine that is scored a 96 is distinctly better than a wine that is scored a 95. I, for one, don't believe that for one minute. And I doubt that you do either, Charlie. But that's what the 100-pt scale implies to most folks I think.

    What Heimhoff was driving at was "replicable" I think. Say you taste a DryCreekVnyd Heritage Zin in a flight of 20 Zins. You give it an 89. Next day, in a flight of another 20 Zins, you gave that wine a score again. Will it be an 89?? Maybe....but most likely not. And repeat that exercise again then next day...and then the next. The likelihood of an 89 score on all those days is pretty slim is my guess. That 89 score is not "replicable". My guess is that you'd come pretty close to being replicable...an 87 one day...maybe a 90 another day. But doubt you'd give it a 95 one day and an 80 another day. That, I think is the point Heimhoff was trying to make.

   Back in the olden days, when UC/Davis was developing their 20-pt scale, they chose a 20-pt scale because their studies indicated that tasters could only discern about 20 gradations of quality. So that's what they chose to use for their scientific studies. The UC/Davis scale is pretty precise...you have to be trained in its usage. It is not a "hedonic" scale...but a science-based "quality"  scale.  Exactly what they needed for their studies.

   But the 100-pt scale is not a "quality" scale. It is a "hedonic" scale...a scale based on how much the taster likes the wine.  And "hedonic" scales are far/far less precise than a "quality" scales. They are totally unrelated. Why...even a blind monkey can use them!! :-) But to think that a taster can discern 100 degrees of "likability"  is utter nonsense. But that's exactly what the 100-pt scale implies...at least to me, anyway.

   However, we're stuck w/ the 100-pt scale and it's not gonna go away in our lifetimes, Charlie. Personally, I always liked the 3-meadow muffin scoring scale myself. 'Twas good enough for me.

   Curious...have you ever tasted a 51-pt wine, Charlie?? Maybe an old Pesenti?? Or a Coturri that had gone south???

   More on blind tasting later on.

Tom

 

PS: And it's totally allright to argue w/ friends. None of us have and inside track to knowledge or enlightenment.

Forget...
by TomHill
Posted on:6/18/2014 9:39:50 AM

Forget to mention that Amerine & Roessler's very good book has a section discussing quality and hedonic scoring systems.

 

Thanks
by Patrick
Posted on:6/18/2014 9:48:21 AM

Thank you Charlie for a long & thoughtful answer. You are still vindicating my first purchase of CGCW back in, what was it, 1975?

No Subject
by Dan Fishman
Posted on:6/18/2014 9:51:42 AM

Re: replicability.

To say that you wouldn't get an 89 every time you rate the same wine does not mean the score is not replicable, it just means the measurement is not 100% precise - this is not really a problem though, unless you really believe the score is infallible.  If you measure, say the circumference of a circular object over and over with a ruler, you are going to get different results because measuring a round object with a straight one is difficult to do precisely.  But if you average all those results you are going to get a number that is pretty close to the actual answer (unless there is some bias in the way you are measuring -- anaologous to, e.g., some critics' (alleged) bias to big wines).  So is that replicable or not?  I think most would say yes, its just not 100% precise.

When Is Random Not Random
by Charlie Olken
Posted on:6/18/2014 9:56:23 AM

Hi Tom--

It may be that Steve H. simply used the wrong word, and I do not mind giving him the benefit of the doubt. He is a good guy even if he has gone over to the dark side :-} and is now working for Kendall-Jackson.

But, he is also a man of words and he is not often giving to totally misusing them, so he has to live with random until he takes it back. I did give him that chance and he did not take it.

Be that as it may, I agree with you that specific replicability within the 100-point scale is not likely over a set of a dozen trials with the same wine in different tastings, let alone with the same dozen wines arranged differently each day for a couple of weeks.

But, all of us who use the 100-point scale do not dispute that. It is not a finite scale with an immutable score. This is not science and no taster worthy of his cork-puller would say so. Thus, the rating given a wine is meant to be two things: how we experienced the wine on the given day and a shorthand notation for the more involved descriptions provided.

And just to make things even fuzzier, not only would the point scores have some variability, but so would the words chosen.

And, no Tom, we have never given a score below 70 even though we have been tempted. The system as currently used simply does not do that, and I, for one, do not think that we ought to equate those numbers directly to grades in grammar school (does anybody still call it grammar school?).

Amerine
by Charlie Olken
Posted on:6/18/2014 10:02:43 AM

The Amerine and Roessler treatice on the subject of numbers is quite informative and has been part of my tasting DNA from the first time (of many) that I read it.

But, it is also a bit of hooey in my humble opinion. It tries to make science and statistics out of wine tasting. And the suggestion that there is no discernible qualitatitve differences among wines that my not have three-sigma limit differences simply belies my experience.

Scale
by Bob Behlendorf
Posted on:6/18/2014 10:46:31 AM

Perhaps the 100 point scale is really a 20 - 25 point scale, since the number of wines reviewed in most of the published reviews that rank below 80 is rather sparse. Since wine evaluation by professionals and amateurs alike is so subjective, I would submit that it would be more helpful to us consumers simply to rate wines on a Like-Dislike scale, something akin to the CGW puffs, etc. Simply put, we either like a wine or we don't, and the various gradations of like - e.g. 95,96, 97, etc. - are somewhat distracting to what should be an elegant ,pleasurable experience.

A
by TomHill
Posted on:6/18/2014 10:53:42 AM

The Amerine & Roessler book is also part of my DNA...though the statistics section of Roessler's can be a bit heavy going.

   I guess I wouldn't dismiss their (UC/Davis) work as hooey, though. It is supposed to be a "quality" scale. The presumption is that since we humans prefer "quality" in all things, their 20-pt scores should indicate how much we like the wine. Not necessarily the case.

   When you & I started reading wine stuff, most scores were based on the Davis 20-pt scale. It was almost always used by folks that had no training in its use...other than reading what the Davis 20-pt scale was.It was, more accurately, a 20-pt hedonic scale...masking itself as a 20-pt Davis scale score.

   I think it was a useful tool for the UC/Davis researchers. If the subject of your study was, say: "The effect of vine yields on NapaVlly Ribolla Harvested on Sept 20 on the Quality of Wine"; then I think it worked just fine. If that study would tell me which Ribolla I would like best...probably wouldn't be of much help. I rather rely on a CGCW 3-meadow muffin score for making that determination.

Tom

Blind Tastings
by TomHill
Posted on:6/18/2014 11:50:50 AM

Charlie,

   I couldn't agree more on your first point. Knowing that a wine critic/reviewer tasted the wines blind does NOT (ignoring the subject of the precision of their scores) make them "immutably correct". All it does is confer on their review an air of objectivity. It removes the bias that review might have if they knew what the wine was or who the winemaker is. You, as well as I, know that a Geyserville tasted out on the deck overlooking MonteBello Ridge with PaulDraper is gonna taste pretty darn good. Better than if you tasted it in blind circumstances around the table in your reviewing venue. I would much prefer my reviews be based on that latter venue.

   I am curious, though. Suppose you're tasting a group of 20 Zins blind. You (or someone) presumably know what wines you are tasting?? So...the "Draper perfume" in a Ridge Zin is pretty distinct, I think. What do you do when you taste one of the wines and immediately identify it as a Ridge LyttonSprings? How to you keep the memories of sitting out on the deck w/ PaulDraper from influencing your review? Or does that never happen??

   I presume you group your wines in your blind tastings in some sort of peer groups. Suppose you are tasting a set of SauvBlancs. You get one SB that has very little varietal character and has a bit of a tannic bite. So you give it a poor score (precise or inprecise) for lacking varietal character and not being as good as the peers in the group. Then when you unveil the wines, you see that it is a CowanCllrs Isa SauvBlanc, made with a fair amount of skin contact (which tends to destroy varietal character) and gives it a bit of tannic bite. That's exactly what you'd expect the wine to taste like if you know what it was (i.e. not tasted blind). Are you allowed to go back and modify your review based on that (non-blind) knowledge?? Or do you still stick to your guns that it's a crappy wine (it is not..and ages into something pretty interesting)?? Or would you never put a skin-contact SauvBlanc in w/ a group of conventional SauvBlanc peers?? Sometimes they are made w/ limited skin-contact, or sometimes the winemaker doesn't reveal that information. So that may not always be the case.

   I guess what I'm asking is what do you do w/ an anomolous wine like a skin-contact SauvBlanc and the wine is an obvious outlier in the group you're reviewing? Or do you just not choose to review such wines?

   My point is that tasting wines blind can confer on the review a certain air of objectivity (which I presume most wine critics seek), but that blind tasting can remove some precision in the review that non-blind tasting would allow you to have. Alas, you can't have it both ways.

Tom

 

The Perils of Paul
by Charles E. Olken
Posted on:6/18/2014 1:19:49 PM

The short answer is "yes".

The basic answer is that reviewers, including us, either have broad understandings of and appreciations for the possibilities of a vareity or we do not. 

Some writers, like Parker and Bonne, have what are thought to be narrow views of the world. Some of us do not. But, there are certain things we think are more or less immutable--varietal character, balance, depth, nuance, range. And even there, each person interprets those seeming immutables differently.

Do we sometimes recognize wines in blind tastings? Well, let me put it this way. Sure we do, and we are often wrong, but a weak Geyserville is still a weak Geyserville. And a very good wine that turns out to cost $10 is still a good wine.

How many points is tasting on the deck with Paul worth when we have tasted Ridge wines blind? Good question. Mostly, none, and certainly none at the upper end of the range because we taste all those wines twice, and the range of ratings is determined solely by the wines. In fact, I try not to recognize wines for that very reason.

That means that I am not very good at the parlor trick of identifying a wine blind in the "Here, taste this and tell me what it is" challenge. But, if I could not tell a Ridge MB Cab from a Heitz Martha's, but those are very particular wines.

Oh, and one more point. I do pick the wines for the tastings, but after all this time, I no longer care about what they are other than that I hope to put together reasonable groups of peers for comparison. They are just lettered bottles of aluminum foil and let the best bottle show itself. In fact, that is the real beauty of what we do. We taste so many wines that we are able, ultimately, to cellar the ones we love--and we are not devotees of a wine or winery before the fact.

 

...Random Results
by Thomas Taylor
Posted on:6/26/2014 9:46:14 AM

I'm hesident to comment after the esteemed group above, but to a point that I didn't see raised - I've subscribed to CGCW for most of the past 30 years, and I keep on keeping on because I've fairly well come to know what to expect by reading the descriptive information included with the overall score, and because my expectations have usually been affirmed with the miniscule subgroup of the wines actually bought and enjoyed. Thank you Charlie and Steve, for calling them like you see them and for maintaining your good eyesight.

Mr. Taylor
by Charlie Olken
Posted on:6/27/2014 5:49:05 AM

Mr. Taylor--

It's obvious to me that you need to comment more often.

Thanks for the kind words.

Charlie

Leave a comment below, but please limit your comments to 1,200 characters or less. We find it helpful to make a copy of our comments to be sure that they fit. In that way, you can edit them if they run long.

(Please note: your e-mail address will not be visible after posting)

Name
Email
Subject

 

Note: Refresh your browser to see your latest comments.

Having technical problems with the comment system? Click here.