Tales of the Rampant Coyote

Adventures in Indie Gaming!

The Ratings Game of Rating Games

Posted by Rampant Coyote on April 22, 2013

Ye olde day job has me once more on the road, although this time still in the U.S. Hopefully this time my trip to Savannah, Georgia will be a little more relaxed – 16-hour workdays last time got to be a little much, and I wasn’t QUITE as prepared with the blog posts as I should have been. While I don’t have all my posts for the week “canned,” I may not be super-responsive to the comments, depending on schedule and the quality of the hotel Internet connection.

That being out of the way…

Sometimes I get confused for being a “games journalist.” I’m really not. I don’t have the training. I really don’t do game reviews here – there’d be a serious problem with conflict-of-interest and stuff. I do like recommending games I enjoy, and I love to talk about games. It’s what gamers do. We talk about what we love. And I love video games. At least, I love many video games. I have ever since the first time I walked into an arcade and started dumping quarters into Defender, Space Invaders, and Starfire. Or saw a friend’s massive printout of his experience playing the Colossal Cave Adventure. I was enamored. I wanted to talk about it. I talked with friends. I read what few magazines there were on the subject. We compared and contrasted. We argued. We swapped stories.

This isn’t games journalism. This is being a gamer and communicating.

Flash forward thirty years, and we’ve got problems like this one with Metacritic. Sadly, there’s no two ways around it – no matter what system is put in place, when there’s money or power on the line, the system will be gamed. Hard.

I spent a couple of years working for a company that did multi-level marketing (I guess the preferred term now is “Network Marketing”) – and I was astonished. Every loophole, every idiosyncrasy would be exploited. Big time. You’d have distributors paying other distributors to *NOT* work to maximize their downline (“If I can skip you, I can make $46,000 this month, otherwise I only make $38,000 this month — do nothing and I’ll pay you one quarter of the difference.”)  That was counter-productive to everybody, except it maximized the gains of the distributor(s) at the top of the chain — and it was the tip of the iceberg. From government laws & regulations, to business agreements, to standardized test scores, to getting your game approved on XBLIG, to annual employment reviews – any system will eventually get “gamed,” and the more that is at stake, the more it will be gamed, and the spirit of the law will be violated to take advantage of the inevitable limitations of the letter.

So what really sounds like a great idea – publishers being concerned not just with being able to milk their audience for millions of sales, but also being concerned about the ultimate quality of their products – gets twisted into a system of vague pressure and politics.

I guess I also don’t understand why Metacritic doesn’t do a simpler Rotten Tomatoes type of system where the number represents the number of critics who gave a favorable review.

And  that goes back to the complex numerical scores so many review sites use. What’s the difference between an 84% and an 85%, anyway? Or an 8.0 versus an 8.1?  Some sites and magazines historically provided some kind of mathematical weighting system for calculating the score, but it seems to me like an exercise in deception – to pretend that the review score is purely objective, when no such thing is possible. I don’t know that I’d want it to be possible! I don’t think games SHOULD be something that can be entirely objectively scored.

Really – while there are certainly qualitative elements that can be pretty easily compared with each other to get something approximating empirical, non-subjective truth, games are such complex, diverse beasts that they really shouldn’t have that much in common to compare.

This is, in part, why there’s been such an emphasis in graphics quality over the decades. That’s something most games have in common – pictures. How nice are these pictures? For a while that worked okay, because it was really a combination of technology, artist skill, and programmer skill. And since technology kept improving, it was always possible to push that frontier. But nowadays, it ends up being a combination of what game has the most mind-blowing special effects and the fewest graphical anomalies. Is that really how we want to grade games?

And dinging a game on account of bugs? I totally understand when it’s a case of the bugs truly detracting from the experience. But if you recognize that bugs are simply a naturally occurring phenomenon resulting from complex software (something too few gamers seem to be able to grasp), then you’ll realize that being too quick to pick up the torches and pitchforks over *GASP* a character’s hand clipping into his hat in an animation provides a perverse reward for those games that DON’T TRY TO DO VERY MUCH. If you keep your interactions really dumbed-down and simple, easily enumerated, and easily testable, you’ll start with a lot fewer bugs, and they’ll be much easier to find and fix in development.

So even these semi-“objective” criteria really include a ton of subjective tolerances. Are you more willing to accept imperfections in exchange for greater depth and complexity? Are you willing to accept that a game taking risks with new mechanics is going to get some things wrong, compared to a game that plays it safe by honing the tried-and-true to a blinding shine?

And then we get into an area where the similarities and easily-compared elements are lost, and we’re comparing a game dealing with (for example) a theme of innocence lost and eventual redemption with another that is about… killing zombies in as graphic a manner as possible. How do you compare these two? Any commonalities will be grossly simplistic and cosmetic, and any deeper comparison might be like grading a fish on its ability to climb trees. I mean, if you ignore the superficial stuff like photorealistic graphics and quality of voice-overs, would a AAA company REALLY want its game compare to Minecraft?

I could argue that much of the lack of progress of the medium the last couple of decades could be attributable to shoddy reviews (and the influence they hold over the industry), but that might be making a bold claim that the chicken came before the egg. All I know is that it seems to have gotten us into a vicious cycle.  The problem predated Metacritic. Can you really fault a site that does its best to preserve the detail that the industry claims is an honest-to-goodness part of the source data?

So what this all boils down to is my belief that not only are game review *scores* a steaming pile of excrement, but our efforts to treat them as legitimate, objective data is what has hobbled us as an industry.

That’s not to say the reviews themselves are a problem – the text of a review can be extremely valuable not only to a prospective buyer but to anybody trying to scope out the length and breadth of the industry. It’s just the scores – which I will readily admit to being hypocritical about and checking often before reading the review itself – often present an illusion that I feel can be damaging to the medium and industry in the long run.

Really, what more do we need besides thumbs-up and thumbs-down. Maybe a thumb-sideways. Maybe an extra super-thumbs up for those games that really amaze. I don’t know that we really need an “extra bad” rating, for that matter – I know I’d not use it very often. Bad is bad.

I think that for the good of the entire industry, we really just need to wake up, embrace our subjectivity, and keep our review scores simple!

 


Filed Under: General - Comments: 7 Comments to Read



  • reader said,

    Most of thre online review systems suffer from “trust agility” issue.

    First of all, i dont trust random sources for my reviews, much less aggregated random sources like MetaCritic.

    If i had 48 hours in a day, i would be coding equivalents of RottenTomatoes, MetaCritic, Yelp etc where one can choose whose reviews and opinions i trust, and also change them as the time goes by.

  • Xian said,

    I tend to look at gameplay and video reviews more these days than printed or web reviews. At least I can see what it looks like and get feel for the game. That isn’t always perfect though – I have seen some very good looking racing games that had abysmal controls once I played it hands-on. In that case, a demo might be the best way to test the waters.

    Eventually it all boils down to that particular reviewer’s opinion, which might not necessarily match mine. Some people I have learned to trust in their reviews – Myrthos and Corwin at RPGWatch for instance. Scorpia doesn’t do reviews these days, but she was another that I knew would tell it like it is.

  • Anon said,

    About the often-used “percent rating” system (0-99% or 0.0 to 9.9):
    A German PC game magazine once tried to simplify it to 5 or 6 stars with the maximum being the best rating. According to their chief editor the result was a massive decline in sales as the readers didn’t adopt the new system. They reverted back to their traditional system and sales increased.

    Apparently, many readers want a detailed score because it may *suggest* a more precise judgement and a better comparison for making a buying decision.

    Personally, I can’t follow this as it doesn’t matter to me if a game has 7 or 8 points or even 77% or 81%.
    In fact I did have fun with many games that had lower scores than “80%” which seems to be a magical barrier in the eyes of many German gamers. Everything below that is just “mediocre” while everything over it is at least “good”.

    Also: Bioshock Infinite has a score of 95 on metacritic.com. The first Bioshock game was also very highly rated in many places – but I just can’t get warm with that franchise, even though I occasionally play a first-person-shooter.

    On the other hand, the games I strongly favor over the Bioshock series – Deus Ex (the original and Human Revolution), Thief 1/2, Dishonored – all have lower scores on metacritic. Granted, they are still at the upper end of the scale with a rating of 87 to 92, but they offer me what Bioshock doesn’t: More freedom in movement and actions.

  • McTeddy said,

    I can only trust reviewers that I know. Not literally know… but that are open about their biases and I know their personality.

    Why? Because saying that a game is 5 star… 90%… great DOESN’T MEAN CRAP. It’s all subjective and based on what each person is looking for in a game. Most of my personal 90%+ games are rated 70 on an average site.

    The modern method of basing the reviews on technical aspects is silly and hurts gaming as a whole. Games should be judge on what they bring to the table rather than what big-budget competitor brought last week. Modern reviewing is a major reason the the uni-genre problem that we complain about these days.

    Though… I will also say “SHAME ON YOU FANBOYS!” for situations like Jim Sterling’s review of Deadly Premonition and Yatzhee’s review of Super Smash Brothers. Both reviews were fair and fully open about their bias… yet people spammed them with hate mail and death threats because their opinions didn’t match the majority.

    That is just sad.

  • Corwin said,

    The issue with ANY score is that people interpret it differently. At the Watch, we give a game a score out of 5, but it is not really meant to be numerical. We have a lengthy explanation of what each score means, but rarely do people bother to read that explanation. I remember giving a game a score of 5 and was immediately hit with a barrage of criticism because I’d said the game wasn’t perfect, but I’d given it a perfect score. If those critics had bothered to read what a score of 5 means for our reviews, they would have noticed that such a score DOES NOT mean the game is perfect. There is no such thing as a perfect game, so that means a score of 5 should never be given? I like to think of it as a 5 tier system, rather than a numeric result. I grade the game based on the criteria for each tier. That, to me, seems fair.

  • Felix said,

    Even 5-star rating systems can be tricky. What exactly does each number of stars mean? I developed my own set of meanings which I try to use consistently, but that makes aggregation useless.

    It was even harder when I was a judge in the French Interactive Fiction Competition once, and we were asked to grade from 1 to 10 (in half a point increments? I don’t recall). So I devised a grading system, which I disclosed in the reviews. Ended up giving most games lower grades than I intuitively felt they deserved, and apologized for it, too. Didn’t quite help. 😛

    Ultimately, @reader might have a point there. The IFDB has just such a system whereas I can mark a reviewer as trusted (or the opposite), and their reviews will bubble up as I browse the website.

  • Xenovore said,

    Of course it’s mostly subjective. For one thing, if reviewers are anything like the rest of us, they have preferred game genres. But they don’t just review games in their preferred genres, they review others as well. We can’t expect them to be completely objective either way; the tendency is to find less fault with what you like, and find more fault with what you don’t like.

    Also they won’t necessarily understand the nuances of a genre they don’t usually play. E.g. if we’re talking about a FPS game… I play FPS games quite a bit, so I could tell you something like “Yeah, they nailed the feel of the weapons and the AI has decent tactics”; but someone that mostly plays RPGs might say “The story was kinda stupid”, i.e. they won’t “get” it. I’ve read plenty of game reviews where the reviewer clearly didn’t get it; it was obvious that he didn’t normally play the type of game he was reviewing. No way I can trust that review, no matter what score they give.

    So… As mentioned above, video reviews can be far more meaningful since we actually see the game-play; we need more of those. Also, we need more demos (or even the old shareware paradigm, where you get a complete piece of the game) so we can actually try the games out.

top