Software picks out fake online reviews


July 26, 2011

Newly-developed software has been shown to pick out deceptive online reviews with almost 90 percent accuracy (Photo: Gizmag)

Newly-developed software has been shown to pick out deceptive online reviews with almost 90 percent accuracy (Photo: Gizmag)

One of the great things about the internet is the fact that everyday people can share what they know with the entire world, so if they've had a particularly good or bad experience with a business or product, they can notify everyone via customer review websites. The flip-side of that, however, is that business owners can plant fake reviews on those same sites, that either praise their own business or slam their competition. Well, confused consumers can now take heart - researchers from Cornell University have developed software that is able to identify phony reviews with close to 90 percent accuracy.

The Cornell team asked a group of people to deliberately write a total of 400 fraudulent positive reviews of 20 Chicago hotels. These were combined with the same number of genuinely positive reviews, then submitted to a panel of three human judges. When asked to identify which reviews were spam, the judges scored no better than if they had randomly guessed.

According to Myle Ott, a Cornell doctoral candidate in computer science, humans are affected by a "truth bias," in which they assume that everything they read is true unless presented with evidence to the contrary. When that happens, they then overcompensate, and assume that more of what they read is untrue than is actually the case.

After the human trials, the researchers then applied statistical machine learning algorithms to the reviews, to see what was unique to both the genuine and fraudulent examples. It turns out that the fake ones used a lot of scene-setting language, such as "vacation," "business" or "my husband." The genuine ones, on the other hand, tended to focus more on specific words relating to the hotel, such as "bathroom," "check-in" and "price."

The two groups of writers also differed in their use of specific keywords and punctuation, and how much they referred to themselves. As had already been found in other studies of imaginative versus informative writing, it was additionally determined that the spam reviews contained more verbs, while the honest ones contained more nouns.

Based on a subset of the 800 reviews, the team created a fake-review-detecting algorithm. When used in a way that combined the analysis of keywords and word combinations, that algorithm was able to identify deceptive reviews in the entire database with 89.8 percent accuracy.

So far, the software is only useful for processing hotel reviews, and Chicago hotel reviews at that. The Cornell team is hoping, however, that similar algorithms could be developed for reviews of a wider range of goods and services.

"Ultimately, cutting down on deception helps everyone," said Ott. "Customers need to be able to trust the reviews they read, and sellers need feedback on how best to improve their services."

About the Author
Ben Coxworth An experienced freelance writer, videographer and television producer, Ben's interest in all forms of innovation is particularly fanatical when it comes to human-powered transportation, film-making gear, environmentally-friendly technologies and anything that's designed to go underwater. He lives in Edmonton, Alberta, where he spends a lot of time going over the handlebars of his mountain bike, hanging out in off-leash parks, and wishing the Pacific Ocean wasn't so far away. All articles by Ben Coxworth

So then the fakers just have to adjust their writing style. easily defeated!

Denis Klanac

so ... when might this be available as an \'app\'?


As somone who works in the hotel industry I can tell you with a lot of confidence that the websites such at Trip Advisor are nearly entirley filled with false reviews either from competitors wishing to trash the competition or the operators writing their own more positive reviews. The whole system is a total sham and this piece of software will not come close to fixing it, in my mind there is no doubt that it is time for government to step in and reform the entire sector as it simply a con pure and simple.


This reminds me of spammed youtube channels with guys figuring all sorts of imagined stories just so people go to their .cc sites and \'watch the movies for free\'. And the IMDB movie reviews.. The general rule of thumb is that the first reviews (especially on a small budget) are the fake ones from family, friends, etc. Actually, if this algorithm won\'t work, a great vacancy for people that see through all the bs might open. Come to think about it - not just 1..))

Renārs Grebežs

I\'m delighted by this advance in validating reviews and I hope very soon it can be adapted to other industries and products. As a fairly regular on-line buyer I tend to look at reviews and while I try to \'sift\' them to spot the dubious ones, I expect that, like the people mentioned in your article, I\'m probably not good at it.

Phoney reviews whether by customers of biased journalists, on-line and in magazines are a waste of everybody\'s time and a serious disservice to customers, wronged suppliers and ultimately to the credibility of whatever forum is involved. So good luck to the people at Cornell and may they soon, very soon have versions of their system suitable for many more situations.


Seems to me a more scientific way to go about this would be comparing the I.P. address of origin to the author some how and not so much on \"how\" the article is written. Seems as with Cornell\'s system, there is a huge margin for error there but I do not know for sure, obviously. I\'m just skeptical.

Michael Shewell

This study is badly done. To limit the study to 400 individuals located in Chicago to write about Chicago hotels makes the information specific to Chicago and maybe not even then. If the 400 consist of 180 males and of these 20 are in the 20-30 year age group, and of these 20 you have 10 that graduated from college, then you do not even have a statistically valid sample of possible business marketing types who would be called upon to write a fake review for their hotel employer.

If someone in Chicago writes about going with their husband to a hotel in Chicago on their vacation obviously they are lying. If someone in Chicago writes about a hotel in Cozumel and mentions going with their husband on a vacation they are more likely to be presenting real information.

This study tells me far more about the incompetency of the study designers than it does about false reviews posted on the internet. Even with reviews where people are attempting to be honest their personal bias enters into the information that is provided and that which is omitted. One person may like the fact that room service came in while they were out and turned down the bed and another person may complain that staff comes into their room without permission while they are away. Two people can eat the same food and the person from Connecticut may find it too spicy and the person from California may find it too bland and neither of these negative reviews would be false.


If you believe Shills can be eliminated - you\'re dreaming. Shills have been around since the beginning of time. Nobody has ever alleviated shills. Nobody will.

Shills and Cockroaches are equally indestructible.


At least the ball is rolling. It\'s about time something is done about the B.S. reviews online. I say more power to\'em. If the issue of \"false\" reviews isn\'t addressed it will be even more like the \"wild west\" online then the personal oasis it\'s supposed to be.

Jason Elizondo


"To limit the study to 400 individuals 'LOCATED IN CHICAGO' to write about Chicago hotels"

This is not what was said. Read it again. Ian Colley.


@ Denis Klanac That is about as useful as telling actors to just act more like real people (in realist productions).

Ali Kim

I travelled all over Europe and used reviews of hotels extensively. I found that the reviews that I could rely upon were the ones that seemed to list good and bad points as well as geographic information eg Such and such a hotel is within walking distance of a major museum. After reading them for a while it seemed that there was a \"feel\" to those reviews that were accurate and fakes tended to give themselves away by comments that did not seem to ring true, compared to other reviews for the same hotel. It turned out that hotels selected through reviews fitted the reviews very accurately so I guess it is possible to detect real and false reviews. However, I have no idea how you could distil the \"feel\" I or anyone else got for reviews. While looking at multiple reviews for hotels worked for me, it did fail me for a train journey. I could no find more than a couple of reviews for a long high speed French train overnight ride and both were very positive. They were also completely divorced from the real journey which was appalling. It certainly would be a huge advantage to know what is genuine and what is a fake but I can\'t even begin to think how anyone would implement it. Let\'s say the Cornell team comes up with an algorithm that is 99% accurate. Then what? If they posted \"this review is a fake\" they\'d likely find they would be sued by the hotel etc if the review was positive and they refuted it. And just where would you post the review of the review?


We started this research a few years ago. It was first funded by Microsoft and then by Google. We used the content and also behaviors of reviewers to detect fake reviews and reviewers. But this is a VERY hard problem. The tough part is it is very difficult to verify the results in the real world. See this page:


Thank You for telling everyone how the software finds fakes Through \"Scene Setting\" and the use of Verbs and Nouns.

The bad news is anyone thinking of posting a fake review is aware of these issues as well and can and will act accordingly.


So where can we the public get our hands on this software?

Gary Greenwood

Very nice advance , humans are fantastically adaptive , all these trends mentioned about honest vs fake will be learned by fake reviewers ( if this software ever get implemented) and in notice, fake reviews will dress in honest dresses while customers keep paying liscence for obsolete software that needs continuous upgrade . There are better ways than just an algorithmic software :) but I will leave that for the Knowledgeable ...

jaison Sibley

I think it is easy to tell if a place has fake reviews. Usually if a place is good or bad, the reviews will be mostly on one end of the spectrum. If there are many fakes, then there will be many conflicting reviews on both ends of the spectrum.

Post a Comment

Login with your Gizmag account:

Related Articles
Looking for something? Search our articles