Software picks out fake online reviews

July 26, 2011

Newly-developed software has been shown to pick out deceptive online reviews with almost 90 percent accuracy (Photo: Gizmag)

View 1 Image

1/1

Newly-developed software has been shown to pick out deceptive online reviews with almost 90 percent accuracy (Photo: Gizmag)

One of the great things about the internet is the fact that everyday people can share what they know with the entire world, so if they've had a particularly good or bad experience with a business or product, they can notify everyone via customer review websites. The flip-side of that, however, is that business owners can plant fake reviews on those same sites, that either praise their own business or slam their competition. Well, confused consumers can now take heart - researchers from Cornell University have developed software that is able to identify phony reviews with close to 90 percent accuracy.

The Cornell team asked a group of people to deliberately write a total of 400 fraudulent positive reviews of 20 Chicago hotels. These were combined with the same number of genuinely positive reviews, then submitted to a panel of three human judges. When asked to identify which reviews were spam, the judges scored no better than if they had randomly guessed.

According to Myle Ott, a Cornell doctoral candidate in computer science, humans are affected by a "truth bias," in which they assume that everything they read is true unless presented with evidence to the contrary. When that happens, they then overcompensate, and assume that more of what they read is untrue than is actually the case.

After the human trials, the researchers then applied statistical machine learning algorithms to the reviews, to see what was unique to both the genuine and fraudulent examples. It turns out that the fake ones used a lot of scene-setting language, such as "vacation," "business" or "my husband." The genuine ones, on the other hand, tended to focus more on specific words relating to the hotel, such as "bathroom," "check-in" and "price."

The two groups of writers also differed in their use of specific keywords and punctuation, and how much they referred to themselves. As had already been found in other studies of imaginative versus informative writing, it was additionally determined that the spam reviews contained more verbs, while the honest ones contained more nouns.

Based on a subset of the 800 reviews, the team created a fake-review-detecting algorithm. When used in a way that combined the analysis of keywords and word combinations, that algorithm was able to identify deceptive reviews in the entire database with 89.8 percent accuracy.

So far, the software is only useful for processing hotel reviews, and Chicago hotel reviews at that. The Cornell team is hoping, however, that similar algorithms could be developed for reviews of a wider range of goods and services.

"Ultimately, cutting down on deception helps everyone," said Ott. "Customers need to be able to trust the reviews they read, and sellers need feedback on how best to improve their services."

17 comments

DixonAgee July 26, 2011 05:31 PM

so ... when might this be available as an \'app\'?

newsontim July 27, 2011 12:56 AM

As somone who works in the hotel industry I can tell you with a lot of confidence that the websites such at Trip Advisor are nearly entirley filled with false reviews either from competitors wishing to trash the competition or the operators writing their own more positive reviews. The whole system is a total sham and this piece of software will not come close to fixing it, in my mind there is no doubt that it is time for government to step in and reform the entire sector as it simply a con pure and simple.

Renārs Grebežs July 27, 2011 08:50 AM

This reminds me of spammed youtube channels with guys figuring all sorts of imagined stories just so people go to their .cc sites and \'watch the movies for free\'. And the IMDB movie reviews.. The general rule of thumb is that the first reviews (especially on a small budget) are the fake ones from family, friends, etc. Actually, if this algorithm won\'t work, a great vacancy for people that see through all the bs might open. Come to think about it - not just 1..))

Alien July 27, 2011 12:38 PM

I\'m delighted by this advance in validating reviews and I hope very soon it can be adapted to other industries and products. As a fairly regular on-line buyer I tend to look at reviews and while I try to \'sift\' them to spot the dubious ones, I expect that, like the people mentioned in your article, I\'m probably not good at it.
Phoney reviews whether by customers of biased journalists, on-line and in magazines are a waste of everybody\'s time and a serious disservice to customers, wronged suppliers and ultimately to the credibility of whatever forum is involved. So good luck to the people at Cornell and may they soon, very soon have versions of their system suitable for many more situations.

Michael Shewell July 27, 2011 01:24 PM

Seems to me a more scientific way to go about this would be comparing the I.P. address of origin to the author some how and not so much on \"how\" the article is written. Seems as with Cornell\'s system, there is a huge margin for error there but I do not know for sure, obviously. I\'m just skeptical.

Calson July 27, 2011 02:19 PM

This study is badly done. To limit the study to 400 individuals located in Chicago to write about Chicago hotels makes the information specific to Chicago and maybe not even then. If the 400 consist of 180 males and of these 20 are in the 20-30 year age group, and of these 20 you have 10 that graduated from college, then you do not even have a statistically valid sample of possible business marketing types who would be called upon to write a fake review for their hotel employer.
If someone in Chicago writes about going with their husband to a hotel in Chicago on their vacation obviously they are lying. If someone in Chicago writes about a hotel in Cozumel and mentions going with their husband on a vacation they are more likely to be presenting real information.
This study tells me far more about the incompetency of the study designers than it does about false reviews posted on the internet. Even with reviews where people are attempting to be honest their personal bias enters into the information that is provided and that which is omitted. One person may like the fact that room service came in while they were out and turned down the bed and another person may complain that staff comes into their room without permission while they are away. Two people can eat the same food and the person from Connecticut may find it too spicy and the person from California may find it too bland and neither of these negative reviews would be false.

Summilux July 27, 2011 02:46 PM

If you believe Shills can be eliminated - you\'re dreaming. Shills have been around since the beginning of time. Nobody has ever alleviated shills. Nobody will.
Shills and Cockroaches are equally indestructible.

Jason Elizondo July 27, 2011 03:41 PM

At least the ball is rolling. It\'s about time something is done about the B.S. reviews online. I say more power to\'em. If the issue of \"false\" reviews isn\'t addressed it will be even more like the \"wild west\" online then the personal oasis it\'s supposed to be.

TexByrnes July 27, 2011 06:31 PM

@Carlson.
"To limit the study to 400 individuals 'LOCATED IN CHICAGO' to write about Chicago hotels"
This is not what was said. Read it again. Ian Colley.

Denis Klanac July 27, 2011 09:40 PM

So then the fakers just have to adjust their writing style. easily defeated!

Software picks out fake online reviews

Tags

Most Viewed

New Volkswagen California camper will put an end to a VW van life era

Superfood protein pulled out of thin air massively scales up production

Affordable starter home is 3D-printed in just 18 hours

GET OUR NEWSLETTER