Online review assessment using multiple sources

US 10,089,660 B2
Filed: 09/09/2015
Issued: 10/02/2018
Est. Priority Date: 09/09/2014
Status: Active Grant

First Claim

Patent Images

1. A method for assessing the trustworthiness of a plurality of reviews directed to a provider, comprising the steps:

providing a first database of reviews;

generating a second database including reviewer-centric features involving inconsistent and unlikely reviews by a reviewer, provider-centric features involving inconsistent and unlikely reviews by a provider, and review-centric features involving contextual characteristics of each review, each of the features capturing temporal, spatial, contextual and graphical characteristics from one or more different review hosting sites;

determining by a processor three outlier scores for the provider based on the features using each of a global density-based approach, a local outlier factor approach, and a hierarchical cluster-based approach;

normalizing by the processor the three outlier scores to conform to a numerical scale;

combining by the processor the scaled outlier scores into a combined outlier score;

producing by the processor an outlier probability value using the combined outlier score;

converting by the processor the outlier probability value to a trustworthiness score, a greater trustworthiness score indicating a greater level of suspiciousness of the reviews of the provider; and

outputting by the processor the trustworthiness score.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Multiple sources of reviews for the same product or service (e.g. hotels, restaurants, clinics, hair saloon, etc.) are utilized to provide a trustworthiness score. Such a score can clearly identify hotels with evidence of review manipulation, omission and fakery and provide the user with a comprehensive understanding of the reviews of a product or establishment. Three types of information are used in computing the score: spatial, temporal and network or graph-based. The information is blended to produce a representative set of features that can reliably produce the trustworthiness score. The invention is self-adapting to new reviews and sites. The invention also includes a validation mechanism by crowd-sourcing and fake review generation to ensure reliability and trustworthiness of the scoring.

25 Citations

View as Search Results

20 Claims

1. A method for assessing the trustworthiness of a plurality of reviews directed to a provider, comprising the steps:
- providing a first database of reviews;
  
  generating a second database including reviewer-centric features involving inconsistent and unlikely reviews by a reviewer, provider-centric features involving inconsistent and unlikely reviews by a provider, and review-centric features involving contextual characteristics of each review, each of the features capturing temporal, spatial, contextual and graphical characteristics from one or more different review hosting sites;
  
  determining by a processor three outlier scores for the provider based on the features using each of a global density-based approach, a local outlier factor approach, and a hierarchical cluster-based approach;
  
  normalizing by the processor the three outlier scores to conform to a numerical scale;
  
  combining by the processor the scaled outlier scores into a combined outlier score;
  
  producing by the processor an outlier probability value using the combined outlier score;
  
  converting by the processor the outlier probability value to a trustworthiness score, a greater trustworthiness score indicating a greater level of suspiciousness of the reviews of the provider; and
  
  outputting by the processor the trustworthiness score.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
- - 2. The method according to claim 1, wherein the reviewer-centric features comprise a maximum number of reviews within a particular location, a maximum number of reviews within a particular time period, and a maximum number of reviews in common with each other based on the provider reviewed, dates for each review, and a clique size.
  - 3. The method according to claim 1, wherein the provider-centric features comprise one or more selected from the group of:
    - bursts, temporal bursts, oscillations, correlation of ratings between hosting sites, and correlation within the same hosting site.
  - 4. The method according to claim 1, wherein the review-centric features comprise a count of empty reviews and one or more reviews with a proportion of same text.
  - 5. The method according to claim 1, wherein the global density-based approach computes one of the three outlier scores by locating one or more points outside a core connected cluster.
  - 6. The method according to claim 1, wherein the local outlier factor approach computes one of the three outlier scores by averaging ratios of distances to one or more points and to each of its nearest neighbors.
  - 7. The method according to claim 1, wherein the hierarchical cluster-based approach computes one of the three outlier scores by identifying one or more points outside clusters at a certain level of a hierarchy.
  - 8. The method according to claim 1, wherein the normalizing step is performed using logarithmic inversion and gaussian scaling.
  - 9. The method according to claim 1, wherein the converting step is performed using linear inversion.
  - 10. The method according to claim 1 further comprising the steps of:
    - comparing by the processor names of one or more providers using a number of letters in common to determine a similarity value;
      
      comparing by the processor a physical location of each of the one or more providers using geocoding to determine a distance value;
      
      determining by the processor a maximum distance value and a minimum similarity value; and
      
      identifying by the processor the one or more providers with the distance value less than the maximum distance value and the similarity value greater than the minimum similarity value.
  - 11. The method according to claim 10, wherein the similarity value is computed according to the step of dividing the number of letters in common by a number of letters of a provider of the one or more providers with the longest name.
  - 12. The method according to claim 10, wherein the distance value is computed according to the steps of:
    - translating by the processor the physical location into a latitude coordinate and a longitude coordinate;
      
      converting by the processor the latitude coordinate and the longitude coordinate into spherical coordinates in radians;
      
      computing by the processor an arc length from the spherical coordinates in radians;
      
      multiplying by the processor the arc length by a radius of earth in miles; and
      
      providing by the processor a distance value in miles between two provider locations.
  - 13. The method according to claim 5, wherein the global density approach comprises the steps of:
    - determining by the processor if a point representing a provider is a core point if it has a threshold number of points within a distance in a Euclidean space;
      
      finding by the processor all core points within a predefined neighborhood from one to another to define a core cluster; and
      
      denoting by the processor each point that is not in the core cluster as an outlier.
  - 14. The method according to claim 6, wherein the local outlier factor approach is defined as:
  - 15. The method of claim 14, wherein the local reachability distance of x is defined as:
  - 16. The method according to claim 7, wherein the hierarchical cluster-based approach comprises the steps of:
    - evaluating by the processor two clusters based on a minimum distance between any pair of points that form the two clusters; and
      
      merging by the processor the closest two clusters.
  - 17. The method according to claim 8, further comprising the step of performing by the processor a log transform defined as:
  - 18. The method according to claim 1, wherein the probability outlier value P(x) is defined as:
  - 19. The method according to claim 1, wherein the trustworthiness score TV(x) is defined as:
    - TV(x)=1−
      
      P(x)wherein P(x) is the probability outlier value.

20. A method for assessing the trustworthiness of a plurality of reviews directed to a provider, comprising the steps:
- providing a first database of reviews;
  
  generating a second database comprising reviewer-centric features involving inconsistent and unlikely reviews by a reviewer, provider-centric features involving inconsistent and unlikely reviews by a provider, and review-centric features involving contextual characteristics of each review, each of the features capturing temporal, spatial, contextual and graphical characteristics from one or more different review hosting sites;
  
  using by a processor the reviewer-centric features, the provider-centric features, and the review-centric features to determine outlier scores from each of a global density-based approach, a local outlier factor approach, and a hierarchical cluster-based approach;
  
  normalizing by the processor the outlier scores to conform to a numerical scale;
  
  combining by the processor the scaled outlier scores into a combined outlier score;
  
  producing by the processor an outlier probability value using the combined outlier score, wherein the probability outlier value P(x) is defined as;

Specification

Resources

Litigation Campaign Assessment

Current Assignee
UNM Rainforest Innovations (f/k/a STC.UNM) (The University of New Mexico)
Original Assignee
UNM Rainforest Innovations (f/k/a STC.UNM) (The University of New Mexico)
Inventors
Luan, Shuang, Mueen, Abdullah, Faloutsos, Michalis, Minnich, Amanda J.
Primary Examiner(s)
Aspinwall, Evan S

Application Number

US14/849,227
Publication Number

US 20160070709A1
Time in Patent Office

1,119 Days
Field of Search

707728
US Class Current
CPC Class Codes

G06Q 30/0282 Rating or review of busines...

Online review assessment using multiple sources

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

25 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Online review assessment using multiple sources

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

25 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links