Online review assessment using multiple sources
First Claim
1. A method for assessing the trustworthiness of a plurality of reviews directed to a provider, comprising the steps:
- providing a first database of reviews;
generating a second database including reviewer-centric features involving inconsistent and unlikely reviews by a reviewer, provider-centric features involving inconsistent and unlikely reviews by a provider, and review-centric features involving contextual characteristics of each review, each of the features capturing temporal, spatial, contextual and graphical characteristics from one or more different review hosting sites;
determining by a processor three outlier scores for the provider based on the features using each of a global density-based approach, a local outlier factor approach, and a hierarchical cluster-based approach;
normalizing by the processor the three outlier scores to conform to a numerical scale;
combining by the processor the scaled outlier scores into a combined outlier score;
producing by the processor an outlier probability value using the combined outlier score;
converting by the processor the outlier probability value to a trustworthiness score, a greater trustworthiness score indicating a greater level of suspiciousness of the reviews of the provider; and
outputting by the processor the trustworthiness score.
2 Assignments
0 Petitions
Accused Products
Abstract
Multiple sources of reviews for the same product or service (e.g. hotels, restaurants, clinics, hair saloon, etc.) are utilized to provide a trustworthiness score. Such a score can clearly identify hotels with evidence of review manipulation, omission and fakery and provide the user with a comprehensive understanding of the reviews of a product or establishment. Three types of information are used in computing the score: spatial, temporal and network or graph-based. The information is blended to produce a representative set of features that can reliably produce the trustworthiness score. The invention is self-adapting to new reviews and sites. The invention also includes a validation mechanism by crowd-sourcing and fake review generation to ensure reliability and trustworthiness of the scoring.
25 Citations
20 Claims
-
1. A method for assessing the trustworthiness of a plurality of reviews directed to a provider, comprising the steps:
-
providing a first database of reviews; generating a second database including reviewer-centric features involving inconsistent and unlikely reviews by a reviewer, provider-centric features involving inconsistent and unlikely reviews by a provider, and review-centric features involving contextual characteristics of each review, each of the features capturing temporal, spatial, contextual and graphical characteristics from one or more different review hosting sites; determining by a processor three outlier scores for the provider based on the features using each of a global density-based approach, a local outlier factor approach, and a hierarchical cluster-based approach; normalizing by the processor the three outlier scores to conform to a numerical scale; combining by the processor the scaled outlier scores into a combined outlier score; producing by the processor an outlier probability value using the combined outlier score; converting by the processor the outlier probability value to a trustworthiness score, a greater trustworthiness score indicating a greater level of suspiciousness of the reviews of the provider; and outputting by the processor the trustworthiness score. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A method for assessing the trustworthiness of a plurality of reviews directed to a provider, comprising the steps:
-
providing a first database of reviews; generating a second database comprising reviewer-centric features involving inconsistent and unlikely reviews by a reviewer, provider-centric features involving inconsistent and unlikely reviews by a provider, and review-centric features involving contextual characteristics of each review, each of the features capturing temporal, spatial, contextual and graphical characteristics from one or more different review hosting sites; using by a processor the reviewer-centric features, the provider-centric features, and the review-centric features to determine outlier scores from each of a global density-based approach, a local outlier factor approach, and a hierarchical cluster-based approach; normalizing by the processor the outlier scores to conform to a numerical scale; combining by the processor the scaled outlier scores into a combined outlier score; producing by the processor an outlier probability value using the combined outlier score, wherein the probability outlier value P(x) is defined as;
-
Specification