System and methods for automatically detecting deceptive content

US 10,642,975 B2
Filed: 10/18/2012
Issued: 05/05/2020
Est. Priority Date: 10/19/2011
Status: Active Grant

First Claim

Patent Images

1. A method for classifying textual opinion information as truthful or deceptive, comprising the steps of:

communicating with a processor a first source of opinion information via a communication interface connecting a user system to a network to collect opinion information, wherein the opinion information consists of at least one set of known deceptive opinion information and at least one set of known truthful opinion information forming an initial dataset, the communication interface being a wired communication interface or a wireless communication interface;

storing by the processor the opinion information in a main memory of the user system;

analyzing separately by the processor each of the set of known deceptive opinion information and the set of known truthful opinion information of the opinion information to determine features associated with each set in the initial dataset, wherein the machine-analysis comprises a genre identification approach that reviews each part of speech of the opinion information;

automatically generating a model based on the analyzing step in which a first set of features comprising nouns, adjectives, prepositions, determiners, coordinating conjunctions are associated with the set of known deceptive opinion information and a second set of features comprising verbs, adverbs, pronouns and pre-determiners are associated with the set of known truthful opinion information;

receiving by the processor an online review of a product or a service;

applying by the processor the model to the online review, wherein the processor identifies text of the online review as one or more nouns, adjectives, prepositions, determiners, coordinating conjunctions, verbs, adverbs, pronouns and pre-determiners;

calculating by the processor a first number of features of the first set in the text and a second number of features of the second set in the text; and

categorizing by the processor the online review as deceptive when the first number is greater than the second number and categorizing the online review as truthful when the second number is greater than the first number.

View all claims

6 Assignments

Timeline View

Assignment View

1 Petition

Accused Products

Abstract

Systems and methods for detecting deceptive opinion spam. Certain embodiments include a classifier with improved accuracy for detecting deceptive opinion entries. A feature analysis of learned models reveals a relationship between deceptive opinions and imaginative writing. By modeling deception in a generative framework, the prevalence of deception in two popular online review communities may be determined. Deceptive opinion spam is a rapidly growing and widespread problem, especially in review communities with minimal posting requirements.

25 Citations

View as Search Results

6 Claims

1. A method for classifying textual opinion information as truthful or deceptive, comprising the steps of:
- communicating with a processor a first source of opinion information via a communication interface connecting a user system to a network to collect opinion information, wherein the opinion information consists of at least one set of known deceptive opinion information and at least one set of known truthful opinion information forming an initial dataset, the communication interface being a wired communication interface or a wireless communication interface;
  
  storing by the processor the opinion information in a main memory of the user system;
  
  analyzing separately by the processor each of the set of known deceptive opinion information and the set of known truthful opinion information of the opinion information to determine features associated with each set in the initial dataset, wherein the machine-analysis comprises a genre identification approach that reviews each part of speech of the opinion information;
  
  automatically generating a model based on the analyzing step in which a first set of features comprising nouns, adjectives, prepositions, determiners, coordinating conjunctions are associated with the set of known deceptive opinion information and a second set of features comprising verbs, adverbs, pronouns and pre-determiners are associated with the set of known truthful opinion information;
  
  receiving by the processor an online review of a product or a service;
  
  applying by the processor the model to the online review, wherein the processor identifies text of the online review as one or more nouns, adjectives, prepositions, determiners, coordinating conjunctions, verbs, adverbs, pronouns and pre-determiners;
  
  calculating by the processor a first number of features of the first set in the text and a second number of features of the second set in the text; and
  
  categorizing by the processor the online review as deceptive when the first number is greater than the second number and categorizing the online review as truthful when the second number is greater than the first number.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1 further comprising the step of:
    - displaying on a display device a label of one or more online reviews.
  - 3. The method of claim 2, wherein the label is the word “
    - deceptive”
      
      .
  - 4. The method of claim 2, wherein the label is the word “
    - truthful”
      
      .
  - 5. The method of claim 3 further comprising the step of:
    - removing by the processor the one or more online reviews labeled as “
      
      deceptive”
      
      .
  - 6. The method of claim 1 further comprising:
    - displaying on a display device statistics about prevalence of deceptive opinion information.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Cornell University
Original Assignee
Cornell University
Inventors
Ott, Myle, Choi, Yejin, Cardie, Claire, Hancock, Jeffrey
Primary Examiner(s)
Khuu, Hien D

Application Number

US14/352,350
Publication Number

US 20140304814A1
Time in Patent Office

2,756 Days
Field of Search

None
US Class Current
CPC Class Codes

G06F 21/552   involving long-term monitor...

G06F 2221/034   Test or assess a computer o...

G06F 40/253   Grammatical analysis; Style...

System and methods for automatically detecting deceptive content

First Claim

6 Assignments

1 Petition

Accused Products

Abstract

25 Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

System and methods for automatically detecting deceptive content

First Claim

6 Assignments

Subscription Required

Subscription Required

1 Petition

Subscription Required

Accused Products

Subscription Required

Abstract

25 Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links