Method and system for the automatic recognition of deceptive language

US 7,853,445 B2
Filed: 12/08/2005
Issued: 12/14/2010
Est. Priority Date: 12/10/2004
Status: Expired due to Fees

First Claim

Patent Images

1. A system for identifying deception within a text, comprising:

a processor for storing and processing a text file containing statements from a particular person whose credibility is being weighed as to verifiable propositions included in the text; and

a memory;

a deception indicator tag analyzer stored in memory and executing on the processor for inserting into the stored text file at least one deception indicator tag that identifies a potentially deceptive word or phrase at its location within the text file, andan interpreter stored in memory and executing on the processor for(a) interpreting the at least one deception indicator tag to determine a distribution of potentially deceptive words or phrases within the text file and for computing and storing for user review deception likelihood data based upon the distribution of potentially deceptive words or phrases within the text file, said deception likelihood data including a calculated distribution proximity metric for a plurality of words or phrases in the text file based upon the proximity of a word or phrase to the at least one deception indicator tag; and

(b) marking words in the text file with differentiating indicia showing the proximity level calculated, to identify areas of the text file more likely to involve deception.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system for identifying deception within a text includes a processor for receiving and processing a text file. The processor includes a deception indicator tag analyzer for inserting into the text file at least one deception indicator tag that identifies a potentially deceptive word or phrase within the text file, and an interpreter for interpreting the at least one deception indicator tag to determine a distribution of potentially deceptive word or phrases within the text file and generating deception likelihood data based upon the density or distribution of potentially deceptive word or phrases within the text file. A method for identifying deception within a text includes the steps of receiving a first text to be analyzed, normalizing the first text to produce a normalized text, inserting into the normalized text at least one part-of-speech tag that identifies a part of speech of a word associated with the part-of-speech tag, inserting into the normalized text at least one syntactic label that identifies a linguistic construction of one or more words associated with the syntactic label, inserting into the normalized text at least one deception indicator tag that identifies a potentially deceptive word or phrase within the normalized text, interpreting the at least one deception indicator tag to determine a distribution of potentially deceptive word or phrases within the normalized text, and generating deception likelihood data based upon the density or frequency of distribution of potentially deceptive word or phrases within the normalized text.

Citations

22 Claims

1. A system for identifying deception within a text, comprising:
- a processor for storing and processing a text file containing statements from a particular person whose credibility is being weighed as to verifiable propositions included in the text; and
  
  a memory;
  
  a deception indicator tag analyzer stored in memory and executing on the processor for inserting into the stored text file at least one deception indicator tag that identifies a potentially deceptive word or phrase at its location within the text file, andan interpreter stored in memory and executing on the processor for(a) interpreting the at least one deception indicator tag to determine a distribution of potentially deceptive words or phrases within the text file and for computing and storing for user review deception likelihood data based upon the distribution of potentially deceptive words or phrases within the text file, said deception likelihood data including a calculated distribution proximity metric for a plurality of words or phrases in the text file based upon the proximity of a word or phrase to the at least one deception indicator tag; and
  
  (b) marking words in the text file with differentiating indicia showing the proximity level calculated, to identify areas of the text file more likely to involve deception.
- View Dependent Claims (2)
- - 2. A system according to claim 1, wherein the interpreter inserts in the text file the calculated proximity metric for each word or phrase to identify areas of the text file that are likely or unlikely to be deceptive.

3. A system for identifying deception within a text, comprising:
- a processor for storing and processing a text file containing statements from a particular person whose credibility is being weighed as to verifiable propositions included in the text; and
  
  a memory;
  
  a deception indicator tag analyzer stored in memory and executing on the processor for inserting into the stored text file at least one deception indicator tag that identifies a potentially deceptive word or phrase at its location within the text file, andan interpreter stored in memory and executing on the processor for interpreting the at least one deception indicator tag to determine a distribution of potentially deceptive words or phrases within the text file and for computing and storing for user review deception likelihood data based upon the distribution of potentially deceptive words or phrases within the text file, said deception likelihood data including a calculated distribution proximity metric for a plurality of words or phrases in the text file based upon the proximity of a word or phrase to the at least one deception indicator tag, the proximity metric comprising a moving average metric for the plurality of words and phrases in the text file based upon the proximity metric of the word or phrase, wherein the moving average metric comprises a portion of the deception likelihood data and said interpreter inserts in the text file the proximity metric for the plurality of words and phrases to identify areas of the text file that are likely or unlikely to be deceptive.
- View Dependent Claims (4, 5, 6, 7, 8, 9, 10)
- - 4. A system according to claim 3, further comprising a display communicating with the interpreter executing on a processor for displaying the deception likelihood data within the text and in association with the at least one deception indicator tag according to one or more levels of likely deception.
  - 5. A system according to claim 3, wherein the processor further comprisesa receiver executing on a processor for receiving a first text file to be analyzed;
    - a component executing on a processor for normalizing the first text file to produce a normalized text;
      
      a component executing on a processor for inserting into the normalized text file at least one part-of-speech tag that identifies a part of speech of a word associated with the part-of-speech tag; and
      
      a component executing on a processor for inserting into the normalized text file at least one syntactic label that identifies a linguistic construction of one or more words associated with the syntactic label,wherein the normalized text file including the at least one part-of-speech tag and the at least one syntactic label is provided to the deception indicator tag analyzer.
  - 6. A system according to claim 5, wherein the deception indicator tag analyzer executing on a processor inserts the deception indicator tag into the normalized text file based upon words or phrases in the normalized text, part-of-speech tags inserted into the normalized text file, and syntactic labels inserted in the normalized text file.
  - 7. A system according to claim 6, wherein the deception indicator tags are associated with a defined word or phrase found in a text file.
  - 8. A system according to claim 6, wherein the deception indicator tags are associated with a defined word or phrase when used in a defined linguistic context found in a text file.
  - 9. A system according to claim 3, wherein the calculation of the moving average metric for each word or phrase in the text file may be adjusted by a user of the system to focus the deception likelihood data within a text window length as specified in a configuration file.
  - 10. A system according to claim 3, wherein the moving average metric associated with each word or phrase within the text file is used to determine a level of potential deception likelihood for the associated word or phrase.

11. A method performed by a programmed processor for identifying deception within a text, comprising the steps of:
- receiving by the processor a first text to be analyzed containing statements from a particular person whose credibility is being weighed as to verifiable propositions included in the text;
  
  normalizing the first text by the processor to produce a normalized text;
  
  inserting into the normalized text by the processor at least one part-of-speech tag that identifies a part of speech of a word associated with the part-of-speech tag;
  
  inserting into the normalized text by the processor at least one syntactic label that identifies a linguistic construction of one or more words associated with the syntactic label;
  
  responsive to a deception tag analyzer that analyzes the normalized text and identifies potentially deceptive words and phrases, inserting into the normalized text by the processor at least one deception indicator tag that identifies a potentially deceptive word or phrase indicating a non-truthful statement at its location within the normalized text; and
  
  interpreting the at least one deception indicator tag by (a) generating, by the processor computing and storing for user review, deception likelihood data based upon the distribution of potentially deceptive words or phrases within the normalized text, said deception likelihood data including a calculated distribution proximity metric for a plurality of words or phrases in the text file based upon the proximity of a word or phrase to the at least one deception indicator tag, and (b) marking words in the text file with differentiating indicia showing the proximity level calculated, to identify areas of the text file more likely to involve deception.
- View Dependent Claims (12)
- - 12. A method according to claim 11, wherein the step of interpreting the at least one deception indicator tag comprises the step of:
    - inserting in the text the calculated proximity metric for each word or phrase in the text to identify areas of the text file that are likely or unlikely to be deceptive.

13. A method performed by a programmed processor for identifying deception within a text, comprising the steps of:
- receiving by the processor a first text to be analyzed containing statements from a particular person whose credibility is being weighed as to verifiable propositions included in the text;
  
  normalizing the first text by the processor to produce a normalized text;
  
  inserting into the normalized text by the processor at least one part-of-speech tag that identifies a part of speech of a word associated with the part-of-speech tag;
  
  inserting into the normalized text by the processor at least one syntactic label that identifies a linguistic construction of one or more words associated with the syntactic label;
  
  responsive to a deception tag analyzer that analyzes the normalized text and identifies potentially deceptive words and phrases, inserting into the normalized text by the processor at least one deception indicator tag that identifies a potentially deceptive word or phrase indicating a non-truthful statement at its location within the normalized text; and
  
  interpreting the at least one deception indicator tag by generating, by the processor computing and storing for user review, deception likelihood data based upon the distribution of potentially deceptive words or phrases within the normalized text, said deception likelihood data including a calculated distribution proximity metric for a plurality of words or phrases in the text file based upon the proximity of a word or phrase to the at least one deception indicator tag, wherein the step of interpreting the at least one deception indicator tag further comprises the steps of;
  
  calculating a moving average metric for the plurality of words or phrases in the text file based upon the proximity metric of the word or phrase, wherein the moving average metric comprises a portion of the deception likelihood data and inserting in the text the calculated proximity metric for the plurality of words or phrases in the text to identify areas of the text file that are likely or unlikely to be deceptive.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
- - 14. A method according to claim 13, further comprising the step ofdisplaying the deception likelihood data within the text and in association with the at least one deception indicator tag according to one or more levels of likely deception.
  - 15. A method according to claim 13, wherein the deception indicator tag analyzer inserts the deception indicator tag into the normalized text based upon words or phrases in the normalized text, part-of-speech tags inserted into the normalized text, and syntactic labels inserted in the normalized text.
  - 16. A method according to claim 15, wherein the deception indicator tags are associated with a defined word or phrase found in a text file.
  - 17. A method according to claim 15, wherein the deception indicator tags are associated with a defined word or phrase when used in a defined linguistic context found in a text file.
  - 18. A method according to claim 13, wherein the calculation of the moving average metric for each word or phrase in the text file may be adjusted by a user of the system to focus the deception likelihood data within a text window length as specified in a configuration file.
  - 19. A method according to claim 13, wherein the moving average metric associated with each word or phrase within the text file is used to determine a level of potential deception likelihood for the associated word or phrase.
  - 20. A method according to claim 13, wherein the step of receiving a first text to be analyzed comprises receiving a live feed from a real-time transcription of a person'"'"'s utterances and the deception likelihood data is generated in real time.

21. An article of manufacture comprising:
- a computer readable non-transitory storage medium for identifying deception within a text containing statements from a particular person whose credibility is being weighed as to verifiable propositions included in the text, wherein the program code directs a computer to perform a method comprising the steps of;
  
  controlling a deception indicator tag analyzer for inserting into the text file at least one deception indicator tag that identifies a potentially deceptive word or phrase at its location within the text file, andcontrolling an interpreter for interpreting the at least one deception indicator tag to determine a distribution of potentially deceptive words or phrases within the text file and for computing and storing for user review deception likelihood data based upon the distribution of potentially deceptive words or phrases within the text file, said deception likelihood data including a calculated distribution proximity metric for a plurality of words or phrases in the text file based upon the proximity of a word or phrase to the at least one deception indicator tag, the proximity metric comprising a moving average metric for the plurality of words or phrases in the text file based upon the proximity metric of a word or phrase, wherein the moving average metric comprises a portion of the deception likelihood data and said interpreter inserts in the text file the proximity metric for the plurality of words or phrases to identify areas of the text file that are likely or unlikely to be deceptive.
- View Dependent Claims (22)
- - 22. An article of manufacture according to claim 21, further comprising program code for:
    - receiving a first text to be analyzed;
      
      normalizing the first text to produce a normalized text;
      
      inserting into the normalized text at least one part-of-speech tag that identifies a part of speech of a word associated with the part-of-speech tag; and
      
      inserting into the normalized text at least one syntactic label that identifies a linguistic construction of one or more words associated with the syntactic label;
      
      and wherein the program code for the deception indicator tag analyzer inserts into the normalized text at least one deception indicator tag that identifies a potentially deceptive word or phrase within the normalized text, and the program code for the interpreter interprets the at least one deception indicator tag to determine a distribution of potentially deceptive words or phrases within the normalized text and generates deception likelihood data based upon the distribution of potentially deceptive word or phrases within the normalized text.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Deception Discovery Technologies LLC
Original Assignee
Deception Discovery Technologies LLC
Inventors
Schonwetter, Michael J., Bachenko, Joan C.
Primary Examiner(s)
Wozniak; James S
Assistant Examiner(s)
Baker; Matthew H

Application Number

US11/297,803
Publication Number

US 20070010993A1
Time in Patent Office

1,832 Days
Field of Search

704 1- 10
US Class Current

704/9
CPC Class Codes

G06F 40/253   Grammatical analysis; Style...

G06F 40/268   Morphological analysis

G06F 40/279   Recognition of textual enti...

G06F 40/30   Semantic analysis

Method and system for the automatic recognition of deceptive language

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for the automatic recognition of deceptive language

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links