NATURAL LANGUAGE PROCESSING AND STATISTICAL TECHNIQUES BASED METHODS FOR COMBINING AND COMPARING SYSTEM DATA

US 20170213222A1
Filed: 04/06/2017
Published: 07/27/2017
Est. Priority Date: 09/19/2013
Status: Abandoned Application

First Claim

Patent Images

1. A method comprising:

obtaining first data comprising data elements pertaining to a first plurality of vehicles;

obtaining second data comprising data elements pertaining to a second plurality of vehicles, wherein one or both of the first data and the second data include one or more abbreviated terms;

disambiguating the abbreviated terms at least in part by;

identifying, from a domain ontology stored in a memory, respective basewords that are associated with each of the abbreviated terms;

filtering the basewords;

performing a set intersection of the basewords; and

calculating posterior probabilities for the basewords based at least in part on the filtering and the set intersection; and

combining the first data and the second data, via a processor, based on semantic and syntactic similarity between respective data elements of the first data and the second data and the disambiguating of the abbreviated terms.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods and systems are provided for automatically comparing, combining and fusing vehicle data. First data is obtained pertaining to a first plurality of vehicles. Second data is obtained pertaining to a second plurality of vehicles. One or both of the first data and the second data include abbreviated terms. The abbreviated terms are disambiguating at least in part by identifying, from a domain ontology stored in a memory, respective basewords that are associated with each of the abbreviated terms, filtering the basewords, performing a set intersection of the basewords, and calculating posterior probabilities for the basewords based at least in part on the filtering and the set intersection. The first data and the second data are combined, via a processor, based on semantic and syntactic similarity between respective data elements of the first data and the second data and the disambiguating of the abbreviated terms.

Citations

20 Claims

1. A method comprising:
- obtaining first data comprising data elements pertaining to a first plurality of vehicles;
  
  obtaining second data comprising data elements pertaining to a second plurality of vehicles, wherein one or both of the first data and the second data include one or more abbreviated terms;
  
  disambiguating the abbreviated terms at least in part by;
  
  identifying, from a domain ontology stored in a memory, respective basewords that are associated with each of the abbreviated terms;
  
  filtering the basewords;
  
  performing a set intersection of the basewords; and
  
  calculating posterior probabilities for the basewords based at least in part on the filtering and the set intersection; and
  
  combining the first data and the second data, via a processor, based on semantic and syntactic similarity between respective data elements of the first data and the second data and the disambiguating of the abbreviated terms.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein:
    - the first data comprises design failure mode and effects analysis (DFMEA) data that is generated using vehicle warranty claims; and
      
      the second data comprises vehicle field data.
  - 3. The method of claim 2, further comprising:
    - determining whether any particular failure modes have resulted in multiple warranty claims for the vehicle, based on the DFMEA data and the vehicle field data; and
      
      updating the DFMEA data based on the multiple warranty claims for the vehicle caused by the particular failure modes.
  - 4. The method of claim 2, wherein:
    - the DFMEA data includes the one or more abbreviated terms;
      
      the step of disambiguating the abbreviated terms comprises disambiguating the abbreviated terms in the DFMEA data at least in part by;
      
      identifying, from a domain ontology stored in a memory, respective basewords that are associated with each of the abbreviated terms of the DFMEA data;
      
      filtering the basewords;
      
      performing a set intersection of the basewords; and
      
      calculating posterior probabilities for the basewords based at least in part on the filtering and the set intersection; and
      
      combining the first data and the second data, via a processor, based on syntactic similarity between respective data elements of the first data and the second data and the disambiguating of the abbreviated terms of the DFMEA data.
  - 5. The method of claim 2, wherein:
    - the vehicle warranty data includes the one or more abbreviated terms;
      
      the step of disambiguating the abbreviated terms comprises disambiguating the abbreviated terms in the vehicle warranty data at least in part by;
      
      identifying, from a domain ontology stored in a memory, respective basewords that are associated with each of the abbreviated terms of the vehicle warranty data;
      
      filtering the basewords;
      
      performing a set intersection of the basewords; and
      
      calculating posterior probabilities for the basewords based at least in part on the filtering and the set intersection; and
      
      combining the first data and the second data, via a processor, based on semantic and syntactic similarity between respective data elements of the first data and the second data and the disambiguating of the abbreviated terms of the vehicle warranty data.
  - 6. The method of claim 1, wherein the step of combining the first data and the second data comprises:
    - calculating, via the processor, a measure of syntactic similarity pertaining to respective data elements of the first data and the second data, based at least in part on the and the disambiguation of the abbreviated terms; and
      
      determining, via the processor, that the respective data elements of the first data and the second data are related to one another based on the calculated measure of the semantic and syntactic similarity.
  - 7. The method of claim 6, wherein the step of calculating the measure of the semantic and syntactic similarity comprises calculating, via the processor, the measure of semantic and syntactic similarity between terms associated with vehicle symptoms derived from the respective data elements of the first data and the second data, based at least in part on the and the disambiguation of the abbreviated terms.
  - 8. The method of claim 6, wherein:
    - the step of calculating the measure of the syntactic similarity comprises calculating, via the processor, a Jaccard Distance between terms derived from the respective data elements of the first data and the second data, based at least in part on the and the disambiguation of the abbreviated terms; and
      
      the step of determining that the respective data elements are related comprises determining, via the processor, that the respective data elements of the first data and the second data are related if the Jaccard Distance exceeds a predetermined threshold.
  - 9. The method of claim 8, wherein the step of determining that the respective data elements are related comprises:
    - determining, via the processor, that the respective data elements of the first data and the second data are synonymous if the Jaccard Distance exceeds the predetermined threshold.
  - 10. The method of claim 8, wherein:
    - the respective data elements of the first data and the second data comprise strings representing vehicle parts, vehicle systems, and vehicle actions; and
      
      the step of calculating the Jaccard Distance comprises calculating, via the processor, the Jaccard Distance between the respective strings of the respective data elements of the first data and the second data, based at least in part on the and the disambiguation of the abbreviated terms.

11. A method comprising:
- obtaining first data comprising data elements pertaining to a first plurality of vehicles, the first data comprising design failure mode and effects analysis (DFMEA) data that is generated using vehicle warranty claims;
  
  obtaining second data comprising data elements pertaining to a second plurality of vehicles, the second data comprising vehicle field data;
  
  combining the DFMEA data and the vehicle field data, based on syntactic similarity between respective data elements of the DMEA data and the vehicle field data;
  
  determining whether any particular failure modes have resulted in multiple warranty claims for the vehicle, based on the DFMEA data and the vehicle field data; and
  
  updating the DFMEA data based on the multiple warranty claims for the vehicle caused by the particular failure modes.
- View Dependent Claims (12, 13, 14)
- - 12. The method of claim 11, wherein the DFMEA data, the warranty data, or both, include one or more abbreviated terms, and the process further comprises:
    - disambiguating the abbreviated terms at least in part by;
      
      identifying, from a domain ontology stored in a memory, respective basewords that are associated with each of the abbreviated terms;
      
      filtering the basewords;
      
      performing a set intersection of the basewords; and
      
      calculating posterior probabilities for the basewords based at least in part on the filtering and the set intersection;
      
      wherein the step of combining the DFMEA data and the vehicle field data comprises combining the DFMEA data and the vehicle field data based on syntactic similarity between respective data elements of the DMEA data and the vehicle field data and the disambiguating of the abbreviated terms.
  - 13. The method of claim 11, wherein the DFMEA data includes the one or more abbreviated terms, and the process further comprises:
    - disambiguating the abbreviated terms of the DFMEA data at least in part by;
      
      identifying, from a domain ontology stored in a memory, respective basewords that are associated with each of the abbreviated terms of the DFMEA data;
      
      filtering the basewords;
      
      performing a set intersection of the basewords; and
      
      calculating posterior probabilities for the basewords based at least in part on the filtering and the set intersection;
      
      wherein the step of combining the DFMEA data and the vehicle field data comprises combining the DFMEA data and the vehicle field data based on semantic and syntactic similarity between respective data elements of the DMEA data and the vehicle field data and the disambiguating of the abbreviated terms of the DFMEA data.
  - 14. The method of claim 11, wherein the vehicle warranty data includes the one or more abbreviated terms, and the process further comprises:
    - disambiguating the abbreviated terms of the vehicle warranty data at least in part by;
      
      identifying, from a domain ontology stored in a memory, respective basewords that are associated with each of the abbreviated terms of the vehicle warranty data;
      
      filtering the basewords;
      
      performing a set intersection of the basewords; and
      
      calculating posterior probabilities for the basewords based at least in part on the filtering and the set intersection;
      
      wherein the step of combining the DFMEA data and the vehicle field data comprises combining the DFMEA data and the vehicle field data based on syntactic similarity between respective data elements of the DMEA data and the vehicle field data and the disambiguating of the abbreviated terms of the vehicle warranty data.

15. A system comprising:
- a memory storing;
  
  first data comprising data elements pertaining to a first plurality of vehicles;
  
  second data comprising data elements pertaining to a second plurality of vehicles wherein one or both of the first data and the second data include one or more abbreviated terms; and
  
  a processor coupled to the memory and configured to at least facilitate;
  
  disambiguating the abbreviated terms at least in part by;
  
  identifying, from a domain ontology stored in a memory, respective basewords that are associated with each of the abbreviated terms;
  
  filtering the basewords;
  
  performing a set intersection of the basewords; and
  
  calculating posterior probabilities for the basewords based at least in part on the filtering and the set intersection; and
  
  combining the first data and the second data, via a processor,based on syntactic similarity between respective data elements of the first data and the second data and the disambiguating of the abbreviated terms.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The system of claim 15, wherein the processor is further configured to:
    - calculate a measure of semantic and syntactic similarity between respective data elements of the first data and the second data, based at least in part on the and the disambiguation of the abbreviated terms; and
      
      determine that the respective data elements of the first data and the second data are related to one another based on the calculated measure of the semantic and syntactic similarity.
  - 17. The system of claim 16, wherein the processor is further configured to:
    - calculate a Jaccard Distance between respective data elements of the first data and the second data, based at least in part on the and the disambiguation of the abbreviated terms; and
      
      determine that the respective data elements of the first data and the second data are related if the Jaccard Distance exceeds a predetermined threshold.
  - 18. The system of claim 17, wherein:
    - the respective data elements of the first data and the second data comprise strings representing vehicle parts, vehicle systems, and vehicle actions; and
      
      the processor is further configured to calculate the Jaccard Distance between the respective strings of the respective data elements of the first data and the second data, based at least in part on the and the disambiguation of the abbreviated terms.
  - 19. The system of claim 15, whereinthe first data comprises design failure mode and effects analysis (DFMEA) data that is generated using vehicle warranty claims;
    - andthe second data comprises vehicle field data.
  - 20. The system of claim 19, wherein the processor is configured to at least facilitate:
    - determining whether any particular failure modes have resulted in multiple warranty claims for the vehicle, based on the DFMEA data and the vehicle field data; and
      
      combining the first data and the second data, via a processor, based on syntactic similarity between respective data elements of the first data and the second data and the disambiguating of the abbreviated terms.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
GM Global Technology Operations LLC (General Motors Company)
Original Assignee
GM Global Technology Operations LLC (General Motors Company)
Inventors
DE, SOUMEN, RAJPATHAK, DNYANESH, PERANANDAM, PRAKASH M., DONNDELINGER, JOSEPH A., CAFEO, JOHN A., BANDYOPADHYAY, PULAK

Application Number

US15/481,205
Publication Number

US 20170213222A1
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G06F 16/284   Relational databases

G06F 17/16   Matrix or vector computatio...

G06F 17/18   for evaluating statistical ...

G06F 40/216   using statistical methods

G06F 40/30   Semantic analysis

G06Q 30/012   Providing warranty services

NATURAL LANGUAGE PROCESSING AND STATISTICAL TECHNIQUES BASED METHODS FOR COMBINING AND COMPARING SYSTEM DATA

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

NATURAL LANGUAGE PROCESSING AND STATISTICAL TECHNIQUES BASED METHODS FOR COMBINING AND COMPARING SYSTEM DATA

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links