Method and device for determining and outputting the similarity between two data strings

US 7,689,638 B2
Filed: 11/28/2002
Issued: 03/30/2010
Est. Priority Date: 11/28/2002
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving a first data string in an electronic component,receiving a second data string in said electronic component,determining pairs of consecutively following data entities in said first data string in a processing unit,determining the relative positions of said pairs of consecutively following data entities in said first data string in said processing unit,allocating a position label to each of said data entities in the first data string in said processing unit,numbering same data entities according to their relative position in accordance with the position label in said processing unit,determining similar data entities with the same order in said second data string in said processing unit,determining the relative positions of said determined data entities in said second data string in said processing unit,determining a matching measure by determining how far the relative positions of data entities in said second data string match with the relative positions of consecutively following data entities in said first data string in said processing unit, anddetermining a similarity measure which corresponds to the matching measure of at least one comparison result in said processing unit,repeating said determination of said similarity measure with a number of received second data strings in said processing unit, andoutputting by an interface said determined similarity measures for said data strings according to the amount of similarity to said first data string,wherein said first data string of entities and said second data string of entities are data strings relating to one of associative text string, genome analysis, speech recognition, and musical melody.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention discloses a method and device for determining and outputting a similarity measure between two data strings each data string comprising data entities, comprising: receiving a first data string, receiving a second data string, which is characterized by determining consecutively following data entities in the first data string, determining the relative positions of the consecutively following data entities in the first data string, determining similar data entities with the same order in the second data string, determining the relative positions of the determined data entities in the second data string, determining a matching measure by determining how far the relative positions of data entities in the second data string match with the relative positions of consecutively following data entities in the first data string, and outputting a similarity measure which corresponds to the matching measure of at least one comparison result.

13 Citations

View as Search Results

16 Claims

1. A method comprising:
- receiving a first data string in an electronic component,receiving a second data string in said electronic component,determining pairs of consecutively following data entities in said first data string in a processing unit,determining the relative positions of said pairs of consecutively following data entities in said first data string in said processing unit,allocating a position label to each of said data entities in the first data string in said processing unit,numbering same data entities according to their relative position in accordance with the position label in said processing unit,determining similar data entities with the same order in said second data string in said processing unit,determining the relative positions of said determined data entities in said second data string in said processing unit,determining a matching measure by determining how far the relative positions of data entities in said second data string match with the relative positions of consecutively following data entities in said first data string in said processing unit, anddetermining a similarity measure which corresponds to the matching measure of at least one comparison result in said processing unit,repeating said determination of said similarity measure with a number of received second data strings in said processing unit, andoutputting by an interface said determined similarity measures for said data strings according to the amount of similarity to said first data string,wherein said first data string of entities and said second data string of entities are data strings relating to one of associative text string, genome analysis, speech recognition, and musical melody.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The method according to claim 1, further comprising:
    - determining at least one error limit for at least one of said entities, andconsidering said at least one error limit during said determination of said matching measure.
  - 3. The method according to claim 1, further comprising:
    - determining a first distance between said two data entities of consecutively following data entities in said first data string,determining a second distance of said two data entities determined in said second data string,determining a difference between said first and second distances, andconsidering said difference during said determination of said matching measure.
  - 4. The method according to claim 1, further comprising:
    - storing said second string together with said similarity measure.
  - 5. The method according to claim 1, further comprising:
    - determining a threshold for said similarity measure, andoutputting said second string, if said determined similarity measure at least equals said threshold.
  - 6. The method according to claim 1, further comprising:
    - analyzing the first string for entities not present in the first string, andsuppressing in the second string all said entities not present in said first string.
  - 7. The method according to claim 6, further comprising:
    - determining the number of entities that are present in the second string, but are not present in the first string, as a second similarity measure.
  - 8. The method according to claim 7, further comprising:
    - determining a section within said second string comprising at least the same number of entities that are simultaneously present in both strings.
  - 9. A computer readable medium stored with code, which when executed by a computer, performs the method of claim 1.
  - 10. The method according to claim 1, wherein the first data string and the second data string are pieces of text.
  - 11. The method according to claim 1, wherein the first data string and the second data string are each a sequence of musical notes.
  - 12. The method according to claim 1, wherein the first data string and the second data string are sequences of deoxyribonucleic acid.
  - 13. The method according to claim 1, wherein the first data string and the second data string are each phonetic sounds.

14. An electronic device comprising:
- a component configured to receive a first data string of entities and a second data string of entities, said first data string of entities and said second data string of entities being data strings relating to one of associative text string, genome analysis, speech recognition, and musical melody,a processing unit configured todetermine pairs of consecutively following data entities in said first data string,determine the relative positions of said pairs of consecutively following data entities in said first data string,allocate a position label to each of said data entities in the first data string,number same data entities according to their relative position in accordance with the position label;
  
  determine similar data entities with the same order in said second data string,determine the relative positions of said determined data entities in said second data string, anddetermine a matching measure by determining how far the relative positions of data entities in said second data string match with the relative positions of consecutively following data entities in said first data string, andrepeat said determination of said similarity measure with a number of received second data strings, andan interface configured to output a similarity measure for said second data string and said number of second data strings according to the amount of similarity to said first data string.
- View Dependent Claims (15, 16)
- - 15. An electronic device according to claim 14, further comprising a storage configured to store received strings and said determined similarity measures.
  - 16. The electronic device according to claim 14, wherein the electronic device is a mobile terminal device.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
WSOU Investments, LLC (WSOU Holdings, LLC)
Original Assignee
Nokia Corporation
Inventors
Theimer, Wolfgang, Ross, Andree
Primary Examiner(s)
Ngo, Chuong D

Application Number

US10/534,007
Publication Number

US 20060117228A1
Time in Patent Office

2,679 Days
Field of Search

708/422
US Class Current

708/422
CPC Class Codes

G06F 16/90344 by using string matching te...

Method and device for determining and outputting the similarity between two data strings

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

13 Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Method and device for determining and outputting the similarity between two data strings

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

13 Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links