×

Hybrid comparison for unicode text strings consisting primarily of ASCII characters

  • US 10,540,425 B2
  • Filed: 06/18/2019
  • Issued: 01/21/2020
  • Est. Priority Date: 11/06/2016
  • Status: Active Grant
First Claim
Patent Images

1. A method of comparing text strings having Unicode encoding, comprising:

  • at a computer having one or more processors, and memory storing one or more programs configured for execution by the one or more processors;

    receiving a first text string S=s1 s2 . . . sn having Unicode encoding and a second text string T=t1 t2 . . . tm having Unicode encoding, wherein n and m are positive integers, and s1, s2, . . . , sn and t1, t2, . . . , tm are Unicode characters;

    computing, for the first text string S, a first string weight ƒ

    (S) according to a weight function ƒ

    , computed according to;

    when it is determined that S consists entirely of ASCII characters, ƒ

    (S)=S;

    when it is determined that S consists of ASCII characters and one or more accented ASCII characters that are replaceable by corresponding ASCII characters, ƒ

    (S)=g(s1) g(s2) . . . g(sn), wherein g(si)=si when si is an ASCII character and g(si)=si

    when si is an accented ASCII character that is replaceable by the corresponding ASCII character si

    ; and

    when S includes one or more non-replaceable non-ASCII characters, the first string weight ƒ

    (S) is a concatenation of an ASCII weight prefix ƒ

    A(S) and a Unicode weight suffix ƒ

    U(S);

    computing, a second string weight ƒ

    (T), for the second text string T, according to the weight function ƒ

    ; and

    determining whether the first text string and the second text string are equal by comparing the first string weight to the second string weight.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×