Hybrid Comparison for Unicode Text Strings Consisting Primarily of ASCII Characters
First Claim
1. A method of comparing text strings having Unicode encoding, comprising:
- at a computer having one or more processors, and memory storing one or more programs configured for execution by the one or more processors;
receiving a first text string S=s1s2 . . . sn, having Unicode encoding and a second text string T=t1t2 . . . tm having Unicode encoding, wherein n and m are positive integers, and s1, s2, . . . , sn and t1, t2, . . . , tm are Unicode characters;
computing, for the first text string S, a first string weight ƒ
(S) according to a weight function ƒ
, computed according to;
when it is determined that S consists entirely of ASCII characters, ƒ
(S)=S; and
when it is determined that S consists of ASCII characters and one or more accented ASCII characters that are replaceable by corresponding ASCII characters, ƒ
(S)=g(s1)g(s2) . . . g(sn), wherein g(si)=si when si is an ASCII character and g(si)=s′
i when si is an accented ASCII character that is replaceable by the corresponding ASCII character si;
computing, a second string weight ƒ
(T), for the second text string T, according to the weight function ƒ
; and
determining whether the first text string and the second text string are equal by comparing the first string weight to the second string weight.
0 Assignments
0 Petitions
Accused Products
Abstract
A method compares text strings having Unicode encoding. The method receives a first string S=s1s2 . . . sn and a second string T=t1t2 . . . tm, where s1, s2, . . . , sn and t1, t2, . . . , tm are Unicode characters. The method computes a first string weight for the first string S according to a weight function ƒ. When S consists of ASCII characters, ƒ(S)=S. When S consists of ASCII characters and some accented ASCII characters that are replaceable by ASCII characters, ƒ(S)=g(s1)g(s2) . . . g(sn), where g(si)=si when si is an ASCII character and g(si)=s′i when si is an accented ASCII character that is replaceable by the corresponding ASCII character s′i. The method also computes a second string weight for the second text string T. Equality of the strings is tested using the string weights.
-
Citations
20 Claims
-
1. A method of comparing text strings having Unicode encoding, comprising:
-
at a computer having one or more processors, and memory storing one or more programs configured for execution by the one or more processors; receiving a first text string S=s1s2 . . . sn, having Unicode encoding and a second text string T=t1t2 . . . tm having Unicode encoding, wherein n and m are positive integers, and s1, s2, . . . , sn and t1, t2, . . . , tm are Unicode characters; computing, for the first text string S, a first string weight ƒ
(S) according to a weight function ƒ
, computed according to;when it is determined that S consists entirely of ASCII characters, ƒ
(S)=S; andwhen it is determined that S consists of ASCII characters and one or more accented ASCII characters that are replaceable by corresponding ASCII characters, ƒ
(S)=g(s1)g(s2) . . . g(sn), wherein g(si)=si when si is an ASCII character and g(si)=s′
i when si is an accented ASCII character that is replaceable by the corresponding ASCII character si;computing, a second string weight ƒ
(T), for the second text string T, according to the weight function ƒ
; anddetermining whether the first text string and the second text string are equal by comparing the first string weight to the second string weight. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computing device, comprising:
-
one or more processors; memory; and one or more programs stored in the memory and configured for execution by the one or more processors, the one or more programs comprising instructions for; receiving a first text string S=s1s2 . . . sn having Unicode encoding and a second text string T=t1t2 . . . tm having Unicode encoding, wherein n and m are positive integers, and s1, s2, . . . , sn and t1, t2, tm are Unicode characters; computing, for the first text string S, a first string weight ƒ
(S) according to a weight function ƒ
, computed according to;when it is determined that S consists entirely of ASCII characters, ƒ
(S)=S; andwhen it is determined that S consists of ASCII characters and one or more accented ASCII characters that are replaceable by corresponding ASCII characters, ƒ
(S)=g(s1)g(s2) . . . g(sn), wherein g(si)=si when si is an ASCII character and g(si)=s′
i when si is an accented ASCII character that is replaceable by the corresponding ASCII character s′
i;computing, a second string weight ƒ
(T), for the second text string T, according to the weight function ƒ
; anddetermining whether the first text string and the second text string are equal by comparing the first string weight to the second string weight. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer readable storage medium storing one or more programs configured for execution by a computing device having one or more processors and memory, the one or more programs comprising instructions for:
-
receiving a first text string S=s1s2 . . . sn having Unicode encoding and a second text string T=t1t2 . . . tm having Unicode encoding, wherein n and m are positive integers, and s1, s2, . . . , sn and t1, t2, . . . , tm are Unicode characters; computing, for the first text string S, a first string weight ƒ
(S) according to a weight function ƒ
, computed according to;when it is determined that S consists entirely of ASCII characters, ƒ
(S)=S; andwhen it is determined that S consists of ASCII characters and one or more accented ASCII characters that are replaceable by corresponding ASCII characters, ƒ
(S)=g(s1)g(s2) . . . g(sn), wherein g(si)=si when si is an ASCII character and g(si)=s′
i when si is an accented ASCII character that is replaceable by the corresponding ASCII character s′
i;computing, a second string weight ƒ
(T), for the second text string T, according to the weight function ƒ
; anddetermining whether the first text string and the second text string are equal by comparing the first string weight to the second string weight. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification