FULL TEXT INDEXING IN A DATABASE SYSTEM
First Claim
1. A processor-implemented method for indexing with redundant information, the method comprising:
- identifying, by a processor, a plurality of unknown code points for a document in response to an indexing request for the document;
converting the identified plurality of unknown code points into a plurality of converted code points, wherein each of the plurality of converted code points uses a different codepage;
identifying sets of same code points and sets of redundant code points from the plurality of converted code points; and
building an index based on the identified sets of same code points and the identified sets of redundant code points.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for indexing with redundant information. The method may identify unknown code points for a document in response to an indexing request for the document. The method may further convert the identified unknown code points into a plurality of converted code points. Each set of converted code points of the plurality uses a different codepage. The method may further identify sets of same code points and sets of redundant code points from the plurality of converted code points. The method may build an index based on the sets of same code points and the sets of redundant code points.
13 Citations
20 Claims
-
1. A processor-implemented method for indexing with redundant information, the method comprising:
-
identifying, by a processor, a plurality of unknown code points for a document in response to an indexing request for the document; converting the identified plurality of unknown code points into a plurality of converted code points, wherein each of the plurality of converted code points uses a different codepage; identifying sets of same code points and sets of redundant code points from the plurality of converted code points; and building an index based on the identified sets of same code points and the identified sets of redundant code points. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer system for indexing with redundant information, the computer system comprising:
-
one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage medium, and program instructions stored on at least one of the one or more tangible storage medium for execution by at least one of the one or more processors via at least one of the one or more memories, wherein the computer system is capable of performing a method comprising; identifying, by a processor, plurality of unknown code points for a document in response to an indexing request for the document; converting the identified plurality of unknown code points into a plurality of converted code points, wherein each of the plurality of converted code points uses a different codepage; identifying sets of same code points and sets of redundant code points from the plurality of converted code points; and building an index based on the identified sets of same code points and the identified sets of redundant code points. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer program product for indexing with redundant information, the computer program product comprising:
-
one or more computer-readable storage medium and program instructions stored on at least one of the one or more tangible storage medium, the program instructions executable by a processor, the program instructions comprising; program instructions to identify, by a processor, plurality of unknown code points for a document in response to an indexing request for the document; program instructions to convert the identified plurality of unknown code points into a plurality of converted code points, wherein each of the plurality of converted code points uses a different codepage; program instructions to identify sets of same code points and sets of redundant code points from the plurality of converted code points; and program instructions to build an index based on the identified sets of same code points and the identified sets of redundant code points. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification