SEARCH-BASED WORD SEGMENTATION METHOD AND DEVICE FOR LANGUAGE WITHOUT WORD BOUNDARY TAG
First Claim
1. A search-based word segmentation method for a language without a word boundary tag, comprising the steps of:
- a. providing at least one search engine with a segment of a text comprising at least one segment;
b. searching for the segment through the at least one search engine, and returning search results; and
c. selecting a word segmentation approach for the segment in accordance with at least part of the returned search results.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention discloses a search-based segmentation method and device for a language without a word boundary tag. The inventive method includes the steps of: a. providing at least one search engine with a segment of a text including at least one segment; b. searching for the segment through the at least one search engine, and returning search results; and c. selecting a word segmentation approach for the segment in accordance with at least part of the returned search results. The invention solves the problems of word segmentation for a language without a word boundary tag, and thus combat the limitations of the prior art in terms of flexibility, dependence upon coverage of dictionaries, available training data corpuses, processing of a new word, etc.
26 Citations
21 Claims
-
1. A search-based word segmentation method for a language without a word boundary tag, comprising the steps of:
-
a. providing at least one search engine with a segment of a text comprising at least one segment; b. searching for the segment through the at least one search engine, and returning search results; and c. selecting a word segmentation approach for the segment in accordance with at least part of the returned search results. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A search-based word segmentation device for a language without a word boundary tag, comprising:
-
at least one search engine, adapted to receive a segment of a text comprising at least one segment, to search in a search network for the segment, and to return search results; and a word segmentation result generating means, adapted to select a word segmentation approach for the segment in accordance with at least part of the returned search results. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer program product which can be stored on a computer readable storage medium and executed by a computer to perform a search-based word segmentation method for a language without a word boundary tag, wherein said method comprises the steps of:
-
a. providing at least one search engine with a segment of a text comprising at least one segment; b. searching for the segment through the at least one search engine, and returning search results; and c. selecting a word segmentation approach for the segment in accordance with at least part of the returned search results.
-
Specification