Method and system for identifying sentence boundaries
First Claim
Patent Images
1. A method for separating individual sentences in text, comprising:
- using a Bayesian algorithm to analyze the text; and
, classify boundaries of sentences within the text.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention is directed to systems and methods for isolating sentence boundaries between sentences in text. Sentences of the normalized document feeds or source text are separated by determining boundaries between individual sentences, by a Bayesian algorithm, that has been seeded with rule frequencies, developed from a previous training phase, that employed a text of sentences with marked boundaries between the sentences.
79 Citations
6 Claims
-
1. A method for separating individual sentences in text, comprising:
-
using a Bayesian algorithm to analyze the text; and
,classify boundaries of sentences within the text. - View Dependent Claims (2, 3)
-
-
4. A method for separating individual sentences in text, comprising:
-
providing at least two rules to classify portions of the text as a sentence or a non-sentence;
tagging a training text with sentence boundaries; and
,applying a Bayesian analysis to the training text to determine thresholds for marking sentence boundaries, whereby a program is populated with rules. - View Dependent Claims (5, 6)
-
Specification