UNSUPERVISED STEMMING SCHEMA LEARNING AND LEXICON ACQUISITION FROM CORPORA
First Claim
1. A computer implemented method for unsupervised stemming schema learning and lexicon acquisition from corpora, the computer implemented method comprising;
- obtaining a corpus from a corpora;
analyzing the corpus to deduce a set of possible stemming schema;
reviewing and revising the set of possible stemming schema to create a pruned set of stemming schema; and
deducing a lexicon from the corpus using the pruned set of stemming schema.
1 Assignment
0 Petitions
Accused Products
Abstract
Illustrated embodiments provide a computer implemented method, an apparatus, and a computer program product for unsupervised stemming schema learning and lexicon acquisition from corpora. In one illustrative embodiment, the computer implemented method obtains a corpus from corpora, analyzes the corpus to deduce a set of possible stemming schema and reviews and revises the set of possible stemming schema, to create a pruned set of stemming schema. The computer implemented method further deduces a lexicon from the corpus using the pruned set of stemming schema.
-
Citations
18 Claims
-
1. A computer implemented method for unsupervised stemming schema learning and lexicon acquisition from corpora, the computer implemented method comprising;
-
obtaining a corpus from a corpora; analyzing the corpus to deduce a set of possible stemming schema; reviewing and revising the set of possible stemming schema to create a pruned set of stemming schema; and deducing a lexicon from the corpus using the pruned set of stemming schema. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A data processing system for unsupervised stemming schema learning and lexicon acquisition from corpora, the data processing system comprising;
-
a bus; a memory connected to the bus; a persistent storage connected to the bus, wherein the persistent storage having computer executable program code embodied therein; a communications unit connected to the bus; a display connected to the bus; and a processor unit connected to the bus, wherein the processor unit executes the computer executable program code directing the data processing to; obtain a corpus from a corpora; analyze the corpus to deduce a set of possible stemming schema; review and revise the set of possible stemming schema to create a pruned set of stemming schema; and deduce a lexicon from the corpus using the pruned set of stemming schema. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer program product for unsupervised stemming schema learning and lexicon acquisition from corpora, the computer program product comprising a computer usable recordable medium having computer executable program code tangibly embodied thereon, the computer executable program code comprising;
-
computer executable program code for obtaining a corpus from a corpora; computer executable program code for analyzing the corpus to deduce a set of possible stemming schema; computer executable program code for reviewing and revising the set of possible stemming schema to create a pruned set of stemming schema; and computer executable program code for deducing a lexicon from the corpus using the pruned set of stemming schema. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification