Adapter for allowing both online and offline training of a text to text system
DCFirst Claim
Patent Images
1. A computer implemented method, comprising:
- first carrying out a first generic training using at least one corpus of language information based at least in part on Internet information, using a first generic training operation to obtain a first generic parameter set;
second carrying out a second domain specific training using a fast train module associated with a domain specific corpus, said fast train module including a second domain specific training operation which operates faster than said first generic training operation, and which is less accurate than said first generic training operation, to obtain a second domain specific parameter set;
merging said first generic parameter set and said second domain specific parameter set into a merged parameter set, and using said merged parameter set for a text to text operation, wherein said merging comprises a weighted merge between said first generic parameter set and said second domain specific parameter set; and
using said second domain specific parameter set to adapt said first generic parameter set to carry out said to text operation, wherein said using comprises using partial information from the first generic training and partial information from the second domain specific training, forming an original table and an override table, and using both said original table and said override table as part of said text to text operation.
2 Assignments
Litigations
0 Petitions
Accused Products
Abstract
An adapter for a text to text training. A main corpus is used for training, and a domain specific corpus is used to adapt the main corpus according to the training information in the domain specific corpus. The adaptation is carried out using a technique that may be faster than the main training. The parameter set from the main training is adapted using the domain specific part.
-
Citations
24 Claims
-
1. A computer implemented method, comprising:
-
first carrying out a first generic training using at least one corpus of language information based at least in part on Internet information, using a first generic training operation to obtain a first generic parameter set; second carrying out a second domain specific training using a fast train module associated with a domain specific corpus, said fast train module including a second domain specific training operation which operates faster than said first generic training operation, and which is less accurate than said first generic training operation, to obtain a second domain specific parameter set; merging said first generic parameter set and said second domain specific parameter set into a merged parameter set, and using said merged parameter set for a text to text operation, wherein said merging comprises a weighted merge between said first generic parameter set and said second domain specific parameter set; and using said second domain specific parameter set to adapt said first generic parameter set to carry out said to text operation, wherein said using comprises using partial information from the first generic training and partial information from the second domain specific training, forming an original table and an override table, and using both said original table and said override table as part of said text to text operation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. An apparatus, comprising:
-
a first training computer at a first location, carrying out a first generic training using at least one corpus of information based at least in part on Internet information, using a first generic training operation to obtain a first generic parameter set; and a second training computer, at a second location, different than the first location, carrying out a second domain specific training using a fast train module associated with a domain specific corpus that has different information than said at least one corpus, said fast train module including a second domain specific training operation which operates faster than said first generic training operation, and which is less accurate than said first generic training operation, to obtain a second domain specific parameter set, and using said first generic parameter set and said second domain specific parameter set together for a text to text operation, wherein said second training computer also operates to merge said first generic parameter set and said second domain specific parameter set into a merged parameter set, to use said merged parameter set for said text to text operation, and to carry out a weighted merge between said first generic parameter set and said second domain specific parameter set, and wherein said training second computer uses partial information from the first generic training and partial information from the second domain specific training, forms an original table and an override table, and uses both said original table and said override table as part of said text to text operation. - View Dependent Claims (16, 17, 18, 19, 20)
-
-
21. An apparatus, comprising:
-
a training part including at least one computer, which carries out a first generic training for a text to text operation using at least one corpus of training information based at least in part on Internet information, to obtain a first generic parameter set and at a different time than first generic training, carrying out a second domain specific training using a fast train module associated with a domain specific corpus that has different information than said at least one corpus, said fast train module including a second domain specific training operation which operates faster than said first generic training operation, and which is less accurate than said first generic training operation, to obtain a second domain specific parameter set and using said second domain specific parameter set to adapt said first generic parameter set to create an adapted parameter set, and to use the adapted parameter set for a text to text operation, wherein said at least one training computer merges said first generic parameter set and said second domain specific parameter set into a merged parameter set, and uses said merged parameter set for said text to text operation, and carries out a weighted merge between said first generic parameter set and said second domain specific parameter set, and wherein said at least one training computer uses partial information from the first generic training and partial information from the second domain specific training, forms an original table and an override table, and uses both said original table and said override table as part of said text to text operation. - View Dependent Claims (22, 23, 24)
-
Specification