WORD HASH LANGUAGE MODEL
First Claim
1. A computer-implemented method for selecting a word, the method comprising:
- obtaining a word hash vector for each word of a vocabulary;
receiving a first sequence of words;
generating a first sequence of word hash vectors by obtaining a word hash vector for each word of the first sequence of words;
processing the first sequence of word hash vectors with a layer of a neural network language model to compute a first output vector;
quantizing the first output vector to obtain a first output word hash vector;
determining a distance between the first output word hash vector and a first hash vector for a first word in the vocabulary;
selecting the first word from the vocabulary using the distance between the first output word hash vector and the first hash vector for the first word;
generating a second sequence of words using the first sequence of words and the first word;
generating a second sequence of word hash vectors by obtaining a word hash vector for each word of the second sequence of words;
processing the second sequence of word hash vectors with the layer of the neural network language model to compute a second output vector;
quantizing the second output vector to obtain a second output word hash vector;
determining a distance between the second output word hash vector and a second hash vector of a second word in the vocabulary; and
selecting the second word from the vocabulary using the distance between the second output word hash vector and the second hash vector for the second word.
1 Assignment
0 Petitions
Accused Products
Abstract
A language model may be used in a variety of natural language processing tasks, such as speech recognition, machine translation, sentence completion, part-of-speech tagging, parsing, handwriting recognition, or information retrieval. A natural language processing task may use a vocabulary of words, and a word hash vector may be created for each word in the vocabulary. A sequence of input words may be received, and a hash vector may be obtained for each word in the sequence. A language model may process the hash vectors for the sequence of input words to generate an output hash vector that describes words that are likely to follow the sequence of input words. One or words may then be selected using the output word hash vector and used for a natural language processing task.
31 Citations
20 Claims
-
1. A computer-implemented method for selecting a word, the method comprising:
-
obtaining a word hash vector for each word of a vocabulary; receiving a first sequence of words; generating a first sequence of word hash vectors by obtaining a word hash vector for each word of the first sequence of words; processing the first sequence of word hash vectors with a layer of a neural network language model to compute a first output vector; quantizing the first output vector to obtain a first output word hash vector; determining a distance between the first output word hash vector and a first hash vector for a first word in the vocabulary; selecting the first word from the vocabulary using the distance between the first output word hash vector and the first hash vector for the first word; generating a second sequence of words using the first sequence of words and the first word; generating a second sequence of word hash vectors by obtaining a word hash vector for each word of the second sequence of words; processing the second sequence of word hash vectors with the layer of the neural network language model to compute a second output vector; quantizing the second output vector to obtain a second output word hash vector; determining a distance between the second output word hash vector and a second hash vector of a second word in the vocabulary; and selecting the second word from the vocabulary using the distance between the second output word hash vector and the second hash vector for the second word. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for selecting a word, the system comprising:
-
at least one computer comprising at least one processor and at least one memory, the at least one computer configured to; compute a word hash vector for each word of a vocabulary of words; receive a sequence of words; generating a sequence of word hash vectors by obtaining a word hash vector for each word of the sequence of words; process the sequence of word hash vectors with a layer of a language model to compute an output vector; and quantize the output vector to obtain an output word hash vector; determine a distance between the output word hash vector and a first hash vector for a first word in the vocabulary; select the first word from the vocabulary using the distance between the output word hash vector and the first hash vector for the first word. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. One or more non-transitory computer-readable media comprising computer executable instructions that, when executed, cause at least one processor to perform actions comprising:
-
obtaining a word hash vector for each word of a vocabulary; receiving a sequence of words; generating a sequence of word hash vectors by obtaining an word hash vector for each word of the sequence of words; processing the sequence of word hash vectors with a layer of a language model to compute an output vector; and quantizing the output vector to obtain an output word hash vector; determining a distance between the output word hash vector and a first hash vector for a first word in the vocabulary; selecting the first word from the vocabulary using the output word hash vector and a first hash vector for the first word. - View Dependent Claims (18, 19, 20)
-
Specification