Sequence transcription with deep neural networks
First Claim
Patent Images
1. A computer-implemented method comprising:
- training, by one or more computing devices, a neural network implemented by one or more processors of a system to map a plurality of training images received by the neural network into a probabilistic model of sequences comprising P(S|X) by maximizing log P(S|X) on the plurality of training images, wherein the training images respectively contain a sequence of characters having a sequence length that is greater than 1, (and wherein each character in the sequence is a discrete variable having a finite number of multiple possible values,) wherein X represents an input image and S represents an output sequence of characters for the input image, and wherein each character in the sequence is a discrete variable having a finite number of multiple possible values, the neural network and the plurality of training images being configured to assume a predetermined maximum sequence length that is greater than 1, wherein each of the one or more computing devices comprises one or more processors;
receiving, by the one or more computing devices, an image containing characters associated with building numbers;
processing, by the one or more computing devices, with the trained neural network the received image containing characters associated with building numbers; and
generating, by the one or more computing devices, with the trained neural network a predicted sequence of characters based at least in part on the processing of the received image.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for sequence transcription with neural networks are provided. More particularly, a neural network can be implemented to map a plurality of training images received by the neural network into a probabilistic model of sequences comprising P(S|X) by maximizing log P(S|X) on the plurality of training images. X represents an input image and S represents an output sequence of characters for the input image. The trained neural network can process a received image containing characters associated with building numbers. The trained neural network can generate a predicted sequence of characters by processing the received image.
48 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
training, by one or more computing devices, a neural network implemented by one or more processors of a system to map a plurality of training images received by the neural network into a probabilistic model of sequences comprising P(S|X) by maximizing log P(S|X) on the plurality of training images, wherein the training images respectively contain a sequence of characters having a sequence length that is greater than 1, (and wherein each character in the sequence is a discrete variable having a finite number of multiple possible values,) wherein X represents an input image and S represents an output sequence of characters for the input image, and wherein each character in the sequence is a discrete variable having a finite number of multiple possible values, the neural network and the plurality of training images being configured to assume a predetermined maximum sequence length that is greater than 1, wherein each of the one or more computing devices comprises one or more processors; receiving, by the one or more computing devices, an image containing characters associated with building numbers; processing, by the one or more computing devices, with the trained neural network the received image containing characters associated with building numbers; and generating, by the one or more computing devices, with the trained neural network a predicted sequence of characters based at least in part on the processing of the received image. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system comprising:
-
one or more processors; one or more memories; and machine-readable instructions stored in the one or more memories, that upon execution by the one or more processors cause the system to carry out operations comprising; training a neural network implemented by one or more processors of a system to map a plurality of training images received by the neural network into a probabilistic model of sequences comprising P(S|X) by maximizing log P(S|X) on the plurality of training images, wherein X represents an input image and S represents an output sequence of characters for the input image, and wherein each character in the sequence is a discrete variable having a finite number of multiple possible values, the neural network and the plurality of training images being configured to assume a predetermined maximum sequence length that is greater than 1; receiving, by the one or more computing devices, an image containing characters associated with building numbers; processing, by the one or more computing devices, with the trained neural network the received image containing characters associated with building numbers; and generating, by the one or more computing devices, with the trained neural network a predicted sequence of characters based at least in part on the processing of the received image. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
-
18. A non-transitory computer-readable medium storing instructions that, when executed by one or more computing devices, cause the one or more computing devices to perform operations, the operations comprising:
-
training a neural network implemented by one or more processors of a system to map a plurality of training images received by the neural network into a probabilistic model of sequences comprising P(S|X) by maximizing log P(S|X) on the plurality of training images, wherein X represents an input image and S represents an output sequence of characters for the input image, and wherein each character in the sequence is a discrete variable having a finite number of multiple possible values, the neural network and the plurality of training images being configured to assume a predetermined maximum sequence length that is greater than 1; receiving, by the one or more computing devices, an image containing characters associated with building numbers; processing, by the one or more computing devices, with the trained neural network the received image containing characters associated with building numbers; and generating, by the one or more computing devices, with the trained neural network a predicted sequence of characters based at least in part on the processing of the received image. - View Dependent Claims (19, 20)
-
Specification