Robot apparatus, method and device for recognition of letters or characters, control program and recording medium
First Claim
Patent Images
1. A robot apparatus acting autonomously based on an inner state of the robot apparatus, comprising:
- storage means for speech recognition, as a dictionary for speech recognition, having stored therein the relationship of correspondence between a word and the pronunciation information thereof;
word sound expression storage means, as a table for word sound expressions, having stored therein the relationship of correspondence between the word and the word reading expressing letters thereof;
imaging means for photographing an object;
image recognition means for extracting the predetermined patterns of images from the image photographed by said imaging means;
sound collecting means for acquiring the surrounding sound;
speech recognition means for recognizing the speech from the sound collected by said sound collecting means;
reading information generating means for conferring plural word reading expressing letters, inferred from the predetermined patterns of images extracted by said image recognition means, based on said table for word sound expressions, and for generating the pronunciation information corresponding to the reading for each of the plural word reading expressing letters or characters thus conferred; and
storage controlling means for comparing the pronunciation information generated by said reading information generating means to the pronunciation information of the speech recognized by said speech recognition means and newly storing the closest information of pronunciation in said dictionary for speech recognition as being the pronunciation information corresponding to the pattern recognition result extracted by said image recognition means.
1 Assignment
0 Petitions
Accused Products
Abstract
A plural number of letters or characters, inferred from the results of letter/character recognition of an image photographed by a CCD camera (20), a plural number of kana readings inferred from the letters or characters and the way of pronunciation corresponding to the kana readings are generated in an pronunciation information generating unit (150) and the plural readings obtained are matched to the pronunciation from the user acquired by a microphone (23) to specify one kana reading and the way of pronunciation (reading) from among the plural generated candidates.
-
Citations
17 Claims
-
1. A robot apparatus acting autonomously based on an inner state of the robot apparatus, comprising:
-
storage means for speech recognition, as a dictionary for speech recognition, having stored therein the relationship of correspondence between a word and the pronunciation information thereof; word sound expression storage means, as a table for word sound expressions, having stored therein the relationship of correspondence between the word and the word reading expressing letters thereof; imaging means for photographing an object; image recognition means for extracting the predetermined patterns of images from the image photographed by said imaging means; sound collecting means for acquiring the surrounding sound; speech recognition means for recognizing the speech from the sound collected by said sound collecting means; reading information generating means for conferring plural word reading expressing letters, inferred from the predetermined patterns of images extracted by said image recognition means, based on said table for word sound expressions, and for generating the pronunciation information corresponding to the reading for each of the plural word reading expressing letters or characters thus conferred; and storage controlling means for comparing the pronunciation information generated by said reading information generating means to the pronunciation information of the speech recognized by said speech recognition means and newly storing the closest information of pronunciation in said dictionary for speech recognition as being the pronunciation information corresponding to the pattern recognition result extracted by said image recognition means. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A letter/character recognition device comprising:
-
storage means for speech recognition, as a dictionary for speech recognition, having stored therein the relationship of correspondence between a word and the pronunciation information thereof; word sound expression storage means, as a table for word sound expressions, having stored therein the relationship of correspondence between the word and the word reading expressing letters thereof; imaging means for photographing an object; image recognition means for extracting the predetermined patterns of images from the image photographed by said imaging means; sound collecting means for acquiring the surrounding sound; speech recognition means for recognizing the speech from the sound collected by said sound collecting means; reading information generating means for conferring plural word reading expressing letters, inferred from the predetermined patterns of images extracted by said image recognition means, based on said table for word sound expressions, and for generating the pronunciation information for each of the plural word reading expressing letters or characters thus conferred; and storage controlling means for comparing the pronunciation information generated by said reading information generating means to the speech information of the speech recognized by said speech recognition means and newly storing the closest information of pronunciation in said dictionary for speech recognition as being the pronunciation information corresponding to the pattern recognition result extracted by said image recognition means. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A letter/character recognition method comprising:
-
an imaging step of imaging an object; an image recognition step of extracting predetermined patterns of images from an image photographed by said imaging step; a sound collecting step of collecting the surrounding sound; a speech recognition step of recognizing the speech from the sound acquired by said sound collecting step; a reading information generating step of conferring plural word reading expressing letters, inferred from the predetermined patterns of images extracted by said image recognition step, based on a table for word sound expressions, having stored therein the relationship of correspondence between a word and a sound expressing letter/character for said word, and for generating the pronunciation information for each of the plural word reading expressing letters or characters thus conferred; and a storage controlling step of comparing the pronunciation information generated by said reading information generating means to the speech information of the speech recognized by said speech recognition step and newly storing the closest information of pronunciation in said dictionary for speech recognition as being the pronunciation information corresponding to the pattern recognition result extracted by said image recognition step. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A computer readable medium having stored therein a control program for having a robot apparatus execute
an imaging step of imaging an object; -
an image recognition step of extracting predetermined patterns of images from an image photographed by said imaging step; a sound collecting step of collecting the surrounding sound; a speech recognition step of recognizing the speech from the sound acquired by said sound collecting step; an pronunciation information generating step of conferring plural word reading expressing letters, inferred from the predetermined patterns of images extracted by said image recognition step, based on a table for word sound expressions, having stored therein the relationship of correspondence between a word and a sound expressing letter/character for said word, and for generating, for each of the plural word reading expressing letters or characters thus conferred, the pronunciation information; and a storage step of comparing the pronunciation information generated by said reading information generating means to the speech information of the speech recognized by said speech recognition step and newly storing the closest information of pronunciation in said dictionary for speech recognition as being the pronunciation information corresponding to the pattern recognition result extracted by said image recognition step.
-
-
17. A computer readable medium having recorded therein a control program for having a robot apparatus execute
an imaging step of imaging an object; -
an image recognition step of extracting predetermined patterns of images from an image photographed by said imaging step; a sound collecting step of collecting the surrounding sound; a speech recognition step of recognizing the speech from the sound acquired by said sound collecting step; an pronunciation information generating step of conferring plural word reading expressing letters, inferred from the predetermined patterns of images extracted by said image recognition step, based on a table for word sound expressions, having stored therein the relationship of correspondence between a word and a sound expressing letter/character for said word, and for generating the pronunciation information for each of the plural word reading expressing letters or characters thus conferred; and a storage step of comparing the pronunciation information generated by said pronunciation information generating step to the speech information of the speech recognized by said speech recognition step and newly storing the closest information of pronunciation in said dictionary for speech recognition as being the pronunciation information corresponding to the pattern recognition result extracted by said image recognition step.
-
Specification