Image manipulation

US 6,133,904 A
Filed: 02/04/1997
Issued: 10/17/2000
Est. Priority Date: 02/09/1996
Status: Expired due to Term

First Claim

Patent Images

1. An image manipulation apparatus comprising:

means for reproducing an image;

a speech recognition user interface for allowing a user to input a speech signal comprising a description of a desired change to be made to the reproduced image;

means for interpreting a recognition result output from the speech recognition interface; and

changing means responsive to the interpreting means for changing the colour of one or more parts of the reproduced image in order to affect a manipulation desired by the user;

wherein said description comprises a number of continuously spoken words;

wherein said speech recognition user interface comprises;

a memory for storing a plurality of reference word models, each representative of a word, and for storing a language model which defines sequences of the reference word models which can be matched with the input speech signal, in order to define input speech commands;

matching means for matching the input speech signal with selected sequences of said word models, selected in accordance with the stored language model;

recognition means, responsive to said matching means, for providing a recognition result based upon a likely sequence of reference models that corresponds to an input utterance;

receive means for receiving a new input speech command comprising two or more whole words;

means for generating a word model for each of the words contained within the new input speech command, if they do not already exist; and

means for adapting said language model to incorporate said new input speech command.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An apparatus for manipulating the colour of an image is provided, having a microphone for providing electrical speech signals representative of a user command, a speech recognition unit for recognizing the input speech signal, a command interpreter for interpreting the recognized speech, a graphics package responsive to the command interpreter and a display for displaying the current image being edited. The apparatus accepts other inputs, for example, from a pointing device.

Citations

28 Claims

1. An image manipulation apparatus comprising:
- means for reproducing an image;
  
  a speech recognition user interface for allowing a user to input a speech signal comprising a description of a desired change to be made to the reproduced image;
  
  means for interpreting a recognition result output from the speech recognition interface; and
  
  changing means responsive to the interpreting means for changing the colour of one or more parts of the reproduced image in order to affect a manipulation desired by the user;
  
  wherein said description comprises a number of continuously spoken words;
  
  wherein said speech recognition user interface comprises;
  
  a memory for storing a plurality of reference word models, each representative of a word, and for storing a language model which defines sequences of the reference word models which can be matched with the input speech signal, in order to define input speech commands;
  
  matching means for matching the input speech signal with selected sequences of said word models, selected in accordance with the stored language model;
  
  recognition means, responsive to said matching means, for providing a recognition result based upon a likely sequence of reference models that corresponds to an input utterance;
  
  receive means for receiving a new input speech command comprising two or more whole words;
  
  means for generating a word model for each of the words contained within the new input speech command, if they do not already exist; and
  
  means for adapting said language model to incorporate said new input speech command.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. An apparatus according to claim 1, wherein said means for reproducing the image comprises a display device, and wherein the system further comprises means for designating one or more areas of the displayed image to be altered.
  - 3. An apparatus according to claim 2 wherein said designating means is a pointing device which is moved across an area to be changed, and wherein said changing means is arranged to determine the statistical characteristics of the hue values of the pixels traced by said pointing device, and wherein said changing means is arranged to change the hue value of all pixels in the image having a hue value similar to the determined statistics.
  - 4. An apparatus according to claim 3, wherein said changing means determines the average and standard deviation of the pixels traced by said pointing device, and wherein all pixels in the image having a hue value within the range of the determined average plus or minus the determined standard deviation are changed by said changing means.
  - 5. An apparatus according to claim 3, wherein the amount of hue change is dependent upon the hue value of the desired colour change.
  - 6. An apparatus according to claim 1, wherein said means for reproducing the image comprises a printing device.
  - 7. An apparatus according to claim 1, wherein said language model adapting means comprises:
    - means for associating the first word in the new input speech command to the output of a start node of the language model, if it is not already associated thereto;
      
      means for processing all but the last word in the new input speech command comprising;
      
      i) means for creating an intermediate node and for associating the current word in the new input speech command to the input of that node, if the current word is not already connected to the input of an intermediate node; and
      
      ii) means for associating the next word in the new input speech command to the output of the intermediate node that has the current word associated to its input, if it is not already associated thereto; and
      
      means for associating the last word in the new input speech command to the input of an end node of the language model if it is not already associated thereto.
  - 8. A computer having an image manipulation system according to claim 1.
  - 9. A facsimile machine having an image manipulation system according to claim 1.
  - 10. A photocopier having an image manipulation system according to claim 1.
  - 11. A camera having an image manipulation system according to claim 1.

12. A method of manipulating an image comprising the steps of:
- reproducing an image;
  
  using a speech recognition user interface for allowing a user to input a speech signal comprising a description of a desired change to be made to the reproduced image;
  
  interpreting a recognition result output from the speech user recognition interface; and
  
  changing a colour of one or more parts of the reproduced image in response to the interpreting step, in order to effect the colour manipulation desired by a user;
  
  wherein said description comprises a number of continuously spoken words;
  
  wherein said speech recognition user interface performs the steps of;
  
  i) matching the input speech signal with selected sequences of word models, selected in accordance with a stored language model by using a memory storing a plurality of reference word models, each representative of a word, and a language model which defines sequences of the reference word models which can be matched with the input speech signal, in order to define input speech commands; and
  
  ii) providing, in response to the matching step, a recognition result based upon a likely sequence of reference models that corresponds of an input utterance; and
  
  wherein said language model is adaptable by;
  
  (a) receiving a new input speech command comprising two or more whole words;
  
  (b) generating a word model for each of the words contained within the new input speech command, if they do not already exist; and
  
  (c) adapting the language model to incorporate the new input speech command.
- View Dependent Claims (13, 14, 15, 16, 17, 18)
- - 13. A method according to claim 12, wherein said reproducing step displays said image on a display, and wherein said method further comprises the step of designating one or more areas of the displayed image to be altered.
  - 14. A method according to claim 13, wherein a user moves a pointing device across an area to be changed in order to designate said one or more areas of the displayed image, and wherein said changing step determines statistical characteristics of hue values of pixels traced by said pointing device, and wherein said changing step is arranged to change the hue value of all pixels in the image having a hue value similar to the determined statistical characteristics.
  - 15. A method according to claim 14, wherein said changing step determines an average and standard deviation of pixels traced by said pointing device, and wherein all pixels in the image having a hue value within a range of the determined average plus or minus the determined standard deviation are changed in said changing step.
  - 16. A method according to claim 14, wherein an amount of hue change is dependent upon the hue value of a desired colour change.
  - 17. A method according to claim 12, wherein said reproducing step comprises a step of printing the image.
  - 18. A method according to claim 12, wherein said adapting step comprises the steps of:
    - associating the first word in the new input speech command to the output of a start node of the language model, if it is not already associated thereto;
      
      processing all but the last word in the new input speech command by;
      
      i) creating an intermediate node and associating a current word in the new input speech command to the input of that node, if the current word is not already connected to the input of an intermediate node; and
      
      ii) associating a next word in the new input speech command to an output of the intermediate node that has the current word associated to its input if the next word is already associated thereto; and
      
      associating a last word in the new input speech command to an input of an end node of the language model if the last word is not already associated thereto.

19. A computer readable medium storing computer executable process steps to allow image manipulation, the process steps comprising the steps of:
- reproducing an image;
  
  using a speech recognition user interface for allowing a user to input a speech signal comprising a description of a desired change to be made to the reproduced image;
  
  interpreting a recognition result output from the speech user recognition interface; and
  
  changing the colour of one or more parts of the reproduced image in response to said interpreting step, in order to effect a manipulation desired by a user;
  
  wherein said description comprises a number of continuously spoken words;
  
  wherein said speech recognition user interface performs the step of;
  
  matching the input speech signal with selected sequences of word models, selected in accordance with a stored language model by using a memory storing a plurality of reference word models, each representative of a word, and a language model which defines sequences of the reference word models which can be matched with the input speech signal, in order to define input speech commands; and
  
  providing, in response to the matching step, a recognition result based upon a likely sequence of reference models that corresponds to an input utterance; and
  
  wherein said language model is adaptable by;
  
  (a) receiving a new input speech command comprising two or more whole words;
  
  (b) generating a word model for each of the words contained within the new input speech command, if they do not already exist; and
  
  (c) adapting the language model to incorporate the new input speech command.
- View Dependent Claims (20, 21, 22, 23, 24, 25)
- - 20. A computer readable medium according to claim 19, wherein said reproducing step displays said image on a display, and further comprising processing steps designating one or more areas of the displayed image to be altered.
  - 21. A computer readable medium according to claim 20, wherein a user moves a pointing device across an area to be changed in order to designate said one or more areas of the displayed image, and wherein said changing step determines the statistical characteristics of the hue values of the pixels traced by said pointing device, and wherein said changing step is arranged to change the hue value of all pixels in the image having a hue value similar to the determined statistical characteristics.
  - 22. A computer readable medium according to claim 21, wherein said changing step determines an average and standard deviation of pixels traced by said pointing device, and wherein all pixels in the image having a hue value within a range of the determined average plus or minus the determined standard deviation are changed in said changing step.
  - 23. A computer medium according to claim 21, wherein an amount of hue change is dependent upon the hue value of a desired colour change.
  - 24. A computer readable medium according to claim 19, wherein said reproducing step comprises a step of printing the image.
  - 25. A computer readable medium according to claim 19, wherein said adapting step comprises the steps of:
    - associating the first word in the new input speech command to an output of a start node of the language model, if it is not already associated thereto;
      
      processing all but the last word in the new input speech command by;
      
      i) creating an intermediate node and associating a current word in the new input speech command to an input of that node, if the current word is not already connected to the input of an intermediate node; and
      
      ii) associating a next word in the new input speech command to the output of the intermediate node that has the current word associated to its input, if the next word is not already associated thereto; and
      
      associating the last word in the new input speech command to the input of an end node of the language model if the last word is not already associated thereto.

26. An image manipulation apparatus comprising:
- a speech recognition user interface for allowing a user to input a speech signal of a command comprising a number of continuously spoken words;
  
  means for interpreting a recognition result output from the speech recognition interface; and
  
  means responsive to the interpreting means for executing a function corresponding to the command;
  
  wherein said speech recognition user interface uses a memory for storing a plurality of reference word models, each representative of a word, and for storing a language model which defines sequences of the reference word models which can be matched with the input speech signal, in order to define input speech commands and comprises;
  
  matching means for matching the input speech signal with selected sequences of said word models, selected in accordance with the stored language model;
  
  recognition means, responsive to said matching means, for providing a recognition result based upon a likely sequence of reference models that corresponds to an input utterance;
  
  receive means for receiving a new input speech command comprising two or more whole words;
  
  means for generating a word model for each of the words contained within the new input speech command, if they do not already exist; and
  
  means for adapting said language model to incorporate said new input speech command.

27. A method of manipulating an image comprising the steps of:
- using a speech recognition user interface for allowing a user to input a speech signal of a command comprising a number of continuously spoken words;
  
  interpreting a recognition result output from the speech recognition interface; and
  
  executing a function corresponding to the command in response to the interpreting step;
  
  wherein said speech recognition user interface uses a memory for storing a plurality of reference word models, each representative of a word, and for storing a language model which defines sequences of the reference word models which can be matched with the input speech signal, in order to define input speech commands and matches the input speech signal with selected sequences of said word models, selected in accordance with the stored language model; and
  
  provides in response to said matching step a recognition result based upon a likely sequence of reference models that corresponds to an input utterance; and
  
  wherein the method further comprises the steps of;
  
  receiving a new input speech command comprising two or more whole words;
  
  generating a word model for each of the words contained within the new input speech command, if they do not already exist; and
  
  adapting said language model to incorporate said new input speech command.

28. A computer readable medium storing computer executable process steps to allow image manipulation, the process steps comprising the steps of:
- using a speech recognition user interface for allowing user to input a speech signal of a command comprising a number of continuously spoken words;
  
  interpreting a recognition result output from the speech recognition interface; and
  
  executing a function corresponding to the command in response to the interpreting step;
  
  wherein said speech recognition user interface uses memory for storing a plurality of reference word models, each representative of a word, and for storing a language model which defines sequences of the reference word models which can be matched with the input speech signal, in order to define input speech commands and matches the input speech signal with selected sequences of said word models, selected in accordance with the stored language model; and
  
  provides, in response to said matching step a recognition result based upon a likely sequence of reference models that corresponds to an input utterance; and
  
  wherein the process steps further comprise the steps of;
  
  receiving a new input speech command comprising two or more whole words;
  
  generating a word model for each of the words contained within the new input speech command, if they do not already exist; and
  
  adapting said language model to incorporate said new input speech command.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Canon Kabushiki Kaisha (Canon Inc.)
Original Assignee
Canon Kabushiki Kaisha (Canon Inc.)
Inventors
Tzirkel-Hancock, Eli
Primary Examiner(s)
Chow, Dennis-Doon
Assistant Examiner(s)
AWAD, AMR A

Application Number

US08/794,455
Time in Patent Office

1,351 Days
Field of Search

345/150-157, 358/518, 358/520, 358/522, 704/255, 704/256, 704/275
US Class Current

345/589
CPC Class Codes

G06F 3/167 Audio in a user interface, ...

G10L 15/26 Speech to text systems G10L...

Image manipulation

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

28 Claims

Specification

Solutions

Use Cases

Quick Links

Image manipulation

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

28 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links