Establishing a multimodal personality for a multimodal application

US 8,073,697 B2
Filed: 09/12/2006
Issued: 12/06/2011
Est. Priority Date: 09/12/2006
Status: Active Grant

First Claim

Patent Images

1. A method of establishing a multimodal personality for an application that provides vocal and visual output, the method comprising:

with at least one processor;

selecting, by the application that provides vocal and visual output, matching vocal and visual demeanors; and

incorporating, by the application, the matching vocal and visual demeanors as a multimodal personality into the application by rendering a voice prompt and/or response generated by the application in the vocal demeanor and rendering a visual element generated by the application in the matching visual demeanor, the voice prompt or response being rendered in a voice having an age, gender and/or accent based on the selected vocal demeanor,wherein;

selecting matching vocal and visual demeanors further comprises selecting a vocal demeanor in dependence upon a history of a user'"'"'s navigation among web sites, the vocal demeanor being selected to match a visual demeanor determined based on at least one property of web pages previously visited, the at least one property comprising one or more of;

text font;

count of words on the web page;

proportion of white space;

ratio of graphics to screen area;

orratio of text space to graphic space.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, apparatus, and computer program products are described for establishing a multimodal personality for a multimodal application that include selecting, by the multimodal application, matching vocal and visual demeanors and incorporating, by the multimodal application, the matching vocal and visual demeanors as a multimodal personality into the multimodal application.

149 Citations

20 Claims

1. A method of establishing a multimodal personality for an application that provides vocal and visual output, the method comprising:
- with at least one processor;
  
  selecting, by the application that provides vocal and visual output, matching vocal and visual demeanors; and
  
  incorporating, by the application, the matching vocal and visual demeanors as a multimodal personality into the application by rendering a voice prompt and/or response generated by the application in the vocal demeanor and rendering a visual element generated by the application in the matching visual demeanor, the voice prompt or response being rendered in a voice having an age, gender and/or accent based on the selected vocal demeanor,wherein;
  
  selecting matching vocal and visual demeanors further comprises selecting a vocal demeanor in dependence upon a history of a user'"'"'s navigation among web sites, the vocal demeanor being selected to match a visual demeanor determined based on at least one property of web pages previously visited, the at least one property comprising one or more of;
  
  text font;
  
  count of words on the web page;
  
  proportion of white space;
  
  ratio of graphics to screen area;
  
  orratio of text space to graphic space.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1 wherein incorporating the matching vocal and visual demeanors as a multimodal personality into the application further comprises:
    - linking one or more markup elements of a markup document of the application to one or more styles of a cascading style sheet;
      
      providing the user interface by providing the markup document as an input to a multimodal browser; and
      
      rendering the user interface with the multimodal browser based on the markup document.
  - 3. The method of claim 1 wherein selecting matching vocal and visual demeanors further comprises selecting a visual demeanor in dependence upon a history of multimodal interactions between the application and a user.
  - 4. The method of claim 1 wherein the visual element is rendered with a background color, text color, text font or placement matching the selected vocal demeanor.
  - 5. The method of claim 1 wherein selecting matching vocal and visual demeanors further comprises selecting a visual demeanor in dependence upon vocal aspects of a history of a user'"'"'s navigation among multimodal web sites.
  - 6. The method of claim 1 wherein selecting matching vocal and visual demeanors further comprisesretrieving a user profile from storage;
    - selecting a vocal demeanor in dependence upon the retrieved user profile; and
      
      selecting a visual demeanor in dependence upon the retrieved user profile.

7. Apparatus for establishing a multimodal personality for an application that provides vocal and visual output, the apparatus comprising a computer processor and a computer memory operatively coupled to the computer processor, the computer memory having disposed within it computer program instructions capable of:
- selecting, by the application that provides vocal and visual output, matching vocal and visual demeanors, the selected vocal demeanor and the selected visual demeanor having at least one matching characteristic, the at least one matching characteristic comprising at least one of an age, gender, location, time or application domain; and
  
  incorporating, by the application, the matching vocal and visual demeanors as a multimodal personality into the application,wherein;
  
  selecting the vocal demeanor comprises selecting at least one grammar for use in recognizing vocal inputs; and
  
  selecting matching vocal and visual demeanors further comprises selecting the vocal and visual demeanors in dependence upon visual aspects or vocal aspects of a history of a user'"'"'s navigation among multimodal web sites,the vocal aspects comprising one or more of;
  
  number of grammars per page;
  
  number of dialogs per page;
  
  dialog intensity;
  
  ornumber of speech inputs per page; and
  
  the visual aspects comprising one or more of;
  
  text font;
  
  counts of words on web pages of the multimodal web sites;
  
  proportion of white space;
  
  ratio of graphics to screen area;
  
  orratio of text space to graphic space.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The apparatus of claim 7 wherein incorporating the matching vocal and visual demeanors as a multimodal personality into the application further comprises linking one or more markup elements of a markup document of the application to one or more styles of a cascading style sheet.
  - 9. The apparatus of claim 7 wherein selecting matching vocal and visual demeanors further comprises selecting a visual demeanor in dependence upon a history of multimodal interactions between the application and a user.
  - 10. The apparatus of claim 7 wherein selecting matching vocal and visual demeanors further comprises selecting a vocal demeanor in dependence upon visual properties of a history of a user'"'"'s navigation among web sites.
  - 11. The apparatus of claim 7 wherein selecting matching vocal and visual demeanors further comprises selecting a visual demeanor in dependence upon vocal aspects of a history of a user'"'"'s navigation among multimodal web sites.
  - 12. The apparatus of claim 7, wherein:
    - the computer processor comprises a computer processor of a server;
      
      incorporating the matching vocal and visual demeanors as a multimodal personality into the application comprises generating a markup document and transmitting the markup document to a client over a network; and
      
      the server further comprises a component that performs speech recognition on VOIP signals received from the client over the network.

13. A computer storage medium encoded with a computer program product for establishing a multimodal personality for an application that provides vocal and visual output, the computer program product comprising computer program instructions for, when executed by at least one processor:
- selecting, by the application that provides vocal and visual output, matching vocal and visual demeanors; and
  
  incorporating, by the application, the matching vocal and visual demeanors as a multimodal personality into the application by;
  
  selecting one or more demeanors from a store comprising a plurality of demeanors, the selected one or more demeanors defining the matching vocal and visual demeanors;
  
  linking one or more styles to one or more markup elements of a markup document output by the application based on the selected demeanor; and
  
  rendering, using a multimodal browser, the markup document using the one or more linked styles, the rendering comprising rendering at least one visual aspect and one speech aspect,wherein;
  
  selecting matching vocal and visual demeanors further comprises selecting a visual demeanor in dependence upon vocal aspects of a history of a user'"'"'s navigation among multimodal web sites, the vocal aspects comprising one or more of;
  
  number of grammars per page;
  
  number of dialogs per page;
  
  dialog intensity;
  
  ornumber of speech inputs per page.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
- - 14. The computer storage medium of claim 13 wherein incorporating the matching vocal and visual demeanors as a multimodal personality into the application further comprises linking one or more markup elements of the markup document of the application to one or more styles of a cascading style sheet.
  - 15. The computer storage medium of claim 13 wherein selecting matching vocal and visual demeanors further comprises selecting a visual demeanor in dependence upon a history of multimodal interactions between the application and a user.
  - 16. The computer storage medium of claim 13 wherein selecting matching vocal and visual demeanors further comprises selecting a vocal demeanor in dependence upon visual properties of a history of a user'"'"'s navigation among web sites.
  - 17. The computer storage medium of claim 13 wherein selecting matching vocal and visual demeanors further comprises selecting a visual demeanor in dependence upon vocal aspects of a history of a user'"'"'s navigation among multimodal web sites.
  - 18. The computer storage medium of claim 13 wherein selecting matching vocal and visual demeanors further comprisesretrieving a user profile from storage;
    - selecting a vocal demeanor in dependence upon the retrieved user profile; and
      
      selecting a visual demeanor in dependence upon the retrieved user profile.
  - 19. The computer storage medium of claim 13, wherein the one or more styles specifies at least one of a color or font of the at least one visual aspect.
  - 20. The computer storage medium of claim 19, wherein the one or more styles specifies at least one of a gender or age of the at least one speech aspect.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
International Business Machines Corporation, Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Cross, Charles W. Jr., Pike, Hilary A.
Primary Examiner(s)
Smits, Talivaldis Ivars
Assistant Examiner(s)
ROBERTS, SHAUN A

Application Number

US11/530,916
Publication Number

US 20080065388A1
Time in Patent Office

1,911 Days
Field of Search

704/270, 704/275
US Class Current

704/270
CPC Class Codes

G06F 3/167 Audio in a user interface, ...

G10L 13/00 Speech synthesis; Text to s...

Establishing a multimodal personality for a multimodal application

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

149 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Establishing a multimodal personality for a multimodal application

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

149 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links