Interactive complex task teaching system that allows for natural language input, recognizes a user's intent, and automatically performs tasks in document object model (DOM) nodes

US 7,983,997 B2
Filed: 11/02/2007
Issued: 07/19/2011
Est. Priority Date: 11/02/2007
Status: Active Grant

First Claim

Patent Images

1. An interactive method for learning and executing executable tasks using language and demonstration inputs from a user, comprising:

a. providing a computational device including a graphical user interface (GUI);

b. providing software running on said computational device and supported GUI-based interaction with said user;

c. wherein said user performs tasks using the GUId. wherein said software includes a natural dialog-based interface whereby said user can communicate with said software using natural dialog-based language;

e. for each of said executable tasks, recognizing said user'"'"'s overall intent;

f. for each of said executable tasks, identifying a plurality of steps needed to complete said task;

g. for each of said plurality of steps, identifying and generalizing a step objective;

h. learning to execute each of said steps from demonstrations provided by said user;

i. providing incremental execution and interaction with said user using said natural dialog-based interface;

j. providing a database for storing a task definition for each of said tasks, wherein said task definition includes said steps comprising said task and said step objectives;

k. storing said task definitions in said database;

l. for each of said tasks, learning semantic characterization of said task for later retrieval from said database;

m. retrieving a particular task definition from said database using said semantic characterization;

n. improving said task definition for each of said tasks through practice, with instruction being provided by said user;

o. wherein one of said executable tasks returns a list of results;

p. displaying said list of results in a first configuration of said GUI wherein said GUI displays a list consisting of multiple Document Object Model nodes;

q. wherein said user provides natural language to said software running on said computational device indicating that iteration should be performed;

r. automatically creating a second configuration of said GUI wherein said GUI displays a plurality of cells arranged into columns and rows, with each row representing a single Document Object Model node from said displayed list;

s. wherein said user demonstrates a first task to be performed in a first Document Object Model node; and

t. wherein thereafter said software running on said computation device automatically performs said first task demonstrate by said user in all other Document Object Model nodes and displays a result of said performance in said second configuration of said GUI.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system which allows a user to teach a computational device how to perform complex, repetitive tasks that the user usually would perform using the device'"'"'s graphical user interface (GUI) often but not limited to being a web browser. The system includes software running on a user'"'"'s computational device. The user “teaches” task steps by inputting natural language and demonstrating actions with the GUI. The system uses a semantic ontology and natural language processing to create an explicit representation of the task that is stored on the computer. After a complete task has been taught, the system is able to automatically execute the task in new situations. Because the task is represented in terms of the ontology and user'"'"'s intentions, the system is able to adapt to changes in the computer code while still pursuing the objectives taught by the user.

392 Citations

17 Claims

1. An interactive method for learning and executing executable tasks using language and demonstration inputs from a user, comprising:
- a. providing a computational device including a graphical user interface (GUI);
  
  b. providing software running on said computational device and supported GUI-based interaction with said user;
  
  c. wherein said user performs tasks using the GUId. wherein said software includes a natural dialog-based interface whereby said user can communicate with said software using natural dialog-based language;
  
  e. for each of said executable tasks, recognizing said user'"'"'s overall intent;
  
  f. for each of said executable tasks, identifying a plurality of steps needed to complete said task;
  
  g. for each of said plurality of steps, identifying and generalizing a step objective;
  
  h. learning to execute each of said steps from demonstrations provided by said user;
  
  i. providing incremental execution and interaction with said user using said natural dialog-based interface;
  
  j. providing a database for storing a task definition for each of said tasks, wherein said task definition includes said steps comprising said task and said step objectives;
  
  k. storing said task definitions in said database;
  
  l. for each of said tasks, learning semantic characterization of said task for later retrieval from said database;
  
  m. retrieving a particular task definition from said database using said semantic characterization;
  
  n. improving said task definition for each of said tasks through practice, with instruction being provided by said user;
  
  o. wherein one of said executable tasks returns a list of results;
  
  p. displaying said list of results in a first configuration of said GUI wherein said GUI displays a list consisting of multiple Document Object Model nodes;
  
  q. wherein said user provides natural language to said software running on said computational device indicating that iteration should be performed;
  
  r. automatically creating a second configuration of said GUI wherein said GUI displays a plurality of cells arranged into columns and rows, with each row representing a single Document Object Model node from said displayed list;
  
  s. wherein said user demonstrates a first task to be performed in a first Document Object Model node; and
  
  t. wherein thereafter said software running on said computation device automatically performs said first task demonstrate by said user in all other Document Object Model nodes and displays a result of said performance in said second configuration of said GUI.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
- - 2. An interactive method for learning and executing as recited in claim 1, wherein the step of recognizing said user'"'"'s overall intent for each of said executable tasks comprises:
    - a. using an ontology of tasks;
      
      b. characterizing said user'"'"'s actions within said ontology by observing said user'"'"'s actions, objects with which said, user interacts, as well as natural dialog-based language descriptions of said actions and said objects provided by said user;
      
      c. using algorithms based on natural language grammars and lexicons to map sentences in said dialog to said ontology; and
      
      d. providing an ontology-driven learning of executable steps from observations of said user'"'"'s actions.
  - 3. An interactive method for learning and executing as recited in claim 1, wherein the step of providing software which includes a natural dialog-based interface whereby said user can communicate with said software using natural dialog-based language comprises:
    - a. describing said tasks and said steps comprising said tasks in natural language;
      
      b. providing a grammar and lexicon with mapping rules to said ontology;
      
      c. using natural language processing techniques to convert said natural language provided by said user into a representation based on the ontology;
      
      d. having said software pose clarifying questions to said user expressed in natural language when needed;
      
      e. summarizing said task steps in natural language as said task steps are learned; and
      
      f. adding to said lexicon as new unknown words appear in said natural dialog-based interface.
  - 4. An interactive method for learning and executing as recited in claim 1, further comprising:
    - a detecting the start of each new step from language communicated by said user and observed user actions in said GUI;
      
      b. for each new step identifying the type of action required in said step in said ontology;
      
      c. for each new step, using the language describing, said step to identify one or more parameters to be used in said step;
      
      d. for each new step, if the intention of said new step is unclear, querying said user for additional information; and
      
      e. once a step is clearly defined, adding said step to said database.
  - 5. An interactive method for learning and executing as recited in claim 2, further comprising:
    - observing actions taken by said user in said GUI;
      
      b. retrieving internal encodings of GUI elements used in said observed user actions;
      
      c. defining a correlation between said descriptions provided by said user and said values entered or selected by said user in said actions;
      
      d. verifying learned patterns and interacting with said user when problems arise.
  - 6. An interactive method for learning and executing as recited in claim 5, wherein said step of verifying, learned patterns and interacting with said user when problems arise comprises:
    - a. simulating execution of said learned steps;
      
      b. in the event a problem arises, notifying said user of said problem; and
      
      c. accepting additional examples or descriptions from said user in order to correct said problem.
  - 7. An interactive method for learning and executing as recited in claim 2, further comprising:
    - a. upon receiving an indication from said user that a new task is to be learned, using said user'"'"'s linguistic description to classify said new task into one of said task-specific ontologies;
      
      b. identifying input and output parameters for said new task according to the way said parameters were described by said user;
      
      c. allowing said user to explicitly describe additional parameters; and
      
      d. querying said user for clarification when said user'"'"'s intent is not identified.
  - 8. An interactive method for learning and executing as recited in claim 1, further comprising:
    - a. receiving from said user a description of a task to be performed;
      
      b. encoding said described task in terms of task ontology;
      
      c. using said encoding to search said database in order to identify said task within said database;
      
      d. retrieving said task from said database, along with parameters needed to perform said task; and
      
      e. querying said user for values for any of said parameters which need to be specified.
  - 9. An interactive method for learning and executing as, recited in claim 1, wherein said step of detecting the presence of iteration comprises identifying lists and tables.
  - 10. An interactive, method for learning and executing as recited in claim 1, wherein for each of said plurality of steps taught by said user, said demonstrations of said user and a primary code object which is associated with said demonstrations of said user are correlated.
  - 11. An interactive method for learning and executing as recited in claim 10, further comprising for each of said plurality of steps taught by said user, scanning other code objects in proximity to said primary code object which is associated with a particular said user demonstration to search for the presence of words defining said user'"'"'s objective or synonyms therefor.
  - 12. An interactive method for learning and executing as recited in claim 11, further comprising:
    - a. for each of said plurality of steps taught by said user, creating a primary link between a particular said user demonstration and said primary code object and also creating a secondary link, between said particular user demonstration and said other code objects which relate to said word'"'"'s defining said user'"'"'s objective or synonyms therefor; and
      
      b. saving said primary and secondary links in said database.
  - 13. An interactive method for learning and executing as recited in claim 2, wherein for each of said plurality of steps taught by said user, said demonstrations of said user and a primary code object which is associated with said demonstrations of said user are correlated.
  - 14. An;
    - interactive method for learning and executing as recited in claim 13, further comprising for each of said plurality of steps taught by said user, scanning other code objects in proximity to said primary code object which is associated with a particular said user demonstration to search for the presence of words defining said user'"'"'s objective or synonyms therefor.
  - 15. An interactive method for learning and executing as recited in claim 14, further comprising:
    - a. for each of said plurality of steps taught by said user, creating a primary link between a particular said user demonstration and said primary code object and also creating a secondary link between said particular user demonstration and said other code objects which relate to said word'"'"'s defining said user'"'"'s objective or synonyms therefor; and
      
      b. saving said primary and secondary links in said database.
  - 16. An interactive method for learning and executing as recited in claim 3, wherein for each of said plurality of steps taught by said user, said demonstrations of said user and a primary code object which is associated with said demonstrations of said user are correlated.
  - 17. An interactive method for learning and executing as recited in claim 16, further comprising for each of said plurality of steps taught by said user, scanning other code objects in proximity to said primary code object which is associated with a particular said user demonstration to search for the presence of words defining said user'"'"'s objective or synonyms therefor.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Florida Institute For Human And Machine Cognition, Inc.
Original Assignee
Florida Institute For Human And Machine Cognition, Inc.
Inventors
Jung, Hyuckchul, Allen, James F., Galescu, Lucian, Taysom, William, Chambers, Nathanael
Primary Examiner(s)
Vincent; David R

Application Number

US11/982,668
Publication Number

US 20090119587A1
Time in Patent Office

1,355 Days
Field of Search

706 45- 48, 706/62, 706/12, 704/2, 704/9, 704/231, 704/246, 704/251, 707/713, 707/736, 707/769, 707/771, 707/772
US Class Current

706/12
CPC Class Codes

G06F 2203/0381   Multimodal input, i.e. inte...

G06F 3/038   Control and interface arran...

G09B 5/00   Electrically-operated educa...

G10L 15/26   Speech to text systems G10L...

Interactive complex task teaching system that allows for natural language input, recognizes a user's intent, and automatically performs tasks in document object model (DOM) nodes

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

392 Citations

17 Claims

Specification

Use Cases

Quick Links

Others

Interactive complex task teaching system that allows for natural language input, recognizes a user's intent, and automatically performs tasks in document object model (DOM) nodes

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

392 Citations

17 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others