SYSTEM AND METHOD FOR DYNAMIC LEARNING

US 20080221893A1
Filed: 02/29/2008
Published: 09/11/2008
Est. Priority Date: 03/01/2007
Status: Active Grant

First Claim

Patent Images

1. A system for recognizing and evaluating possible relationships between terms expressed during cross-communication activities, the system comprising:

a memory;

a processor in signal communication with the memory;

a speech recognition system having a speech collection device arranged to receive a speech portion and then transcribe the speech portion to a first set of sub-word textual sequences related to the speech portion;

an ink recognition system having an ink input receiving device configured to receive written input at least contemporaneously while the speech recognition system receives the speech portion, the ink recognition system further configured to identify a second set of sub-word textual sequences related to the written input; and

a multimodal fusion engine in signal communication with the processor, the multimodal fusion engine comprising;

an alignment system having a plurality of grammar-based phoneme recognizers configured to identify a number of phonetically close terms corresponding to a modally redundant term defined by a temporal relationship between the speech portion and the written input, the grammar-based phoneme recognizers operable to generate a first-pass alignment matrix in which the first set of sub-word textual sequences related to the speech portion are selectively aligned with the second set sub-word sequences related to the written input;

a refinement system in communication with the alignment system for dynamically modeling the first and second sub-word sequences captured in the alignment matrix by identifying a desired path within the alignment matrix and then modifying the desired path based on temporal boundaries associated with the speech portion and the written input; and

an integration system in communication with the refinement system to select a desired term that is estimated to be a best-fit to the modally redundant term, the integration system configured to generate a normalized match score based on information received at least from the alignment system and the refinement system.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

New language constantly emerges from complex, collaborative human-human interactions like meetings—such as when a presenter handwrites a new term on a whiteboard while saying it redundantly. The system and method described includes devices for receiving various types of human communication activities (e.g., speech, writing and gestures) presented in a multimodally redundant manner, includes processors and recognizers for segmenting or parsing, and then recognizing selected sub-word units such as phonemes and syllables, and then includes alignment, refinement, and integration modules to find or at least an approximate match to the one or more terms that were presented in the multimodally redundant manner. Once the system has performed a successful integration, one or more terms may be newly enrolled into a database of the system, which permits the system to continuously learn and provide an association for proper names, abbreviations, acronyms, symbols, and other forms of communicated language.

111 Citations

View as Search Results

11 Claims

1. A system for recognizing and evaluating possible relationships between terms expressed during cross-communication activities, the system comprising:
- a memory;
  
  a processor in signal communication with the memory;
  
  a speech recognition system having a speech collection device arranged to receive a speech portion and then transcribe the speech portion to a first set of sub-word textual sequences related to the speech portion;
  
  an ink recognition system having an ink input receiving device configured to receive written input at least contemporaneously while the speech recognition system receives the speech portion, the ink recognition system further configured to identify a second set of sub-word textual sequences related to the written input; and
  
  a multimodal fusion engine in signal communication with the processor, the multimodal fusion engine comprising;
  
  an alignment system having a plurality of grammar-based phoneme recognizers configured to identify a number of phonetically close terms corresponding to a modally redundant term defined by a temporal relationship between the speech portion and the written input, the grammar-based phoneme recognizers operable to generate a first-pass alignment matrix in which the first set of sub-word textual sequences related to the speech portion are selectively aligned with the second set sub-word sequences related to the written input;
  
  a refinement system in communication with the alignment system for dynamically modeling the first and second sub-word sequences captured in the alignment matrix by identifying a desired path within the alignment matrix and then modifying the desired path based on temporal boundaries associated with the speech portion and the written input; and
  
  an integration system in communication with the refinement system to select a desired term that is estimated to be a best-fit to the modally redundant term, the integration system configured to generate a normalized match score based on information received at least from the alignment system and the refinement system.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The system of claim 1, wherein the speech collection device includes at least one microphoneme.
  - 3. The system of claim 1, wherein the temporal relationship includes a multimodal redundant relationship having a detected temporal boundary.
  - 4. The system of claim 1, wherein the written input includes alphanumeric characters and non-alphanumeric symbols.
  - 5. The system of claim 1, wherein the non-alphanumeric symbols include Unicode symbols.
  - 6. The system of claim 1, wherein the alignment system includes a salience-weighted articulatory-feature comparison module for generating a table having pairs of hypothesized phonemes determined from at least one articulatory feature detected by the speech recognition system.
  - 7. The system of claim 1, wherein the written symbols include pictorial and graphical sketches.

8. A method for recognizing and evaluating possible relationships between terms expressed during multiple communication modes, the method comprising:
- detecting at least two modes of communication selected from the group consisting of speech, writing, and physical gestures;
  
  receiving at least two of the modes of communication within a memory of a computational processing system;
  
  determining a time period between a first communication mode and a second communication mode;
  
  aligning a selected feature of the first communication mode with a selected feature of the second communication mode;
  
  generating a group of hypothesized redundant terms based on the time period and based on the selected features of the first and second communication modes;
  
  reducing a number of the hypothesized redundant terms to populate a matrix of possibly related sub-word units from which a best-fit term is to be selected; and
  
  selecting the best-fit term based at least in part on a multimodal redundancy between the first communication mode and the second communication mode.
- View Dependent Claims (9, 10, 11)
- - 9. The method of claim 8, further comprising:
    - reducing the number of the hypothesized redundant terms through alignment, refinement, and integration processes.
  - 10. The method of claim 8, further comprising:
    - dynamically enrolling the best-fit term into a database accessible by the computational processing system.
  - 11. The method of claim 8, wherein reducing the number of the hypothesized redundant terms includes generating a table of salience-weighted articulatory-features for comparing at least the speech to the writing communication.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Adapx Incorporated
Original Assignee
Adapx Incorporated
Inventors
Kaiser, Edward C.

Granted Patent

US 8,457,959 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/257
CPC Class Codes

G09B 19/04 Speaking with audible prese...

G10L 15/24 Speech recognition using no...

SYSTEM AND METHOD FOR DYNAMIC LEARNING

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

111 Citations

11 Claims

Specification

Solutions

Use Cases

Quick Links

SYSTEM AND METHOD FOR DYNAMIC LEARNING

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

111 Citations

11 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links