System and method of extracting clauses for spoken language understanding

US 9,484,020 B2
Filed: 08/26/2014
Issued: 11/01/2016
Est. Priority Date: 12/24/2002
Status: Expired due to Term

First Claim

Patent Images

1. A method comprising:

annotating data by inserting, via a processor and via a discriminative classification approach independent of using n-grams, boundary tags at boundaries in a speech utterance text based on weighted examples, wherein higher weights indicate more difficult examples, to yield annotated data; and

iteratively repeating the annotating of the data, where each successive iteration has a longer turn than an immediately preceding iteration and each successive iteration is used to retrain a model associated with the discriminative classification approach.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A clausifier for extracting clauses for spoken language understanding is disclosed. The method relates to generating a set of clauses from speech utterance text and comprises inserting at least one boundary tag in speech utterance text related to sentence boundaries, inserting at least one edit tag indicating a portion of the speech utterance text to remove, and inserting at least one conjunction tag within the speech utterance text. The result is a set of clauses that may be identified within the speech utterance text according to the inserted at least one boundary tag, at least one edit tag and at least one conjunction tag. The disclosed clausifier comprises a sentence boundary classifier, an edit detector classifier, and a conjunction detector classifier. The clausifier may comprise a single classifier or a plurality of classifiers to perform the steps of identifying sentence boundaries, editing text, and identifying conjunctions within the text.

Citations

20 Claims

1. A method comprising:
- annotating data by inserting, via a processor and via a discriminative classification approach independent of using n-grams, boundary tags at boundaries in a speech utterance text based on weighted examples, wherein higher weights indicate more difficult examples, to yield annotated data; and
  
  iteratively repeating the annotating of the data, where each successive iteration has a longer turn than an immediately preceding iteration and each successive iteration is used to retrain a model associated with the discriminative classification approach.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the boundary tags comprise one of a phrase boundary tag, a sentence boundary tag, and a paragraph boundary tag.
  - 3. The method of claim 1, further comprising inserting an edit tag in the annotated data.
  - 4. The method of claim 3, wherein the edit tag identifies a portion of the speech utterance text to be removed based on repeated words which do not contribute to language understanding.
  - 5. The method of claim 1, further comprising inserting conjunction tags within the unedited text which identify, without relying on punctuation cues, coordinating conjunctions selected from a list.
  - 6. The method of claim 5, wherein the list comprises {and, but, for, nor, or, so, yet}.
  - 7. The method of claim 1, wherein the annotating of the data further comprises identifying clauses within the speech utterance text based on the boundary tags.

8. A system comprising:
- a processor; and
  
  a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising;
  
  annotating data by inserting, via a discriminative classification approach independent of using n-grams, boundary tags at boundaries in a speech utterance text based on weighted examples, wherein higher weights indicate more difficult examples, to yield annotated data; and
  
  iteratively repeating the annotating of the data, where each successive iteration has a longer turn than an immediately preceding iteration and each successive iteration is used to retrain a model associated with the discriminative classification approach.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8, wherein the boundary tags comprise one of a phrase boundary tag, a sentence boundary tag, and a paragraph boundary tag.
  - 10. The system of claim 8, the computer-readable storage medium having additional instructions stored which result in operations comprising inserting an edit tag in the annotated data.
  - 11. The system of claim 10, wherein the edit tag identifies a portion of the speech utterance text to be removed based on repeated words which do not contribute to language understanding.
  - 12. The system of claim 8, the computer-readable storage medium having additional instructions stored which result in operations comprising inserting conjunction tags within the unedited text which identify, without relying on punctuation cues, coordinating conjunctions selected from a list.
  - 13. The system of claim 12, wherein the list comprises {and, but, for, nor, or, so, yet}.
  - 14. The system of claim 8, wherein the annotating of the data further comprises identifying clauses within the speech utterance text based on the boundary tags.

15. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
- annotating data by inserting, via a discriminative classification approach independent of using n-grams, boundary tags at boundaries in a speech utterance text based on weighted examples, wherein higher weights indicate more difficult examples, to yield annotated data; and
  
  iteratively repeating the annotating of the data, where each successive iteration has a longer turn than an immediately preceding iteration and each successive iteration is used to retrain a model associated with the discriminative classification approach.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The computer-readable storage device of claim 15, wherein the boundary tags comprise one of a phrase boundary tag, a sentence boundary tag, and a paragraph boundary tag.
  - 17. The computer-readable storage device of claim 15, the computer-readable storage medium having additional instructions stored which result in operations comprising inserting an edit tag in the annotated data.
  - 18. The computer-readable storage device of claim 17, wherein the edit tag identifies a portion of the speech utterance text to be removed based on repeated words which do not contribute to language understanding.
  - 19. The computer-readable storage device of claim 15, the computer-readable storage medium having additional instructions stored which result in operations comprising inserting conjunction tags within the unedited text which identify, without relying on punctuation cues, coordinating conjunctions selected from a list.
  - 20. The computer-readable storage device of claim 19, wherein the list comprises {and, but, for, nor, or, so, yet}.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Bangalore, Srinivas, Gupta, Narendra K., Rahim, Mazin G.
Primary Examiner(s)
SPOONER, LAMONT M

Application Number

US14/468,442
Publication Number

US 20150025886A1
Time in Patent Office

798 Days
Field of Search

704/1, 704/9, 704/10
US Class Current

1/1
CPC Class Codes

G06F 40/211   Syntactic parsing, e.g. bas...

G06F 40/284   Lexical analysis, e.g. toke...

G10L 15/063   Training

G10L 2015/0635   updating or merging of old ...

System and method of extracting clauses for spoken language understanding

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

System and method of extracting clauses for spoken language understanding

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links