System and method of extracting clauses for spoken language understanding

US 9,703,769 B2
Filed: 10/07/2015
Issued: 07/11/2017
Est. Priority Date: 12/24/2002
Status: Expired due to Term

First Claim

Patent Images

1. A method comprising:

inserting, via a discriminative classification approach, boundary tags into speech utterance text, the boundary tags identifying boundaries selected from a group comprising phrase boundaries, sentence boundaries, and paragraph boundaries, wherein the discriminative classification approach utilizes syntactic features before and after each word being tagged, to yield boundary marked speech utterance text and unedited text;

identifying, via a processor, a coordinating conjunction within the unedited text based on a conjunction tag, wherein the conjunction tag comprises conjunction span information indicating how many words to the left of the conjunction tag a corresponding conjunction includes; and

identifying clauses in the speech utterance text based on the boundary marked speech utterance text and the coordinating conjunction.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A clausifier and method of extracting clauses for spoken language understanding are disclosed. The method relates to generating a set of clauses from speech utterance text and comprises inserting at least one boundary tag in speech utterance text related to sentence boundaries, inserting at least one edit tag indicating a portion of the speech utterance text to remove, and inserting at least one conjunction tag within the speech utterance text. The result is a set of clauses that may be identified within the speech utterance text according to the inserted at least one boundary tag, at least one edit tag and at least one conjunction tag. The disclosed clausifier comprises a sentence boundary classifier, an edit detector classifier, and a conjunction detector classifier. The clausifier may comprise a single classifier or a plurality of classifiers to perform the steps of identifying sentence boundaries, editing text, and identifying conjunctions within the text.

29 Citations

17 Claims

1. A method comprising:
- inserting, via a discriminative classification approach, boundary tags into speech utterance text, the boundary tags identifying boundaries selected from a group comprising phrase boundaries, sentence boundaries, and paragraph boundaries, wherein the discriminative classification approach utilizes syntactic features before and after each word being tagged, to yield boundary marked speech utterance text and unedited text;
  
  identifying, via a processor, a coordinating conjunction within the unedited text based on a conjunction tag, wherein the conjunction tag comprises conjunction span information indicating how many words to the left of the conjunction tag a corresponding conjunction includes; and
  
  identifying clauses in the speech utterance text based on the boundary marked speech utterance text and the coordinating conjunction.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, further comprising inserting an edit tag within the speech utterance text to indicate a portion of the boundary marked speech utterance text for removal.
  - 3. The method of claim 1, further comprising inserting the conjunction tag within the boundary marked speech utterance text.
  - 4. The method of claim 1, wherein a different classifier performs each step of the method.
  - 5. The method of claim 1, wherein a single classifier performs the inserting of the boundary tags, the identifying of the coordinating conjunction, and the identifying of the clauses in the speech utterance text.
  - 6. The method of claim 1, wherein a plurality of classifiers perform the inserting of the boundary tags, the identifying of the coordinating conjunction, and the identifying of the clauses in the speech utterance text.

7. A system comprising:
- a processor; and
  
  a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising;
  
  inserting, via a discriminative classification approach, boundary tags into speech utterance text, the boundary tags identifying boundaries selected from a group comprising phrase boundaries, sentence boundaries, and paragraph boundaries, wherein the discriminative classification approach utilizes syntactic features before and after each word being tagged, to yield boundary marked speech utterance text and unedited text;
  
  identifying a coordinating conjunction within the unedited text based on a conjunction tag, wherein the conjunction tag comprises conjunction span information indicating how many words to the left of the conjunction tag a corresponding conjunction includes; and
  
  identifying clauses in the speech utterance text based on the boundary marked speech utterance text and the coordinating conjunction.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The system of claim 7, the computer-readable storage medium having additional instructions stored which, when executed by the processor, cause the processor to perform operations comprising inserting an edit tag within the speech utterance text to indicate a portion of the boundary marked speech utterance text for removal.
  - 9. The system of claim 7, the computer-readable storage medium having additional instructions stored which, when executed by the processor, cause the processor to perform operations comprising inserting the conjunction tag within the boundary marked speech utterance text.
  - 10. The system of claim 7, wherein a different classifier performs each step of the operations.
  - 11. The system of claim 7, wherein a single classifier performs the inserting of the boundary tags, the identifying of the coordinating conjunction, and the identifying of the clauses in the speech utterance text.
  - 12. The system of claim 7, wherein a plurality of classifiers perform the inserting of the boundary tags, the identifying of the coordinating conjunction, and the identifying of the clauses in the speech utterance text.

13. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
- inserting, via a discriminative classification approach, boundary tags into speech utterance text, the boundary tags identifying boundaries selected from a group comprising phrase boundaries, sentence boundaries, and paragraph boundaries, wherein the discriminative classification approach utilizes syntactic features before and after each word being tagged, to yield boundary marked speech utterance text and unedited text;
  
  identifying a coordinating conjunction within the unedited text based on a conjunction tag, wherein the conjunction tag comprises conjunction span information indicating how many words to the left of the conjunction tag a corresponding conjunction includes; and
  
  identifying clauses in the speech utterance text based on the boundary marked speech utterance text and the coordinating conjunction.
- View Dependent Claims (14, 15, 16, 17)
- - 14. The computer-readable storage device of claim 13, having additional instructions stored which, when executed by the computing device, cause the computing device to perform operations comprising inserting an edit tag within the speech utterance text to indicate a portion of the boundary marked speech utterance text for removal.
  - 15. The computer-readable storage device of claim 13, having additional instructions stored which, when executed by the computing device, cause the computing device to perform operations comprising inserting the conjunction tag within the boundary marked speech utterance text.
  - 16. The computer-readable storage device of claim 13, wherein a different classifier performs each step of the operations.
  - 17. The computer-readable storage device of claim 13, wherein a single classifier performs the inserting of the boundary tags, the identifying of the coordinating conjunction, and the identifying of the clauses in the speech utterance text.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Gupta, Narendra K., Bangalore, Srinivas, Gilbert, Mazin
Primary Examiner(s)
SPOONER, LAMONT M

Application Number

US14/877,272
Publication Number

US 20160026618A1
Time in Patent Office

643 Days
Field of Search

704 1, 704 9, 704 10
US Class Current
CPC Class Codes

G06F 40/117   Tagging; Marking up details...

G06F 40/20   Natural language analysis s...

G06F 40/205   Parsing

G06F 40/289   Phrasal analysis, e.g. fini...

G10L 15/05   Word boundary detection

G10L 15/26   Speech to text systems G10L...

G10L 2015/025   Phonemes, fenemes or fenone...

G10L 2015/081   Search algorithms, e.g. Bau...

G10L 2015/088   Word spotting

System and method of extracting clauses for spoken language understanding

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

29 Citations

17 Claims

Specification

Use Cases

Quick Links

Others

System and method of extracting clauses for spoken language understanding

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

29 Citations

17 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others