Method and apparatus for managing information

US 5,526,407 A
Filed: 03/17/1994
Issued: 06/11/1996
Est. Priority Date: 09/30/1991
Status: Expired due to Fees

First Claim

Patent Images

1. A method for recording, categorizing, organizing, managing and retrieving speech information, said method comprising,a. obtaining a speech stream,b. storing the speech stream in at least a temporary storage,c. extracting multiple, selected features from the speech stream, wherein the multiple features include the speaker'"'"'s identity or location, duration of speech phrases, and pauses in speaking,d. constructing a visual representation of the selected features of the speech stream,e. providing the visual representation to a user,f. categorizing portions of the speech stream, with or without the aid of the representation, by at least one of the following categorization techniques:

user command and,automatic recognition of speech qualities, including tempo, fundamental pitch, and phonemes, andg. storing, in at least a temporary storage, data structure which represents the categorized portions of the speech stream.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus for recording, categorizing, organizing, managing and retrieving speech information obtains a speech stream; stores the speech stream in at least a temporary storage; provides a visual representation of portions of the speech stream to the user; categorizes portions of a speech stream, with or without the aid of the visual representation, by user command and/or by automatic recognition of speech qualities; stores, in at least a temporary storage, structure which represents a categorized portions of the speech stream; and selectively retrieves one or more of the categorized portions of the Speech stream.

Citations

57 Claims

1. A method for recording, categorizing, organizing, managing and retrieving speech information, said method comprising,a. obtaining a speech stream,b. storing the speech stream in at least a temporary storage,c. extracting multiple, selected features from the speech stream, wherein the multiple features include the speaker'"'"'s identity or location, duration of speech phrases, and pauses in speaking,d. constructing a visual representation of the selected features of the speech stream,e. providing the visual representation to a user,f. categorizing portions of the speech stream, with or without the aid of the representation, by at least one of the following categorization techniques:
- user command and,automatic recognition of speech qualities, including tempo, fundamental pitch, and phonemes, andg. storing, in at least a temporary storage, data structure which represents the categorized portions of the speech stream.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
- - 2. The invention defined in claim 1 including directing the speech stream, as initially obtained, to a permanent storage.
  - 3. The invention defined in claim 1 including selectively retrieving one or more of the categorized portions of the speech stream.
  - 4. The invention defined in claim 1 including controlling, under user control, display format of the representation for display of categories of particular interest.
  - 5. The invention defined in claim 1 wherein the visual representation of the speech stream and the storage of the speech stream in at least a temporary storage enable the categorizing of the portions of the speech stream to be done by a user at a time subsequent to an initial obtaining of the speech stream including at a time which occurs later than the initial obtaining of the speech stream.
  - 6. The invention defined in claim 1 wherein the categorization is done by reference only to the visual representation without the need to actually listen to the speech itself.
  - 7. The invention defined in claim 1 wherein the visual representation is employed by a user to select the portion of the speech to be retrieved.
  - 8. The invention defined in claim 1 wherein the categorization determines which portions of the speech stream are saved in permanent storage.
  - 9. The invention defined in claim 1 wherein the visual representation shows patterns of the speech that occurr over a period of time during the obtaining of the speech stream.
  - 10. The invention defined in claim 1 which includes forming as part of the visual representation a document which includes category headings and wherein selected categorized portions of one or more speech streams are incorporated in the document, being located under a respective category heading of the document.
  - 11. The invention defined in claim 1 wherein the visual representation includes overlays indicating a particular categorization applied to a particular portion of the speech stream.
  - 12. The invention defined in claim 1 including marking the visual representation to select portions of the speech for further processing.
  - 13. The invention defined in claim 12 wherein the further processing includes preparation of speech for voice mail.
  - 14. The invention defined in claim 12 wherein the further processing includes at least one of the following:
    - selection of speech for noting on a calendar, andselection of speech for updating a schedule.
  - 15. The invention defined in claim 12 wherein the further processing includes the provision of alarms for automatically reminding the user of alarm events.
  - 16. The invention defined in claim 1 wherein the categorizing includes the step of integrating of reference notes, including both manual and programmed notes, within the stored data structure of the speech stream.
  - 17. The invention defined in claim 16 wherein the integrating of the notes occurs concurrently with obtaining the speech stream.
  - 18. The invention defined in claim 16 wherein the integrating of notes occurs after the speech stream is obtained.
  - 19. The method defined in claim 1 wherein the categorizing includes automatically detecting and recording and visually displaying the speaker'"'"'s identity, pauses, non-speech sounds, emphasis, laughter, or pre-selected key words as pre-programmed by a user.
  - 20. The invention defined in claim 1 wherein the speech stream comes from a telephone call.
  - 21. The invention defined in claim 20 wherein the categorization includes categorizing by caller identity, date of telephone call, number called, time of the telephone call, and duration of the telephone call.
  - 22. The invention defined in claim 1 wherein the thresholds of automatic categorization are under user control.
  - 23. The invention defined in claim 1 which includes selectively retrieving categorized portions of the speech stream in any desired order for subsequent processing including audio play back and transcription, and wherein the selectively retrieving comprises both including and excluding by category.
  - 24. The invention defined in claim 23 wherein the excluding by category comprises excluding pauses and non-speech sounds to thereby reduce the amount of time required for the selective retrieval and to improve the clarity and understanding of the retrieved categorization portions of the speech stream.
  - 25. The invention defined in claim 1 wherein the selectively retrieving includes initially retrieving only every n^th utterance, as demarcated by detected speech pauses, in order to speed up searching and replaying.

26. A method for recording, categorizing, organizing, managing and retrieving speech information transmitted by telephone, said method comprising,a. obtaining a speech stream from a telephone connection,b. storing the speech stream in at least a temporary storage,c. extracting multiple, selected features from the speech stream, wherein the multiple features include the speaker'"'"'s identity or location, duration of speech phrases, and pauses in speaking.d. categorizing portions of the speech stream by user command or by automatic recognition of speech qualities, including tempo, fundamental pitch, and phonemes, and wherein the categorizing portions of the speech stream includes categorizing the speaker by indicating which end of the telephone connection the speech is coming from,e. storing, in at least a temporary storage, data structure which represents the categorized portions of the speech stream, andf. selectively retrieving one or more of the categorized portions of the speech stream.

27. A method of recording speech, said method comprising,capturing the speech,storing the captured speech in a temporary storage,extracting multiple, selected features from the speech stream, wherein the multiple features include the speaker'"'"'s location, duration of speech phrases, and pauses in speaking,representing selected, extracted features of the speech in a visual form to the user,using the visual representation to select portions of the speech for storage and including the step of looking at the visual representation of the captured speech in the temporary storage and selectively categorizing portions of that speech, with the aid of the visual representation, after the speech has been captured in the temporary storage.

28. A method for recording and indexing speech information, said method comprising,obtaining a speech stream,storing the entire speech stream as an unannotated speech stream in a first, separate storage,automatically recognizing qualities of the speech stream, including tempo, fundamental pitch, and phonemes,categorizing portions of the speech stream by user command, and by association with the automatically recognized qualities,storing the categorized portions together with said automatically recognized qualities in a second storage,synchronizing at least a portion of the obtained speech stream with both the stored categorized portions and the stored automatically recognized qualities, andcompiling the automatically recognized qualities with the categorized portions as compiled speech information in a manner which permits the compiled speech information to be organized, managed, and selectively retrieved by a user.

29. A speech information apparatus for recording, categorizing, organizing, managing and retrieving speech information, said apparatus comprising,a. speech stream means for obtaining a speech stream,b. first storage means for storing the speech stream in at least a temporary storage,c. extracting means for extracting multiple, selected features from the speech stream, and wherein the multiple features include the speaker'"'"'s identity or location, duration of speech phrases, and pauses in speaking,d. constructing means for constructing a visual representation of the selected features of the speech stream,e. visual representation means for providing the visual representation to a user,f. categorizing means for categorizing portions of the speech stream, with or without the aid of the representation, by at least one of the following categorizing techniques:
- user command and,automatic recognition of speech qualities, including tempo, fundamental pitch, and phonemes, andg. second storage means for storing, in at least a temporary storage, data structure which represents the categorized portions of the speech stream.
- View Dependent Claims (30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53)
- - 30. The invention defined in claim 29 including directing means for directing the speech stream, as initially obtained, to a permanent storage.
  - 31. The invention defined in claim 29 including retrieving means for selectively retrieving one or more of the categorized portions of the speech stream.
  - 32. The invention defined in claim 29 including formatting means for controlling, under user control, a display format of the representation for display of categories of particular interest.
  - 33. The invention defined in claim 29 wherein the visual representation of the speech stream in the visual means and the storage of the speech stream in at least a temporary storage in the first storage means enable the categorizing of the portions of the speech stream to be done by a user at a time subsequent to an initial obtaining of the speech stream including at a time which occurs later than the initial obtaining of the speech stream.
  - 34. The invention defined in claim 29 wherein the categorization in the categorizing means is done by reference only to a visual representation in the visual means without the need to actually listen to the speech itself.
  - 35. The invention defined in claim 29 wherein the visual representation in the visual means is employed by a user to select the portion of the speech to be retrieved.
  - 36. The invention defined in claim 29 wherein the categorization produced in the categorizing means determines which portions of the speech stream are saved in permanent storage.
  - 37. The invention defined in claim 29 wherein the visual representation in the visual means shows patterns of the speech that occurr over a period of time during the obtaining of the speech stream.
  - 38. The invention defined in claim 29 wherein the visual representation in the visual means takes the form of a document having category headings, and wherein selected categorized portions of one or more speech streams are incorporated in the document, being located under a respective category heading of the document.
  - 39. The invention defined in claim 29 wherein the visual representation in the visual means includes overlays indicating a particular categorization applied to a particular portion of the speech stream.
  - 40. The invention defined in claim 29 including processing means for processing selected items in accordance with programmed instrucitons and including marking means for marking the visual representation in the visual means to select portions of the speech for further processing in the processing means of those marked portions of the visual representations and related speech stream.
  - 41. The invention defined in claim 40 wherein the further processing in the processing means includes preparation of speech for voice mail.
  - 42. The invention defined in claim 40 wherein the further processing in the processing means includes at least one of the following:
    - selection of speech for noting on a calendar, andselection of speech for updating a schedule.
  - 43. The invention defined in claim 40 wherein the further processing in the processing means includes the provision of alarms for automatically reminding the user of alarm events.
  - 44. The invention defined in claim 29 wherein the categorizing means include integrating means for integrating reference notes, including both manual and programmed notes, within the stored data structure of the speech stream.
  - 45. The invention defined in claim 44 wherein the integrating of the notes in the integrating means can be done concurrently with the obtaining of the speech stream.
  - 46. The invention defined in claim 44 wherein the integrating of the notes in the integrating means can be done after the speech stream is obtained.
  - 47. The invention defined in claim 29 wherein the categorizing means includes automatically detect and record and visually display on the visual means the speaker'"'"'s identity, pauses, non speech sounds, emphasis, laughter, and pre-selected key words as pre-programmed by a user.
  - 48. The invention defined in claim 29 wherein the speech stream comes from a telephone call.
  - 49. The invention defined in claim 48 wherein the categorizing means categorize automatically by caller identity, date of the telephone call, number called, time of the telephone call, and duration of the telephone call.
  - 50. The invention defined in claim 29 wherein the thresholds of automatic categorizations are under user control.
  - 51. The invention defined in claim 29 which includes retrieving means for selectively retrieving categorized portions of the speech stream in any desired order for subsequent processing including audio play back and transcription, and wherein the retrieving means comprises both means for including and means for excluding by category.
  - 52. The invention defined in claim 51 wherein the means for excluding by category excludes pauses and non-speech sounds to thereby reduce the amount of time required for the selective retrieval and to improve the clarity and understanding of the retrieved categorized portions of the speech stream.
  - 53. The invention defined in claim 29 wherein the retrieving means for selectively retrieving includes means for initially retrieving only every n^th utterance, as demarcated by detected speech pauses, in order to speed up searching and replaying.

54. A speech information apparatus for recording, categorizing, organizing, managing and retrieving speech information transmitted by telephone, said apparatus comprising,a. a speech stream means for obtaining a speech stream from a telephone call,b. first storage means for storing the speech stream in at least a temporary storage,c. extracting means for extracting multiple, selected features from the speech stream, wherein the multiple features include the speaker'"'"'s identity or location, duration of speech phrases,and pauses in speaking,d. categorizing means for categorizing portions of the speech stream by user command or by automatic recognition of speech qualities, including tempo, fundamental pitch, and phonemes,e. second storage means for storing, in at least a temporary storage, structure which represents the categorized portions of the speech stream, andf. retrieving means for selectively retrieving one or more of the categorized portions of the speech stream, andg. wherein the speech portions are categorized in the categorizing means by speaker by indicating which end of the telephone connection the speech is coming from.

55. A speech information apparatus for recording speech, said apparatus comprising,capture means for capturing the speech,temporary storage means for storing captured speech in a temporary storage,extracting means for extracting multiple, selected features from the speech, wherein the multiple features include the speaker'"'"'s location, duration of speech phrases, and pauses in speaking,visual representation means for representing selected, extracted features of the speech in a visual form to a user,selection means for using the visual representation to select portions of the speech for storage, and including visual means for looking at the captured speech in the temporary store and categorizing means for selectively categorizing portions of that speech, with the aid of the visual representation, after the speech has been captured and stored in the temporary storage means.

56. A speech information apparatus for recording and indexing speech information, said apparatus comprising,speech stream means for obtaining a speech stream,first storage means for storing an entire speech stream as an unannotated speech stream in a first storage,automatic categorizing means for automatically recognizing qualities of the speech stream, including tempo, fundamental pitch, and phonemes,user command means for categorizing portions of the speech stream by user command and by association with the automatically recognized qualities,second storage means separate from the first storage means for storing the categorized portions of the speech stream together with the automatically recognized qualities,synchronizing means for synchronizing at least a portion of the obtained speech stream with the categorized portions and the automatically recognized qualities stored in said second storage, andcompiling means for compiling the automatically recognized qualities with the categorized portions as compiled speech information in a manner which permits the compiled speech information to be organized, managed, selectively retrieved by a user.

57. A video information apparatus for recording, categorizing, organizing, managing and retrieving video information, said apparatus comprising,a. stream means for obtaining a video stream,b. first storage means for storing the speech stream in at least a temporary storage,c. extracting means for extracting multiple, selected features from the video stream,d. constructing means for constructing a visual representation of the selected features of the video stream,e. visual means for providing the visual representation to a user,f. categorizing means for categorizing portions of the speech stream by user command or by automatic recognition of visual or audio qualities, andg. second storage means for storing, in at least a temporary storage, structure which represents the categorized portions of the speech stream.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Riverrun Technology
Original Assignee
Riverrun Technology
Inventors
McCusker, Michael V., Russell, Steven P.
Primary Examiner(s)
Hofsass, Jeffrey
Assistant Examiner(s)
TSANG, FAN S

Application Number

US08/210,318
Time in Patent Office

817 Days
Field of Search

379/67, 379/88, 379/89, 379/96, 381/43, 381/44, 395/2
US Class Current

379/88.01
CPC Class Codes

G06F 3/16 Sound input; Sound output s...

Method and apparatus for managing information

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

57 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for managing information

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

57 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links