Real-time speech-to-text conversion in an audio conference session

US 9,560,206 B2
Filed: 04/30/2010
Issued: 01/31/2017
Est. Priority Date: 04/30/2010
Status: Active Grant

First Claim

Patent Images

1. A computer system for providing real-time resources to participants in an audio conference session, the computer system comprising:

a conference system for establishing an audio conference session between a plurality of computing devices connected via a communication network; and

a server configured to communicate with the conference system and the plurality of computing devices via the communication network, the server comprising;

a processor and a memory;

a pre-processing engine stored in the memory and executed by the processor, the pre-processing engine comprising logic configured to;

receive an audio stream associated with one or more of the computing devices, the audio stream comprising a speech signal; and

extract the speech signal from the audio stream;

a speech-to-text conversion engine stored in the memory and executed by the processor, the speech-to-text conversion engine comprising logic configured to extract words from the speech signal;

a relevance engine stored in the memory and executed by the processor, the relevance engine comprising an algorithm for outputting a relevant keyword or topic being discussed in the audio conference session based on a plurality of data inputs, the plurality of data inputs comprising the extracted words from the speech-to-text conversion engine, a speaker identity with a corresponding role or category of one or more participants who spoke the extracted words, the algorithm identifying the relevant keyword or topic by calculating and updating a relevance score during the audio conference session and, if the relevance score exceeds a threshold, outputting the relevant keyword or topic, wherein the relevance score is based on a usage density associated with the one or more extracted words; and

a resources engine stored in the memory and executed by the processor, the resources engine operatively coupled to the relevance engine and comprising logic configured to;

receive from the relevance engine the relevant keyword or topic identified by the algorithm based on the speaker identity with the corresponding role or category;

identify a plurality of resources related to the relevant keyword or topic;

display in a graphical user interface and during the audio conference session, the plurality of resources to the one or more computing devices in a conference user interface associated with the audio conference session; and

enable user-selection of one or more of the plurality of resources via the conference user interface.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Various embodiments of systems, methods, and computer programs are disclosed for providing real-time resources to participants in an audio conference session. One embodiment is a method for providing real-time resources to participants in an audio conference session via a communication network. One such method comprises: a conferencing system establishing an audio conference session between a plurality of computing devices via a communication network, each computing device generating a corresponding audio stream comprising a speech signal; and in real-time during the audio conference session, a server: receiving and processing the audio streams to determine the speech signals; extracting words from the speech signals; analyzing the extracted words to determine a relevant keyword being discussed in the audio conference session; identifying a resource related to the relevant keyword; and providing the resource to one or more of the computing devices.

Citations

18 Claims

1. A computer system for providing real-time resources to participants in an audio conference session, the computer system comprising:
- a conference system for establishing an audio conference session between a plurality of computing devices connected via a communication network; and
  
  a server configured to communicate with the conference system and the plurality of computing devices via the communication network, the server comprising;
  
  a processor and a memory;
  
  a pre-processing engine stored in the memory and executed by the processor, the pre-processing engine comprising logic configured to;
  
  receive an audio stream associated with one or more of the computing devices, the audio stream comprising a speech signal; and
  
  extract the speech signal from the audio stream;
  
  a speech-to-text conversion engine stored in the memory and executed by the processor, the speech-to-text conversion engine comprising logic configured to extract words from the speech signal;
  
  a relevance engine stored in the memory and executed by the processor, the relevance engine comprising an algorithm for outputting a relevant keyword or topic being discussed in the audio conference session based on a plurality of data inputs, the plurality of data inputs comprising the extracted words from the speech-to-text conversion engine, a speaker identity with a corresponding role or category of one or more participants who spoke the extracted words, the algorithm identifying the relevant keyword or topic by calculating and updating a relevance score during the audio conference session and, if the relevance score exceeds a threshold, outputting the relevant keyword or topic, wherein the relevance score is based on a usage density associated with the one or more extracted words; and
  
  a resources engine stored in the memory and executed by the processor, the resources engine operatively coupled to the relevance engine and comprising logic configured to;
  
  receive from the relevance engine the relevant keyword or topic identified by the algorithm based on the speaker identity with the corresponding role or category;
  
  identify a plurality of resources related to the relevant keyword or topic;
  
  display in a graphical user interface and during the audio conference session, the plurality of resources to the one or more computing devices in a conference user interface associated with the audio conference session; and
  
  enable user-selection of one or more of the plurality of resources via the conference user interface.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The computer system of claim 1, wherein the conference system and server are remotely located.
  - 3. The computer system of claim 1, wherein the calculating and updating the relevance score comprises:
    - identifying a current instance of the extracted word;
      
      generating a timestamp for the current instance of the extracted word; and
      
      updating the relevance score for the extracted word based on the timestamp.
  - 4. The computer system of claim 1, wherein the logic configured to identify the plurality of resources related to the relevant keyword or topic comprises:
    - logic configured to query a database.
  - 5. The computer system of claim 1, wherein the logic configured to identify the plurality of resources related to the relevant keyword or topic comprises logic configured to:
    - generate a search query comprising the relevant keyword or topic;
      
      send the search query to a search engine; and
      
      receive a response to the search query, the response including the plurality of resources.
  - 6. The computer system of claim 5, wherein the response is provided to the one or more computing devices.
  - 7. The computer system of claim 1, wherein the graphical user interface comprises a virtual conference location based on a predefined simulated location view.
  - 8. The computer system of claim 7, wherein the virtual conference location comprises:
    - a first graphical representation of a conference location according to the predefined simulated location view; and
      
      a second graphical representation of each of a plurality of participants with corresponding contact information displayed in the conference location.

9. A method for providing real-time resources to participants in an audio conference session via a communication network, the method comprising:
- a conferencing system establishing an audio conference session between a plurality of computing devices via a communication network, each computing device generating a corresponding audio stream comprising a speech signal; and
  
  in real-time during the audio conference session, a server;
  
  receiving and processing the audio streams to determine the speech signals;
  
  extracting words from the speech signals;
  
  inputting the extracted words to a relevance algorithm;
  
  the relevance algorithm outputting a relevant keyword being discussed in the audio conference session based on a speaker identity with a corresponding role or category of one or more participants who spoke the extracted words, the relevant keyword determined by calculating and updating a relevance score during the audio conference session and, if the relevance score exceeds a threshold, outputting the relevant keyword, wherein the relevance score is based on a usage density associated with the one or more extracted words;
  
  identifying a plurality of resources related to the relevant keyword output from the relevance algorithm based on the speaker identity with the corresponding role or category of the one or more participants who spoke the extracted words;
  
  displaying the plurality of resources to one or more of the computing devices in a conference user interface associated with the audio conference session; and
  
  receiving a user selection of one or more of the plurality of resources to present, via the conference user interface to the one or more participants.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 10. The method of claim 9, wherein the conferencing system and the server comprise an integrated computer system.
  - 11. The method of claim 9, wherein the identifying the plurality of resources related to the relevant keyword comprises querying a search engine.
  - 12. The method of claim 11, wherein results of the search engine query are displayed in the conference user interface to the computing device.
  - 13. The method of claim 9, wherein the identifying the plurality of resources related to the relevant keyword comprises determining a URL associated with the corresponding resource, and the providing the plurality of resources to the computing device comprises providing the URL to a browser associated with the computing device.
  - 14. The method of claim 9, wherein one or more of the plurality of resources comprises a hypertext link.
  - 15. The method of claim 9, wherein the displaying the plurality of resources to the computing device comprises providing the plurality of resources within a virtual conference location.
  - 16. The method of claim 15, wherein the plurality of resources are displayed in a resources pane in the virtual conference location, wherein the resources pane enables the one or more participants to navigate and select one or more of the plurality of resources.
  - 17. The method of claim 9, wherein the role or category associated with the speaker identity is based on one or more of an organizational hierarchy, social networking information, or whether the speaker identity corresponds to an employee or vendor.
  - 18. The method of claim 9, wherein the role or category associated with the speaker identity is based on one or more of an organizational hierarchy, social networking information, or whether the speaker identity corresponds to an employee or vendor.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
American Teleconferencing Services Limited (Premiere Global Services Incorporated)
Original Assignee
American Teleconferencing Services Limited (Premiere Global Services Incorporated)
Inventors
Jones, Boland T., Guthrie, David Michael, Schaefer, Laurence, Martin, J Douglas
Primary Examiner(s)
Serrou, Abdelali

Application Number

US12/771,400
Publication Number

US 20110270609A1
Time in Patent Office

2,468 Days
Field of Search

704/235, 704/252, 704/231, 704/251
US Class Current

1/1
CPC Class Codes

G10L 15/26   Speech to text systems G10L...

G10L 2015/088   Word spotting

H04M 2201/38   Displays

H04M 2201/40   using speech recognition

H04M 2203/2038   Call context notifications

H04M 2203/655   Combination of telephone se...

H04M 3/56   Arrangements for connecting...

H04M 3/562   where the conference facili...

H04M 3/565   relating to time schedule a...

H04M 7/0027   Collaboration services wher...

Real-time speech-to-text conversion in an audio conference session

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Real-time speech-to-text conversion in an audio conference session

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links