Speech recognition and summarization

US 10,185,711 B1
Filed: 07/05/2016
Issued: 01/22/2019
Est. Priority Date: 09/10/2012
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

receiving, by a videoconference system that includes an automated speech recognizer and a context builder and from a first computing device that includes a microphone, speech data representing an utterance spoken by a particular participant of a video conference and captured by the microphone of the first computing device;

transcribing, by the automated speech recognizer of the video-conference system, the speech data representing the utterance spoken by the particular participant of the video conference into text in real-time;

determining, by the automated speech recognizer of the videoconference system, a topic of the video conference by analyzing one or more words and/or phrases in the text of the speech data;

annotating, by the automated speech recognizer of the videoconference system, the text of the speech data by;

determining one or more relevant terms in the text of the speech data as being potentially relevant to the determined topic; and

identifying, using the one or more relevant terms in the text, one or more resources associated with the determined topic of the video conference, each identified resource comprising at least one of advertising content, a search result, an event, or a location; and

for each identified resource;

generating, using the context builder of the video conference system, a user interface component for the identified resource; and

outputting, by the context builder of the video conference system, the corresponding user interface component for the identified resource to a second computing device in real-time, the corresponding user interface component when received by the second computing device causing the second computing device to display the corresponding user interface component on a videoconference graphical user interface executing on the second computing device.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The subject matter of this specification can be embodied in, among other things, a method that includes receiving two or more data sets each representing speech of a corresponding individual attending an internet-based social networking video conference session, decoding the received data sets to produce corresponding text for each individual attending the internet-based social networking video conference, and detecting characteristics of the session from a coalesced transcript produced from the decoded text of the attending individuals for providing context to the internet-based social networking video conference session.

90 Citations

View as Search Results

17 Claims

1. A computer-implemented method comprising:
- receiving, by a videoconference system that includes an automated speech recognizer and a context builder and from a first computing device that includes a microphone, speech data representing an utterance spoken by a particular participant of a video conference and captured by the microphone of the first computing device;
  
  transcribing, by the automated speech recognizer of the video-conference system, the speech data representing the utterance spoken by the particular participant of the video conference into text in real-time;
  
  determining, by the automated speech recognizer of the videoconference system, a topic of the video conference by analyzing one or more words and/or phrases in the text of the speech data;
  
  annotating, by the automated speech recognizer of the videoconference system, the text of the speech data by;
  
  determining one or more relevant terms in the text of the speech data as being potentially relevant to the determined topic; and
  
  identifying, using the one or more relevant terms in the text, one or more resources associated with the determined topic of the video conference, each identified resource comprising at least one of advertising content, a search result, an event, or a location; and
  
  for each identified resource;
  
  generating, using the context builder of the video conference system, a user interface component for the identified resource; and
  
  outputting, by the context builder of the video conference system, the corresponding user interface component for the identified resource to a second computing device in real-time, the corresponding user interface component when received by the second computing device causing the second computing device to display the corresponding user interface component on a videoconference graphical user interface executing on the second computing device.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The computer-implemented method of claim 1, wherein identifying one or more resources associated with the determined topic of the video conference comprises obtaining, from an advertising module, the advertising content that corresponds to the determined topic of the video conference.
  - 3. The computer-implemented method of claim 1, wherein identifying one or more resources associated with the determined topic of the video conference comprises:
    - obtaining, from a search engine, one or more search results that are identified as a result of performing a query using at least one of the one or more relevant terms in the text of the speech data representing the utterance spoken by the particular participant of the video conference; and
      
      selecting a particular search result from among the one or more search results identified as the result of performing the query.
  - 4. The computer-implemented method of claim 1, wherein generating the corresponding user interface component for the identified resource comprises:
    - identifying the event associated with the determined topic of the video conference; and
      
      generating an invitation for the identified event, the invitation comprising at least one of a calendar date, time, location, or guest list for the identified event.
  - 5. The computer-implemented method of claim 1, wherein generating the corresponding user interface component for the identified resource comprises:
    - identifying the location associated with the determined topic of the video conference; and
      
      generating a map image associated with the identified location.
  - 6. The computer-implemented method of claim 1, wherein generating the corresponding user interface component for the identified resource comprises:
    - identifying the location associated with the determined topic of the video conference; and
      
      generating a hyperlink to a map associated with the identified location.

7. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  receiving, by a videoconference system that includes an automated speech recognizer and a context builder and from a first computing device that includes a microphone, speech data representing an utterance spoken by a particular participant of a video conference and captured by the microphone of the first computing device;
  
  transcribing, by the automated speech recognizer of the video-conference system, the speech data representing the utterance spoken by the particular participant of the video conference into text in real-time;
  
  determining, by the automated speech recognizer of the videoconference system, a topic of the video conference by analyzing one or more words and/or phrases in the text of the speech data;
  
  annotating, by the automated speech recognizer of the videoconference system, the text of the speech data by;
  
  determining one or more relevant terms in the text of the speech data as being potentially relevant to the determined topic; and
  
  identifying, using the one or more relevant terms in the text, one or more resources associated with the determined topic of the video conference, each identified resource comprising at least one of advertising content, a search result, an event, or a location; and
  
  for each identified resource;
  
  generating, using the context builder of the video conference system, a corresponding user interface component for the identified resource; and
  
  outputting, by the context builder of the video conference system, the corresponding user interface component for the identified resource to a second computing device in real-time, the corresponding user interface component when received by the second computing device causing the second computing device to display the corresponding user interface component on a videoconference graphical user interface executing on the second computing device.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The system of claim 7, wherein identifying one or more resources associated with the determined topic of the video conference comprises obtaining, from an advertising module, the advertising content that corresponds to the determined topic of the video conference.
  - 9. The system of claim 7, wherein identifying one or more resources associated with the determined topic of the video conference comprises:
    - obtaining, from a search engine, one or more search results that are identified as a result of performing a query using at least one of the one or more relevant terms in the text of the speech data representing the utterance spoken by the particular participant of the video conference; and
      
      selecting a particular search result from among the one or more search results identified as the result of performing the query.
  - 10. The system of claim 7, wherein generating the corresponding user interface component for the identified resource comprises:
    - identifying the event associated with the determined topic of the video conference; and
      
      generating an invitation for the identified event, the invitation comprising at least one of a calendar date, time, location, or guest list for the identified event.
  - 11. The system of claim 7, wherein generating the corresponding user interface component for the identified resource comprises:
    - identifying the location associated with the determined topic of the video conference; and
      
      generating a map image associated with the identified location.
  - 12. The system of claim 7, wherein generating the corresponding user interface component for the identified resource comprises:
    - identifying the location associated with the determined topic of the video conference; and
      
      generating a hyperlink to a map associated with the identified location.

13. A computer-readable storage device storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
- receiving, by a videoconference system that includes an automated speech recognizer and a context builder and from a first computing device that includes a microphone, speech data representing an utterance spoken by a particular participant of a video conference and captured by the microphone of the first computing device;
  
  transcribing, by the automated speech recognizer of the video-conference system, the speech data representing the utterance spoken by the particular participant of the video conference into text in real-time;
  
  determining, by the automated speech recognizer of the videoconference system, a topic of the video conference by analyzing one or more words and/or phrases in the text of the speech data;
  
  annotating, by the automated speech recognizer of the videoconference system, the text of the speech data by;
  
  determining one or more relevant terms in the text of the speech data as being potentially relevant to the determined topic; and
  
  identifying, using the one or more relevant terms in the text, one or more resources associated with the determined topic of the video conference, each identified resource comprising at least one of advertising content, a search result, an event, or a location; and
  
  for each identified resource;
  
  generating, using the context builder of the video conference system, a corresponding user interface component for the identified resource; and
  
  outputting, by the context builder of the video conference system, the corresponding user interface component for the identified resource to a second computing device in real-time, the corresponding user interface component when received by the second computing device causing the second computing device to display the corresponding user interface component on a videoconference graphical user interface executing on the second computing device.
- View Dependent Claims (14, 15, 16, 17)
- - 14. The computer-readable storage device of claim 13, wherein identifying one or more resources associated with the determined topic of the video conference comprises:
    - obtaining, from a search engine, one or more search results that are identified as a result of performing a query using at least one of the one or more relevant terms in the text of the speech data representing the utterance spoken by the particular participant of the video conference; and
      
      selecting a particular search result from among the one or more search results identified as the result of performing the query.
  - 15. The computer-readable storage device of claim 13, wherein generating the corresponding user interface component for the identified resource comprises:
    - identifying the event associated with the determined topic of the video conference; and
      
      generating an invitation for the identified event, the invitation comprising at least one of a calendar date, time, location, or guest list for the identified event.
  - 16. The computer-readable storage device of claim 13, wherein generating the corresponding user interface component for the identified resource comprises:
    - identifying the location associated with the determined topic of the video conference; and
      
      generating map image associated with the identified location.
  - 17. The computer-readable storage device of claim 13, wherein generating the corresponding user interface component for the identified resource comprises:
    - identifying the location associated with the determined topic of the video conference; and
      
      generating a hyperlink to a map associated with the identified location.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google LLC (Alphabet Inc.)
Inventors
Shires, Glen, Swigart, Sterling, Zolla, Jonathan, Gauci, Jason J.
Primary Examiner(s)
Guerra-Erazo, Edgar X

Application Number

US15/202,039
Time in Patent Office

931 Days
Field of Search

704 9, 704 10, 704235, 704243, 704246, 704270, 7042701, 704275
US Class Current
CPC Class Codes

G06F 40/279   Recognition of textual enti...

G06F 40/30   Semantic analysis

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/1822   Parsing for meaning underst...

G10L 15/26   Speech to text systems G10L...

G10L 21/10   Transforming into visible i...

H04N 7/15   Conference systems

Speech recognition and summarization

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

90 Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition and summarization

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

90 Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links