Speech recognition method, device and system based on artificial intelligence

US 10,360,913 B2
Filed: 12/27/2017
Issued: 07/23/2019
Est. Priority Date: 04/07/2017
Status: Active Grant

First Claim

Patent Images

1. A speech recognition method based on artificial intelligence, comprising:

collecting speech data to be recognized in a speech recognition process at a client device;

sending uplink data stream from the client device to a server via an uplink connection to the server, wherein the uplink data stream comprises the speech data; and

receiving, at the client device, downlink data stream sent by the server via a downlink connection to the server in parallel with sending, from the client device, the uplink data stream to the server, wherein the downlink data stream comprises result data, and the result data is obtained by the server performing speech recognition according to the speech data;

wherein each of a uniform resource locator (URL) of the uplink connection and a URL of the downlink connection comprises a session identification of the speech recognition process, such that the server determines a correspondence relationship between the uplink connection and the downlink connection according to the session identifications, the URL of the uplink connection being distinct from the URL of the downlink connection.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present disclosure provides a speech recognition method, device and system based on artificial intelligence. The method includes: collecting speech data to be recognized in a speech recognition process; sending uplink data stream to a server via an uplink connection to the server, in which the uplink data stream includes the speech data; and receiving downlink data stream sent by the server via a downlink connection to the server in parallel with sending the uplink data stream to the server, in which the downlink data stream includes result data, and the result data is obtained by the server performing speech recognition according to the speech data.

Citations

12 Claims

1. A speech recognition method based on artificial intelligence, comprising:
- collecting speech data to be recognized in a speech recognition process at a client device;
  
  sending uplink data stream from the client device to a server via an uplink connection to the server, wherein the uplink data stream comprises the speech data; and
  
  receiving, at the client device, downlink data stream sent by the server via a downlink connection to the server in parallel with sending, from the client device, the uplink data stream to the server, wherein the downlink data stream comprises result data, and the result data is obtained by the server performing speech recognition according to the speech data;
  
  wherein each of a uniform resource locator (URL) of the uplink connection and a URL of the downlink connection comprises a session identification of the speech recognition process, such that the server determines a correspondence relationship between the uplink connection and the downlink connection according to the session identifications, the URL of the uplink connection being distinct from the URL of the downlink connection.
- View Dependent Claims (2, 3, 4)
- - 2. The method according to claim 1, wherein the uplink connection and the downlink connection are based on a protocol, the protocol indicates a structure of data content in the uplink data stream and in the downlink data stream, and the structure of data content comprises a data type, a data length and/or value, the protocol being HyperText Transfer Protocol (HTTP);
    - wherein the data type is configured to indicate a data processing mode of the data content.
  - 3. The method according to claim 2, before sending the uplink data stream to the server via the uplink connection to the server, further comprising:
    - performing packaging according to data types corresponding to the speech data, parameter data and/or application data to obtain first data content satisfying the protocol; and
      
      adding the first data content to the uplink data stream.
  - 4. The method according to claim 2, after receiving the downlink data stream sent by the server via the downlink connection to the server, further comprising:
    - acquiring the data type of second data content in the downlink data stream; and
      
      performing data processing on the second data content with a data processing mode indicated by the data type of the second data content.

5. A speech recognition method based on artificial intelligence, comprising:
- receiving, at a server, an uplink data stream sent by a client via an uplink connection to the client;
  
  performing, by the server, speech recognition on speech data in the uplink data stream to obtain result data; and
  
  sending, from the server, downlink data stream to the client via a downlink connection to the client in parallel with receiving uplink data stream sent by the client, wherein the downlink data stream comprises the result data;
  
  wherein sending downlink data stream to the client via the downlink connection to the client comprises;
  
  acquiring the downlink connection with a URL containing a session identification same as a session identification contained in a URL of the uplink connection, wherein the session identifications are corresponding to speech recognition processes one by one, the URL of the uplink connection being distinct from the URL of the downlink connection; and
  
  sending the downlink data stream to the client via the acquired downlink connection.
- View Dependent Claims (6, 7, 8)
- - 6. The method according to claim 5, wherein the uplink connection and the downlink connection are based on a protocol, the protocol indicates a structure of data content in the uplink data stream and in the downlink data stream, and the structure of data content comprises a data type, a data length and/or value, the protocol being HyperText Transfer Protocol (HTTP);
    - wherein the data type is configured to indicate a data processing mode of the data content.
  - 7. The method according to claim 6, before performing speech recognition on speech data in the uplink data stream to obtain result data, further comprising:
    - acquiring the data type of first data content in the uplink data stream; and
      
      acquiring a data processing mode indicated by the data type of the first data content as speech recognition.
  - 8. The method according to claim 7, wherein after acquiring the data type, further comprising:
    - when the data processing mode indicated by the data type of the first data content is not the speech recognition, performing data processing on the first data content with the data processing mode indicated by the data type of the first data content; and
      
      acquiring corresponding data types according to parameter data, result data and/or application data obtained by the data processing, and performing packaging according to the acquired data types to obtain second data content satisfying the protocol; and
      
      adding the second data content to the downlink data stream.

9. A speech recognition device based on artificial intelligence, comprising:
- a processor; and
  
  a memory, configured to store one or more software modules executable by the processor,wherein the one or more software modules comprise;
  
  a collecting module, configured to collect speech data to be recognized in a speech recognition process at a client device;
  
  a sending module, configured to send uplink data stream from the client device to a server via an uplink connection to the server, wherein the uplink data stream comprises the speech data; and
  
  a receiving module, configured to receive, at the client device, downlink data stream sent by the server a downlink connection to the server in parallel with sending, from the client device, the uplink data stream to the server, wherein the downlink data stream comprises result data, and the result data is obtained by the server performing speech recognition according to the speech data;
  
  wherein each of a URL of the uplink connection and a URL of the downlink connection comprises a session identification of the speech recognition process, such that the server determines a correspondence relationship between the uplink connection and the downlink connection according to the session identifications, the URL of the uplink connection being distinct from the URL of the downlink connection.
- View Dependent Claims (10, 11, 12)
- - 10. The device according to claim 9, wherein the uplink connection and the downlink connection are based on a protocol, the protocol indicates a structure of data content in the uplink data stream and in the downlink data stream, and the structure of data content comprises a data type, a data length and/or value, the protocol being HyperText Transfer Protocol (HTTP);
    - wherein the data type is configured to indicate a data processing mode of the data content.
  - 11. The device according to claim 10, wherein the one or more software modules further comprise:
    - a packaging module, configured to perform packaging according to data types corresponding to the speech data, parameter data and/or application data to obtain first data content satisfying the protocol, and to add the first data content to the uplink data stream.
  - 12. The device according to claim 10, wherein the one or more software modules further comprise:
    - a processing module, configured to acquire a data type of second data content in the downlink data stream, and to perform data processing on the second data content with a data processing mode indicated by the data type of the second data content.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Baidu Online Network Technology (Beijing) Co., Ltd (Baidu Incorporated)
Original Assignee
Baidu Online Network Technology (Beijing) Co., Ltd (Baidu Incorporated)
Inventors
Du, Niandong, Xie, Yan, Tang, Haiyuan
Primary Examiner(s)
Nguyen, Khai N.

Application Number

US15/854,904
Publication Number

US 20180293987A1
Time in Patent Office

573 Days
Field of Search

704200, 704231, 704232, 7042701
US Class Current
CPC Class Codes

G10L 15/22   Procedures used during a sp...

G10L 15/30   Distributed recognition, e....

H04L 67/01   Protocols

H04L 67/10   in which an application is ...

H04L 67/141   Setup of application sessio...

H04L 67/146   Markers for unambiguous ide...

Speech recognition method, device and system based on artificial intelligence

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition method, device and system based on artificial intelligence

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links