Speech data collection over the world wide web

US 6,112,176 A
Filed: 05/16/1997
Issued: 08/29/2000
Est. Priority Date: 05/16/1997
Status: Expired due to Term

First Claim

Patent Images

1. A computerized method for collecting speech processing model training data using the Internet, comprising the steps of:

enabling client computers connected to the Internet to acquire speech signals and information characterizing the speech signals using Web pages;

storing addresses of the client computers in a list in a memory of a Web server computer;

selecting from the list, based upon predetermined criteria, some of the enabled client computers to acquire the speech signals and information characterizing the speech signals using the Web pages; and

transmitting from at least one of the selected client computers, the acquired speech signals and information to the Web server computer, said Web server computer using the acquired and transmitted speech signals and information to generate and train speech processing models;

the client computers are selected on the basis of Web domains, the Web domains are associated with specific linguistic groupings.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In a computerized method for collecting speech data, Web pages of client computers connected to the Internet are enabled to acquire speech signal and information characterizing the speech. The addresses of the enabled Web pages are stored in a list in a memory of a Web server computer. Based on predetermined criteria and the list, some of the enabled client computers are selected to acquire the speech signal and information. The acquired speech signal and information are transmitted to the server computer to generate, train, and evaluate acoustic-phonetic models.

58 Citations

View as Search Results

12 Claims

1. A computerized method for collecting speech processing model training data using the Internet, comprising the steps of:
- enabling client computers connected to the Internet to acquire speech signals and information characterizing the speech signals using Web pages;
  
  storing addresses of the client computers in a list in a memory of a Web server computer;
  
  selecting from the list, based upon predetermined criteria, some of the enabled client computers to acquire the speech signals and information characterizing the speech signals using the Web pages; and
  
  transmitting from at least one of the selected client computers, the acquired speech signals and information to the Web server computer, said Web server computer using the acquired and transmitted speech signals and information to generate and train speech processing models;
  
  the client computers are selected on the basis of Web domains, the Web domains are associated with specific linguistic groupings.
- View Dependent Claims (2, 3, 4)
- - 2. The method of claim 1 wherein the acquired speech signals and information collected at the Web server computer are used to evaluate speech processing models.
  - 3. The method of claim 1 wherein the information includes data characterizing an acoustic environment where the speech signals are initially acquired.
  - 4. The method of claim 1 wherein the information includes data characterizing the speaker of the speech signals.

5. Computer method for training acoustic-phonetic models using speech data collected over the Internet, comprising the steps of:
- using Web pages, enabling client computers connected to the Internet to acquire speech signals and information characterizing the speech signals;
  
  storing addresses of the client computers in a list in a memory of a Web server computer;
  
  selecting from the list, based upon predetermined criteria, some of the enabled client computers to acquire the speech signals and information characterizing the speech signals using the Web pages;
  
  transmitting from at least one of the selected client computers, the acquired speech signals and information to the Web server computer; and
  
  using the acquired and transmitted speech signals and information collected at the Web server computer, to generate and train acoustic-phonetic models of a speech processing system;
  
  selecting client computers on the basis of at least one of Web domain and linguistic groupings.
- View Dependent Claims (6, 7)
- - 6. A method as claimed in claim 5 further comprising the step of using the acquired and transmitted speech signals and information to evaluate acoustic-phonetic models.
  - 7. A method as claimed in claim 5 wherein the step of enabling includes enabling client computers to acquire information formed of at least one of data characterizing an acoustic environment where the speech signals are initially acquired and data characterizing the speaker of the speech signals.

8. Computer apparatus for collecting speech data over the Internet and training speech processing models with said collected speech data, comprising:
- a plurality of client computers connected to the Internet, each client computer having a respective Web Page enabled to acquire speech signals and information characterizing the speech signals; and
  
  a Web server computer coupled across the Internet for communicating with the client computers, said Web server computer making requests of certain client computers for speech signals and information characterizing the speech signals, in response to each request from the Web server computer, said respective certain client computers transmitting acquired speech signals and information to the Web server computer for use in training speech processing models;
  
  the Web server computer selects the certain client computers on the basis of Web domains, the Web domains are associated with specific linguistic groupings.
- View Dependent Claims (9, 10, 11, 12)
- - 9. Computer apparatus as claimed in claim 8 further comprising list means coupled to the Web server computer, said list means storing addresses of the client computers in a memory of the Web server computer, such that said Web server computer makes requests of certain client computers for speech signals and information characterizing the speech signals using said list means.
  - 10. Computer apparatus as claimed in claim 8 wherein the acquired speech signals and information are used to evaluate speech processing models.
  - 11. Computer apparatus as claimed in claim 8 wherein the information includes data characterizing an acoustic environment where the speech signals are acquired.
  - 12. Computer apparatus as claimed in claim 8 wherein the information includes data characterizing the speaker of the speech signals.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Hewlett-Packard Development Company, L.P. (HP Inc.)
Original Assignee
Compaq Computer Corporation (HP Inc.)
Inventors
Weikart, Christopher M., Goldenthal, William D.
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Azad, Abul K.

Application Number

US08/857,449
Time in Patent Office

1,201 Days
Field of Search

704/201, 704/270, 704/275, 704/231, 704/251, 704/257, 704/260, 707/501, 707/513
US Class Current

704/257
CPC Class Codes

G10L 15/06 Creation of reference templ...

Speech data collection over the world wide web

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

58 Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Speech data collection over the world wide web

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

58 Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links