LANGUAGE ANALYSIS BASED ON WORD-SELECTION, AND LANGUAGE ANALYSIS APPARATUS
3 Assignments
0 Petitions
Accused Products
Abstract
The invention relates to a method for wording-based speech analysis. In order to provide a method that allows automated analysis of largely arbitrary features of a person from whom a voice file that needs to be analysed comes, the invention detaches itself from the known concept of evaluating static keyword lists for the personality type. The method according to the invention comprises the preparation of a computer system by formation of a reference sample that allows the comparison that is necessary for feature recognition with other persons. The preparation of the computer system involves the recording and storage of a further voice file in addition to the voice files of the reference sample, the analysis of the additionally recorded voice file and the output of the recognized features using at least one output unit connected to the computer system. Furthermore, the invention relates to a speech analysis device for carrying out the method.
29 Citations
33 Claims
-
1-18. -18. (canceled)
-
19. A method for automated language analysis based on word-selection, comprising the steps:
-
a) preparing a computer system (1.30) by aa) storing a plurality of reference language files (1.10) in a memory unit (1.20) of the computer system (1.30) in order to form a reference sample (1.40), wherein each reference language file (1.10) comprises a minimum number of 100 words, and each reference language file (1.10) originates from a different person having known characteristics, ab) storing a dictionary file (2.20) containing a multiplicity of different categories (2.10) in a memory unit (1.20) of the computer system (1.30), wherein all the words in the dictionary file (2.20) are classified in at least one of the categories (2.10), ac) making an individual comparison of each reference language file (1.10) in the reference sample (1.40) with the dictionary file (2.20) by calculating the percentage frequency (3.40) of the words in each reference language file (1.10) that are contained in each category (2.10) of the dictionary file (2.20), and ad) storing a set of rules (5.40) in a memory unit (1.20) of the computer system (1.30), which set of rules uses statistical and/or algorithmic methods to calculate associations at least between the percentage frequencies (3.40) calculated in step ac) in one or more categories (2.10) and at least one known characteristic (4.20) of the people from whom the reference language files (1.10) originate. b) following preparation of the computer system in accordance with steps aa)-ad), recording and storing a language file (6.10), in addition to the reference language files (1.10) of the reference sample (1.40), in a memory unit (1.20) of the computer system (1.30), wherein each language file (6.10) and each reference language file is one of a text file or an audio file that is converted into a text file by a transcription, c) analyzing the language file (6.10) additionally recorded and stored in step b), by ca) making an individual comparison of the language file (6.10) with the dictionary file (2.20) by calculating the percentage frequency (7.30) of the words in the language file (6.10) that are contained in each category (2.10) of the dictionary file (2.20), and cb) using the set of rules (5.40) to process the percentage frequencies (7.30) calculated in step ca), which set of rules uses statistical and/or algorithmic methods to examine the percentage frequencies (7.30) calculated in step ca) for similarities with the percentage frequencies (3.40) calculated in step ac), and classifies the language file (6.10) according to the established similarities, and associates said file with at least one known characteristic belonging to the people from whom the reference language files (1.10) originate, d) creating an output file (8.20), which contains characteristics (4.20) associated with the language file (6.10) in step cb), and e) outputting the output file (8.20), f) fa) expanding the reference sample (1.40) in step aa) by adding as reference language files (1.10), each language file (6.10) recorded in step b), fb) providing a feedback through an input, which allows an evaluation of the correctness of the analysis of step c), and fc) updating and re-saving the set of rules (5.40) taking into account the enlarged database from step ad). - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
Specification