Variable rate video playback with synchronized audio
First Claim
1. For use with an audiovisual display system in which an original set of audio data and a related original set of video data can be used to generate an audiovisual display at a normal display rate, a method for enabling the apparent display rate of the audiovisual display to be varied from the normal display rate, the method comprising the steps of:
- defining a correspondence between the original set of audio data and the original set of video data such that the original set of audio data is synchronized with the original set of video data;
determining a target display rate or rates for the audiovisual display;
creating a modified set of audio data, based upon the target display rate or rates and an evaluation of the content of the original set of audio data, that corresponds to the original set of audio data, wherein the step of creating a modified set of audio data further comprises the steps of;
identifying a first group of one or more audio samples of the original set of audio data that is adjacent to a second group of one or more audio samples from the original set of audio data that represents a corresponding pitch pulse or pulses that approximate to a predetermined degree the pitch pulse or pulses represented by the first group of audio samples; and
blending the audio samples from the first and second groups of audio samples to produce a third group of one or more audio samples that replaces, or is added to, the first and second groups of audio samples; and
creating a modified set of video data, based upon the modified set of audio data, the correspondence between the modified set of audio data and the original set of audio data, and the correspondence between the original set of audio data and the original set of video data, such that the modified set of video data is synchronized with the modified set of audio data.
0 Assignments
0 Petitions
Accused Products
Abstract
The invention enables the apparent display rate of an audiovisual display to be varied. The invention can modify an original set of audio data in accordance with a target display rate, then modify a related original set of video data to conform to the modifications made to the audio data set, such that the modified audio and video data sets are synchronized. When the modified audio and video data sets so produced are used to generate an audiovisual display, the audiovisual display has an apparent display rate that approximates the target display rate. The target display rate can be faster or slower than a normal display rate at which an audiovisual display system generates an audiovisual display from the original sets of audio and video data. The target display rate can be established solely by a user instruction, by analysis of the audiovisual data, or by modification of a user-specified nominal target display rate based upon analysis of the audiovisual data. Preferably, the method for modifying the original audio data set is one that produces a modified audio data set that can be used to generate an audio display having little or no distortion.
-
Citations
102 Claims
-
1. For use with an audiovisual display system in which an original set of audio data and a related original set of video data can be used to generate an audiovisual display at a normal display rate, a method for enabling the apparent display rate of the audiovisual display to be varied from the normal display rate, the method comprising the steps of:
-
defining a correspondence between the original set of audio data and the original set of video data such that the original set of audio data is synchronized with the original set of video data;
determining a target display rate or rates for the audiovisual display;
creating a modified set of audio data, based upon the target display rate or rates and an evaluation of the content of the original set of audio data, that corresponds to the original set of audio data, wherein the step of creating a modified set of audio data further comprises the steps of;
identifying a first group of one or more audio samples of the original set of audio data that is adjacent to a second group of one or more audio samples from the original set of audio data that represents a corresponding pitch pulse or pulses that approximate to a predetermined degree the pitch pulse or pulses represented by the first group of audio samples; and
blending the audio samples from the first and second groups of audio samples to produce a third group of one or more audio samples that replaces, or is added to, the first and second groups of audio samples; and
creating a modified set of video data, based upon the modified set of audio data, the correspondence between the modified set of audio data and the original set of audio data, and the correspondence between the original set of audio data and the original set of video data, such that the modified set of video data is synchronized with the modified set of audio data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34)
dividing the original set of video data into a plurality of subunits, each subunit of video data representing a duration of time that is substantially equal to the duration of time represented by each other subunit of video data;
dividing the original set of audio data into a plurality of segments, each segment representing a duration of time that is approximately coincident with and substantially equal to the duration of time of a corresponding subunit of video data; and
identifying corresponding subunits of video data and segments of audio data.
-
-
3. A method as in claim 1, wherein at least one target display rate is faster than a normal display rate.
-
4. A method as in claim 1, wherein at least one target display rate is slower than a normal display rate.
-
5. A method as in claim 1, wherein the target display rate is a sequence of target display rates.
-
6. A method as in claim 1, wherein the step of determining a target display rate further comprises the step of ascertaining the value of a nominal target display rate specified by a user of the audiovisual display system.
-
7. A method as in claim 1, wherein the step of determining a target display rate further comprises the step of evaluating the audio and/or video data to automatically determine the value of the target display rate.
-
8. A method as in claim 7, wherein the step of evaluating further comprises the steps of:
-
analyzing the original set of audio data; and
calculating the target display rate based upon the analysis of the original set of audio data.
-
-
9. A method as in claim 7, wherein the step of evaluating further comprises the steps of:
-
analyzing the original set of video data; and
calculating the target display rate based upon the analysis of the video data.
-
-
10. A method as in claim 9, wherein:
-
the step of analyzing the original set of video data further comprises ascertaining the relative rate of change of the video data along various population-based dimensions; and
the step of calculating further comprises the step of calculating the target display rate based upon the change in values of the data along the population-based dimensions.
-
-
11. A method as in claim 9, wherein:
-
the step of analyzing the original set of video data further comprises;
ascertaining portions of a video image represented by the original set of video data that change quickly; and
ascertaining the frequency with which such quick changes occur; and
the step of calculating further comprises the step of calculating the target display rate based upon the occurrence and frequency of quick changes in the video image.
-
-
12. A method as in claim 11, wherein the step of calculating further comprises establishing a target display rate for periods of time during which quick changes in the video image occur that is lower than the target display rate during other periods of time.
-
13. A method as in claim 9, wherein:
-
the step of analyzing the original set of video data further comprises tracking the motion of objects within a video image represented by the original set of video data; and
the step of calculating further comprises the step of calculating the target display rate based upon the appearance of new objects in the video image.
-
-
14. A method as in claim 13, wherein the step of calculating further comprises establishing a target display rate for periods of time during which new objects appear in the video image that is lower than the target display rate during other periods of time.
-
15. A method as in claim 7, wherein the step of evaluating further comprises the steps of:
-
performing a first analysis of the original set of audio data; and
performing a second analysis of the original set of audio data; and
calculating the target display rate based upon the first and second analyses of the audio data.
-
-
16. A method as in claim 7, wherein the step of evaluating further comprises the steps of:
-
performing a first analysis of the original set of video data;
performing a second analysis of the original set of video data;
calculating the target display rate based upon the first and second analyses of the video data.
-
-
17. A method as in claim 7, wherein the step of evaluating further comprises the steps of:
-
analyzing the original set of audio data; and
analyzing the original set of video data; and
calculating the target display rate based upon the analyses of the audio and video data.
-
-
18. A method as in claim 1, wherein the step of creating a modified set of audio data further comprises the step of analyzing the content of the original set of audio data, the modified set of audio data being created based upon, in addition to the target display rate or rates, the content of the audio data.
-
19. A method as in claim 2, wherein the step of creating a modified set of video data further comprises the steps of:
-
establishing a correspondence between the modified audio data set and the original video data set, based upon the correspondence between the modified audio data set and the original audio data set and the correspondence between the original audio data set and the original video data set;
grouping the audio data of the modified audio data set into audio segments having the same amount of data as found in audio segments of the original audio data set;
identifying one or more partial or complete subunits of video data from the original video data set that correspond to each of the audio segments of the modified audio data set, based upon the correspondence between the modified audio data set and the original video data set; and
modifying the subunits of video data in the original video data set as necessary to produce the modified video data set so that there is a one-to-one correspondence between audio segments of the modified audio data set and subunits of video data of the modified video data set.
-
-
20. A method as in claim 1, wherein the step of creating a modified set of video data further comprises the step of eliminating data from the original video data set.
-
21. A method as in claim 1, wherein the step of creating a modified set of video data further comprises the step of adding data to the original video data get.
-
22. A method as in claim 1, wherein the step of creating a modified set of video data further comprises the step of blending data from the original video data set so that the modified video data set has less data than the original video data set.
-
23. A method as in claim 1, wherein the step of creating a modified set of video data further comprises the step of synthesizing data, based on the data in the original video data set, so that the modified video data set has more data than the original video data set.
-
24. A method as in claim 1, further comprising the steps of:
-
generating an audio display from the modified set of audio data; and
generating a video display from the modified set of video data.
-
-
25. A method as in claim 2, wherein the subunits of video data are frames of video data.
-
26. A method as in claim 19, wherein the subunits of video data are frames of video data.
-
27. A method as in claim 1, wherein the step of blending further comprises the step of performing a linear cross fade of the audio samples of the first group of audio samples with the corresponding audio samples of the second group of audio samples.
-
28. A method as in claim 8, wherein:
-
the step of analyzing the original set of audio data further comprises the step of ascertaining the stress with which spoken portions of the audio data are uttered; and
the step of calculating further comprises the step of calculating the target display rate or rates based upon the relative stresses of the spoken portions of the audio data.
-
-
29. A method as in claim 28, wherein the step of ascertaining stress further comprises the step of computing energy terms for the spoken portions or the audio data.
-
30. A method as in claim 8, wherein:
-
the step of analyzing the original set of audio data further comprises the step of ascertaining the speaking rate at which spoken portions of the audio data are uttered; and
the step of calculating further comprises the step of calculating the target display rate or rates based upon the relative speeds of the spoken portions of the audio data.
-
-
31. A method as in claim 30, wherein the step of ascertaining speaking rates further comprises the step of ascertaining spectral changes in the spoken portions of the audio data.
-
32. A method as in claim 8, wherein:
-
the step of analyzing the original set of audio data further comprises the steps of;
ascertaining the stress with which spoken portions of the audio data are uttered;
ascertaining the speaking rate at which spoken portions of the audio data are uttered; and
combining corresponding stresses and speaking rates to produce audio tension values for the spoken portions; and
the step of calculating further comprises the step of calculating the target display rate or rates based upon the audio tension values of the spoken portions of the audio data.
-
-
33. A method as in claim 32, further comprising the step of ascertaining the value of a nominal target display rate specified by a user of the audiovisual display system, wherein the step of calculating further comprises the step of combining the audio tension values with the nominal target display rate to produce the target display rate.
-
34. A method as in claim 1, wherein the step of creating a modified set of audio data further comprises the steps of:
-
(i) dividing the original met of audio data into a plurality of segments, each segment representing a contiguous portion of the original set of audio data that occurs during a specified duration of time, each segment being adjacent to one or two other segments such that there are no gaps between segments and adjacent segments do not overlap;
(ii) selecting a first segment;
(iii) selecting a second segment, the second segment being temporally adjacent to the first segment;
(iv) overlapping an end portion of the first segment with an end portion of the second segment that is adjacent to the first segment, the end portion of the first segment including first segment overlap data and the end portion of the second segment including second segment overlap data;
(v) identifying as part of the modified set of audio data the audio data from the first segment that is not part of the first segment overlap data;
(vi) blending corresponding first segment overlap data and second segment overlap data; and
(vii) determining whether there are additional segments in the original set of audio data that have not been overlapped with an adjacent segment, wherein;
if there are additional segments, the following steps are further performed;
(viii) combining the blended overlap data with the audio data from the second segment that is not part of the second segment overlap data;
(ix) selecting the combined data as a new first segment; and
(x) selecting a new second segment that is temporally adjacent to the new first segment and that has not previously been selected as a segment; and
(xi) repeating steps (i) through (vii); and
if there are not additional segments, the following step is further performed;
(xii) identifying as part of the modified set of audio data the blended data and the audio data from the second segment that is not part of the second segment overlay data.
-
-
35. A system for enabling the apparent display rate of an audiovisual display to be varied from a normal display rate at which an audiovisual display system can generate a display from an original set of audio data and a related original set of video data, comprising:
-
means for defining a correspondence between the original set of audio data and the original set of video data such that the original set of audio data is synchronized with the original set of video data;
means for determining a target display rate or rates for the audiovisual display;
means for creating a modified set of audio data, based upon the target display rate or rates and an evaluation of the content of the original set of audio data, that corresponds to the original set of audio data, wherein the means for creating a modified set of audio data further comprises;
means for identifying a first group of one or more audio samples of the original set of audio data that is adjacent to a second group of one or more audio samples from the original set of audio data that represents a corresponding pitch pulse or pulses that approximate to a predetermined degree the pitch pulse or pulses represented by the first group of audio samples; and
means for blending the audio samples from the first and second groups of audio samples to produce a third group of one or more audio samples that replaces, or is added to, the first and second groups of audio samples; and
means for creating a modified set of video data, based upon the modified set of audio data, the correspondence between the modified set of audio data and the original set of audio data, and the correspondence between the original set of audio data and the original set of video data, such that the modified set of video data is synchronized with the modified set of audio data. - View Dependent Claims (36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68)
means for generating an audio display from the modified set of audio data; and
means for generating a video display from the modified set of video data.
-
-
37. A system as in claim 35, further comprising:
-
means for enabling a user to specify a nominal target display rate; and
means for ascertaining the value of the nominal target display rate.
-
-
38. A system as in claim 35, wherein the means for defining a correspondence between the original set of audio data and the original set of video data further comprises:
-
means for dividing the original set of video data into a plurality of subunits, each subunit of video data representing a duration of time that is substantially equal to the duration of time represented by each other subunit of video data;
means for dividing the original set of audio data into a plurality of segments, each segment representing a duration of time that is approximately coincident with and substantially equal to the duration of time of a corresponding subunit of video data; and
means for identifying corresponding subunits of video data and segments of audio data.
-
-
39. A system as in claim 35, wherein the means for determining a target display rate further comprises means for evaluating the audio and/or video data to automatically determine the value of the target display rate.
-
40. A system as in claim 39, wherein the means for evaluating further comprises:
-
means for analyzing the original set of audio data; and
means for calculating the target display rate based upon the analysis of the original set of audio data.
-
-
41. A system as in claim 39, wherein the means for evaluating further comprises:
-
means for analyzing the original set of video data; and
means for calculating the target display rate based upon the analysis of the video data.
-
-
42. A system as in claim 35, wherein the means for creating a modified set of audio data further comprises means for analyzing the content of the original set of audio data, the modified set of audio data being created based upon, in addition to the target display rate, the content of the audio data.
-
43. A system as in claim 38, wherein the means for creating a modified set of video data further comprises:
-
means for establishing a correspondence between the modified audio data set and the original video data set, based upon the correspondence between the modified audio data set and the original audio data set and the correspondence between the original audio data set and the original video data set;
means for grouping the audio data of the modified audio data set into audio segments having the same amount of data as found in audio segments of the original audio data set;
means for identifying one or more partial or complete subunits of video data from the original video data set that correspond to each of the audio segments of the modified audio data set, based upon the correspondence between the modified audio data set and the original video data set; and
means for modifying the video frames in the original video data set as necessary to produce the modified video data set so that there is a one-to-one correspondence between audio segments of the modified audio data set and video frames of the modified video data set.
-
-
44. A system as in claim 35, wherein at least one target display rate is faster than a normal display rate.
-
45. A system as in claim 35, wherein at least one target display rate is slower than a normal display rate.
-
46. A system as in claim 35, wherein the target display rate is a sequence of target display rates.
-
47. A system as in claim 40, wherein:
-
the means for analyzing the original set of audio data further comprises means for ascertaining the stress with which spoken portions of the audio data are uttered; and
the means for calculating further comprises means for calculating the target display rate or rates based upon the relative stresses of the spoken portions of the audio data.
-
-
48. A system as in claim 47, wherein the means for ascertaining stress further comprises means for computing energy terms for the spoken portions of the audio data.
-
49. A system as in claim 40, wherein:
-
the means for analyzing the original set of audio data further comprises means for ascertaining the speaking rate at which spoken portions of the audio data are uttered; and
the means for calculating further comprises means for calculating the target display rate or rates based upon the relative speeds of the spoken portions of the audio data.
-
-
50. A system as in claim 49, wherein the means for ascertaining speaking rates further comprises means for ascertaining spectral changes in the spoken portions of the audio data.
-
51. A system as in claim 40, wherein:
-
the means for analyzing the original set of audio data further comprises;
means for ascertaining the stress with which spoken portions of the audio data are uttered;
means for ascertaining the speaking rate at which spoken portions of the audio data are uttered; and
means for combining corresponding stresses and speaking rates to produce audio tension values for the spoken portions; and
the means for calculating further comprises means for calculating the target display rate or rates based upon the audio tension values of the spoken portions of the audio data.
-
-
52. A system as in claim 51, further comprising means for ascertaining the value of a nominal target display rate specified by a user of the audiovisual display system, wherein the means for calculating further comprises means for combining the audio tension values with the nominal target display rate to produce the target display rate.
-
53. A system as in claim 35, wherein the means for creating a modified set of audio data further comprises:
-
(i) means for dividing the original set of audio data into a plurality of segments, each segment representing a contiguous portion of the original set of audio data that occurs during a specified duration of time, each segment being adjacent to one or two other segments such that there are no gaps between segments and adjacent segments do not overlap;
(ii) means for selecting a first segment;
(iii) means for selecting a second segment, the second segment being temporally adjacent to the first segment;
(iv) means for overlapping an end portion of the first segment with an end portion of the second segment that is adjacent to the first segment, the end portion of the first segment including first segment overlap data and the end portion of the second segment including second segment overlap data;
(v) means for identifying as part of the modified set of audio data the audio data from the first segment that is not part of the first segment overlap data;
(vi) means for blending corresponding first segment overlap data and second segment overlap data;
(vii) means for determining whether there are additional segments in the original set of audio data that have not been overlapped with an adjacent segment;
(viii) means for performing, if there are are additional segments, the following functions;
combining the blended overlap data with the audio data from the second segment that is not part of the second segment overlap data;
selecting the combined data as a new first segment;
selecting a new second segment that is temporally adjacent to the new first segment and that has not previously been selected as a segment; and
effecting operation of the (iv) means for overlapping, (v) means for identifying, (vi) means for blending corresponding first segment overlap data and second segment overlap data, (vii) means for determining, and (viii) means for performing or (ix) means for performing; and
(ix) means for performing, if there are not additional segments, the function of identifying as part of the modified set of audio data the blended data and the audio data from the second segment that is not part of the second segment overlap data.
-
-
54. A system as in claim 41, wherein:
-
the means for analyzing the original set of video data further comprises means for ascertaining the relative rate of change of the video data along various population-based dimensions; and
the means for calculating further comprises means for calculating the target display rate based upon the change in values of the data along the population-based dimensions.
-
-
55. A system as in claim 41, wherein:
-
the means for analyzing the original set of video data further comprises;
means for ascertaining portions of a video image represented by the original set of video data that change quickly; and
means for ascertaining the frequency with which such quick changes occur; and
the means for calculating further comprises means for calculating the target display rate based upon the occurrence and frequency of quick changes in the video image.
-
-
56. A system as in claim 55, wherein the means for calculating further comprises means for establishing a target display rate for periods of time during which quick changes in the video image occur that is lower than the target display rate during other periods of time.
-
57. A system as in claim 41, wherein:
-
the means for analyzing the original set of video data further comprises means for tracking the motion of objects within a video image represented by the original set of video data; and
the means for calculating further comprises means for calculating the target display rate based upon the appearance of new objects in the video image.
-
-
58. A system as in claim 57, wherein the means for calculating further comprises means for establishing a target display rate for periods of time during which new objects appear in the video image that is lower than the target display rate during other periods of time.
-
59. A system as in claim 39, wherein the means for evaluating further comprises:
-
means for performing a first analysis of the original set of audio data; and
means for performing a second analysis of the original set of audio data; and
means for calculating the target display rate based upon the first and second analyses of the audio data.
-
-
60. A system as in claim 39, wherein the means for evaluating further comprises:
-
means for performing a first analysis of the original set of video data;
means for performing a second analysis of the original set of video data;
means for calculating the target display rate based upon the first and second analyses of the video data.
-
-
61. A system as in claim 39, wherein the means for evaluating further comprises;
-
means for analyzing the original set of audio data; and
means for analyzing the original set of video data; and
means for calculating the target display rate based upon the analyses of the audio and video data.
-
-
62. A system as in claim 35, wherein the means for creating a modified set of video data further comprises means for eliminating data from the original video data set.
-
63. A system as in claim 35, wherein the means for creating a modified set of video data further comprises means for adding data to the original video data set.
-
64. A system as in claim 35, wherein the means for creating a modified set of video data further comprises means for blending data from the original video data set so that the modified video data set has less data than the original video data set.
-
65. A system as in claim 35, wherein the means for creating a modified set of video data further comprises means for synthesizing data, bared on the data in the original video data set, so that the modified video data set has more data than the original video data set.
-
66. A system as in claim 38, wherein the subunits of video data are frames of video data.
-
67. A system as in claim 43, wherein the subunits of video data are frames of video data.
-
68. A system as in claim 35, wherein the means for blending further comprises means for performing a linear cross fade of the audio samples of the first group of audio samples with the corresponding audio samples of the second group of audio samples.
-
69. A computer readable medium encoded with one or more computer programs for enabling the apparent display rates of an audiovisual display to be varied from a normal display rate at which an audiovisual display system can generate a display from an original set of audio data and a related original set of video data, comprising:
-
instructions for defining a correspondence between the original set of audio data and the original set of video data such that the original set of audio data is synchronized with the original set of video data;
instructions for determining a target display rate or rates for the audiovisual display;
instructions for creating a modified set of audio data, based upon the target display rate or rates and an evaluation of the content of the original set of audio data, that corresponds to the original set of audio data, wherein the instructions for creating a modified set of audio data further comprise;
instructions for identifying a first group of one or more audio samples of the original set of audio data that is adjacent to a second group of one or more audio samples from the original set of audio data that represents a corresponding pitch pulse or pulses that approximate to a predetermined degree the pitch pulse or pulses represented by the first group of audio samples; and
instructions for blending the audio samples from the first and second groups of audio samples to produce a third group of one or more audio samples that replaces, or is added to, the first and second groups of audio samples; and
instructions for creating a modified set of video data, based upon the modified set of audio data, the correspondence between the modified set of audio data and the original set of audio data, and the correspondence between the original set of audio data and the original set of video data, such that the modified set of video data is synchronized with the modified set or audio data. - View Dependent Claims (70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102)
instructions for dividing the original set of video data into a plurality of subunits, each subunit of video data representing a duration of time that is substantially equal to the duration of time represented by each other subunit of video data;
instructions for dividing the original set of audio data into a plurality of segments, each segment representing a duration of time that is approximately coincident with and substantially equal to the duration of time of a corresponding subunit of video data; and
instructions for identifying corresponding subunits of video data and segments of audio data.
-
-
71. A computer readable medium as in claim 70, wherein the instructions for creating a modified set of video data further comprise:
-
instructions for establishing a correspondence between the modified audio data set and the original video data set, based upon the correspondence between the modified audio data set and the original audio data set and the correspondence between the original audio data set and the original video data set;
instructions for grouping the audio data of the modified audio data set into audio segments having the same amount of data as found in audio segments of the original audio data set;
instructions for identifying one or more partial or complete subunits of video data from the original video data set that correspond to each of the audio segments of the modified audio data set, based upon the correspondence between the modified audio data set and the original video data set; and
instructions for modifying the video frames in the original video data set as necessary to produce the modified video data set so that there is a one-to-one correspondence between audio segments of the modified audio data set and video frames of the modified video data set.
-
-
72. A computer readable medium as in claim 69, wherein at least one target display rate is faster than a normal display rate.
-
73. A computer readable medium as in claim 69, wherein at least one target display rate is slower than a normal display rate.
-
74. A computer readable medium as in claim 69, wherein the target display rate is a sequence of target display rates.
-
75. A computer readable medium as in claim 69, wherein the instructions for determining a target display rate further comprise instructions for ascertaining the value of a nominal target display rate specified by a user of the audiovisual display system.
-
76. A computer readable medium as in claim 69, wherein the instructions for determining a target display rate further comprise instructions for evaluating the audio and/or video data to automatically determine the value of the target display rate.
-
77. A computer readable medium as in claim 76, wherein the instructions for evaluating further comprises:
-
instructions for analyzing the original set of audio data; and
instructions for calculating the target display rate based upon the analysis of the original set of audio data.
-
-
78. A computer readable medium as in claim 77, wherein:
-
the instructions for analyzing the original set of audio data further comprise instructions for ascertaining the stress with which spoken portions of the audio data are uttered; and
the instructions for calculating further comprise instructions for calculating the target display rate or rates based upon the relative stresses of the spoken portions of the audio data.
-
-
79. A computer readable medium as in claim 78, wherein the instructions for ascertaining stress further comprise instructions for computing energy terms for the spoken portions of the audio data.
-
80. A computer readable medium as in claim 77, wherein:
-
the instructions for analyzing the original set of audio data further comprise instructions for ascertaining the speaking rate at which spoken portions of the audio data are uttered; and
the instructions for calculating further comprise instructions for calculating the target display rate or rates based upon the relative speeds of the spoken portions of the audio data.
-
-
81. A computer readable medium as in claim 80, wherein the instructions for ascertaining speaking rates further comprise instructions for ascertaining spectral changes in the spoken portions of the audio data.
-
82. A computer readable medium as in claim 77, wherein:
-
the instructions for analyzing the original set of audio data further comprises;
instructions for ascertaining the stress with which spoken portions of the audio data are uttered;
instructions for ascertaining the speaking rate at which spoken portions of the audio data are uttered; and
instructions for combining corresponding stresses and speaking rates to produce audio tension values for the spoken portions; and
the instructions for calculating further comprise instructions for calculating the target display rate or rates based upon the audio tension values of the spoken portions of the audio data.
-
-
83. A computer readable medium as in claim 82, further comprising instructions for ascertaining the value of a nominal target display rate specified by a user of the audiovisual display system, wherein the instructions for calculating further comprise instructions for combining the audio tension values with the nominal target display rate to produce the target display rate.
-
84. A computer readable medium as in claim 69, wherein the instructions for creating a modified set of audio data further comprise:
-
(i) instructions for dividing the original set of audio data into a plurality of segments, each segment representing a contiguous portion of the original set or audio data that occurs during a specified duration of time, each segment being adjacent to one or two other segments such that there are no gaps between segments and adjacent segments do not overlap;
(ii) instructions for selecting a first segment;
(iii) instructions for selecting a second segment, the second segment being temporally adjacent to the first segment;
(iv) instructions for overlapping an end portion of the first segment with an end portion of the second segment that is adjacent to the first segment, the end portion of the first segment including first segment overlap data and the end portion of the second segment including second segment overlap data;
(v) instructions for identifying as part of the modified set of audio data the audio data from the first segment that is not part of the first segment overlap data;
(vi) instructions for blending corresponding first segment overlap data and second segment overlap data; and
(vii) instructions for determining whether there are additional segments in the original set of audio data that have not been overlapped with an adjacent segment, wherein;
if there are additional segments, the following instructions are further performed;
(viii) instructions for combining the blended overlap data with the audio data from the second segment that is not part of the second segment overlap data;
(ix) instructions for selecting the combined data as a new first segment; and
(x) instructions for selecting a new second segment that is temporally adjacent to the new first segment and that has not previously been selected as a segment; and
(xi) instructions for repeating instructions (i) through (vii); and
if there are not additional segments, the following instructions are further performed;
(xii) instructions for identifying as part of the modified set of audio data the blended data and the audio data from the second segment that is not part of the second segment overlap data.
-
-
85. A computer readable medium as in claim 76, wherein the instructions for evaluating further comprise:
-
instructions for analyzing the original set of video data; and
instructions for calculating the target display rate based upon the analysis of the video data.
-
-
86. A computer readable medium as in claim 85, wherein:
-
the instructions for analyzing the original set or video data further comprise instructions for ascertaining the relative rate of change of the video data along various population-based dimensions; and
the instructions for calculating further comprise instructions for calculating the target display rate based upon the chance in values of the data along the population-based dimensions.
-
-
87. A computer readable medium as in claim 85, wherein:
-
the instructions for analyzing the original set of video data further comprise;
instructions for ascertaining portions of a video image represented by the original set of video data that change quickly; and
instructions for ascertaining the frequency with which such quick changes occur; and
the instructions for calculating further comprise instructions for calculating the target display rate based upon the occurrence and frequency of quick changes in the video image.
-
-
88. A computer readable medium as in claim 87, wherein the instructions for calculating further comprise instructions for establishing a target display rate for periods of time during which quick changes in the video image occur that is lower than the target display rate during other periods of time.
-
89. A computer readable medium as in claim 85, wherein:
-
the instructions for analyzing the original set of video data further comprise instructions for tracking the motion of objects within a video image represented by the original set of video data; and
the instructions for calculating further comprise instructions for calculating the target display rate based upon the appearance of new objects in the video image.
-
-
90. A computer readable medium as in claim 89, wherein the instructions for calculating further comprise instructions for establishing a target display rate for periods of time during which new objects appear in the video image that is lower than the target display rate during other periods of time.
-
91. A computer readable medium as in claim 76, wherein the instructions for evaluating further comprise:
-
instructions for performing a first analysis of the original set of audio data; and
instructions for performing a second analysis of the original set of audio data; and
instructions for calculating the target display rate based upon the first and second analyses of the audio data.
-
-
92. A computer readable medium as in claim 76, wherein the instructions for evaluating further comprise:
-
instructions for performing a first analysis of the original set of video data;
instructions for performing a second analysis of the original set of video data;
instructions for calculating the target display rate based upon the first and second analyses of the video data.
-
-
93. A computer readable medium as in claim 76, wherein the instructions for evaluating further comprise:
-
instructions for analyzing the original set of audio data; and
instructions for analyzing the original set of video data; and
instructions for calculating the target display rate based upon the analyses of the audio and video data.
-
-
94. A computer readable medium as in claim 69, wherein the instructions for creating a modified set of audio data further comprise instructions for analyzing the content of the original set of audio data, the modified set of audio data being created based upon, in addition to the target display rate or rates, the content of the audio data.
-
95. A computer readable medium as in claim 69, wherein the instructions for creating a modified set of video data further comprise instructions for eliminating data from the original video data set.
-
96. A computer readable medium as in claim 69, wherein the instructions for creating a modified set of video data further comprise instructions for adding data to the original video data set.
-
97. A computer readable medium as in claim 69, wherein the instructions for creating a modified set of video data further comprise instructions for blending data from the original video data set so that the modified video data set has less data than the original video data set.
-
98. A computer readable medium as in claim 69, wherein the instructions for creating a modified set of video data further comprise instructions for synthesizing data, based on the data in the original video data set, so that the modified video data set has more data than the original video data set.
-
99. A computer readable medium as in claim 69, further comprising:
-
instructions for generating an audio display from the modified set of audio data; and
instructions for generating a video display from the modified set of video data.
-
-
100. A computer readable medium as in claim 70, wherein the subunits of video data are frames of video data.
-
101. A computer readable medium as in claim 71, wherein the subunits of video data are frames of video data.
-
102. A computer readable medium as in claim 69, wherein the instructions for blending further comprise instructions for performing a linear cross fade of the audio samples of the first group of audio samples with the corresponding audio samples of the second group of audio samples.
Specification