Audio volume control device, control method and program
First Claim
Patent Images
1. An information processing apparatus comprising:
- an input circuit for reception of capture image data and captured sound data corresponding to an environment in which content is reproduced;
a processor that;
processes the captured image data and the captured sound data corresponding to the environment in which content is reproduced;
detects a user based on the captured image data;
analyzes a situation of the environment based on a result of the detection and the captured sound data;
determines a direction in the captured image data to a source of the captured sound data;
determines if the direction in the captured image data to the source of the captured sound data is coincident with a location of a face of a human detected in the captured image data; and
controls an audio volume corresponding to reproduced content based on a result of the analyzing,whereinwhen a sound level corresponding to the captured sound data is greater than or equal to a predetermined threshold value,the processor controls the audio volume corresponding to the reproduced content to remain unchanged when it is determined that the direction in the captured image data corresponding to the source of the captured sound data which is a human voice is coincident with the location of the face detected in the captured image data, andthe processor controls the audio volume corresponding to the reproduced content to increase when the processor determines that the direction in the captured image data corresponding to the source of the captured sound data which is a human voice is not coincident with the location of the face detected in the captured image data,when the processor increases the audio volume when the processor determined that the direction in the captured image data corresponding to the source of the captured sound data which is a human voice is not coincident with the location of the face detected in the captured image data, the processor determines a volume increase amount based on the captured image data of a distance between the location of the detected user and the source of the captured sound data, andin an event of a manual adjustment of a setting, the processor once an environmental situation is over automatically returns to a previous setting before the environmental situation occurred.
1 Assignment
0 Petitions
Accused Products
Abstract
An information processing apparatus includes a processor that receives captured image data and captured sound data corresponding to an environment in which content is reproduced and detects a user based on the captured image data and analyzes a situation of the environment based on a result of the detection and the captured sound data and controls an audio volume corresponding to reproduced content based on a result of the analyzing.
62 Citations
20 Claims
-
1. An information processing apparatus comprising:
-
an input circuit for reception of capture image data and captured sound data corresponding to an environment in which content is reproduced; a processor that; processes the captured image data and the captured sound data corresponding to the environment in which content is reproduced; detects a user based on the captured image data; analyzes a situation of the environment based on a result of the detection and the captured sound data; determines a direction in the captured image data to a source of the captured sound data; determines if the direction in the captured image data to the source of the captured sound data is coincident with a location of a face of a human detected in the captured image data; and controls an audio volume corresponding to reproduced content based on a result of the analyzing, wherein when a sound level corresponding to the captured sound data is greater than or equal to a predetermined threshold value, the processor controls the audio volume corresponding to the reproduced content to remain unchanged when it is determined that the direction in the captured image data corresponding to the source of the captured sound data which is a human voice is coincident with the location of the face detected in the captured image data, and the processor controls the audio volume corresponding to the reproduced content to increase when the processor determines that the direction in the captured image data corresponding to the source of the captured sound data which is a human voice is not coincident with the location of the face detected in the captured image data, when the processor increases the audio volume when the processor determined that the direction in the captured image data corresponding to the source of the captured sound data which is a human voice is not coincident with the location of the face detected in the captured image data, the processor determines a volume increase amount based on the captured image data of a distance between the location of the detected user and the source of the captured sound data, and in an event of a manual adjustment of a setting, the processor once an environmental situation is over automatically returns to a previous setting before the environmental situation occurred. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A method performed by an information processing apparatus, the method comprising:
-
receiving captured image data and captured sound data corresponding to an environment in which content is reproduced; detecting a user based on the captured image data; determining a direction in the captured image data to the source of the captured sound data; determining if the direction in the captured image data to the source of the captured sound data is coincident with a location of a face of a human detected in the captured image data; analyzing a situation of the environmental based on a result of the detection and the captured sound data; and controlling an audio volume corresponding to reproduced content based on a result of the analyzing, wherein when a sound level corresponding to the captured sound data is greater than or equal to a predetermined threshold value, the controlling includes controlling the audio volume corresponding to the reproduced content to remain unchanged when it is determined that the direction in the captured image data corresponding to the source of the captured sound data which is at human voice is coincident with the location of the face detected in the captured image data, and the controlling includes controlling the audio volume corresponding to the reproduced content to increase when the direction in the captured image data corresponding to the source of the captured sound data which is human voice is not coincident with the location of the face detected in the captured image data, when the controlling includes controlling the audio volume to increase when the direction in the captured image date corresponding to the source of the captured sound data which is a human voice is not coincident with the location of the face detected in the captured image data, the controlling further includes determining a volume increase amount based on the captured image data of a distance between the location of the detected user and the source of the captured sound data, and in an event of a manual adjustment of a setting, the controlling once an environmental situation is over automatically returns to a previous setting before the environmental situation occurred. - View Dependent Claims (18)
-
-
19. A non-transitory computer-readable medium including computer-program instructions, which when executed by an information processing apparatus, cause the information processing apparatus to perform a method comprising:
-
receiving captured image data and captured sound data corresponding to an environment in which content is reproduced; detecting a user based on the captured image data; determining a direction in the captured image data to the source of the captured sound data; determining if the direction in the captured image data to the source of the captured sound data is coincident with a location of a face of a human detected in the captured image data; analyzing a situation of the environment based on a result of the detection ad the captured sound data; and controlling an audio volume corresponding to reproduced content based on a result of the analyzing, wherein when a sound level corresponding to the captured sound data is greater than or equal to a predetermined threshold value, the controlling includes controlling the audio volume corresponding to the reproduced content to remain unchanged when it is determined that the direction in the captured image data corresponding to the source of the captured sound data which is a human voice is coincident with the location of the face detected in the captured image data, and the controlling includes controlling the audio volume corresponding to the reproduced content to increase when the direction corresponding to the source of the captured sound data which is a human voice is not coincident with the location of the face detected in the captured image data, and when the controlling includes controlling the audio volume to increase when the direction in the captured image data corresponding to the source of the captured sound data which is a human voice is not coincident with the location of the face detected in the captured image data, the controlling further includes determining a volume increased amount based on the captured image data of a distance between the location of the detected user and the source of the captured sound data, and in an event of a manual adjustment of a setting, the controlling once an environmental situation is over automatically returns to a previous setting before the environmental situation occurred. - View Dependent Claims (20)
-
Specification