GENERATING APPARATUS, SELECTING APPARATUS, GENERATION METHOD, SELECTION METHOD AND PROGRAM
1 Assignment
0 Petitions
Accused Products
Abstract
A generating apparatus is arranged to generate a set of gain vectors with respect to a transition model having observable visible states and unobservable hidden states and expressing a transition from a present visible state to a subsequent visible state according to an action, the set of gain vectors being generated for each visible state and used for calculation of a cumulative expected gain at and after a reference point in time. The apparatus includes a generation section for recursively generating, by retroacting from a future point in time to the reference point in time, a set of gain vectors containing at least one gain vector including a component of a cumulative expected gain with respect to each hidden state, from which set of gain vectors the gain vector giving the maximum of the cumulative expected gain is to be selected.
-
Citations
20 Claims
-
1-16. -16. (canceled)
-
17. An apparatus arranged to generate a set of gain vectors with respect to a transition model having observable visible states and unobservable hidden states and expressing a transition from a present visible state to a subsequent visible state according to an action, the set of gain vectors being generated for each visible state and used for calculation of a cumulative expected gain at and after a reference point in time, the apparatus comprising:
a generation section for recursively generating, by retroacting from a future point in time to the reference point in time, the set of gain vectors containing at least one gain vector including a component of a cumulative expected gain with respect to each hidden state, from which set of gain vectors the gain vector giving the maximum of the cumulative expected gain is to be selected.
-
18. A program product for causing a computer to function as a generation apparatus arranged to generate a set of gain vectors with respect to a transition model having observable visible states and unobservable hidden states and expressing a transition from a present visible state to a subsequent visible state according to an action, the set of gain vectors being generated for each visible state and used for calculation of a cumulative expected gain at and after a reference point in time, the program product being executed to cause the computer to function as:
a generation section for recursively generating, by retroacting from a future point in time to the reference point in time, the set of gain vectors containing at least one gain vector including a component of a cumulative expected gain with respect to each hidden state, from which set of gain vectors the gain vector giving the maximum of the cumulative expected gain is to be selected.
-
19. An apparatus arranged to select an optimum action in a transition model having observable visible states and unobservable hidden states and expressing a transition from a present visible state to a subsequent visible state according to an action, the apparatus comprising:
-
an acquisition section for obtaining, with respect to each visible state, a set of gain vectors containing at least one gain vector including a component of a cumulative expected gain with respect to each hidden state and used for calculation of a cumulative expected gain at and after a reference point in time; a gain selection section for selecting, from the gain vectors according to the present visible state, the gain vector maximizing the cumulative expected gain with respect to a probability distribution over the hidden states at the present point in time; and an action selection section for selecting an action corresponding to the selected gain vector as an optimum action.
-
-
20. A program product for causing a computer to function as a selecting apparatus arranged to select an optimum action in a transition model having observable visible states and unobservable hidden states and expressing a transition from a present visible state to a subsequent visible state according to an action, the program product being executed to cause the computer to function as:
-
an acquisition section for obtaining, with respect to each visible state, a set of gain vectors containing at least one gain vector including a component of a cumulative expected gain with respect to each hidden state and used for calculation of a cumulative expected gain at and after a reference point in time; a gain selection section for selecting, from the gain vectors according to the present visible state, the gain vector maximizing the cumulative expected gain with respect to a probability distribution over the hidden states at the present point in time; and an action selection section for selecting an action corresponding to the selected gain vector as an optimum action.
-
Specification