INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM

US 20100318478A1
Filed: 06/01/2010
Published: 12/16/2010
Est. Priority Date: 06/11/2009
Status: Abandoned Application

First Claim

Patent Images

1. An information processing device comprising:

calculating means configured to calculate a current-state series candidate that is a state series for an agent capable of actions reaching the current state, based on a state transition probability model obtained byperforming learning of said state transition probability model stipulated bya state transition probability that a state will be transitioned according to each of actions performed by an agent capable of actions, andan observation probability that a predetermined observation value will be observed from said state,usingan action performed by said agent, andan observation value observed at said agent when said agent performs an action; and

determining means configured to determine an action to be performed next by said agent using said current-state series candidate in accordance with a predetermined strategy.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An information processing device includes: a calculating unit configured to calculate a current-state series candidate that is a state series for an agent capable of actions reaching the current state, based on a state transition probability model obtained by performing learning of the state transition probability model stipulated by a state transition probability that a state will be transitioned according to each of actions performed by an agent capable of actions, and an observation probability that a predetermined observation value will be observed from the state, using an action performed by the agent, and an observation value observed at the agent when the agent performs an action; and a determining unit configured to determine an action to be performed next by the agent using the current-state series candidate in accordance with a predetermined strategy.

Citations

16 Claims

1. An information processing device comprising:
- calculating means configured to calculate a current-state series candidate that is a state series for an agent capable of actions reaching the current state, based on a state transition probability model obtained byperforming learning of said state transition probability model stipulated bya state transition probability that a state will be transitioned according to each of actions performed by an agent capable of actions, andan observation probability that a predetermined observation value will be observed from said state,usingan action performed by said agent, andan observation value observed at said agent when said agent performs an action; and
  
  determining means configured to determine an action to be performed next by said agent using said current-state series candidate in accordance with a predetermined strategy.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The information processing device according to claim 1, wherein said determining means determine an action in accordance with a strategy for increasing information of an unknown situation not obtained at said state transition probability model.
  - 3. The information processing device according to claim 2, wherein said calculating means estimate, with an action series of actions performed by said agent, and an observation value series of observation values observed at said agent when said actions are performed as an action series for recognition for recognizing the situation of an agent, and an observation value series, one or more state series for recognition that are state series wherein state transition occurs in which said action series for recognition and said observation value series are observed, and select one or more candidates of said current-state series out of one or more of said state series for recognition;
    - and wherein said determining means detect an action of which the state transition probability of state transition from a final state that is the final state of said current-state series candidate to an immediate before state that is a state immediately before said final state is the maximum as a return action wherein state transition for returning the state to said immediate before state regarding each of one or more candidates of said current-state series, obtain the sum of the state transition probabilities of state transitions with said final state as the transition source for each action as an action suitability degree representing suitability for performing the action thereof regarding each of one or more candidates of said current-state series, obtain an action other than said return action of actions of which said action suitability degree is equal to or greater than a predetermined threshold, as an action candidate to be performed next regarding each of one or more candidates of said current-state series, and determine an action to be performed next out of said action candidates to be performed next.
  - 4. The information processing device according to claim 1, wherein said determining means determine an action in accordance with a strategy for increasing information whereby the situation of said agent is recognizable.
  - 5. The information processing device according to claim 4, wherein said calculating means estimate, with an action series of actions performed by said agent, and an observation value series of observation values observed at said agent when said actions are performed as an action series for recognition for recognizing the situation of an agent, and an observation value series, one or more state series for recognition that are state series wherein state transition occurs in which said action series for recognition and said observation value series are observed, and select one or more candidates of said current-state series out of one or more of said state series for recognition;
    - and wherein said determining means detect an action of which the state transition probability of state transition from a final state that is the final state of said current-state series candidate to an immediate before state that is a state immediately before said final state is the maximum as an action to be performed next regarding each of one or more candidates of said current-state series, and determine an action to be performed next out of said action candidates to be performed next.
  - 6. The information processing device according to claim 1, wherein said determining means determine an action in accordance with a strategy for performing an action performed by said agent in a known situation similar to the current situation of said agent of known situations obtained at said state transition probability model.
  - 7. The information processing device according to claim 6, wherein said calculating means estimate, with an action series of actions performed by said agent, and an observation value series of observation values observed at said agent when said actions are performed as an action series for recognition for recognizing the situation of an agent, and an observation value series, one or more state series for recognition that are state series wherein state transition occurs in which said action series for recognition and said observation value series are observed, and select one or more candidates of said current-state series out of one or more of said state series for recognition;
    - and wherein said determining means obtain the sum of the state transition probabilities of state transitions with a final state that is the final state of said current-state series candidate as the transition source for each action as an action suitability degree representing suitability for performing the action thereof regarding each of one or more candidates of said current-state series, obtain an action of which said action suitability degree is equal to or greater than a predetermined threshold, as an action candidate to be performed next regarding each of one or more candidates of said current-state series, and determine an action to be performed next out of said action candidates to be performed next.
  - 8. The information processing device according to claim 1, wherein said determining means select a strategy for determining an action out of a plurality of strategies, and determine an action in accordance with the strategy thereof.
  - 9. The information processing device according to claim 8, wherein said determining means select a strategy for determining an action out of a strategy for increasing information of an unknown situation not obtained at said state transition probability model, and a strategy for increasing information whereby the situation of said agent is recognizable.
  - 10. The information processing device according to claim 9, wherein said determining means select a strategy based on elapsed time since an unknown situation not obtained at said state transition probability model.
  - 11. The information processing device according to claim 9, wherein said determining means select a strategy based on the time of a known situation obtained at said state transition probability model, or the percentage of an unknown situation not obtained at said state transition probability model, of imminent predetermined time.
  - 12. The information processing device according to claim 1, wherein said calculating means repeat to estimate, with an action series of actions performed by said agent, and an observation value series of observation values observed at said agent when said actions are performed as an action series for recognition for recognizing the situation of an agent, and an observation value series, a most likely state series that is a state series where state transition occurs in which likelihood for said action series for recognition, and said observation value series being observed is the highest, and to determine whether the situation of said agent is a known situation obtained at said state transition probability model, or an unknown situation not obtained at said state transition probability model based on said most likely state series while increasing the series lengths of said action series for recognition and said observation value series until determination is made that the situation of said agent is said unknown situation, estimate one or more of state series for recognition that are state series where state transition occurs in which said action series for recognition and said observation value series, of which the series lengths are shorter than said series lengths at the time of determination being made that the situation of said agent is said unknown situation by one sample worth are observed, and select one or more candidates of said current state series out of said one or more state series for recognition;
    - and wherein said determining means determine an action using one or more candidates of said current state series.

13. An information processing method comprising the steps of:
- calculating of a current-state series candidate that is a state series for an agent capable of actions reaching the current state, based on a state transition probability model obtained byperforming learning of said state transition probability model stipulated bya state transition probability that a state will be transitioned according to each of actions performed by an agent capable of actions, andan observation probability that a predetermined observation value will be observed from said state,usingan action performed by said agent, andan observation value observed at said agent when said agent performs an action; and
  
  determining an action to be performed next by said agent using said current-state series candidate in accordance with a predetermined strategy.

14. A program causing a computer serving as:
- calculating means configured to calculate a current-state series candidate that is a state series for an agent capable of actions reaching the current state, based on a state transition probability model obtained byperforming learning of said state transition probability model stipulated bya state transition probability that a state will be transitioned according to each of actions performed by an agent capable of actions, andan observation probability that a predetermined observation value will be observed from said state,usingan action performed by said agent, andan observation value observed at said agent when said agent performs an action; and
  
  determining means configured to determine an action to be performed next by said agent using said current-state series candidate in accordance with a predetermined strategy.

15. An information processing device comprising:
- a calculating unit configured to calculate a current-state series candidate that is a state series for an agent capable of actions reaching the current state, based on a state transition probability model obtained byperforming learning of said state transition probability model stipulated bya state transition probability that a state will be transitioned according to each of actions performed by an agent capable of actions, andan observation probability that a predetermined observation value will be observed from said state,usingan action performed by said agent, andan observation value observed at said agent when said agent performs an action; and
  
  a determining unit configured to determine an action to be performed next by said agent using said current-state series candidate in accordance with a predetermined strategy.

16. A program causing a computer serving as:
- a calculating unit configured to calculate a current-state series candidate that is a state series for an agent capable of actions reaching the current state, based on a state transition probability model obtained byperforming learning of said state transition probability model stipulated bya state transition probability that a state will be transitioned according to each of actions performed by an agent capable of actions, andan observation probability that a predetermined observation value will be observed from said state,usingan action performed by said agent, andan observation value observed at said agent when said agent performs an action; and
  
  a determining unit configured to determine an action to be performed next by said agent using said current-state series candidate in accordance with a predetermined strategy.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sony Corporation (Sony Group Corp.)
Original Assignee
Sony Corporation (Sony Group Corp.)
Inventors
Sabe, Kohtaro, Noda, Kuniaki, Kawamoto, Kenta, Yoshiike, Yukiko

Application Number

US12/791,240
Publication Number

US 20100318478A1
Time in Patent Office

Days
Field of Search
US Class Current

706/12
CPC Class Codes

G06F 18/295   Markov models or related mo...

G06N 3/006   based on simulated virtual ...

G06V 20/10   Terrestrial scenes scenes u...

G06V 40/20   Movements or behaviour, e.g...

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links