Processing device with intuitive learning capability
First Claim
1. A method of providing learning capability to a processing device having one or more objectives, comprising:
- receiving an action performed by a user;
selecting one of a plurality of processor actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of processor actions;
determining an outcome value based on one or both of said user action and said selected processor action;
updating said action probability distribution using a learning automaton based on said outcome value; and
modifying one or more subsequent processor action selections, outcome value determinations, and action probability distribution updates based on said one or more objectives.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for providing learning capability to processing device, such as a computer game, is provided. One of a plurality of computer actions to be performed on the computer-based device is selected. In the case of a computer game, the computer actions can take the form of moves taken by a computer-manipulated object. A user input indicative of a user action, such as a move by a user-manipulated object, is received. An outcome value of the selected computer action is determined based on the user action. For example, in the case of a computer game, an intersection between the computer-manipulated object and the user-manipulated object may generate an outcome value indicative of a failure, whereas the non-intersection therebetween may generate an outcome value indicative of a success. An action probability distribution that includes probability values corresponding to said plurality of computer actions is updated based on the determined outcome value. The next computer action will be selected based on this updated action probability distribution. For example, the probability value of the last computer action taken can be increased if the outcome value represents a success, thereby increasing the chance that such computer action will be selected in the future. In contrast, the probability value of the last computer action taken can be decreased if the outcome value represents a failure, thereby decreasing the chance that such computer action will be selected in the future. In this manner, the computer-based device learns the strategy of the user. This learning is directed to achieve one or more objectives of the processing device. For example, in the case of a computer game, the objective may be to match the skill level of the player with that of the game.
42 Citations
760 Claims
-
1. A method of providing learning capability to a processing device having one or more objectives, comprising:
-
receiving an action performed by a user;
selecting one of a plurality of processor actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of processor actions;
determining an outcome value based on one or both of said user action and said selected processor action;
updating said action probability distribution using a learning automaton based on said outcome value; and
modifying one or more subsequent processor action selections, outcome value determinations, and action probability distribution updates based on said one or more objectives. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 62)
-
-
45. A processing device having one or more objectives, comprising:
-
a probabilistic learning module having a learning automaton configured for learning a plurality of processor actions in response to a plurality of actions performed by a user; and
an intuition module configured for modifying a functionality of said probabilistic learning module based on said one or more objectives. - View Dependent Claims (46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61)
-
-
63. A method of providing learning capability to a computer game having an objective of matching a skill level of said computer game with a skill level of a game player, comprising:
-
receiving an action performed by said game player;
selecting one of a plurality of game actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of game actions;
determining an outcome value based on said player action and said selected game action;
updating said action probability distribution based on said outcome value; and
modifying one or more subsequent game action selections, outcome value determinations, and action probability distribution updates based on said objective. - View Dependent Claims (64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111)
-
-
112. A computer game having an objective of for matching a skill level of said computer game with a skill level of a game player, comprising:
-
a probabilistic learning module configured for learning a plurality of game actions in response to a plurality of actions performed by a game player; and
an intuition module configured for modifying a functionality of said probabilistic learning module based on said objective. - View Dependent Claims (113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152)
-
-
134. A method of providing learning capability to a processing device, comprising:
-
generating an action probability distribution comprising a plurality of probability values organized among a plurality of action subsets, said plurality of probability values corresponding to a plurality of processor actions;
selecting one of said plurality of action subsets; and
selecting one of said plurality of processor actions from said selected action subset.
-
-
153. A method of providing learning capability to a computer game, comprising:
-
generating an action probability distribution comprising a plurality of probability values organized among a plurality of action subsets, said plurality of probability values corresponding to a plurality of game actions;
selecting one of said plurality of action subsets; and
selecting one of said plurality of game actions from said selected action subset. - View Dependent Claims (154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186)
-
-
187. A method of providing learning capability to a processing device, comprising:
-
generating an action probability distribution using one or more learning algorithms, said action probability distribution comprising a plurality of probability values corresponding to a plurality of processor actions;
modifying said one or more learning algorithms; and
updating said action probability distribution using said modified one or more learning algorithms. - View Dependent Claims (188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209)
-
-
210. A method of providing learning capability to a computer game, comprising:
-
generating an action probability distribution using one or more learning algorithms, said action probability distribution comprising a plurality of probability values corresponding to a plurality of game actions;
modifying said one or more learning algorithms; and
updating said action probability distribution using said modified one or more learning algorithms. - View Dependent Claims (211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250)
-
-
251. A method of matching a skill level of game player with a skill level of a computer game, comprising:
-
receiving an action performed by said game player;
selecting one of a plurality of game actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of game actions;
determining if said selected game action is successful;
determining a current skill level of said game player relative to a current skill level of said computer game; and
updating said action probability distribution using a reward algorithm if said selected game action is successful and said relative skill level is relatively high, or if said selected game action is unsuccessful and said relative skill level is relatively low. - View Dependent Claims (252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262)
-
-
263. A method of matching a skill level of game player with a skill level of a computer game, comprising:
-
receiving an action performed by said game player;
selecting one of a plurality of game actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of game actions;
determining if said selected game action is successful;
determining a current skill level of said game player relative to a current skill level of said computer game; and
updating said action probability distribution using a penalty algorithm if said selected game action is unsuccessful and said relative skill level is relatively high, or if said selected game action is successful and said relative skill level is relatively low. - View Dependent Claims (264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274)
-
-
275. A method of matching a skill level of game player with a skill level of a computer game, comprising:
-
receiving an action performed by said game player;
selecting one of a plurality of game actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of game actions;
determining if said selected game action is successful;
determining a current skill level of said game player relative to a current skill level of said computer game;
updating said action probability distribution using a reward algorithm if said selected game action is successful and said relative skill level is relatively high, or if said selected game action is unsuccessful and said relative skill level is relatively low; and
updating said action probability distribution using a penalty algorithm if said selected game action is unsuccessful and said relative skill level is relatively high, or if said selected game action is successful and said relative skill level is relatively low. - View Dependent Claims (276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286)
-
-
287. A method of matching a skill level of game player with a skill level of a computer game, comprising:
-
receiving an action performed by said game player;
selecting one of a plurality of game actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of game actions;
determining if said selected game action is successful;
determining a current skill level of said game player relative to a current skill level of said computer game;
generating a successful outcome value if said selected game action is successful and said relative skill level is relatively high, or if said selected game action is unsuccessful and said relative skill level is relatively low; and
updating said action probability distribution based on said successful outcome value. - View Dependent Claims (288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298)
-
-
299. A method of matching a skill level of game player with a skill level of a computer game, comprising:
-
receiving an action performed by said game player;
selecting one of a plurality of game actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of game actions;
determining if said selected game action is successful;
determining a current skill level of said game player relative to a current skill level of said computer game;
generating an unsuccessful outcome value if said selected game action is unsuccessful and said relative skill level is relatively high, or if said selected game action is successful and said relative skill level is relatively low; and
updating said action probability distribution based on said unsuccessful outcome value. - View Dependent Claims (300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310)
-
-
311. A method of matching a skill level of game player with a skill level of a computer game, comprising:
-
receiving an action performed by said game player;
selecting one of a plurality of game actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of game actions;
determining if said selected game action is successful;
determining a current skill level of said game player relative to a current skill level of said computer game;
generating a successful outcome value if said selected game action is successful and said relative skill level is relatively high, or if said selected game action is successful and said relative skill level is relatively low;
generating an unsuccessful outcome value if said selected game action is unsuccessful and said relative skill level is relatively high, or if said selected game action is successful and said relative skill level is relatively low; and
updating said action probability distribution based on said successful outcome value and said unsuccessful outcome value. - View Dependent Claims (312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322)
-
-
323. A method of providing learning capability to a processing device, comprising:
-
generating an action probability distribution comprising a plurality of probability values corresponding to a plurality of processor actions; and
transforming said action probability distribution. - View Dependent Claims (324, 325, 326, 327, 328, 329, 330, 331, 332, 333)
-
-
334. A method of providing learning capability to a computer game, comprising:
-
generating an action probability distribution comprising a plurality of probability values corresponding to a plurality of game actions; and
transforming said action probability distribution. - View Dependent Claims (335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354)
-
-
355. A method of providing learning capability to a processing device, comprising:
-
generating an action probability distribution comprising a plurality of probability values corresponding to a plurality of processor actions; and
limiting one or more of said plurality of probability values. - View Dependent Claims (356, 357, 358, 359, 360, 361, 362, 363, 364, 365)
-
-
366. A method of providing learning capability to a computer game, comprising:
-
generating an action probability distribution comprising a plurality of probability values corresponding to a plurality of game actions; and
limiting one or more of said plurality of probability values. - View Dependent Claims (367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379)
-
-
380. A method of providing learning capability to a processing device, comprising:
-
receiving an action performed by a user;
selecting one of a plurality of processor actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of processor actions;
determining an outcome value based on one or both of said user action and said selected processor action;
updating said action probability distribution based on said outcome value; and
repeating said foregoing steps, wherein said action probability distribution is prevented from substantially converging to a single probability value. - View Dependent Claims (381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405)
-
-
406. A processing device, comprising:
-
a probabilistic learning module configured for learning a plurality of processor actions in response to a plurality of actions performed by a user; and
an intuition module configured for preventing said probabilistic learning module from substantially converging to a single processor action. - View Dependent Claims (407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418)
-
-
419. A method of providing learning capability to an electronic device having a function independent of determining an optimum action, comprising:
-
receiving an action performed by a user;
selecting one of a plurality of processor actions, said action selection being based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of processor actions, wherein said selected processor action affects said electronic device function;
determining an outcome value based on said user action and said selected processor action; and
updating said action probability distribution based on said outcome value. - View Dependent Claims (420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444)
-
-
445. A processing device having a function independent of determining an optimum action, comprising:
-
an action selection module configured for selecting one of a plurality of processor actions, said action selection being based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of processor actions, wherein said selected processor action affects said electronic device function;
an outcome evaluation module configured for determining an outcome value based on one or both of said user action and said selected processor action; and
a probability update module configured for updating said action probability distribution based on said outcome value. - View Dependent Claims (446, 447, 448, 449, 450, 451, 452, 453, 454)
-
-
455. A method of providing learning capability to a processing device having one or more objectives, comprising:
-
receiving actions from a plurality of users;
selecting one or more of a plurality of processor actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of processor actions;
determining one or more outcome values based on one or both of said plurality of user actions and said selected one or more processor actions;
updating said action probability distribution using one or more learning automatons based on said one or more outcome values; and
modifying one or more subsequent processor action selections, outcome value determinations, and action probability distribution updates based on said one or more objectives. - View Dependent Claims (456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477)
-
-
478. A method of providing learning capability to a processing device having one or more objectives, comprising:
-
receiving actions from users divided amongst a plurality of user sets;
for each of said user sets;
selecting one or more of a plurality of processor actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of processor actions;
determining one or more outcome values based on one or more actions from said each user set and said selected one or more processor actions;
updating said action probability distribution using a learning automaton based on said one or more outcome values; and
modifying one or more subsequent processor action selections, outcome value determinations, and action probability distribution updates based on said one or more objectives. - View Dependent Claims (479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500)
-
-
501. A processing device having one or more objectives, comprising:
-
a probabilistic learning module having a learning automaton configured for learning a plurality of processor actions in response to actions from a plurality of users; and
an intuition module configured for modifying a functionality of said probabilistic learning module based on said one or more objectives. - View Dependent Claims (502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535)
-
-
536. A method of providing learning capability to a processing device having one or more objectives, comprising:
-
receiving a plurality of user actions;
selecting one or more of a plurality of processor actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of processor actions;
weighting said plurality of user actions;
determining one or more outcome values based on said selected one or more processor actions and said plurality of weighted user actions; and
updating said action probability distribution based on said outcome value. - View Dependent Claims (537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555)
-
-
556. A processing device having one or more objectives, comprising:
-
an action selection module configured for selecting one or more of a plurality of processor actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of processor actions;
an outcome evaluation module configured for weighting a plurality of received user actions, and for determining one or more outcome values based on said selected one or more processor actions and said plurality of weighted user actions; and
a probability update module configured for updating said action probability distribution based on said outcome value. - View Dependent Claims (557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571)
-
-
572. A method of providing learning capability to a processing device having one or more objectives, comprising:
-
receiving a plurality of user actions;
selecting one of a plurality of processor actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of processor actions;
determining a success ratio of said selected processor action relative to said plurality of user actions;
comparing said determined success ratio to a reference success ratio;
determining an outcome value based on said success ratio comparison; and
updating said action probability distribution based on said outcome value. - View Dependent Claims (573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590)
-
-
591. A processing device having one or more objectives, comprising:
-
an action selection module configured for selecting one of a plurality of processor actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of processor actions;
an outcome evaluation module configured for determining a success ratio of said selected processor action relative to a plurality of user actions, for comparing said determined success ratio to a reference success ratio, and for determining an outcome value based on said success ratio comparison; and
a probability update module configured for updating said action probability distribution based on said outcome value. - View Dependent Claims (592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607)
-
-
608. A method of providing learning capability to a processing device having one or more objectives, comprising:
-
receiving actions from a plurality of users;
selecting one of a plurality of processor actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of processor actions;
determining if said selected processor action has a relative success level for a majority of said plurality of users;
determining an outcome value based on said success determination; and
updating said action probability distribution based on said outcome value. - View Dependent Claims (609, 610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624)
-
-
625. A processing device having one or more objectives, comprising:
-
an action selection module configured for selecting one of a plurality of processor actions based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of processor actions;
an outcome evaluation module configured for determining if said selected processor action has a relative success level for a majority of a plurality of users, and for determining an outcome value based on said success determination; and
a probability update module configured for updating said action probability distribution based on said outcome value. - View Dependent Claims (626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639)
-
-
640. A method of providing learning capability to a processing device having one or more objectives, comprising:
-
receiving one or more user actions;
selecting one or more of a plurality of processor actions that are respectively linked to a plurality of user parameters, said selection being based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of linked processor actions;
linking said one or more selected process actions with one or more of said plurality of user parameters;
determining one or more outcome values based on said one or more linked processor actions and said one or more user actions; and
updating said action probability distribution based on said one or more outcome values. - View Dependent Claims (641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660)
-
-
661. A processing device having one or more objectives, comprising:
-
an action selection module configured for selecting one or more of a plurality of processor actions that are respectively linked to a plurality of user parameters, said selection being based on an action probability distribution comprising a plurality of probability values corresponding to said plurality of linked processor actions;
an outcome evaluation module configured for linking said one or more selected process actions with one or more of said plurality of user parameters, and for determining one or more outcome values based on said one or more linked processor actions and one or more user actions; and
a probability update module configured for updating said action probability distribution based on said one or more outcome values. - View Dependent Claims (662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679)
-
-
680. A method of providing learning capability to a phone number calling system having an objective of anticipating called phone numbers, comprising:
-
generating a phone list containing at least a plurality of listed phone numbers and a phone number probability distribution comprising a plurality of probability values corresponding to said plurality of listed phone numbers;
selecting a set of phone numbers from said plurality of listed phone numbers based on said phone number probability distribution;
generating a performance index indicative of a performance of said phone number calling system relative to said objective; and
modifying said phone number probability distribution based on said performance index. - View Dependent Claims (681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714)
-
-
715. A phone number calling system having an objective of anticipating called phone numbers, comprising:
-
a probabilistic learning module configured for learning favorite phone numbers of a user in response to phone calls; and
an intuition module configured for modifying a functionality of said probabilistic learning module based on said objective. - View Dependent Claims (716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739)
-
-
740. A phone number calling system having an objective of anticipating called phone numbers, comprising:
-
a probabilistic learning module configured for learning favorite phone numbers of a user in response to phone calls; and
an intuition module configured for modifying a functionality of said probabilistic learning module based on said objective. - View Dependent Claims (741, 742, 743, 744, 745, 746, 747)
-
-
748. A method of providing learning capability to a phone number calling system, comprising:
-
receiving a plurality of phone numbers;
maintaining a phone list containing said plurality of phone numbers and a plurality of priority values respectively associated with said plurality of phone numbers;
selecting a set of phone numbers from said plurality of listed phone numbers based on said plurality of priority values;
communicating said phone number set to a user. - View Dependent Claims (749, 750, 751, 752, 753, 754, 755, 756, 757, 758, 759, 760)
-
Specification