Applications                     

 

RLGO is a Go program based on reinforcement learning techniques. It combines TD learning and TD search, using a million binary features matching simple patterns of stones. RLGO outperformed traditional (pre-Monte-Carlo) programs in 9x9 Go.

Source MLJ PhD ICML-08 IJCAI-07

Joel Veness’ Meep is the first master-level chess program with an evaluation function that was learnt entirely from self-play, by bootstrapping from deep searches.

NIPS-09

Sylvain Gelly’s MoGo (2007) is a Go program based on Monte-Carlo tree search. It was the world’s first master level 9x9 Computer Go program, and the first program to beat a human professional in even games on 9x9 boards and in handicap games on 19x19 boards.

Source CACM AIJ PhD AAAI-08 ICML-07

In a previous life, I was CTO for Elixir Studios and lead programmer on the PC strategy game Republic: the Revolution.

Trailer

Real-time planning in games with hidden state, using partially observable Monte-Carlo planning (POMCP).

NIPS-10  

Demo Source

Monte-Carlo search in Civilization II beats the built-in AI.

JAIR IJCAI-11 ACL-11

Demo Source

Real-time strategy games are often plagued by pathfinding problems when large numbers of units move around the map. Cooperative pathfinding allows multiple units to coordinate their routes effectively in both space and time.

Demo AIIDE-05 AIW-06

Deep reinforcement learning approaches superhuman performance in poker, without domain knowledge.

arXiv-16

SmooCT wins three silver medals at the Computer Poker Competition.

IJCAI-15 ICML-15

Deep reinforcement learning solves a variety of continuous manipulation and locomotion problems, using a single neural network architecture.

arXiv-17 ICLR-16 NIPS-16 NIPS-15

Demo

A single neural network architecture learns to play many different Atari games to human level, directly from video input and joystick output.

Nature-15 arXiv-17 ICLR-17 NIPS-16 AAAI-16 ICLR-16 ICML-DLW-15 NIPS-DLW-13

Demo Source

AlphaGo Zero becomes the world’s strongest Go player, starting completely from scratch, without any human knowledge.

Nature-17 (info)

AlphaGo defeats a human professional player for the first time, by combining deep neural networks and tree search.

Nature-16 (info) ICLR-15 (older)

First results on the new Starcraft II environment for reinforcement learning

arXiv-17