Skip to content

albertogg99/p1-navigation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

image1

Deep Reinforcement Learning Nanodegree

Value Based Methods | Project: Navigation

Author : Alberto García García

Introduction

The problem to be solved in this project consists in training an agent through reinforcement learning algorithms. The goal of the agent is to collect as many yellow bananas as possible while avoiding blue bananas in a large, square world.

image2

The state space has 37 dimensions and contains the agent's velocity, along with ray-based perception of objects around agent's forward direction. Given this information, the agent has to learn how to best select actions. Four discrete actions are available, corresponding to:

  • 0 - move forward.
  • 1 - move backward.
  • 2 - turn left.
  • 3 - turn right.

Yellow bananas provides a reward of +1, while blue ones provide a reward of -1. The problem is considered to be solved when the average score of the last 100 episodes is equal or greater than 13 points.

Guidelines

The whole training of the agent is implemented in the Navigation.ipynb notebook. You can either visualize the last execution or run it by yourself in a Jupyter server. To do so, you can take the following steps to fulfill the requirements:

  1. Create (and activate) a new environment with Python 3.6.

    • Linux or Mac:
    conda create --name drlnd python=3.6
    source activate drlnd
    • Windows:
    conda create --name drlnd python=3.6 
    activate drlnd
  2. Follow the instructions in this repository to perform a minimal install of OpenAI gym.

  3. Install the dependencies in the python/ folder.

    cd ./python/
    pip install .
  4. Create an IPython kernel for the drlnd environment.

    python -m ipykernel install --user --name drlnd --display-name "drlnd"
  5. Download the Unity Environment and unzip it inside the solution/ directory. (If you are using Windows 64 bits, you can skip this step since the repository already contains the environment files. Otherwise, delete the environment files and place the ones matching your OS).

  6. Before running the notebook, change the kernel to match the drlnd environment by using the drop-down Kernel menu.

    image3

Results

The agent is able to solve the problem in 395 episodes. The weights of the deep Q-network are stored in agent.pth. Here's the agent's score history:

image4

About

Udacidy DRL Nanodegree - Project: Navigation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published