代写Reinforcement Learning-CSE4/510代写-强化学习代写
代写Reinforcement Learning

代写Reinforcement Learning-CSE4/510代写-强化学习代写

CSE4/510: Introduction to Reinforcement Learning

Fall 2019

代写Reinforcement Learning Report can be done, either as a pdf or directly in Jupyter notebook, but it has to follow the report structure,as NIPS template.

Project 1 – Building Reinforcement Learning Environment

Due Date: Sunday, September 29, 11:59pm

1 Project Overview

The goal of the project is to explore and get an experience of building reinforcement learning environments, following the OpenAI Gym standards. The project consists of building deterministic and stochastic environments that are based on Markov decision process, and applying tabular method to solve them.

Part 1 [30 points] – Build a deterministic environment

Defifine a deterministic environment, where P(s0 , r|s, a) = {0, 1}. It has to have more than one state and more then one action.

Environment ideas:

  • Tic-Tac-Toe
  • Grid world
  • Student’s Life
  • Any your ideas are welcome

Part 2 [30 points] – Build a stochastic environment  代写Reinforcement Learning

Defifine a stochastic environment, where P s0 ,r P(s0 , r|s, a) = 1. A modifified version of environment defifined in Part 1 can be used.

Part 3 [40 points] – Implement tabular method

Apply a tabular method to solve environments, that were built in Part 1 and Part 2.

Tabular methods options:

  • Dynamic programming
  • Q-learning
  • SARSA
  •  TD(0)
  • Monte Carlo

2 Deliverables

There are two parts in your submission (unless they are combined into Jupyter notebook):

2.1 Report  代写Reinforcement Learning

Report can be done, either as a pdf or directly in Jupyter notebook, but it has to follow the report structure,as NIPS template.

In your report, describe the deterministic/stochastic environments, that were defifined. Show your results after applying an algorithm to solve deterministic and stochastic types of problems, that might include plots and your interpretation of the results.

Show your understanding of:

  • Difffferences between the deterministic/stochastic environments
  • Example and role of transition-probability matrix
  • Main components of the RL environment
  • Explain tabular method that was used to solve the problem
代写Reinforcement Learning
代写Reinforcement Learning

2.2 Code

The code of your implementations. Code in Python is the only accepted one for this project. You can submit multiple fifiles, but they all need to have a clear naming. All Python code fifiles should be packed in a ZIP fifile named Y OUR_UBID_project1.zip After extracting the ZIP fifile and executing command python main.py in the fifirst level directory, it should be able to generate all the results and plots you used in your report and print them out in a clear manner.

3 References

  • NIPS Styles (docx, tex)
  • Google Colab tutorial
  • GYM environments
  • Lecture slides
  • Richard S. Sutton and Andrew G. Barto, “Reinforcement learning: An introduction”, Second Edition,MIT Press, 2019

4 Submission  代写Reinforcement Learning

To submit your work, add your pdf, ipynb/python script to zip fifile Y OUR_UBID_project1.zip and upload it to UBlearns (Assignments section). There is also an option to provide a link to your Google Colab notebook with the project fifiles, in this case submit a .txt fifile with the fifinal link to the project fifiles. After fifinishing theproject, you may be asked to demonstrate it to the instructor if your results and reasoning in your report are not clear enough.

5 Important Information

This project is done individually. The standing policy of the Department is that all students involved inan academic integrity violation (e.g. plagiarism in any way, shape, or form) will receive an F grade for the course. 代写Reinforcement Learning

6 Important Dates

September 29, Sun, 11:59pm – Project 1 is Due

 

更多代写:Geography地理代写  托福在家考试  java代做  Essay代写标题  final paper代写  gre代考靠谱吗

合作平台:essay代写 论文代写 写手招聘 英国留学生代写

代写Reinforcement Learning
代写Reinforcement Learning

发表回复