代写Reinforcement Learning-CSE4/510代写-强化学习代写

CSE4/510: Introduction to Reinforcement Learning

Fall 2019

代写Reinforcement Learning Report can be done, either as a pdf or directly in Jupyter notebook, but it has to follow the report structure,as NIPS template.

Project 1 – Building Reinforcement Learning Environment

Due Date: Sunday, September 29, 11:59pm

1 Project Overview

The goal of the project is to explore and get an experience of building reinforcement learning environments, following the OpenAI Gym standards. The project consists of building deterministic and stochastic environments that are based on Markov decision process, and applying tabular method to solve them.

Part 1 [30 points] – Build a deterministic environment

Defifine a deterministic environment, where P(s0 , r|s, a) = {0, 1}. It has to have more than one state and more then one action.

Environment ideas:

Tic-Tac-Toe
Grid world
Student’s Life
Any your ideas are welcome

Part 2 [30 points] – Build a stochastic environment 代写Reinforcement Learning

Defifine a stochastic environment, where P s0 ,r P(s0 , r|s, a) = 1. A modifified version of environment defifined in Part 1 can be used.

Part 3 [40 points] – Implement tabular method

Apply a tabular method to solve environments, that were built in Part 1 and Part 2.

Tabular methods options:

Dynamic programming
Q-learning
SARSA
TD(0)
Monte Carlo

2 Deliverables

There are two parts in your submission (unless they are combined into Jupyter notebook):

2.1 Report 代写Reinforcement Learning

Report can be done, either as a pdf or directly in Jupyter notebook, but it has to follow the report structure,as NIPS template.

In your report, describe the deterministic/stochastic environments, that were defifined. Show your results after applying an algorithm to solve deterministic and stochastic types of problems, that might include plots and your interpretation of the results.

Show your understanding of:

Difffferences between the deterministic/stochastic environments
Example and role of transition-probability matrix
Main components of the RL environment
Explain tabular method that was used to solve the problem

2.2 Code

The code of your implementations. Code in Python is the only accepted one for this project. You can submit multiple fifiles, but they all need to have a clear naming. All Python code fifiles should be packed in a ZIP fifile named Y OUR_UBID_project1.zip After extracting the ZIP fifile and executing command python main.py in the fifirst level directory, it should be able to generate all the results and plots you used in your report and print them out in a clear manner.

3 References

NIPS Styles (docx, tex)
Google Colab tutorial
GYM environments
Lecture slides
Richard S. Sutton and Andrew G. Barto, “Reinforcement learning: An introduction”, Second Edition,MIT Press, 2019

4 Submission 代写Reinforcement Learning

To submit your work, add your pdf, ipynb/python script to zip fifile Y OUR_UBID_project1.zip and upload it to UBlearns (Assignments section). There is also an option to provide a link to your Google Colab notebook with the project fifiles, in this case submit a .txt fifile with the fifinal link to the project fifiles. After fifinishing theproject, you may be asked to demonstrate it to the instructor if your results and reasoning in your report are not clear enough.

5 Important Information

This project is done individually. The standing policy of the Department is that all students involved inan academic integrity violation (e.g. plagiarism in any way, shape, or form) will receive an F grade for the course. 代写Reinforcement Learning

6 Important Dates

September 29, Sun, 11:59pm – Project 1 is Due

合作平台：essay代写论文代写写手招聘英国留学生代写

CSE4/510: Introduction to Reinforcement Learning

Fall 2019

Project 1 – Building Reinforcement Learning Environment

1 Project Overview

2 Deliverables

2.1 Report 代写Reinforcement Learning

2.2 Code

3 References

4 Submission 代写Reinforcement Learning

5 Important Information

6 Important Dates

你可能也喜欢

代写数据分析 – Python编程代写 – Data Analysis代写

托福替考 – 英语小白备考托福如何快速提分 – 留学生代写

计算机代写 – CA-10代写 – assignment代写 – M136 S21代写

发表回复 取消回复

发表回复取消回复