CSE4/510: Introduction to Reinforcement Learning
Fall 2019
代写Reinforcement Learning Report can be done, either as a pdf or directly in Jupyter notebook, but it has to follow the report structure,as NIPS template.
Project 1 – Building Reinforcement Learning Environment
Due Date: Sunday, September 29, 11:59pm
1 Project Overview
The goal of the project is to explore and get an experience of building reinforcement learning environments, following the OpenAI Gym standards. The project consists of building deterministic and stochastic environments that are based on Markov decision process, and applying tabular method to solve them.
Part 1 [30 points] – Build a deterministic environment
Defifine a deterministic environment, where P(s0 , r|s, a) = {0, 1}. It has to have more than one state and more then one action.
Environment ideas:
- Tic-Tac-Toe
- Grid world
- Student’s Life
- Any your ideas are welcome
Part 2 [30 points] – Build a stochastic environment 代写Reinforcement Learning
Defifine a stochastic environment, where P s0 ,r P(s0 , r|s, a) = 1. A modifified version of environment defifined in Part 1 can be used.
Part 3 [40 points] – Implement tabular method
Apply a tabular method to solve environments, that were built in Part 1 and Part 2.
Tabular methods options:
- Dynamic programming
- Q-learning
- SARSA
- TD(0)
- Monte Carlo
2 Deliverables
There are two parts in your submission (unless they are combined into Jupyter notebook):
2.1 Report 代写Reinforcement Learning
Report can be done, either as a pdf or directly in Jupyter notebook, but it has to follow the report structure,as NIPS template.
In your report, describe the deterministic/stochastic environments, that were defifined. Show your results after applying an algorithm to solve deterministic and stochastic types of problems, that might include plots and your interpretation of the results.
Show your understanding of:
- Difffferences between the deterministic/stochastic environments
- Example and role of transition-probability matrix
- Main components of the RL environment
- Explain tabular method that was used to solve the problem
2.2 Code
The code of your implementations. Code in Python is the only accepted one for this project. You can submit multiple fifiles, but they all need to have a clear naming. All Python code fifiles should be packed in a ZIP fifile named Y OUR_UBID_project1.zip After extracting the ZIP fifile and executing command python main.py in the fifirst level directory, it should be able to generate all the results and plots you used in your report and print them out in a clear manner.
3 References
- NIPS Styles (docx, tex)
- Google Colab tutorial
- GYM environments
- Lecture slides
- Richard S. Sutton and Andrew G. Barto, “Reinforcement learning: An introduction”, Second Edition,MIT Press, 2019
4 Submission 代写Reinforcement Learning
To submit your work, add your pdf, ipynb/python script to zip fifile Y OUR_UBID_project1.zip and upload it to UBlearns (Assignments section). There is also an option to provide a link to your Google Colab notebook with the project fifiles, in this case submit a .txt fifile with the fifinal link to the project fifiles. After fifinishing theproject, you may be asked to demonstrate it to the instructor if your results and reasoning in your report are not clear enough.
5 Important Information
This project is done individually. The standing policy of the Department is that all students involved inan academic integrity violation (e.g. plagiarism in any way, shape, or form) will receive an F grade for the course. 代写Reinforcement Learning
6 Important Dates
September 29, Sun, 11:59pm – Project 1 is Due
更多代写:Geography地理代写 托福在家考试 java代做 Essay代写标题 final paper代写 gre代考靠谱吗
合作平台:essay代写 论文代写 写手招聘 英国留学生代写