Homework 2 - Model-free prediction and control

Due May 17, 2020 by 11:59pm
Points 25
Submitting a file upload
File Types pdf, ipynb, and txt

This assignment will cover DS4-DS5, and you will among other things solve the Taxi- and MountainCar-problems from HW0.

Additional packages needed:

As in previous tinkering notebooks and homework, you will need the OpenAI gym package. Additionally you will need the GridWorld environment also used in previous notebooks.

The purpose of the assignment:

1. Work with some fundamental concepts via pen-and-paper exercises.

2. Implement and test methods for learning a policy from experience.

The assignment:

Download the notebook here. Download Download the notebook here.

The instructions for the assignment is given in the notebook.

Hand-in:

Hand in the notebook with your solutions, so that the grader can easily run your code. All code and other solutions should also be given in the form of PDF-files. If you prefer, you can hand in the code by exporting your notebook into a PDF, and then write your answers in a different PDF. (It is much easier to give feedback on your solutions if the code is available in a PDF).

Passing requirements:

Each task is awarded a certain amount of points, indicated in the notebook. According to the quality of the answer, each question will receive scores according to 3 levels: 0%, 50%, or 100% of the question's score.

For passing this assignment, you should score at least 60% of the total score of the assignment, that is, you need to obtain least 15 out of 25 points.

The grading will be done through peer-review. Instructions for the peer-review is given here.

Questions:

If you have questions, write in the discussion forum or send an e-mail to per.mattsson@it.uu.se

HW2
Criteria	Ratings	Pts
Ex 1.1 _5428 threshold: pts Edit criterion description	2 to >1.0 pts Full Marks blank 1 to >0.0 pts Half Marks _6934 0 to >0 pts No Marks blank_2 This area will be used by the assessor to leave comments related to this criterion.	pts / 2 pts --
Ex 1.2 _8463 threshold: pts Edit criterion description	2 to >1.0 pts Full Marks _2953 1 to >0.0 pts Half Marks _747 0 to >0 pts No Marks _9708 This area will be used by the assessor to leave comments related to this criterion.	pts / 2 pts --
Ex 1.3 _3860 threshold: pts Edit criterion description	2 to >1.0 pts Full Marks _1564 1 to >0.0 pts Half Marks _4543 0 to >0 pts No Marks _3627 This area will be used by the assessor to leave comments related to this criterion.	pts / 2 pts --
Ex 1.4 _6675 threshold: pts Edit criterion description	2 to >1.0 pts Full Marks _2127 1 to >0.0 pts Half Marks _6234 0 to >0 pts No Marks _3018 This area will be used by the assessor to leave comments related to this criterion.	pts / 2 pts --
Ex 1.5 _1371 threshold: pts Edit criterion description	2 to >1.0 pts Full Marks _7174 1 to >0.0 pts Half Marks _4934 0 to >0 pts No Marks _842 This area will be used by the assessor to leave comments related to this criterion.	pts / 2 pts --
Task 2: Reasoning choice of method _6562 threshold: pts Edit criterion description	2 to >1.0 pts Full Marks _2933 1 to >0.0 pts Half Marks _5547 0 to >0 pts No Marks _8081 This area will be used by the assessor to leave comments related to this criterion.	pts / 2 pts --
Task 2: Code and plot _1382 threshold: pts Edit criterion description	2 to >1.0 pts Full Marks _6705 1 to >0.0 pts Half Marks _2301 0 to >0 pts No Marks _2709 This area will be used by the assessor to leave comments related to this criterion.	pts / 2 pts --
Task 2: Positive average reward _7233 threshold: pts Edit criterion description	2 to >0.0 pts Full Marks _7474 0 to >0 pts No Marks _7713 This area will be used by the assessor to leave comments related to this criterion.	pts / 2 pts --
Task 2: Average reward at least 7 _2374 threshold: pts Edit criterion description	1 to >0.0 pts Full Marks _8356 0 to >0 pts No Marks _8159 This area will be used by the assessor to leave comments related to this criterion.	pts / 1 pts --
Task 3.1 _6986 threshold: pts Edit criterion description	4 to >2.0 pts Full Marks _6559 2 to >0.0 pts Half Marks _1943 0 to >0 pts No Marks _6779 This area will be used by the assessor to leave comments related to this criterion.	pts / 4 pts --
Task 3.2: Code _8268 threshold: pts Edit criterion description	1 to >0.5 pts Full Marks _6009 0.5 to >0.0 pts Half Marks _995 0 to >0 pts No Marks _2268 This area will be used by the assessor to leave comments related to this criterion.	pts / 1 pts --
Task 3.2: Average reward above -200 _5202 threshold: pts Edit criterion description	1 to >0.0 pts Full Marks _3963 0 to >0 pts No Marks _5454 This area will be used by the assessor to leave comments related to this criterion.	pts / 1 pts --
Task 3.2: Discussion of results _2987 threshold: pts Edit criterion description	2 to >1.0 pts Full Marks _4519 1 to >0.0 pts Half Marks _2500 0 to >0 pts No Marks _7117 This area will be used by the assessor to leave comments related to this criterion.	pts / 2 pts --
Description of criterion threshold: 5 pts Edit criterion description Delete criterion row	5 to >0 pts Full Marks blank 0 to >0 pts No Marks blank_2 This area will be used by the assessor to leave comments related to this criterion.	pts / 5 pts --

Rubric

Title:

Find a Rubric

Title

Title
Criteria	Ratings	Pts
Description of criterion threshold: 5 pts Edit criterion description Delete criterion row	5 to >0 pts Full Marks blank 0 to >0 pts No Marks blank_2 This area will be used by the assessor to leave comments related to this criterion.	pts / 5 pts --
Description of criterion threshold: 5 pts Edit criterion description Delete criterion row	5 to >0 pts Full Marks blank 0 to >0 pts No Marks blank_2 This area will be used by the assessor to leave comments related to this criterion.	pts / 5 pts --