Course description
The course examines the principles of operation of the main OsP algorithms, which have made it possible to achieve breakthrough results in many tasks: from gaming artificial intelligence to robotics. All the necessary theoretical results are presented with proofs using a unified approach, unified designations and definitions.
The objectives of the course are to provide up—to-date information about reinforcement learning tasks and algorithms for solving them, as well as to explain the difference between algorithms of various types and the reasons for their presentation in specific forms. In the classroom, students will be able to discuss basic training issues with reinforcement, as well as analyze tasks with a teacher.
To master the course, the student needs to know the basics of probability theory, numerical optimization methods, programming in Python, as well as get acquainted with the packages of application programs for mathematical modeling in the Python programming language: SciPy, NumPy, Matplotlib, Scikit-learn, PyTorch, OpenAI Gym.
Instructors
Nikita Evgenievich Yudin