Temporal Difference Learning - the holy amalgamation of Monte Carlo and dynamic programming. Taking the best of both worlds, TD learning is a faster, model-less, more accurate method of solving reinforcement learning problems. TD is a concept that was first developed in reinforcement learning, and only later branched to other...
[Read More]
Hey there! I am Shirshajit, and here I try to write articles on Artificial Intelligence. My current focus is
Reinforcement Learning, Flutter, Data Analytics
My other areas of interest include Machine Learning, Computer Vision, Human Computer Interaction, and Image Adversarial Attacks. I am also a decent amateur pianist who'll someday finish learning fast classical piece (everything is currently on hiatus).
Roulette - To Play or Not to Play
RL Basics
Previously, we looked into the Monte Carlo family of reinforcement learning algorithms. These methods estimate the solution to a RL problem by learning from experience. Just by simulation and blind exploration, Monte Carlo agent reaches the optimal solution.
[Read More]
Monte Carlo - Learning from Experience
RL Basics
Last time we looked at dynamic programming, methods that compute near-exact solutions to problems using a full-scale model of the environment dynamics. This time we are doing a complete 180. Monte Carlo(MC) methods are everything DP is not.
[Read More]
Dynamic Programming RL
RL Basics
Dynamic Programming(DP) in reinforcement learning refers to a set of algorithms that can be used to compute optimal policies when the agent knows everything about its surroundings; i.e. the agent has a perfect model of the environment. Although dynamic programming has a large number of drawbacks, it is the precursor...
[Read More]
The OpenAI Gym
RL Basics
Is this working out?
[Read More]