Temporal Difference Learning - the holy amalgamation of Monte Carlo and dynamic programming. Taking the best of both worlds, TD learning is a faster, model-less, more accurate method of solving reinforcement learning problems. TD is a concept that was first developed in reinforcement learning, and only later branched to other...
[Read More]

Hey there! I am Shirshajit, and here I try to write articles on Artificial Intelligence. My current focus is

### Reinforcement Learning

My other areas of interest include Machine Learning, Computer Vision and Image Adversarial Attacks. I am also a decent amateur pianist who'll someday finish learning fast classical piece (everything is currently on hiatus).

## Roulette - To Play or Not to Play

### RL Basics

Previously, we looked into the Monte Carlo family of reinforcement learning algorithms. These methods estimate the solution to a RL problem by learning from experience. Just by simulation and blind exploration, Monte Carlo agent reaches the optimal solution.
[Read More]

## Monte Carlo - Learning from Experience

### RL Basics

Last time we looked at dynamic programming, methods that compute near-exact solutions to problems using a full-scale model of the environment dynamics. This time we are doing a complete 180. Monte Carlo(MC) methods are everything DP is not.
[Read More]

## Dynamic Programming RL

### RL Basics

Dynamic Programming(DP) in reinforcement learning refers to a set of algorithms that can be used to compute optimal policies when the agent knows everything about its surroundings; i.e. the agent has a perfect model of the environment. Although dynamic programming has a large number of drawbacks, it is the precursor...
[Read More]

## The OpenAI Gym

### RL Basics

Is this working out?
[Read More]