Hands-on Tutorials

A Practical Guide to Deep Q-Networks

Reinforcement Learning is an exciting field of Machine Learning that’s attracting a lot of attention and popularity. An important reason for this popularity is due to breakthroughs in Reinforcement Learning where computer algorithms such as Alpha Go and OpenAI Five have been able to achieve human level performance on games such as Go and Dota 2. One of the core concepts in Reinforcement Learning is the Deep Q-Learning algorithm. Naturally, a lot of us want to learn more about the algorithms behind these impressive accomplishments. …


An Introduction to the Advantage Actor Critic Algorithm

In the field of Reinforcement Learning, the Advantage Actor Critic (A2C) algorithm combines two types of Reinforcement Learning algorithms (Policy Based and Value Based) together. Policy Based agents directly learn a policy (a probability distribution of actions) mapping input states to output actions. Value Based algorithms learn to select actions based on the predicted value of the input state or action. In our previous Deep Q-Learning Tutorial: minDQN, we learned to implement our own Deep Q-Network to solve the simple Cartpole environment. In this tutorial, we’ll be sharing a minimal Advantage Actor Critic (minA2C) implementation in order to help new…


Figure 1: The Policy Gradients Algorithm

1. What is Reinforcement Learning?

Reinforcement Learning is a field of Machine Learning that has produced many important AI breakthroughs such Alpha Go and OpenAI Five. The game of Go was widely considered to be quite difficult for computers to learn and play at the same level of professional human players. AlphaGo is significant for being the first machine to surpass the best human Go players. Importantly, both Alpha Go and OpenAI Five use Reinforcement Learning algorithms to learn to play their respective games. One of the main goals of Reinforcement Learning is to create software agents that learn to maximize their reward in certain…


In this post, we’ll explain what Momentum is and why Momentum is a simple and easy improvement upon Stochastic Gradient Descent. We also show a minimal code example on the MNIST dataset where adding Momentum improves the accuracy and training loss of the model.

1. Exponential Smoothing (or Exponentially Weighted Averages)

When looking at noisy time series data such as your training/validation error graphs in tensorboard, you may often notice that the raw values are often quite noisy. Quite often, you are able to see a trend in the graph. These trends often become more obvious when you add some smoothing to the raw graph values.

Exponential…


Logistic Regression is a standard technique used for Classification problems. In Logistic Regression, a Sigmoid or Softmax function is applied to Linear Regression to solve classification problems. In this post, I attempt to answer:

  • Which activation function should you choose: Sigmoid or Softmax?

1. Which activation function should you choose: Sigmoid or Softmax?

These activation functions normalize the input data into probability distributions between 0 and 1. This property is quite useful for classification problems where the output represents the probability of the input being 1 of 2 classes.

Sigmoid Function

The Sigmoid function is an S-shaped function between 0 and 1 defined by the equation below:

The Sigmoid Function

The objective of this page is to provide example reference code for defining an Image Classification Network using DepthWise Separable Convolutions. We’ll demonstrate all the working parts of an Image Classification Network including loading the data, defining the network, optimizing weights on the GPU, and evaluating performance. This example code is written in PyTorch and run on the Fashion MNIST dataset.

Why DepthWise Separable Convolutions?

Normal 2D convolutions map N input features to M output feature maps using a linear combination of the N input feature maps. Normal 2D convolutions require a larger and larger number of parameters as the number of feature maps…


This post will demonstrate all the working parts of an Image Classification Network including loading the data, defining the network, optimizing weights on the GPU, and evaluating performance. Our objective is to provide example reference code for people who want to get a simple Image Classification Network working with PyTorch and Fashion MNIST.

Fashion MNIST Dataset

Fashion MNIST is a dataset of 70,000 grayscale images and 10 classes. The classes are defined here.

1. Check that GPU is available

import torch
print(torch.cuda.is_available())
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

2. Download and load the Fashion MNIST dataset

import torch
from torchvision import datasets…

Mike Wang

Hi there, I write and teach about cool topics in Data Science

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store