Neural Networks: Supervised, Unsupervised & Reinforcement Learning Types of Learning

There are 3 types of learning

  • Supervised learning
  • Reinforcement learning
  • Unsupervised learning

Supervised Learning

In supervised learning we try to predict an output when given an input vector. Input and target are clear in supervised learning.

Regression - target ouptut is a real numbers or a vector of real numbers
Classification - target output is a class label

Model-class is a function:

\begin{equation} y = f(x; W) \end{equation}

$y$ - output of the model-class
$x$ - input vector
$W$ - other parameters provided to the function $f$

Learning is the process of adjusting the parameters $W$, which will reduce the discrepancy between the target output $t$ and the actual output $y$.

For regression, discrepancy can be measured as:

\begin{equation} \dfrac{1}{2} (y - t) ^2 \end{equation}

Examples of unsupervised learning can be:

  • linear regression for regression problems
  • random forest for classification and regression problems
  • support vector machines for classification problems

Unsupervised Learning

Here you have only input data (X) and no output variables. The goal is to model the structure in the data order to learn more about the data. After your data is normalized and certain metrics are correlated between different data groups, you may find out some interesting facts about the data you are dealing with.

There are mainly 2 groups of unsupervised learning problems:

Clustering - discovering how data can be grouped (i.e. grouping customers by a certain behavior)
Association - association learning problem, discovering rules that describe large portions of data

Examples can include:

  • k-means for clustering problems
  • Apriori algorithm for association rule learning problems

Reinforcement Learning

The objective of reinforcement learning is to select each action to maximize the rewards. Usually it is not easy to conclude which action was right or wrong as the reward is happening in the future and the time lag will add some uncertainity on the learning / adjustment process.

Basically the learner decides What to do in order to maximize the reward. The exact output is not clear like in supervised learning, and we know that we are expecting an output (in contrast to unsupervised learning).