Classification of Algorithms in Machine Learning

24-07-2022
chuong xuan
0 Comments

There are many different types of learning algorithms, classified according to many different criteria. For example, based on how that model learns (learning style) or according to function, it can be summarized by the following criteria:

Does the training process need human supervision? Supervised (supervised), Unsupervised (unsupervised), Semi-supervised (half supervised) and Reinforcement Learning (reinforcement learning).
Compare the change of the data comparing the old data points with the new data points from which to draw conclusions or build rule patterns for the training data and then build models to guess like some algorithms (Instance-based, model-based learning).

Mục lục

Supervised/Unsupervised Learning

This algorithm is often used in clustering/labeling problems.

Supervised learning

From the input data pair is (data, label) and will predict the output of new data new input. That is, when the input is the set X={x1, x2,…, xn} and the corresponding label set Y = {y1, y2, .., yn} where X, Y are 2 vectors. Given X and Y are training data sets, from this training set we will need a function that maps each element of X to an element y predicted with y_pre ~ f(xi).

The problem of checking email for spam is a practical application of this model. The algorithm will take as input emails with the label as spam or not. And from there, the algorithm based on fears determines whether an email with new data is spam or not. Some supervised learning algorithms such as Linear Regression, Logistic regression, Neural Networks, …

Unsupervised Learning

In this algorithm, we will not know the outcome but only the input data. Determine based on the data structure to perform a certain task such as clustering. The system will learn without anyone teaching.

Two algorithms can be mentioned in the unsupervised learning model such as: Clustering or Association

Clustering: Helps to break down data based on their relevance. For example, segmenting customers on social networks from there to analyze user behavior and bring data to a group of people with similarity.
- Some popular clustering algorithms
  - K-Means
  - k-Medians
  - Expectation Maximization
  - Hierarchical Cluster Analysis (HCA)

Association: Defines the rules of the data.

Semi-Supervised Learning

Problems where we have a large amount of data but only a part of which we are labeled is called Semi-Supervised Learning. This one is in between Unsupervised and Supervised Learning. For example, a problem where only a portion of an image or text is labeled (e.g. pictures of people, animals, or scientific or political texts) and most other images/texts are unlabeled. collected from the internet. In fact, a lot of Machine Learning problems fall into this group because the collection of labeled data takes a lot of time and is expensive. Many types of data even require an expert to label them (medical images, for example). In contrast, unlabelled data can be obtained at low cost from the internet.

Reinforcement Learning

The input data of reinforcement learning is almost indeterminate. Which the algorithm will learn directly. Then the more learning, the higher the accuracy of the output data will be because each machine learning will be rewarded with a certain reward. AlphaGo, for example, is a system that uses reinforcement learning and has won against even the best Go player in the world.

Summary

Determining which model your data belongs to is a very important step to find a training algorithm that fits the data and problem requirements. Through this article, hopefully everyone will understand more about algorithms in ML. In the following articles, I will delve into the explanation and analysis of each of the algorithms mentioned above.