The Basic Concepts of Machine Learning

“The development of full artificial intelligence could spell the end of the human race….It would take off on its own, and re-design itself at an ever increasing rate. Humans, who are limited by slow biological evolution, couldn’t compete, and would be superseded.” by Stephen Hawking.

Share Give it a Spin!
Follow by Email

“The development of full artificial intelligence could spell the end of the human race….It would take off on its own, and re-design itself at an ever increasing rate. Humans, who are limited by slow biological evolution, couldn’t compete, and would be superseded.”

by Stephen Hawking.

Machine Learning is the subfield of computer science and a branch of artificial intelligence whose objective is to develop techniques that allow computers to learn. More specifically, it is about creating programs capable of generalizing behaviors from information provided in the form of examples. It is, therefore, a process of induction of knowledge.

In many cases the field of action of machine learning overlaps computational statistics, since the two disciplines are based on data analysis. However, machine learning also focuses on the study of the computational complexity of problems. Many problems are NP-hard class, so much of the research done in machine learning is focused on designing feasible solutions to those problems. Machine learning can be seen as an attempt to automate some parts of the scientific method by mathematical methods.

Machine learning has a wide range of applications, including search engines, medical diagnostics, fraud detection in the use of credit cards, stock market analysis, classification of DNA sequences, recognition of speech and written language, games and robotics.

Some machine learning systems try to eliminate any need for intuition or expert knowledge of data analysis processes, while others try to establish a collaborative framework between the expert and the computer. In any case, the human intuition can not be replaced in its entirety, since the system designer has to specify the form of representation of the data and the methods of manipulation and characterization of the same. However, computers are used all over the world for very good technological purposes.


Machine learning results in a model to solve a given task. Among the models are distinguished:

The geometric models, built in the space of instances and that can have one, two or multiple dimensions. If there is a linear decision border between the classes, the data is said to be linearly separable. A linear decision boundary is defined as w*x=t, where w is a vector perpendicular to the decision boundary, x is an arbitrary point in the decision boundary, and t is the threshold of the decision.

The probabilistic models, which attempt to determine the probability distribution descriptor of the function that links to the values ​​of the characteristics with determined values. One of the key concepts to develop probabilistic models is Bayesian statistics.

The logical models, which transform and express the probabilities in rules organized in the form of decision trees.

Models can also be classified as grouping models and gradient models. The former try to divide the space of instances into groups. The seconds, as their name indicates, represent a gradient in which you can differentiate between each instance. Geometric classifiers such as support vector machines are gradient models.

Types of algorithms

A support vector machine

The different automatic learning algorithms are grouped into a taxonomy depending on the output of the same. Some types of algorithms are:

Supervised learning

The algorithm produces a function that establishes a correspondence between the desired inputs and outputs of the system. An example of this type of algorithm is the classification problem, where the learning system tries to label a series of vectors using one of several categories. The knowledge base of the system consists of examples of previous labeling. This type of learning can be very useful in problems of biological research, computational biology and bioinformatics.

Unsupervised learning

The whole modeling process is carried out on a set of examples formed only by inputs to the system. There is no information about the categories of these examples. Therefore, in this case, the system has to be able to recognize patterns in order to label the new entries.

Semisupervised learning

This type of algorithm combines the two previous algorithms to be able to classify adequately. The marked and unmarked data are taken into account.

Learning by reinforcement

The algorithm learns by observing the world around it. Your input information is the feedback or feedback you get from the outside world in response to your actions. Therefore, the system learns based on trial and error.

Learning by reinforcement is the most general among the three categories. Instead of an instructor telling the agent what to do, the intelligent agent must learn how the environment behaves through rewards (reinforcements) or punishments, resulting from success or failure respectively. The main objective is to learn the value function that helps the intelligent agent to maximize the reward signal and thus optimize its policies in order to understand the behavior of the environment and make good decisions for the achievement of its formal objectives.

The main reinforcement learning algorithms are developed within the Markov finite decision solving methods, which incorporate the Bellman equations and the value functions. The three main methods are: Dynamic Programming, Monte Carlo methods and learning Temporal Differences.


Similar to supervised learning, but does not explicitly construct a function. Try to predict the categories of future examples based on the input examples, their respective categories and new examples to the system.

Multi-task learning

Learning methods that use knowledge previously learned by the system in order to face problems similar to those already seen.

Computational and performance analysis of machine learning algorithms is a branch of statistics known as computational theory of learning.

The automatic learning people carry out automatically because it is such a simple process for us that we do not realize how it is done and everything that implies. From birth to death we humans carry out different processes, among them we find the learning process through which we acquire knowledge, develop skills to analyze and evaluate through methods and techniques as well as through our own experience. However, machines must be taught how to learn, because if a machine is not able to develop its skills, the learning process will not be carried out, but will only be a repetitive sequence. We must also bear in mind that having knowledge or doing well the automatic learning process does not imply that you know how to use, you need to know how to apply it in everyday activities, and good learning also means knowing how and when to use our knowledge.

To carry out a good learning it is necessary to consider all the factors that surround it, such as society, economy, environment, etc. Therefore, it is necessary to start taking various measures to achieve adequate learning, and obtain an adequate automation of learning. Thus, the first thing that must be taken into account is the concept of knowledge, which is the understanding of a certain topic or subject in which you can give your opinion or point of view, as well as answer certain questions that may arise.