Machine Learning

Machine learning is a method of data analysis that automates analytical model building. It is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention.

Machine learning is closely related to computational statistics, which focuses on making predictions using computers. The study of mathematical optimization delivers methods, theory and application domains to the field of machine learning. Data mining is a related field of study, focusing on exploratory data analysis through unsupervised learning. In its application across business problems, machine learning is also referred to as predictive analytics.

Machine learning methods

Supervised learning algorithms are trained using labeled examples, such as an input where the desired output is known. The learning algorithm receives a set of inputs along with the corresponding correct outputs, and the algorithm learns by comparing its actual output with correct outputs to find errors. Through methods like classification, regression, prediction and gradient boosting, supervised learning uses patterns to predict the values of the label on additional unlabeled data.

Unsupervised learning is used against data that has no historical labels. The algorithm must figure out what is being shown. The goal is to explore the data and find some structure within. Popular techniques include self-organizing maps, nearest-neighbor mapping, k-means clustering and singular value decomposition. These algorithms are also used to segment text topics, recommend items and identify data outliers.

Reinforcement learning –  algorithm discovers through trial and error which actions yield the greatest rewards. This type of learning has three primary components: the agent (the learner or decision maker), the environment (everything the agent interacts with) and actions (what the agent can do). The objective is for the agent to choose actions that maximize the expected reward over a given amount of time. The agent will reach the goal much faster by following a good policy. So the goal in reinforcement learning is to learn the best policy.

Machine learning tools:

TensorFlow – software tool for dataflow and differentiable programming across a range of tasks, made by Google.

PyTorch – open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing. It is primarily developed by Facebook’s AI Research lab

Scikit-learn – free software tool, designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy

Numpy – fundational instrument for data analysis in Python, which powers all other instruments, although it is not a machine learning tool.

Pandas – fast, powerful, flexible and easy to use open source data analysis and manipulation tool, which is mainly used for machine learning in form of dataframes.