Common ML Algorithms for IoT: Linear Regression, Logistic Regression, SVM, and Decision Trees

In the realm of Edge AI and TinyML for IoT devices, efficient and lightweight machine learning algorithms are paramount. These algorithms must perform complex tasks with limited computational power, memory, and energy. This module explores four fundamental algorithms commonly adapted for these resource-constrained environments: Linear Regression, Logistic Regression, Support Vector Machines (SVM), and Decision Trees.

Linear Regression

Linear Regression is a supervised learning algorithm used for predicting a continuous target variable. It models the relationship between a dependent variable and one or more independent variables by fitting a linear equation to the observed data. For IoT, it's useful for tasks like predicting sensor readings or estimating remaining battery life based on usage patterns.

Linear Regression predicts continuous values by finding the best-fit line.

It establishes a linear relationship between input features and an output value, often represented by the equation y = mx + c (or its multi-dimensional equivalent).

The core idea is to minimize the difference between the predicted values and the actual values, typically using methods like Ordinary Least Squares (OLS). The 'best-fit' line is determined by finding the coefficients (m and c) that minimize the sum of squared errors. In IoT, this can be used to forecast temperature based on historical data or predict the power consumption of a device based on its operational parameters.

What type of problem does Linear Regression solve?

Regression problems, where the goal is to predict a continuous numerical value.

Logistic Regression

Logistic Regression is a supervised learning algorithm used for binary classification tasks. Despite its name, it's used for classification, not regression. It predicts the probability of a binary outcome (e.g., yes/no, true/false) by fitting data to a logistic function (sigmoid function).

Logistic Regression predicts the probability of a binary outcome.

It uses the sigmoid function to map any real-valued number into a value between 0 and 1, representing a probability. This probability is then used to classify the input.

The sigmoid function, σ(z) = 1 / (1 + e^-z), transforms a linear combination of input features (z) into a probability. A threshold (commonly 0.5) is then applied to classify the instance into one of two classes. For IoT, it's ideal for anomaly detection (e.g., detecting a faulty sensor reading) or classifying events (e.g., identifying if a motion sensor has been triggered).

What is the primary output of Logistic Regression?

The probability of an instance belonging to a particular class.

Support Vector Machines (SVM)

Support Vector Machines (SVM) are powerful supervised learning algorithms used for both classification and regression tasks, though most commonly for classification. SVMs work by finding an optimal hyperplane that best separates data points of different classes in a high-dimensional space.

SVM finds the optimal boundary (hyperplane) to separate data classes.

It aims to maximize the margin between the hyperplane and the nearest data points (support vectors) of each class, leading to robust classification.

The core principle is to find the hyperplane that has the largest distance to the nearest training data points of any class. This maximum margin classifier is less sensitive to outliers and generally performs well. For complex, non-linear separation, SVMs utilize kernel tricks (like polynomial or radial basis function kernels) to map data into higher dimensions where linear separation is possible. In IoT, SVMs can be used for image classification (e.g., identifying objects captured by a camera) or for classifying different types of network traffic.

Imagine data points scattered on a 2D plane, representing two different classes. An SVM algorithm seeks to draw a line (a hyperplane in higher dimensions) that separates these classes. The key is that this line is positioned to be as far as possible from the closest points of each class. These closest points are called 'support vectors' because they 'support' the hyperplane. The distance between the hyperplane and these support vectors is called the 'margin'. A larger margin generally leads to better generalization. For non-linearly separable data, SVMs use kernel functions to implicitly map the data into a higher-dimensional space where a linear separation becomes possible.

📚

Text-based content

Library pages focus on text content

What are the 'support vectors' in an SVM?

The data points closest to the hyperplane that define its position and margin.

Decision Trees

Decision Trees are supervised learning algorithms that use a tree-like structure of decisions and their possible consequences. They are intuitive and can handle both classification and regression tasks. For IoT, their interpretability and ability to handle non-linear relationships make them valuable.

Decision Trees make decisions by splitting data based on feature values.

They create a flowchart-like structure where internal nodes represent tests on attributes, branches represent the outcome of the test, and leaf nodes represent class labels or predicted values.

The tree is built by recursively partitioning the data based on the feature that best splits the data according to a certain criterion (e.g., Gini impurity or information gain for classification, variance reduction for regression). This process continues until a stopping criterion is met (e.g., maximum depth, minimum samples per leaf). Decision Trees are prone to overfitting, so techniques like pruning or using ensemble methods (like Random Forests or Gradient Boosting) are often employed. In IoT, they can be used for simple rule-based decision making, such as determining if a device should enter a low-power mode based on sensor inputs.

What are the main components of a Decision Tree?

Internal nodes (tests on attributes), branches (outcomes of tests), and leaf nodes (class labels or predicted values).

Algorithm Suitability for IoT

Algorithm	Primary Use Case	Complexity (for IoT)	Interpretability
Linear Regression	Predicting continuous values	Low	High
Logistic Regression	Binary Classification	Low	High
SVM	Classification (can do regression)	Medium (depends on kernel)	Medium (with linear kernel)
Decision Trees	Classification & Regression	Low to Medium (depending on depth)	High

For TinyML and Edge AI, model size, inference speed, and energy consumption are critical. Algorithms that can be efficiently quantized or pruned, and have fewer parameters, are generally preferred.

Learning Resources

Introduction to Machine Learning - Coursera(video)

Andrew Ng's foundational course covers Linear Regression, Logistic Regression, and introduces concepts relevant to SVMs and Decision Trees.

Scikit-learn Documentation: Linear Models(documentation)

Official documentation for Linear Regression and Logistic Regression implementations in Python's scikit-learn library.

Scikit-learn Documentation: Support Vector Machines(documentation)

Comprehensive guide to Support Vector Machines, including kernels and practical usage in scikit-learn.

Scikit-learn Documentation: Decision Trees(documentation)

Detailed explanation of Decision Tree algorithms, including their implementation and parameters in scikit-learn.

Machine Learning for Beginners: Logistic Regression(video)

A clear and concise video explanation of how Logistic Regression works, suitable for beginners.

Understanding SVM with Kernels(video)

An animated explanation of Support Vector Machines and the role of kernel functions in handling non-linear data.

Decision Trees Explained(video)

A visual tutorial that breaks down the concepts behind Decision Trees and how they make predictions.

Towards Data Science: Linear Regression Explained(blog)

A visually rich blog post explaining the intuition and mathematics behind Linear Regression.

Towards Data Science: Logistic Regression vs. SVM(blog)

A comparative analysis of Logistic Regression and SVM, highlighting their strengths and weaknesses.

Wikipedia: Decision Tree Learning(wikipedia)

A comprehensive overview of decision tree learning algorithms, their history, and applications.

Common ML Algorithms for IoT: Linear Regression, Logistic Regression, SVM, Decision Trees