1.0 Models
Simple models
A model is a mathematical function, or set of functions in which outputs are chained to inputs. The purpose of a model is to make predictions based on inputs.
An example of the simplest possible model would be:
In the context of Machine Learning, this is instead represented as:
In the above example, is known as a weight, and as a bias parameter. Parameters of the model are optimised during the training stage.
Neural networks
A common form of model in Machine Learning is known as a feed-forward neural network. The following is a minimalist example of a feed-forward neural network:
Diagram 1.0.1: Example of a basic feed-forward neural network, with only 3 inputs, 1 hidden layer, and 1 output.
Diagram 1.0.2: Continued example; inspection of inputs, contents, and outputs of one neuron (h1) within the hidden layer of Diagram 1.0.1. Specifically, this is a linear regression function wrapped in an activation function σ, which could alternatively be written σ(f(x)).
What is a feed-forward neural network?
- A feed-forward neural network is made up of layers
- Each layer is made up of neurons
- Each neuron within the same layer is assigned the same function, but different weights
- The weights of the functions are adjusted during the training stage of a feed-forward neural network
- The types of functions involved are typically linear regression functions, such as in Diagram 1.0.2
- A linear regression function within a neuron may contain, at a maximum, a quantity of weights which is equivalent to the quantity of neurons of the previous layer
- Inputs are taken from the outputs of a previous layer of functions, processed, and then output to the next layer of functions
- Functions within neurons are typically wrapped in activation functions, and they may cause certain neurons within a layer to output zero unless a specific condition is met
The choice of quantities of neurons, layers, and the functions that make up them, is known as the model architecture.
Consider that, as a result of the above, a feed-forward neural network can be represented as one giant mathematical function. But, as neural networks grow, they can quickly become extremely algebraically complicated. And so, instead, the graphical format is common, in teaching.
The notion of the activation function is based on human brain synapses, in which neurons within the brain only pass on information when they are stimulated enough. As a result, neural networks are based on the mechanisms of the human brain.
Practice questions
1. Considering that a brain neuron will pass information on only when excited, what conditions make for a typical activation function?
Answer
Theoretically, any function which passes on information only when the input begins to pass a certain limit. And so we could say any function you are familiar with which is increasing for all values of x, beyond a certain value of x. Alternatively, even a piecewise function could do:
However, as the reader continues through this textbook, it will become clear why some functions are more preferable than others.
2. Are the weights within a model considered part of the model architecture?
Answer
3. How many hidden layers and neurons can a neural network have?