By Tech Brew Staff
less than 3 min read
Definition:
Most advanced machine learning algorithms use a structure called a neural network to model algorithms. Loosely modeled after the human brain, neural networks consist of columns of nodes, each defined by weights and biases.
Images, text, and other data are converted into arrays of numbers, called tensors, and fed into the input layer of nodes. From there, this data is multiplied by a set of given weights and fed to each of the nodes in the next layer, which sum them up in addition to a bias number.
Illustrating how neural networks work
Imagine you have data on people’s heights and shoe sizes. You plot it on a graph with a line of best fit–y=mx+b—where m is the average shoe size increase for every additional inch of height, and b is a baseline shoe size. This line maps a relationship within your data and is essentially one neural network node—m is the weight value and b is the bias value. You could scale that up into an endlessly more complicated, layered system that could plot increasingly complex nonlinear relationships between all kinds of characteristics—age, gender, foot width, genetic traits. From that, it could also predict something like shoe size. That’s a massively oversimplified version of a neural network but, at its core, it’s akin to a giant statistical equation.
Each of the weights are often initially set to random numbers and tweaked, along with biases, in the course of feeding training data through the system. Most modern neural networks include many “hidden layers” of nodes—those between the input and output. As the data trickles deeper into the layers, the concepts analyzed become more sophisticated and abstract. That’s where image models might zero in on, say, the ears of a cat photo, or natural language models might parse certain sentence structures.
Neural networks were first conceived of in the 1940s and 1950s, but they’ve fallen in and out of favor along with the ebbs and flows of AI research over the decades. The most recent boom around AI started in the early 2010s with a convergence of trends: a specific formulation called convolutional neural networks, the rise of graphics processing units (GPUs) to handle large computing loads, and massive image datasets. These together supercharged the field of computer vision, or image recognition, and led to a boom in deep learning—AI that involves multi-layered neural networks.