The Beauty of Neural Networks

By Daniel Gitelman, Vishwak Pabba, Manav Jairam

What actually happens when you train a neural network? Neural networks have been commonly referred to as being “black boxes”. Despite their widespread use, most people do not know what they are or how they function, or why they might even be used. In our visualizations, we explore how simple regression models may generalize to data in simple linear spaces, but fail to accurately capture intricate patterns in data. We will demonstrate how neural networks, in contrast, excel in modeling intricate datasets and learn complex relationships between variables.

To provide some context to the upcoming visualizations, the data visualized and used for training before the final section is all labeled either 0 or 1. Points labeled 0 will be colored red, and points labeled 1 will be blue. The same coloring scheme will also apply to heatmaps that represent a models predictions in certain areas. Throughout, this page, you will be asked to try training various models. Once you do so, changes should be apparent in the corresponding heatmap. You can view a video demonstarting key features of the visualization here.

Basic Logistic Regression

One of the most basic machine learning models used for classification is logstic regression. When fit to a simple dataset that follows a simple linearly seperable dataset, logistic regression can be an effective method to learn patterns in data. Try pressing the train button and observe how a logistic regression model learns the boundary between differently labeled points. You should see the heatmap divided in the middle by a diagonal, similar to how the data it learns from is divided.

Logistic Regression on Complex Data

However, this same model trained on a complex dataset, such as the one seen below, fails to perform well. Again, try pressing the train button to observe how logistic regression attempts to learn the pattern in the data. You might see the predictions move around at random during training, and never really settle into one place. Regardless, the model will fail to learn the shape of the data.

Hidden Layers on a Complex Dataset

This is where neural networks become important. By introducing additional hidden layers and neurons to the neural network architecture, we observe an enhanced ability to accurately capture the intricacies of the complex dataset.

The hidden layers within a neural network are composed of neurons that receive inputs consisting of linear combinations of weighted values. These neurons then utilize an activation function to process these inputs, thereby enabling the network to gain nonlinearity and learn more effective representations of the data. In our specific model, we used a RELU activation function, which serves to ascertain whether the output value from the preceding layer surpasses a predetermined threshold.

The overarching purpose of the additional hidden layers and neurons in each layer is to learn the relationships between each feature. You might notice from comparing the logistic regression model to the neural network model that each individual neuron of the network is almost identical to the logistic regression model. However, by including and combining these neurons, neural networks can learn much more information then just a simple logistic regression.

Try experimenting with the model parameters below and observe the effect on the size of the model on how fast or effectively the model learns the pattern in the data while it trains. In addition, observe the changes of the weights in the model as it learns. Whether a weight is positive or negative is encoded by the color, green representing a positive weight, and red representing a negative one. The absolute magnitude of each weight is encoded by the thickness of the line visualizing it. You should see weights grow or disappear, which is the model learning specific relationships between the neurons these weights connect, either strengthing or weakening them.

Number of Nodes in First Hidden Layer: Number of Nodes in Second Hidden Layer:

Learning From Real-World Data

While showing that a neural network can learn patterns in 2 dimensional synthetic data is cool, the question remains of how well a neural network can learn complex real world relationships between features and their output.

Using data from the Boston Housing dataset (https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html), which contains over 500 data points, we want to demonstrate the performance of a neural network on real world data. We incorporate the majority of features of the dataset into the input layer and use them to model a regression problem, predicting the prices of Boston houses. Try experimenting with the various different parameters and explore how additional complexity or different model structures allow the model to learn the data more precisely. To track how well the model is learning, you can use the loss graph below, which shows the mean squared error of the network on the data. You might also notice that the total number of weights increases exponentially as the number of layers and neurons increase. Because of this aspect, neural networks have the capacity to even learn patterns in far more complex data then just boston housing prices.

Number of Hidden Layers: Number of Nodes in First Hidden Layer: Number of Nodes in Second Hidden Layer: Number of Nodes in Third Hidden Layer:

Conclusion

In conclusion, we hoped to show how neural networks learn and how the number of parameters affects this learning process. If there is one thing we want everyone to take away from this project, it would be a better understanding of how neural networks learn complex relationships, in contrast to simpler models. Instead of just understanding that neural networks are complex, we want everyone to understand where this complexity comes from, and what it actually does. We hope that we succeed in making it clear how this happens by allowing the viewer to investigate the effects of complexity in a neural network, as well as by relating the neural network to a simpler model to visualize how they are built up.