Hello, dear readers! It’s amazing how fast time flies–we’re already 1/5 of the way through this summer project if you can believe it. With our final team member Alex back in Bethlehem, it really feels like we can hit the ground running. We’ve had our hands full, and I’m very excited to update you on my progress!
Our week started off much as last week ended–exploring different Python libraries and collecting needed resources to use later in the project. I personally spent some time with PyTorch, another open-source library for building Machine Learning models in Python. The absolutely amazing web resource Learn PyTorch has great documentation walking through the process of building and testing models for different scenarios, and I cannot recommend them enough as a starting point for others interested in ML. Following their lead, along with guidance from Prof. R., I was able to build a sparse feed-forward Neural Network in the system!
I’ll explain the basic structure of the network, and then show off the results I was able to get.
So, for the layman, let’s lay down a foundation. What’s a neural network, why am I making one, and how did I do it? First, perhaps most important: neural networks chain together layers of computation called ‘neurons’, that train by reading through large datasets and slowly ‘learning’ what makes that data tick. The network adjusts its decision-making parameters as it trains until eventually, it can make decisions related to whatever you taught it. Most networks focus on one of two major tasks; regression and classification. The former attempts to predict numerical values whereas the latter attempts to identify an input according to a list of possible outputs. Our project focuses on the former–ideally, experimental data goes in, and the model predicts experiment results without ever having to touch a lab.
Before we get that far, though, we need to start at the absolute basics. My first given task was to predict one value, using a simple function as training data. I won’t take you through the nitty gritty for this example, since my code heavily mimics the tutorials above. At a high level, the process is simple! Data needs to be generated–in this case, I had Python generate the first-order rate equation often used in chemistry, at Prof R’s request. This data gets loaded into PyTorch tensors and split into a training and testing segment. (More complex models will include a third split as validation, to finetune model parameters before full testing.)
We then define the model network; in this case, it’s a Sequential line of fully-connected linear layers (ones where every neuron is connected to every other neuron) along with activation layers to signal when the neuron before has found a critical value. In my case, this turned out like the nearby snippet of code. This model follows a ‘pyramid’ structure for its neurons, starting with the 1 input, expanding it outward, then slowly bringing it back down until the model has 1 distinct answer. You can also see two further variables, a loss function, and an optimizer. The former tells the tool what measure to use to read accuracy, and the latter is how the model chooses to update neurons during training.
We then train this model, feeding it into the network many, many times–otherwise known as epochs. Epochs are broken down into ‘batches’ of data, so the model can optimize more than once per dataset cycle. Batches are pushed forward through the model, and then the optimizer performs a ‘backward pass’, counting changes in neuron weights. The optimizer takes a run-through, updating according to the algorithm we specified earlier. This process can repeat as often as we, the designer, want, but it’s best to stop the model when the loss can’t drop any further. This helps prevent overfitting, a phenomenon where models learn the exact dataset rather than the underlying concept they represent, and can’t generate good predictions on unseen examples.
Here’s where we see how we did! Above is a ‘loss graph’, which in its’ simplest terms shows us how accurate a model is as it continues to learn from the data. We’ve plotted our loss rate, in this case, mean squared error, against epochs, how long the model has been learning.
Finally, we get to see a model make predictions! This network was trained on data from a first-order rate equation, a common function used in chemistry. This actually shows a rate of change–one that’s relatively linear in nature, though not quite. While this particular model isn’t spot on, it’s surprisingly close for such a small set of training data and a simple network structure.
..whew, that is a LOT for a ‘simple’ model! However, this exercise helped a ton in learning the process of generating any machine learning model and gave me fun graphs to show off in the process (always a bonus). To anyone else attempting a similar project, I seriously recommend this method of starting small and working your way up to the project you have in mind. Incremental progress lets you test your own knowledge as you go, and mistakes won’t feel nearly as punishing as they happen.
Our team also had some really great discussions about further project goals, especially centering around what problem we hope to solve for Dr. Menicucci’s laboratory class. Even this early on, I consider the end-user experience probably the most crucial aspect of development. Our success is measured by whether our models see use in a classroom, after all! Thus, the module now has a name: The Digital Laboratory Twin. All in all, I feel really great about our progress, and can’t wait to see where development takes us!
-Maddie
References:
Predictive modeling and multiobjective optimization of diamond turning process of single-crystal silicon using RSM and desirability function approach – Scientific Figure on ResearchGate. Available from: https://www.researchgate.net/figure/Typical-artificial-neural-network-structure-64_fig2_333206147 [accessed 16 Jun, 2023]