A Simple Neural Network

Cosmocoder
5 min readJun 22, 2023

So today is a milestone for me as a programmer. Throughout all of my studying to become a competent dev, whenever asked what kind of work I would like to do in this field, my response has always shot up from some unknown place deep inside…

…”A.I. and Machine Learning — for sure”

I don’t really know why, it’s just so awesome — not only the applications but also the chance to project ourselves even more deeply onto something external and thereby perhaps getting to know ourselves in a new and more exciting way. And maybe the complexity of this medium is finally enough to provide meaningful room for our inner selves.

For today’s post, as always, I want to keep this short and sweet. The goal is to take a very high level glance at the creation, training and application of a simple neural network.

The problem…

Let’s say I have a collection of 1,000 haiku and each one has a score attached to it — anywhere from 0 to 100.

Now what this means to me is that whoever scored these has some kind of criteria by which they determine the beauty of a haiku — i.e. a way of thinking about haiku.

And I want to create a consciousness (or rather a model) that will be trainable and after training be capable of thinking in a similar way.

Now, from the biological / neurological side, let’s shoot for the stars here and attempt to distill the key essence of neurology that we’re interested in into only a couple of ideas…(which is ridiculous but oh well, here goes)

  1. Neurons generally receive signals from other neurons and then propagate signals forward to other neurons.
  2. What determines whether or not a neuron will fire is usually whether or not it has received a sufficient amount of signaling from its upstream neuron(s) — enough to initiate an action potential.

All of this is to say that if five neurons all terminate at a single neuron — each of the five has to contribute a certain amount of energy in order to push the downstream neuron to fire.

Now, from the neural network / programming side of this, what we’re doing now is saying, which “neurons” (words) contributed to the firing of this single neuron (the haiku being “good”).

So right away let’s codify “good”…

I’m thinking any haiku with a score above 85 is pretty good. Now we’re in a position to consider this in a binary fashion.

< 85 = 0 (the neuron didn’t fire)

≥ 85 = 1 (the neuron fired, the haiku was good!)

For today’s task let’s keep it simple and consider a group of words that occur across many of our haiku to be independent nodes and thus each has an opportunity, when present, to contribute to the haiku being beautiful or not.

So here are our neurons, one for each word:

You can see that I’ve ignored what are commonly known as “stop words” or words that we might not imagine to meaningfully contribute to the score (“a”, “to”, “the”, “and”, etc.)

Each line from the individual word to the score itself represents its contribution to the score. The only issue at the moment is that we don’t know how much each one contributes.

So what we need is each one’s weight or relative importance when considering how our network will “think” regarding the score that we want to predict.

Training

Now let’s train the model and in doing so, find the weight of each word. To do this, I want to take about 70% of my haikus and analyze each one. We’ll save the other 30% for testing the model to establish it’s mean squared error (which lets us know how accurate our model is).

What I’ll do is simple-

For each haiku, each time the score is 85% or greater, I’ll increment the weight for the words present in the haiku. Conversely, if the score is below 85%, I’ll decrement the weight for these words.

The first Haiku contained the words light, candle and dew. Its score was 87, so my weights now look like this:

Now I’ve continued the analysis of the rest of my haiku collection to further refine my weights and after all have passed through, this is my end result…

Now I’m given a new haiku, one by Yosa Buson —

The light of a candle

Is transferred to another candle —

Spring twilight

So how can I determine what my newly trained network will make of this?

Simple - I’ll just analyze it by applying my weights in order to get a predicted total score.

And there it is. Using this haiku and applying our weights and summing the “action potentials” leading to our target, we ended up with a stout score of 94!

In other words, our downstream neuron fired!

And to double check how reliable this prediction really is, we would use our other 30% of the haiku that we did not train the model with to test and then quantify how accurate our model really is.

If we were to find that our mean squared error (or root mean squared error) for the model is too high, then it might mean we need to increase our sample size, use a larger dataset or dive deeper into some hyperparameter tuning (or a number of other adjustments that we’ll look at in more complex examples).

And that’s it — going to stop here on this one. Just a brief notion to get the mind wondering about all of the potential complexities that can (and will!) branch off from here ;)

--

--

Cosmocoder

Hi. I’m Nathan and I’m a full-stack software dev making my way through DS and Algo. Looking forward to sharing experiences along the way.