<p>Part 2 of my foray into machine learning. In part one we built a neuron class in python. In this post we'll start to
train it to be useful... or as useful as a binary classifier can be.</p>

<p>If you haven't read [Part 1]({% post_url 2019-07-16-neuron %}) I would highly suggest you start there. As a quick recap
though, here is what we accomplished by the end of the last post.</p>
<p>We built a neuron class that looks like the following:</p>
<pre><code class="language-python">import random

class Neuron:
    def __init__(self, number_of_inputs, weights=None):
        self.number_of_inputs = number_of_inputs
        self.weights = weights if weights is not None else self._random_weights()

    def _random_weights(self):
        return [random.normalvariate(0, .5) for _ in range(self.number_of_inputs)]

    def _weighted_sum(self, inputs):
        return sum([inputs[i] * self.weights[i] for i in range(self.number_of_inputs)])

    def _activate(self, w_sum):
        return -1 if w_sum &#x3C; 0 else 1

    def predict(self, inputs):
        w_sum = self._weighted_sum(inputs)
        return self._activate(w_sum)
</code></pre>
<p>We can instantiate it, and then get it to make predictions from inputs, but those predictions are basically a coin toss,
based on the random weights assigned at initialization.</p>
<pre><code class="language-python">n = Neuron(number_of_inputs=2)
n.predict([100, 50])
</code></pre>
<p>Running this code a dozen times should yield a pretty random split of 1s and 0s. The goal of this post is to reduce
that randomness it steer it closer and closer to a correct prediction.</p>
<h1>Forward the Foundation</h1>
<h3>Baseline</h3>
<p>To start, I created a bunch of random data for my perceptron to work with. Without doing any training, I ran it through
several attempts at classifying the data. Here are a few of its outcomes:</p>
<div style="text-align: center;">
    <img src="/assets/images/articles/initial_performance.png">
</div>
The green points are class A, the blue points are class b, and the red points were incorrectly guessed by the perceptron.
Each of these attempts differs because the random starting weights between each perceptron were different. Attemp 2 is
clearly on the right track, while the other two are very wrong. We'll start steering these weights to get closer and
closer to accurate predictions.
<h3>Step six: training</h3>
<p>Our perceptron is going to go through supervised learning to train. This means we train it on a subset of data for which
we know the expected outcome. Are the inputs above, class A or class B? Our neuron doesn't know, so we need
to tell it. It can then look at what it predicted, and what it should have predicted, and make some changes to itself to
compensate. The only adjustments this neuron can make are to its weights, and we want it to be making adjustments based
on its error. To calculate this error, we subtract the predicted value from the expected value, which gives us
these possible error values:</p>
<pre><code>| Desired | Predicted | Error |
|---------|-----------|-------|
|       1 |         1 |     0 |
|       1 |        -1 |     2 |
|      -1 |        -1 |     0 |
|      -1 |         1 |    -2 |
</code></pre>
<p>We then take these error values, multiply them by our training rate (I'll come back to this), and then multiply that value
by each input to get the values we need to steer each weight. I think this makes more sense to actually see in code, so
here goes.</p>
<h4>Adjusting weights</h4>
<p>We'll start with writing a function that will handle adjusting the weights.</p>
<pre><code class="language-python">class Neuron:
    # ...
    def _adjust_weights(self, inputs, error):
        for i in range(self.number_of_inputs):
            self.weights[i] += inputs[i] * error * self.training_rate
</code></pre>
<p>So here, we take every input, multiply it by the overall error, then by the training rate, then we add that to that
input's respective weight.</p>
<h4>Training Rate</h4>
<p>Basically, a training rate is how hard you steer the neuron. Very high training rates will have a tendency to oscillate
back and forth over the optimal weight, each time overshooting it, and trying to correct back, and subsequently
overshooting again. To solve this, we steer by only small amounts at a time. This does mean it can take longer to land
on the optimal weight. So there is a balance to how hard we are going to steer our weights, and that value will probably
differ from dataset to dataset. I have chosen, somewhat arbitrarily, a training rate of 0.1, set at the neuron's
initialization.</p>
<pre><code class="language-python">class Neuron:
    def __init__(self, number_of_inputs, weights=None, training_rate=0.1):
        self.number_of_inputs = number_of_inputs
        self.weights = weights if weights is not None else self._random_weights()
        self.training_rate = training_rate
</code></pre>
<p>This makes it easy to play around with training rates without having to change my base neuron code each time.
We have the ability to adjust our weights, now it's time to flex those neuron muscles... brains have muscles, right?
No... moving on...</p>
<h4>Put in reps</h4>
<p>We are going to make a function called, predictably, <code>train</code>. This function could work a couple of ways:</p>
<ol>
    <li>
        Take in a single set of inputs, the single expected output, make a prediction, and then adjust its weights, only doing this
        once per call.
    </li>
    <li>
        Take in our whole set of training data, loop over it on its own, and train all at once.
    </li>
</ol>
For this neuron, 
I'm choosing to go with the second option, a step training process, feeding it the training data one piece at a time.
This will make my implementation of my neuron slightly more cumbersome, since I'll have to handle the data looping 
there, but it will allow me to do some interesting things with the neuron as a whole. Read on, you'll see what I mean.
<pre><code class="language-python">class Neuron:
    # ...
    def train(self, inputs, label):
        output = self.predict(inputs)
        error = label - output
        self._adjust_weights(inputs, error)
</code></pre>
<p>Now that we can train, lets do it, and see what we get!</p>
<div style="text-align: center;">
    <img src="/assets/images/articles/trained_performance.png">
</div>
<p>As you can see here, with the same base data, and just a little bit of training, (I chose to train on only 50% of the
data, so 250 data points) we can get very close to correct, even with each attempt having very different starting weights</p>
<p>To get these graphs, I actually handled the testing outside of the neuron class, by doing a for loop of my testing
data and using the <code>predict</code> method, but we can easily have the neuron handle this.</p>
<h3>Step 7: testing</h3>
<p>While we are training one point at a time, I want to batch test, letting our neuron handle the looping.</p>
<pre><code class="language-python">class Neuron:
    # ...
    def test(self, x_test, y_test):
        error = 0
        for i, inputs in enumerate(x_test):
            if self.predict(inputs) != y_test[i]:
                error += 1
        return error / len(x_test)
</code></pre>
<p>The <code>x_test</code> here is a 2D array of all the inputs, ie: <code>x_test = [[10, 20], [75, 14], ...]</code> and <code>y_test</code> is an array
of all of the expected outputs, in the same order: <code>y_test = [1, -1, ...]</code>. So with this example: the expected output
of <code>[10, 20]</code> would be 1, and <code>[75, 14]</code> would be -1.</p>
<p>Typically, training and testing data are split out from the same set of data, training on something like 70% of the data
and testing on the unseen 30%. These percentages vary, and the process of splitting out the data is beyond the scope
of this post.</p>
<p>The error rate we return here is just the average error percentage. So if we test on 250 data points and we get 5 of them
wrong, our error rate would be 5 / 250 or 2%. I prefer to think in terms of accuracy, which would be 1 - error rate (98%
here). But it seems pretty much all of machine learning talks about reducing error, rather than increasing accuracy, so I'll stick
with that.</p>
<h3>Step Finally Done: tracking error rates</h3>
<p>We have all of our parts in place, and now we can start to tie them together in interesting ways. The one I'm implementing
is a simple way to track how training is affecting error rate. I'm calling this function <code>run</code></p>
<pre><code class="language-python">class Neuron:
    # ...
    def run(self, x, y):
        half = len(x) // 2
        x_train = x[:half]
        x_test = x[half:]
        y_train = y[:half]
        y_test = y[half:]
        error_rates = [self.test(x_test, y_test)]
        for i, inputs in enumerate(x_train):
            self.train(inputs, y_train[i])
            error_rates.append(self.test(x_test, y_test))
        return error_rates
</code></pre>
<p>This function splits our data in half. We are assuming the data is sufficiently shuffled before coming in to this function.
Data pre-processing is not really a part of building my neuron. Computerphile has a great playlist that goes in to this
kind of thing much better than I could. See that
<a href="https://www.youtube.com/playlist?list=PLzH6n4zXuckpfMu_4Ff8E7Z1behQks5ba" target="_blank">here</a>.</p>
<p>Now that we have split data, we grab see what our error rate is completely untrained. With our baseline in hand, we can
start to see how training changes our error rates as we go. These should, ideally, trend towards 0. Let's run it and
find out!</p>
<div style="text-align: center;">
    <img src="/assets/images/articles/final_run.png">
</div>
The plot on the left is the error rate over time, and the plot on the right is the final test result, after all the
training has been done. Feel free to play with all this yourself. All of the code, plus some supplementary bits can
be found [here](https://github.com/jimmykodes/neuron_blog_post).
<h2>The whole enchilada. Thanks, now I want dinner!</h2>
<p>Here it is. The whole kit and caboodle. Lock, stock, and barrel. Hook, line, and sinker. You get the idea...</p>
<pre><code class="language-python">import random

class Neuron:
    def __init__(self, number_of_inputs, weights=None, training_rate=0.1):
        self.number_of_inputs = number_of_inputs
        self.weights = weights if weights is not None else self._random_weights()
        self.training_rate = training_rate

    def _random_weights(self):
        return [random.normalvariate(0, .5) for _ in range(self.number_of_inputs)]

    def _weighted_sum(self, inputs):
        return sum([inputs[i] * self.weights[i] for i in range(self.number_of_inputs)])

    def _activate(self, w_sum):
        return -1 if w_sum &#x3C; 0 else 1

    def _adjust_weights(self, inputs, error):
        for i in range(self.number_of_inputs):
            self.weights[i] += inputs[i] * error * self.training_rate

    def predict(self, inputs):
        w_sum = self._weighted_sum(inputs)
        return self._activate(w_sum)

    def train(self, inputs, label):
        output = self.predict(inputs)
        error = label - output
        self._adjust_weights(inputs, error)

    def test(self, x_test, y_test):
        error = 0
        for i, inputs in x_test:
            if self.predict(inputs) != y_test[i]:
                error += 1
        return error / len(x_test)

    def run(self, x, y):
        half = len(x) // 2
        x_train = x[:half]
        x_test = x[half:]
        y_train = y[:half]
        y_test = y[half:]
        error_rates = [self.test(x_test, y_test)]
        for i, inputs in enumerate(x_train):
            self.train(inputs, y_train[i])
            error_rates.append(self.test(x_test, y_test))
        return error_rates
</code></pre>
<p>The core of this is just over 30 lines of code, and the whole thing is just under 50. Short, sweet, and to the point.</p>
<h2>Stay tuned!</h2>
<p>Part 3 of this series will be on a neural network, not just a perceptron.</p>

Training our Neuron. Algorithms to keep it on track.