Neural networks, Artificial Intelligence, Machine learning


.... you are an experienced Python programmer and /or AI scientist, you probably dont need this page


.... get. a taste of AI. It is the skill of fhe near future!

AI is the buzzword, but actually Machine Learning is much better - intelligence and consciousness are at the moment not realistic.

But you have to start somewhere!

Follow me on this adventure. I recently started this myself and found a lot of info on the web, but mostly far too complicated.

So i collected and adapted stuff, simple enough to understand. It is fascinating!

I use programming languages Python and some Java, maybe Javascript; so it helps if you do some readup on those.

I hope to arouse some interest with some of you; as especially in Holland education in this field is lagging behind (some unis now actually have a numerus clausus - the ultimate stupidity), but I also do this because when you write up something to explain to others you learn the best yourself.

AI (or ML , deep learning. take your pick) is about software structures based on/emulating the human brain, learning either from experience and/or from large datasets and also without datasets (see later).

So AI learns by itself, instead of being preprogrammed for a particular purpose by fallible humans.



Lekker winkelen zonder zorgen - Gratis verzending en retour

The mythical NAND gate as illustration of a neuron

The NAND gate is an interesting starting point and easy to use as simple exercise to emulate a single artificial neuron. It doesnt learn yet, but shows the internal workings.

Also it is the basic building block of THE universal computer.

Available as simple electronic integrated circuits, with 4 and more gates on a chip (you can buy them cheaply, for example the TI 7401 chips).

A NAND (NOT-AND) gate (the chip) will output a ONE (5 volts) only if both inputs are ZERO (zero Volts).

Look it up; there are also AND, OR, XOR and NOT gates, doing other things with ones and zeros.

responsive design image

With multiple NAND gates we can perform binary additions, compare binary numbers, and make flipflops as memory cells and thus registers.

The building blocks of a computer are memory, registers, arithmetic units (AU), comparators, counters that can all be constructed with NAND gates., therefore a computer can be constructed using NAND gates.

You would need a lot of them but its definitely doable and it has actually been done. If you think of the early beginnings of computerchips, INTEL's 4004, a similar 4-bit CPU is certainly doable with NAND gates.

If you are interested Google the 7401 TI chips, containing 4 nand gates in silicon, 14PIN DIL, still generally available., cheap.

An artifical software Perceptron/neuron with two inputs.

responsive design image


This is to get into some of the very basic basics to get you started with the AI concepts as well as with Python to play around with, before we get into the real thing.

A neuron (or perceptron) will take inputs and produce a particular output. This happens all the time in your brain, but here we will create the software model of a neuron/perceptron.

By applying WEIGHTS to the inputs, we can make a particular input more or less important than another; and by adding a socalled BIAS (number) we can make the neuron easier or less easy to trigger.

After some summing and multiplying the inputs and weights we will apply a TRIGGER (also called ACTIVATION FUNCTION) to the result , being the function that lets the neuron fire or remain quiet.

(important summary) The functions of the perceptron are thus:

- Input the x1 ----- xn values (variables or array)

- Do the summing algorithm with weights w1 --- wn and bias b: w1 * x1 + w2 * x2 + wn * xn + b

- Perform the trigger or activation function on the result of the sum

- Output the end result to the world

At this stage we do a simple binary activation , if the resulting sum is <=0 then the output becomes 0, else if the sum > 0, the output = 1 (variables or array).

In the model in this chapter, the values of weight and bias are not preset, so you still can play around with various values !!!

X1 to Xn are the input values, and output is the value we are trying to achieve, using the weight and bias. Here we use only two inputs, x1 and x2. Start with bias 3, weights -2 and two inputs, x1 and x2 (use 1 or 0). These values will let the neuron behave as NAND.

In Python speak this becomes:

print ("defining a perceptron")

x1 = int (input("number x1: "))

x2 = int(input("number x2: "))

w1 = int(input("number weight 1: "))

w2 = int(input("number weight 2: "))

b = int(input("number for bias: "))

output = (x1*w1)+(x2*w2)+b

if (output > 0):

        output =1


        output = 0

print (output)

You can paste the above into Idle (the Python shell / editor), save it with a .py extension in the directory where Python.exe lives and run it under python.

Better practice is to add python.exe location to the path variable, system environment variables. It allows you to run the python commands from within any folder in your system (and you avoid cluttering up the folder with python. exe etc.). I havent done that - yet. Ask me if you dont know how to do the system path.

(Alarm!) Python can get very angry if you misuse indenting ; and dont forget brackets, semicolons etc!

By all means try out various weights and biases; nothing gets damaged, except your ego perhaps.

Use 0 or 1 for the inputs, we are still talking binary.

Here a screenshot of what actually happens.

responsive design image

As expected with x1 and x2 being 1, the result is 0, thats what a NAND gate does.

Try some other x1, x2 . weight and bias input values. Just to pass the time.

Perceptron/neuron weighted (hardcoded) to function as NAND gate, defined as Python function:

This is the same program as the previous chapter, but as we want to repeatedly use the NAND gate in the next chapters, its useful to define it as a Python function.

A function can be called repeatedly with input parameters., instead of retyping the same code all the time.

For our purpose this particular perceptron is parameterized (hard coded) to work as a NAND gate, now with only x1 and x2 as variable input.

Copy the program lines below into IDLE (the handy Python shell), save as e.g. or some name you fancy, in the folder where python.exe lives , start the windows command-shell, cd to that folder and run it.

Now do it:

def nand (x1, x2):

        print ("defining a perceptron NAND as function")

      # ("NAND defined by weights -2 and bias 3")

        w1 = int(-2)

        w2 = int(-2)

        b = int(3)

        output = (x1*w1) + (x2*w2) + b

        if (output > 0):

                output =1


                output = 0

        return output

print ("nand is now a function")

x1= int(input("number 1 or 0: "))

x2= int(input("number 1 or 0: "))

print ("Öutput NAND is: ", nand(x1,x2))

This is the result in the command shell:

responsive design image

Try the whole NAND truthtable to see if it works.

Perceptron/neuron hardcoded as a Java class:

WTH, Just for the fun of it i downloaded Java SDK and tried my hand at the previous NAND routine , now as a Java class (actually two classes).

These are my first stumblings in Java, so there are very probably better ways to do things. I adapted listings from the book " Java for dummies" , by adapting and changing existing listings you start to understand whats happening.

OOP is a totally different way of thinking for me and the program may still be a bit awkward. I am trying to wrap my brains around the question what the benefit is of the class/object Java approach in comparison with the Python function approach, or as I wrote back in the seventies in machine assembler language a main program with a huge amount of subroutines and interrupt handling routines.

I use Wordpad to edit the source. An IDE may be better as it can prevent you making some standard errors. At this stage i prefer the direct approach of a wordpad file - if you make an error the program will not compile. Straight text input is totally unforgiving.

The program consists of two Java classes, Neuron and UseNeuron

You must save the two wordpad files with these names (icluding capitals) and the extension .java

The Neuron class defines a generic two input, one output logical gate (can be anything, NAND, AND, OR, whatever)

The UseNeuron class defines a NAND object, gets (called the method) the input values and produces the result with a very simple activation. Try to understand, there is a lot of tutorial info on the net.

UseNeuron also defines an AND object for illustration, which i will not use.

For java i have set the system environment path to point a the java executables, so i can run the jave programs from the folder where they live (a bit cleaner than what i did with Python)

So create and save your .java wordpad files,

Then run :


this will compile into a file Neuron.class

and run:

java Neuron

this will do nothing visible as it only establishes the class Neuron

then run:


which compiles into a class file UseNeuron.class

and run:

java UseNeuron

which will get x1 and x2 inputs and produce the NAND value

The class source:

public class Neuron {

/* This is java neuron devinition modelled after 7.1
* only defines the class Neuron
* x1 and x2 are the two inputs
* output is the output after activation
double x1;
double x2;
double output;

And the class source:

import static java.lang.System.out;
import static;
import java.util.Scanner;

public class UseNeuron {
/* This class is based on java for dummies listing 7.2 */
/* implements object Nand neuron using class Neuron */
public static void main(String args[]) {
Neuron nandNeuron;
Neuron andNeuron;

/* only using the nand */
nandNeuron = new Neuron();
andNeuron = new Neuron();

/* nandNeuron.x1 =; */
/* further study, why i cant use nandNeuron.x1 */
/* directly with or scanner */

/* here obtain the input values x1 and x2 */
Scanner scanner = new Scanner(;
System.out.print("Enter your x1: ");
int x1 = scanner.nextInt();

System.out.print("Enter your x2: ");
int x2 = scanner.nextInt();

/* here the neuron function using */
/* fixed weights and bias */
nandNeuron.x1 = x1;
nandNeuron.x2 = x2;
nandNeuron.output = 1;
nandNeuron.output = (nandNeuron.x1 * -2)+ (nandNeuron.x2 * -2) + 3;

/* here the very simple activation function */
/* just determining 0 or 1 */
if (nandNeuron.output > 0) {
nandNeuron.output =1; } else {
nandNeuron.output=0; }

/* and display the result */
out.print("The NAND function: ");
out.print(nandNeuron.x1+ " NAND "+nandNeuron.x2+" gives: ");



The result in the command shell, two ones give one zero, as expected.

responsive design image

Try the whole NAND truthtable to see if it works.

It does.

Now a program using the NAND neural function to emulate a 1 bit binary adder:

If you have seen enough of the NAND gates, you may skip to the Neural Network section - the real thing,

The previous chapter was just about one simple single nand gate.

Next step is to implement a binary adder for two one bit numbers plus a carry from a previous adder.

You need 4 nand perceptrons for addition and 1 nand perceptron for the carry bit.

The output is 2 bits, 1 bit sum plus 1 bit carry.

The binary truth table for this is then:

0 + 0 = 0 carry 0 result shows as 0 0 (decimal 0)

0 + 1 = 1 carry 0 result shows as 0 1 (decimal 1)

1 + 1 = 0 carry 1 result shows as 1 0 (decimal 2)

If there was a carry from a previous stage the end- result for 1+1 would be 11 (3) .

responsive design image

Without further ado I have defined the add1bit as a function, which calls the nand function multiple times.

The program structure is simple; a. define the nand function, b. define an add1bit function, then c. (main) the code lines to input the values you want to add and run the add1bit function, which will run de nand function, then d. output the result. The lines under c. are called ' main' in other languages like C , Pascal., Java

The program lines are copied directly from IDLE, you can copy/paste back through IDLE or paste directly in a wordpad file and save with extension .py.

Save in the location where python exe lives and run through the Windows command shell. For Linux the principle is the same.

The variables o1 upto o7, cin, cout and o refer to the nandgates (ie neurons!) in the diagram, so you can see what happens.

# using the nand function 5 times to create a AU

# adding two one bit binary numbers, carry in and carry out

def nand (a, b):

        w1 = int(-2)

        w2 = int(-2)

        bias = int(3)

        out = (a*w1) + (b*w2) + bias if (out > 0):

                out =1


                out = 0

        return out

def add1bit (a, b, cin):

        o1 = nand(a,b)

        o2 = nand(a,o1)

        o3 = nand(b, o1)

        o4 = nand(o2, o3)

        o5 = nand(cin, o4)

        o6 = nand(cin, o5)

        o7 = nand(o4,o5)

        o = nand(o6,o7)

        cout = nand(o1,o5)

        return cout, o

#input binary numbers to be added

print ("Enter two 1-bit binary numbers")

a0 = int(input("number 1 or 0: "))

b0 = int(input("number 1 or 0: "))

cin = int(input("carry in, number 1 or 0: "))

# now add m up

result = add1bit (a0, b0, cin)

print ("Result now is carry - sum: " , result)

Here the screendump of the process

responsive design image

Adding two two-bit binary numbers

I got carried away slightly, so skip to the neural network chapters you have had enough.

The diagram below is actually a 4-bit adder. As we are doing this as exercise we only implement a two two-bit number adder Python. Doing a full 4-bit adder is just more of the same.

responsive design image

This is the embryonic beginning of a real computer

An AU (arithmetic unit) is used for several purposes, such as adding numbers or function as the program counter (points to the address of the next instruction to be fetched from memory and executed).

Once this is done, adding 4 or 8 bit numbers is just more of the same (wont do that here, promise).

The truth table for addition of two two-bit numbers is:

00 + 00 = 00

01 + 00 = 01

01 + 01 = 10

10 + 01 = 11

10 + 10 = 00 plus carry 1 (100)

11 + 01 = 00 plus carry 1 (100)

11 + 10 = 01 plus carry 1 (101)

11+ 11 = 10 plus carry 1 (110)

Output is 2 bits plus carry

Values change when the input carry of a previous stage is true. We use the nand function defined above again as building block.

Here in python speak:

# a multi bit binary number adder using the nand function as basic block

# each two bit plus carry adder will be used as a function

# numbers represented by a1 a0 and b1 b0 , c is carry

def nand (a, b):

        w1 = int(-2)

        w2 = int(-2)

        bias = int(3)

        out = (a*w1) + (b*w2) + bias

        if (out > 0):

                out =1


                out = 0

        return out

def add1bit (a, b, cin):

        o1 = nand(a,b)

        o2 = nand(a,o1)

        o3 = nand(b, o1)

        o4 = nand(o2, o3)

        o5 = nand(cin, o4)

        o6 = nand(cin, o5)

        o7 = nand(o4,o5)

        o = nand(o6,o7)

        cout = nand(o1,o5)

        return cout, o

def add2bits (a1, a0, b1, b0, cin):




        o0 = add1bit(a0,b0,cin)

        cin = cout

        a = a1

        b = b1

        cout, o1 = add1bit (a1,b1,cin)

        return o1, o0, cout

#input binary numbers to be added

print ("Enter two 2-bit binary numbers")

a1 = int(input("a1 number 1 or 0: "))

a0 = int(input("a0 number 1 or 0: "))

b1 = int(input("b1 number 1 or 0: "))

b0 = int(input("b0 number 1 or 0: "))

cin = int(input("carry in, number 1 or 0: "))

cout = 0

# now add m up

result = add2bits (a1,a0, b1, b0, cin)

bit1, bit0, carry= add2bits (a1, a0, b1, b0, cin)

print ("carry: -", carry, "bit1: -", bit1, "bit0: -" , bit0 )

Below you see what happens in the command shell.

The result shows binary 111, which is decimal 7, as we added 11 (3) + 11 (3) + 1 = 111 (7)

responsive design image

A flipflop memory latch from nandgates - utter madness

Just for the sake of utter madness you could attempt to create a memory latch ( or flipflop) for one bit with neurons.

responsive design image

Obviously totally bonkers as we use an Intel i7 desktop, Windows 10, Python to emulate a single 1$ chip., to create a one bit memory.

I will skip this for now as we are ready to move to more intelligent neurons.

Perhaps nice as an exercise in Python programming sometime, as the flip flop has an issue: its clocked.

Read the outputs of the diagram as red led is 1 and green led is 0. The NOT gate is easy, use the nand gate and short circuit the inputs (works in Python as well).

Sigmoid neurons, first derivative and matrix multiplication - dont panic.


Some mathematical shit here which is good to have heard of.

The issue with the activiation function sofar is that it flips between 0 and 1 and could become unstable.

The difference between perceptrons and sigmoid neurons is that sigmoid neurons don't just output 1 or 0. They can have as output any real number between -1 and 1, so values such as 0,486654 and 0,9988321 are legitimate outputs - as we will see later.

The sigmoid function is available in Python (numpy toolset).

In python speak . If you use the ' as' option you can call numpy by using 'np', a lot shorter. Like this:

import numpy as np

def sigmoid(z):

return 1.0/(1.0+np.exp(-z)):

I dont go into the deep with eulers number and more like that. Look it up, lots of stuff on the internet.

In order to play around with AI you only have to know how to import numpy and how to call the function.

Rather than a very steep transition between 0 and 1 , the sigmoid will give a sliding (non-linear) scale of numbers, for example:

responsive design image


one more important concept you have to be aware of, is the derivative (in dutch afgeleide). Sorry about that.

You only have to be aware what is use is., understanding is another matter.

The first derivative of a function will give the tangent line tilt at particular point of the original function. If the deirvative is negative the tilt is to the left, else to the right. You can also determine how steep the curve is at that particular point of the graph.

It is used to check whether we should adjust parameters forward or backward when training a neural network. This is an over simplification, so by all means check on the web.

Matrices and multiplying matrices

In the NAND gate examples i coded each neuron as a function.

In Neural Networks a more efficient way is to use matrix and matrix multiplication.

Neurons and synapses (the weights) in a network are represented by matrices - for our purpose here matrices are stored in numpy arrays, and the multimplication etc. is done in one feel sweep for all neurons using numpy dot multiplication of arrays (= matrices). There is a load of stuff on the internet on the subject, but here a small summary fo you can understand the code lines following.


       [0, 0, 1],

       [0, 1, 1],

       [1, 0, 1],

       [1, 1, 1]])

Matrix 4 x 3, ie 4 rows 3 columns





Matrix 3 x 1 ie 3 rows 1 column

When we multiply these two matrices the result will be array3;






Multiply gives matrix 4 x 1 ie 4 rows 1 column

The rule in general is :

·         The number of columns of the 1st matrix must equal the number of rows of the 2nd matrix.

·         And the result will have the same number of rows as the 1st matrix, and the same number of columns as the 2nd matrix.

Dont worry too much. Its nice if you can understand it, numpy will solve the whole thing for you - and fast.

A 2 layer artificial neural network (ANN) to play with

I first have to apologize to all those clever people (Phd, professors and the like) who also have published on this subject. I've reused (uh, loaned?) ideas and Python solutions and after a good deal of mixing and shuffling came up with the easy examples below.

I hope I added some value to place it in the context and sequences of this uhh, tutorial and make it accessible to morons like myself.

What i also noticed is that there is a naming convention issue here; you will see a 2-layer neural network also referred to as a single layer. I see a network as multiple connected things, so a minimum useful network wil consist of TWO LAYERS OF NEURONS.

responsive design image

Left is a TWO LAYER network, consisting of neurons forming the input layer, and in this case a single neuron as OUTPUT LAYER taking the synapses from the input layer and do its thing with those.

At the right a three layer network, with again an input layer and an output layer, but in between a socalled hidden layer that does its thing with the input layer synapses.

A THREE layer network can do cleverer things than a two layer network. Rember each neuron consists of the functions described above.

Ok at this stage i went back to my python 2layer neural network to clean it up for publication here.

In the program below we will work with a simple two layer network, see the picture:

responsive design image

The program will do the following conform the picture:

Triple neuron input layer with values

Synapses ( weights) as array between input neurons and output neurons

The suming algorythm for the synapses will do the sigmoid activation and

generate Layer 1, which is the output

The output will be compared with a separate array and the difference (the error) will is used to change the weights.

The cycle will be repeated a large number of times and then print the last layer1 array; which will closely match the target (we hope).

Here is the program pasted raw, not yet corrected with the right indents etc. but it gives you the idea. Have a look. Next time i wil make the indents so the program is runnable, and the screendump of the output.

import numpy as np

# sigmoid function - deriv flag indicates to yield sigmoid or its derivative

def nonlin(x,deriv=False):


                return x*(1-x)


                return 1/(1+np.exp(-x))

# input dataset

x = np.array([ [0,0,1], [0,1,1], [1,0,1], [1,1,1] ])

# output dataset that we try to achieve

y = np.array([[0,0,1,1]]).T

# seed random numbers to make calculation

# deterministic


# initialize weights randomly with mean 0

syn0 = 2*np.random.random((3,1)) - 1

# set number of training iterations

for iter in range(10000):

# forward propagation, do the sums with weights ad bias

        l0 = x

        l1 = nonlin(,syn0))

# compare result with wanted output

        l1_error = y - l1

# multiply how much we missed by the

# slope = derivative of the sigmoid at the values in l1

        l1_delta = l1_error * nonlin(l1,True)

# update weights

        syn0 +=,l1_delta)

print ("Output After Training:")

print (l1)

print ("The actual synapse/weights for this result: ")

print (syn0)

So run the program in the usual way from the command shell. The result will show after some thinking (10000 iterations!) the following.

Remember the output we wanted the program to learn was [ 0, 0, 1, 1]

The result is [0.009], 0.007], [0.993], [0.992] and this result was obtained with the synapse containing the weights:

[9.672], [-0.207], [-4.629] give or take a few more digits.

So what has happened. The program has learned itself to recognize the pattern of bits in the input x array as 0011. So if the input x was a picture (bit pattern) of a three, the program would now have learned to recognize this as a 0011 - binary 3.

So2; if we now take the synapse (weights) and use them in a much more simplified non-learning program to read the same x array, it should yield the same output.

So3; what have we achieved then is that we created a neural network (very simple) which recognizes this pattern as a 3

Think about that. Thats amazing. At least, the first time i did this i found it amazing.

responsive design image

I will demonstrate with the following tiny program (the non learning version of the previous one) that indeed this particular x-array, combined with these weights will be recognised as representing a three.

Note that the actual result is not precisely 0011, but zeros are generally small or negative numbers, while ones are distininctly much larger.

You will find that when screwing around with the input array x and y; the result sometimes becomes unstable; a lot of combinations however are very clear and consistent.

Here is the little program that takes the x input array x and applies the weights we found above in order to demonstrate that without further learning the result will be the same 0011.

print ("This program uses the weights of a training cycle;")

print ("then applies these to the input matrix;")

print ("which should then result in the target output dataset")

import numpy as np

# sigmoid function or derivative

def nonlin(x,deriv=False):


                return x*(1-x)


                return 1/(1+np.exp(-x))

# input dataset matrix 4x3

x = np.array([ [0,0,1], [0,1,1], [1,0,1], [1,1,1] ])

# output dataset 4x1 to be learned

y = np.array([[0,0,1,1]]).T

# initialize weights with the learned weights

syn0 = np.array( [[ 9.67299303], [-0.2078435 ], [-4.62963669]])

# forward propagation, generate output layer l1

l0 = x

l1 = nonlin(,syn0))

print ("input dataset x")

print (x)

print ("target y was")

print (y)

print ("Output Layer After Training:")

print (l1)

print ("final synapse, for this x/y combination")

print (syn0)

Load and run in the same way as previous examples. The output layer l1 is clearly representing 0011

Sufficient stuff here to play around with. For example try to shorten the weights to eg. 4.62 and so on. Will work just fine.

Test by changing the input array; this will result in a nonmatch !!!!

The screendump of the command shell now will look as follows:

responsive design image

A 3-layer artificial neural network (ANN), going deeper

We have done the simple stuff, which should give you an idea of what a neuron can do and some background on the math used.

We ve seen that the layers are represented by l0 and syn0 for layer 1, and l1 for output.

So basically if we add the following:

l1 and syn1 and l2

we would have a 3layer network with l1 as input layer, l1 as the socalled hidden layer and l2 as output.

A 3 layer network can solve more complex things because of the extra step in between.

First lets look at the picture - i like pictures: note that i have only drawn 4 neurons un the hidden layer and 4 in the output. The program will be with 32 neurons in the hiddenlayer and 5 in the output. The principle is the same.

responsive design image

Programwise the expansion from a two layer to a three layer network is not extremely difficult if we understand the 2 layer version. Its about adding the input arrays, and syn1 handling.

The program as shown below will have the following matrices during its runtime:

(Matrix is shown as row x column)

1. Input l0 (=x) : 6x5 matrix

2. Syn0 : 5x64

3. L1 layer is the result of (l0.syn0) : 6 x 64

4. Syn1 : 64 x 1

5. Output l2 is the result of (l1.syn1) = 6 x 1

Understanding this and getting it to run, its not very difficult to use a larger input matrix, e.g. 12 x 10 and see what happens.

But first the program with a 6 by 5 matrix

# 3 Layer Neural Network:

# with variable hidden layer size

import numpy as np

hiddenSize = 32

# sigmoid and derivative function

def nonlin(x,deriv=False):


                return x*(1-x)


                return 1/(1+np.exp(-x))

# input dataset format 6 rows x 5 columns

X = np.array([

\ [0,1,1,0,0],

\ [1,0,0,1,0],

\ [0,0,1,0,0],

\ [0,1,0,0,0],

\ [1,0,0,0,0],

\ [1,1,1,1,1]])

# output dataset 6 rows x 1 column

y = np.array([[0],[0],[0],[0],[1],[0]])

# seed random numbers


# randomly initialize our weights with mean 0

syn0 = 2*np.random.random((5,hiddenSize)) - 1

syn1 = 2*np.random.random((hiddenSize,1)) - 1

# now learn

for j in range(60000):

# Feed forward through layers 0, 1, and 2

        l0 = X

        l1 = nonlin(,syn0))

        l2 = nonlin(,syn1))

# how much did we miss the target value?

        l2_error = y - l2

# if (j% 10000) == 0:

# print ("Error:" + str(np.mean(np.abs(l2_error)))

# in what direction is the target value?

        l2_delta = l2_error*nonlin(l2,deriv=True)

# how much did each l1 value contribute to the l2 error (according to the weights)?

        l1_error =

# in what direction is the target l1?

# were we really sure? if so, don't change too much.

        l1_delta = l1_error * nonlin(l1,deriv=True)

        syn1 +=

        syn0 +=

# to use the learned weights separately we need to save l1 and l2

# (syn0 and syn1)

print ("input matrix is: ")

print (X)

print ("output after training: ")

print (l2)

print ("output needed was")

print (y)

The input matrix vaguely resembles a digit 2 (as far as possible with a 6x5 matrix), the output y to be achieved i have set as 00010; which would be binary 2; it can also be seen as the second bit from the right (i will come back to that issue later).

Now all the program has to do is to recognize the input matrix as a 2; lets see what the command shell says:

Hm .....

responsive design image

These numbers may seem a bit mysterious, but they are in scientific notation.

This is m*10 to the power of n. If n is a negative number you could actually say m / 10 to the power of n. A negative exponent means actual divided by.

So the larger n is , the smaller the number, QED.

One number is clearly a lot larger than the others by virtue of the number being to the power of e-01; the second largest is e-03.

Thats a lot smaller

So yes, eureka, indeed , the program has recognized the input quasi picture 2 and linked it to the number 2.

Now play around with different input matrices and output matrices and see where the program will fail. I would set the number of iterations at 60.000 or less (try), thats a lot faster.

You can also play with hiddenSize and see where the program dies.

The following screendump i did with a digit 6 (of sorts) and two different ways of output. The first one where the 6th bit corresponds with the input nr. 6. The second one with binary six (110) corresponduing wit the input number.

In general the method with one bit works best. You will also see that the quality of the input determines a correct output, try it.

In the case of this input matrix both versions yielded a correct output - need more study to research why and if this can be made consistent.

I will make a version with input matrix 12 x 10 and output 10x 1 , and do the full set of digits from 0 to 9, with a single bit out of 10 output., where the 0 (zero ) should give bit 10. Probably its best to raise the hiddenSize to 64. We'll see.

responsive design image

Next come the results of a larger matrix representing digits 0 t0 9. Is it consistent?


Below the results of the same program , but with a larger input array of 10 by 12 . It still is a toy application so its probably not very scalable to very large arrays for example those that would represent an image with greyscale pixel values. The output is 10 wide where each postion represents a the number so from 1,2 ,3 ,4 ...... to 9, 0.

responsive design image

Actually the three is mirrored. That shows we can teach the ANN to see the mirrored three as a ..... 3.

responsive design image

I won' t place all the runs on this page. I however did some more runs and for now its performing fairly consistent with this matrix size..

Next step is to get a more powerful routine and try it with publicly available image data bases freely available on the net.

But first i am fascinated by the next subject, reinforcement learning.

Reinforcement learning with BB8 - into robotics -failed utterly!

I decided to take you (and myself) on a more ambitious adventure. The plan is to take BB8, which can be programmed by javascript and adapt the QLearning routines i found such that BB8 will learn by itself to find a path through a room or maze, with no previous information.

This is different from the previous neural network exercises that learned from examples and a required output. This is also a bit harder, a large number of bits harder, megabits actually.

After some research i abandoned the BB8 path. Its sensors and acuracy make this too complex. BB8 is just a charming toy, lets accept that.

responsive design image

It would have been nice.

The PROBLEM with BB8:

A first trial resulted in failure. The collision detect is not consistent. BB8 has to hit a wall really hard to trigger the function. Thats no good, i need it to go at very low speed and still detect the difference between walls and doors, ie bump and nobump. The sensitivity cannot be adjusted.

The other parameters can be helpful but they are not very accurate . The velocity feedback etc. are based on what the motors know, rather than the actual velocity of the thing.

Actually i would need more sensors (vision/infrared, ultrasound or something like that), My aim still is to get a real but simple droid, with more senses. I would have preferred not to have to buy one, as i have BB8.

Orbotix have not been very helpful providing info, soo too much effort to get this going.

Pausing for a while - time passes 7/3/2019

Time flies. I have gone thru a divorce and moved house, so that kept me somewhat busy distracted.

We've done a very simple 2 and 3-layer network, which already can learn to recognize simple input arrays and give them a meaning. BY ALL MEANS PLAY AROUND WITH IT, CHANGE ALL PARAMETERS (ALWAYS PROTECT YOUR WAY BACK WHEN IT GETS MESSY).

As i gradually find my bearings i still mean to continue with reinforcement learning, preferably using a small robot in real space using Arduino, Blackberry or similar.

I wont get productive again before Q1 2019, too much to do.

If you are interested look at the Openworm project (Google) , close to what i wanted to do but as joined opensource effort.

I forgot to say that below is pic of a (1972) 7401 chip on an experimenter board i found in a box of aged stuff. Still works!

responsive design image


Use this form to send me a question or whatever:

Contact form