Machine Learning, Perceptrons and Ruby

Machine learning(ML) refers to a collection of programming techniques that allow computers to make distinctions or recognize patterns without explicit commands. This field is based on statistical methods and emerged from artificial intelligence research in the late 1950s and early 1960s. Applications of ML include optical character recognition, sentiment analysis, computer vision and prediction making. People with experience in ML are highly desired in the job market and learning based algorithms are making more and more important decisions in our society. So as an emerging programer its probably worth while to learn a bit about how machines learn.

Use conventional code if you can articulate a concrete series of actions that would produce the desired functionality.
Should I use Machine Learning? (Source: Learning Machines)


As an introduction to ML this post will walk through how to build a single layer perceptron in Ruby. The perceptron was one of the first functional ML algorithms. It was developed by Frank Rosenblatt in 1957 and was used to build a machine that could identify certain objects. At the time Rosenblatt stated that the “perceptron [is] “the embryo of an electronic computer that [the Navy] expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.”

I am far from an expert in this field but luckily perceptrons are relatively straight forward models to build. I have seen them written in python, Java, and javascript but had a hard time finding a ruby version. Attempting to build this out in ruby seemed like a decent contribution that I could make.

Using a common biological analogy, a perceptron is like a single neuron in a larger brain. It is designed to take in inputs and based on those inputs generate an output for other neurons.


Diagram of a neuron
Neurons (Source: Nature of Code)


A diagram of a single layer perceptron
Perceptron (Source: Nature of Code)
A diagram that illustrates how a perceptron can be useful
Example use case (Source: Learning Machines)

The basic procedure that a perceptron follows is:

  1. Take in input values; such as X,Y,Z coordinates
  2. Multiply these values by their respective “weights”. (The appropriate value for these weights is what is being “learned”. They are typically random when initialized)
  3. Sum all weighted values
  4. Pass this sum into a function that returns either a 1(true) or 0(false).

The perceptron first goes through a training process where data is passed through with a known “correct” output. As the training data is passed through the weights are adjusted until the inputs produce the correct outputs with a reasonably high success rate. Following this training process the perceptron can then take in new data without a known output and predict the correct output.


To make my own perceptron I followed along with this video in python and translated to ruby as best I could.

The code:

# Original Python code

def main
  #        Bias,  x,  y, expected output]
  data = [[1.00, 0.08, 0.72, 1.0],
          [1.00, 0.10, 1.00, 0.0],
          [1.00, 0.26, 0.58, 1.0],
          [1.00, 0.35, 0.95, 0.0],
          [1.00, 0.45, 0.15, 1.0],
          [1.00, 0.60, 0.30, 1.0],
          [1.00, 0.70, 0.65, 0.0],
          [1.00, 0.92, 0.45, 0.0],
          [1.00, 0.93, 0.46, 0.0],
          [1.00, 0.09, 0.72, 1.0],
          [1.00, 0.10, 0.73, 1.0]]
          # initial weights
  weights = [0.20, 1.00, -1.00]

  train_weights(data, weights, 50, 0.5, false, true, false)

def predict(inputs, weights)
  threshold = 0.0
  total_activation = 0.0 do |weight, input|
    total_activation = input * weight
  total_activation >= threshold ? 1.0 : 0

def accuracy(matrix,weights)
  preds = []
  mat_len = matrix.size
  i = 0
  while i < mat_len
    # get prediciton
    pred = predict(matrix[i][0...-1],weights)
    preds << pred
    #check accuracy of prediciton
    if pred == matrix[i][-1] then num_correct+=1.0 end
    i += 1
  puts "Predictions: #{preds}"
  #return overall prediction accuracy
  puts "HIII #{num_correct}, #{matrix.size.to_f}"

#train the perceptron on the data from the matrix
# trained weights are returned at the end of this function
def train_weights(matrix, weights, nb_epoch = 10, l_rate = 1.0, do_plot = false, stop_early = true, verbose = true)

  #iterate over the number of epochs
  nb_epoch.times do
    #calculate accuracy
    cur_acc = accuracy(matrix,weights)
    puts "Weights: #{weights}"
    puts "Accuracy: #{cur_acc}"

    #check if we are done
    if cur_acc == 1.0 and stop_early then break end

    #iterate over training data
    i = 0
    while i < matrix.size
      prediciton = predict(matrix[i][0...-1],weights)
      error = matrix[i][-1]-prediciton

      if verbose then puts "Training on data index #{i}" end

      #iterate over each weight and update it
      j = 0
      while j < weights.size
        if verbose then puts "\tWeight #{j}" end
        j += 1

      i += 1


At this point the code could certainly be improved but seems to work on some level. As I tested it I found the weights being slowly adjusted but never saw the accuracy change. I think there may be a problem, either in the text output or with how I’m storing the variables. Currently it predicts a single output for all input data which is not how its supposed to work. I may also take some time to refactor the loops, since I converted them from python in a very simplistic way.

Terminal output from the ruby perceptron
My Terminal output from the ruby perceptron


Two lines demonstrate the correct and predicted classification of each point on a grid
As more data is added the accuracy is improved (source: nature of code)

Now, lets say I got this thing working perfectly. It seems relatively simple for a “Machine Learning Algorithm” right? Well it is. It turns out that a perceptron is about as useful as an IF statment. The major failing of a perceptron is that it can only solve “linearly separable problems”. Meaning that if the difference between two groups of data can’t be defined by a line a perceptron will fail. This is also known as a XOR (exclusive OR) problem.

Two graphs demonstrate linearly separable data XOR
Left, linearly separable data. Right, non-linearly separable data


This XOR problem with perceptrons was a major blow to the field and its discovery caused an “AI winter” in which there was little AI/ML funding to be found. However, it was quickly determined that chaining multiple perceptrons together by adding “hidden layers” could solve this XOR problem. Thus the perceptron, while limited, represents a fundamental building block in ML that has been expanded on to produce terrifying, brilliant and powerful software.


Please see these sources for more information. Much of what I wrote about here comes from Patrick Hebron’s Learning Machines class taught at ITP (NYU) and his syllabus (below) provides a great intro for newbies. In addition another(!) ITP professor has made a series of videos on the subject. Its funny, and provides a much better tutorial than I did 🙂.

Leave a Reply

Your email address will not be published. Required fields are marked *