4

I'm working through the textbook called Learning From Data and one of the problems from the first chapter has the reader implement the Adaline algorithm from scratch and I chose to do so using Python. The issue I'm running into is that the weights for my $\textbf{w}$ immediately blow up to infinity before my algorithm converges. Is there something incorrect I am doing here? It looks like I am implementing it exactly as the text describes. Below I've provided the question and my Python code. Here $\textbf{y}$ takes on the values of -1 and 1. So it is a classification problem. Problem 1.5

import numpy as np
import pandas as pd

#Generate w* vector, the true weights dim=2 wstar=2000*np.random.rand(dim+1)-1000

#Generate the random sample of size 100 trainSize=100 train=pd.DataFrame(2000*np.random.rand(trainSize,dim)-1000) train['intercept']=np.ones(trainSize) cols=train.columns.tolist() cols=cols[-1:]+cols[:-1] train=train[cols]

#Classify the points train['y']=np.sign(np.dot(train.iloc[:,0:3],wstar))

#Now we run the ADALINE algorithm on the training data #Declare w vector w=np.zeros(dim+1)

#Column of guesses train['guess']=np.ones(trainSize)

#s column train['s']=np.dot(train.iloc[:,0:3],w)

#Set eta eta=5 iterations=0 while (all((train['y']train['s'])>1)==False): if iterations>=1000: break #Picking a random point randInt=np.random.randint(len(train)) #Temporary values for calculating new w temp_s=train['s'].iloc[randInt] temp_x=train.iloc[randInt,0:3] temp_y=train['y'].iloc[randInt] #Calculating new w if temp_ytemp_s<=1: w=w+eta(temp_y-temp_s)temp_x #Calculating new guesses and s values train['s']=np.dot(train.iloc[:,0:3],w) train['guess']=np.sign(train['s']) iterations+=1

lamyvista
  • 51
  • 4

1 Answers1

4

First of all, let me add this schema which I think is quite nice to understand the transition and improvement from the initial Rosenblatt's perceptron and the Adaline algorithm:

enter image description here

In Adaline, provided that the cost function (your y(t)-s(t)) is differentiable, the weights can be updated and there is no restriction of y and s having the same sign: the objective is to minimize the cost y-s.

Below you can find the code provided in the excellent book by Sebastian Raschka:

class AdalineSGD(object):
"""ADAptive LInear NEuron classifier.
    Parameters
    ------------
    eta : float
    Learning rate (between 0.0 and 1.0)
    n_iter : int
    Passes over the training dataset.
    shuffle : bool (default: True)
    Shuffles training data every epoch if True
    to prevent cycles.
    random_state : int
    Random number generator seed for random weight
    initialization.
    Attributes
    -----------
    w_ : 1d-array
    Weights after fitting.
    cost_ : list
    Sum-of-squares cost function value averaged over all
    training samples in each epoch.
"""
def __init__(self, eta=0.01, n_iter=10,
                shuffle=True, random_state=None):
    self.eta = eta
    self.n_iter = n_iter
    self.w_initialized = False
    self.shuffle = shuffle
    self.random_state = random_state

def fit(self, X, y): """ Fit training data. Parameters ---------- X : {array-like}, shape = [n_samples, n_features] Training vectors, where n_samples is the number of samples and n_features is the number of features. y : array-like, shape = [n_samples] Target values. Returns ------- self : object """ self.initialize_weights(X.shape[1]) self.cost = [] for i in range(self.n_iter): if self.shuffle: X, y = self.shuffle(X, y) cost = [] for xi, target in zip(X, y): cost.append(self._update_weights(xi, target)) avg_cost = sum(cost) / len(y) self.cost.append(avg_cost)

return self

def partial_fit(self, X, y): """Fit training data without reinitializing the weights""" if not self.w_initialized: self._initialize_weights(X.shape[1]) if y.ravel().shape[0] > 1: #if we have more than one sample for xi, target in zip(X, y): self._update_weights(xi, target) else: self._update_weights(X, y)

return self

def _shuffle(self, X, y): """Shuffle training data""" r = self.rgen.permutation(len(y))

return X[r], y[r]

def _initialize_weights(self, m): """Initialize weights to small random numbers""" import numpy as np

self.rgen = np.random.RandomState(self.random_state)
self.w_ = self.rgen.normal(loc=0.0, scale=0.01,
                           size=1 + m)

self.w_initialized = True

def update_weights(self, xi, target): """Apply Adaline learning rule to update the weights""" output = self.activation(self.net_input(xi)) error = (target - output) self.w[1:] += self.eta * xi.dot(error) self.w_[0] += self.eta * error cost = 0.5 * error**2

return cost

def net_input(self, X): """Calculate net input"""

return np.dot(X, self.w_[1:]) + self.w_[0]

def activation(self, X): """Compute linear activation""" return X

def predict(self, X): """Return class label after unit step"""

return np.where(self.activation(self.net_input(X))
                &gt;= 0.0, 1, -1)

German C M
  • 2,744
  • 7
  • 18