The k-learning algorithm nearest neighbors (KNN) is used to perform data classification.
To predict the classification of a new data, the algorithm relies on the k records from the learning data set are then located most similar to this new record.
The similarity between the records can be measured in different ways. Generally a good starting point is the Euclid distance.
The algorithm is as follows:
For an entry x what is its class y if I rely on the neighborkk set nearest to x?
- Find the k part input the training data that are the closest to my entry x (here we will use for example the Euclid distance)
- Make each of these training data vote for their class.
- Returning the majority class
The success of the algorithm will depend on the amount of training data and on the quality of the measurement of the distance between 2 vectors x.
Here is an example of Using and implementing Python on the database that relates to the granting of a credit based on age and amount requested. The class is the YES or NO answer.
from math import sqrt Make a predicition of classification def predire_classification (donnee_test, training, nombre_voisins_vote): neighbours - recherche_voisins (training, donnee_test, nombre_voisins_vote) exit - f[vecteur[-1]or vector in neighbors] prediction - max (set), key-exit.count) return prediction euclidian #Distance of 2 vectors def distance_euclidienne (vector1, vector2): distance - 0.0 for i in range(len)-1): distance (vector[i]1 - vect[i]or2) return sqrt (distance) Search for neighbors def recherche_voisins (training, donnee_test, nbVoisins): distances - list() for online Training in training: dist - distance_euclidienne (donnee_test, Training line) distances.append (training line, dist) distances.sort (key-lambda tup: tu[1]p) kVoisins - list() for i in range (nbVoisins): kVoisins.append (distance[i][0]s) return kVoisins Training data Learning gives[[25,40000,'NON'], [30,60000,'OUI'], , , , , , , , [32,10000,'NON']] prediction - predire_classification (learning[1], giving, learning, 3) print ('We should find %s, prediction is: %s.' % (givenSLearning, [1][-1]prediction))