Thursday, July 7, 2011

SVM classification, an intuitive explanation and some test with LIBSVM

In a previous post I focused on linear classification and on the kernel trick to make linear separable points not linear separable.
...In this post I would explain why this trick is the base to build one of the most powerful classifier: the support vectors machines (SVM).
Consider the following set of point red and blue:

As you can see, you cannot find a straight line to divide blue points from the red points.
...But as mentioned before using the kernel trick we can find a space having greater dimension where these points can be separate by  straight line.
Formally,
we are looking for a  straight line

xi ·w+b≥+1 for yi =+1 and xi ·w+b≤−1 for yi =−1
where xi represents the points (blue or red), yi =+1 to represent red points and y=−1for blue points.
and represent the unknown parameters of the straight lines.
The above equations can be combined into one set of inequalities: yi(xi ·w+b)−1≥0 ∀i
The SVM works out the classification bringing the points in a "kernel space" where the points can be divided finding a simple hyperplane.
I won't enter in technical details (the theory is based on Lagrange multipliers, kernel space, and Karush-Kuhn-Tucker Conditions) but if you are interested you can find an easy but exhaustive  description on "A Tutorial on Support Vector Machines for Pattern Recognition".
I implemented in Mathematica an easy routine to call the library LIBSVM to classify the above points (contact me to obtain the notebook):
To obtain this results (overall accuracy of classification: 99,2%) i trained the SVM using a gaussian kernel.
Another 2D sample (just to highlight the hyperplane I removed the color gradations):

2D points
Hyperplanes founds (in blue) for the above points



One of the interesting aspects of SVM is its vector notation: it allows a complete generalization of the problem: you can use the same algorithm to solve problem in any dimension!
For example in 3D scenario, you have:



3d Points 
Hyperplanes found



...another example:









Now we are ready to jump into the real world and attempt to classify text documents!
In the next post I will explain a real case application of document classification using SVM.
As usual: STAY TUNED!!
cristian

4 comments:

  1. Talking about kernels. I started to play with SVM in the context of learning to rank. Is it me, or training of non-linear kernels is painfully slow?

    ReplyDelete
  2. Hi Itman,
    Generally SVM training is pretty fast (formally the time complexity follows a quadratic function, even if you can find almost linear implementations), however the time required in the training set depends on:
    1) size of training set
    2) kernel you are using
    3) capacity factor you are using.
    4) intrinsic complexity of the problem
    5) ability to tune the points 1,2,3. :)

    About the point 2) many people believe that a gaussian kernel is always the best kernel (because it is infinite dimensional) but is not true!!
    As we can see in the next post, for example, in the document classification the linear kernel is absolutely faster and precise than other complex kernels.
    BTW before train a sys (using SVM o whatever algo) the most important steps are:
    1) describe properly the dataset (in svm the way chosen to build the vectors)
    2) select different training sets
    3) define a proper strategy for param tuning
    cheers
    c.

    ReplyDelete
  3. Thanks for the great blog. I've bookmarked it :P

    ReplyDelete
  4. telasmosquiteira-sp.com.br

    telas mosquiteiras sp
    telas mosquiteira sp

    As telas mosquiteiras sp , telas mosquiteiro sp garantem ar puro por toda casa livrando-a completamente dos mosquitos e insetos indesejáveis. As telas mosquiteira garantem um sono tranquilo a toda família, livrando e protegendo-nas dos mais diversos insetos. Muitos destes insetos são transmissores de doenças e a tela mosquiteira é indispensável no combate a mosquitos transmissores de doenças.
    s
    A dengue, por exemplo, já matou centenas de pessoas só na capital de São Paulo e um pequeno investimento em nossas telas mosquiteiras podem salvar vidas. As telas mosquiteiras também impedem a entrada de insetos peçonhentos como as aranhas e os escorpiões, estes insetos também oferecem risco, pois seu veneno em poucos minutos podem levar uma criança a morte.
    telas mosquiteira jundiai
    telas mosquiteiro jundiai
    telas mosquiteira São Paulo
    telas mosquiteiro São Paulo
    telas mosquiteiras sp
    telas mosquiteiras Jundiai
    telas mosquiteira sp
    telas mosquiteiro Jundiai
    telas mosquiteira sao paulo
    telas mosquiteiro sao paulo

    A chegada da temporada Primavera/Verão traz consigo a elevação da temperatura e a maior ocorrência de chuvas. Mas não é só isso. As estações mais quentes do ano causam muita dor de cabeça e muitos zumbidos indesejáveis em função das pragas urbanas – pernilongos, baratas, cupins e outros insetos -, que afetam todas as regiões brasileiras.

    Nossa missão é oferecer telas mosquiteiras de qualidade a um preço acessível, fazendo com que as telas mosquiteiras sejam uma opção viável para muitas pessoas.

    telas mosquiteiras Jundiaí
    telas mosquiteiro Jundiai
    telas mosquiteiras jundiai
    telas mosquiteiro industria
    telas mosquiteira restaurante
    telas mosquiteiro restaurante
    telas mosquiteira empresa
    telas mosquiteiro empresa

    ReplyDelete