Text & Data Mining by practical means: SVM classification, an intuitive explanation and some test with LIBSVM

Thursday, July 7, 2011

SVM classification, an intuitive explanation and some test with LIBSVM

In a previous post I focused on linear classification and on the kernel trick to make linear separable points not linear separable.
...In this post I would explain why this trick is the base to build one of the most powerful classifier: the support vectors machines (SVM).
Consider the following set of point red and blue:

As you can see, you cannot find a straight line to divide blue points from the red points.
...But as mentioned before using the kernel trick we can find a space having greater dimension where these points can be separate by straight line.
Formally,
we are looking for a straight line

xi ·w+b≥+1 for yi =+1 and xi ·w+b≤−1 for yi =−1

where xi represents the points (blue or red), yi =+1 to represent red points and yi =−1for blue points.

w and b represent the unknown parameters of the straight lines.

The above equations can be combined into one set of inequalities: yi(xi ·w+b)−1≥0 ∀i

The SVM works out the classification bringing the points in a "kernel space" where the points can be divided finding a simple hyperplane.

I won't enter in technical details (the theory is based on Lagrange multipliers, kernel space, and Karush-Kuhn-Tucker Conditions) but if you are interested you can find an easy but exhaustive description on "A Tutorial on Support Vector Machines for Pattern Recognition".

I implemented in Mathematica an easy routine to call the library LIBSVM to classify the above points (contact me to obtain the notebook):

To obtain this results (overall accuracy of classification: 99,2%) i trained the SVM using a gaussian kernel.

Another 2D sample (just to highlight the hyperplane I removed the color gradations):

2D points

Hyperplanes founds (in blue) for the above points

One of the interesting aspects of SVM is its vector notation: it allows a complete generalization of the problem: you can use the same algorithm to solve problem in any dimension!

For example in 3D scenario, you have:

3d Points

Hyperplanes found

...another example:

Now we are ready to jump into the real world and attempt to classify text documents!

In the next post I will explain a real case application of document classification using SVM.

As usual: STAY TUNED!!

cristian

18 comments:

ItmanJuly 13, 2011 at 12:41 PM
Talking about kernels. I started to play with SVM in the context of learning to rank. Is it me, or training of non-linear kernels is painfully slow?
ReplyDelete
Replies
Cristian MesianoJuly 13, 2011 at 11:14 PM
Hi Itman,
Generally SVM training is pretty fast (formally the time complexity follows a quadratic function, even if you can find almost linear implementations), however the time required in the training set depends on:
1) size of training set
2) kernel you are using
3) capacity factor you are using.
4) intrinsic complexity of the problem
5) ability to tune the points 1,2,3. :)

About the point 2) many people believe that a gaussian kernel is always the best kernel (because it is infinite dimensional) but is not true!!
As we can see in the next post, for example, in the document classification the linear kernel is absolutely faster and precise than other complex kernels.
BTW before train a sys (using SVM o whatever algo) the most important steps are:
1) describe properly the dataset (in svm the way chosen to build the vectors)
2) select different training sets
3) define a proper strategy for param tuning
cheers
c.
ReplyDelete
Replies
Lê Trường Vĩnh PhúSeptember 24, 2011 at 10:36 AM
Thanks for the great blog. I've bookmarked it :P
ReplyDelete
Replies
UnknownJune 13, 2019 at 3:38 AM

Thanks for sharing, check out

Data Mining Service Providers

Data Mining Process
ReplyDelete
Replies
TülayOctober 4, 2023 at 8:56 AM
elazığ
erzincan
bayburt
tunceli
sakarya
COWE
ReplyDelete
Replies
ŞekerGözler51October 16, 2023 at 9:19 PM
https://titandijital.com.tr/
sakarya parça eşya taşıma
aksaray parça eşya taşıma
urfa parça eşya taşıma
kocaeli parça eşya taşıma
AQ7
ReplyDelete
Replies
75C26MirandaD26EFNovember 7, 2023 at 3:50 AM
A7F4D
Kütahya Evden Eve Nakliyat
Balıkesir Lojistik
Rize Evden Eve Nakliyat
Elazığ Evden Eve Nakliyat
Siirt Evden Eve Nakliyat
ReplyDelete
Replies
050D5Sarah51388November 8, 2023 at 10:46 PM
D8DA9
Ünye Kurtarıcı
Maraş Parça Eşya Taşıma
Çerkezköy Çamaşır Makinesi Tamircisi
Niğde Evden Eve Nakliyat
Edirne Şehirler Arası Nakliyat
Çorum Lojistik
Ünye Parke Ustası
Huobi Güvenilir mi
Altındağ Fayans Ustası
ReplyDelete
Replies
D79B1Shane1AC2ANovember 10, 2023 at 7:32 AM
D7790
Bursa Lojistik
Muş Parça Eşya Taşıma
Pancakeswap Güvenilir mi
Pursaklar Fayans Ustası
Gölbaşı Boya Ustası
Hatay Şehir İçi Nakliyat
Batman Parça Eşya Taşıma
Hotbit Güvenilir mi
Karabük Lojistik
ReplyDelete
Replies
43D1FKelly48DB6January 3, 2024 at 4:24 AM
CCC56
agri sohbet
muş sesli sohbet mobil
maraş canlı sohbet ücretsiz
Antalya Ücretsiz Sohbet Sitesi
ordu görüntülü sohbet siteleri ücretsiz
amasya canli sohbet bedava
Tokat Telefonda Sohbet
sivas canlı sohbet siteleri
hatay canlı görüntülü sohbet siteleri
ReplyDelete
Replies
7DF18Brandon01D2BJanuary 18, 2024 at 1:33 PM
8BA37
Binance Madencilik Nasıl Yapılır
Kripto Para Üretme
Binance Sahibi Kim
Coin Çıkarma Siteleri
Spotify Dinlenme Hilesi
Pepecoin Coin Hangi Borsada
Bitcoin Üretme
Kaspa Coin Hangi Borsada
Mexc Borsası Güvenilir mi
ReplyDelete
Replies
AnonymousSeptember 12, 2024 at 1:05 AM
TYVGGHIUHUIGIU
تسليك مجاري بالدمام
ReplyDelete
Replies
AnonymousOctober 5, 2024 at 5:40 AM
شركة مكافحة حشرات بالجبيل r3RG0SvtFb
ReplyDelete
Replies
AnonymousNovember 10, 2024 at 2:01 AM
شركة عزل خزانات المياه dcf6xicQIT
ReplyDelete
Replies
AnonymousNovember 25, 2024 at 6:46 AM
مكافحة الحشرات بالاحساء B4eQC1crsV
ReplyDelete
Replies
AnonymousFebruary 7, 2025 at 9:14 AM
97058B7EDF
insta takipçi
Online Oyunlar
MLBB Hediye Kodu
Google Haritalara Adres Ekleme
Sıra Bulucu
Referans Kimliği Nedir
Viking Rise Hediye Kodu
Footer Link Satın Al
Razer Gold Promosyon Kodu
ReplyDelete
Replies
AnonymousFebruary 9, 2025 at 6:29 AM
شركة مكافحة حشرات بخميس مشيط
8wuWZk2mqo
ReplyDelete
Replies
AnonymousMay 17, 2025 at 9:35 PM
D44E2B4058
instagram bot takipçi al
tiktok beğeni satın al
twitter takipçi
instagram takipçi
güvenilir takipçi
ReplyDelete
Replies

Subscribe to: Post Comments (Atom)