Wants to know about Classification Models for Machine Learning and Data Analytics?
Classification is the problem of identifying the category out of many catergoies given to which a new observation belongs on the basis of a data containing observations.
It is a supervised machine learning where a training set of correctly identified observations is available. The corresponding unsupervised procedure is known as clustering which involves grouping data into categories based on some measure of inherent similarity or distance.
1.An email could be “spam” or “non-spam”.
2.Bank Customers could be “Happy” or “Unsatisfied”
Let’s consider second example:
Say there is a bank who has customers.Some are happy and some are not happy from the services of the bank.Now say bank wants to know which of their customers are not not happy with their services.Bank might want to know this because of obvious reason i.e to improve their services so that they have more customers which are happy from their services.
So what would bank do now??? They have the customers data which involves information of customer like age,city name,saving account or current account,etc. They will use this customer data and then try to fit the Machine Learning algorithms which will predict that “which of the customers are most likely unhappy”. So this is binary classification.Only two classes are there.One is “happy” and other is “unhappy”.
Some popular Classification Algorithms are :
4.Support Vector Machines
Classification Problems could be of types like:
1.Binary Classification: Where you are given only two categories to predict.Like spam or non spam.
2.Multi Classification: Where you are given more than two categories.It could be 3,10,100,1000,10000 and so on.Google has the capacity to solve a problem with very high number of classes.