K-Neighbors Classifier (KNC) |
K-Neighbors Classifier is
a neighbors-based classification where k is an integer value
specified by the user. It is an instance-based learning or
non-generalizing learning: it does not attempt to construct a general
internal model; simply, it stores instances of the training data.
Classification is computed from a simple majority vote of nearest
neighbors of each point: a query point is assigned the data class that
has most representatives within nearest neighbors. |
SVM |
Support vector machines (SVM) and NuSVM are
algorithms capable of performing multi-class classifications on
datasets. They are a set of supervised learning methods used for
classification. SVM and NuSVC are similar methods, but accept slightly
different sets of parameters and have different mathematical
formulations. These are based on a library (libsvm). In SVM, the
fit-time scales at least quadratically with the # of samples and may be
impractical beyond tens of thousands of samples. NuSVM is similar but
uses a parameter to control the # of support vectors. |
NuSVM |
|
Decision Tree Classifier (DTC) |
Decision Tree
Classifier is a non-parametric supervised learning method. It is an
algorithm capable of performing multi-class classification on datasets.
The goal is to create models that predict the value of a target variable
by learning simple decision rules inferred from the data features. For
example, a classical decision tree learns from the data to approximate a
sine curve with a set of if-then-else decision rules. The deeper the
tree, the more complex the decision rules, and the better the
model. |
Random Forest Classifier (RFC) |
Random Forests
Classifier is an ensemble learning method for classification, that
operates by constructing a multitude of decision trees at training time
and outputting the class that is the mode of the classes of the
individual trees. Random decision forests correct for decision trees’
habit of overfitting to their training set. A random forest is a meta
estimator that fits a # of decision trees classifiers on various
sub-samples of the dataset and uses averaging to improve the predictive
accuracy and control over-fitting. |
AdaBoost Classifier (ABC) |
An AdaBoost (51) classifier
is a meta-estimator that begins by fitting a classifier on the original
dataset and then fits additional copies of the classifier on the same
dataset but where the weights of incorrectly classified instances are
adjusted such that subsequent classifiers focus more on difficult
cases. |
Gradient Boosting Classifier (GBC) |
Gradient Boosting
Classifier builds an additive model in a forward stage-wise fashion. It
allows for the optimization of arbitrary differentiable loss functions.
In each stage n classes, regression trees are fit on the negative
gradient of binomial or multinomial deviance loss function. Binary
classification is a special case where only a single regression tree is
induced. |
Gaussian Naive Bayes (GNB) |
In the Gaussian Naive
Bayes, the likelihood of the features is assumed to be Gaussian. Can
perform online updates to model parameters via partial
fit. |
Linear Discriminant Analysis (LDA) |
Linear Discriminant
Analysis is a classifier with a linear decision boundary generated by
fitting class conditional densities to the data using Bayes’ rule. The
model fits a Gaussian density to each class, assuming that all classes
share the same covariance matrix. The fitted model can be used to reduce
the dimensionality of the input, projecting it to the most
discriminative directions. |
Quadratic Discriminant Analysis (QDA) |
Quadratic
Discriminant Analysis, it is a classifier with a quadratic decision
boundary, generated by fitting class conditional densities to the data
and using Bayes’ rule. The model fits a Gaussian density to each
class. |