Title: | Kernel Factory: An Ensemble of Kernel Machines |
---|---|
Description: | Binary classification based on an ensemble of kernel machines ("Ballings, M. and Van den Poel, D. (2013), Kernel Factory: An Ensemble of Kernel Machines. Expert Systems With Applications, 40(8), 2904-2913"). Kernel factory is an ensemble method where each base classifier (random forest) is fit on the kernel matrix of a subset of the training data. |
Authors: | Michel Ballings, Dirk Van den Poel |
Maintainer: | Michel Ballings <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.3.0 |
Built: | 2025-02-21 02:43:32 UTC |
Source: | https://github.com/cran/kernelFactory |
Credit
contains credit card applications. The dataset has a good mix of continuous and categorical features.
data(Credit)
data(Credit)
A data frame with 653 observations, 15 predictors and a binary criterion variable called Response
All observations with missing values are deleted.
Frank, A. and Asuncion, A. (2010). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
The original dataset can be downloaded at http://archive.ics.uci.edu/ml/datasets/Credit+Approval
data(Credit) str(Credit) table(Credit$Response)
data(Credit) str(Credit) table(Credit$Response)
kernelFactory
implements an ensemble method for kernel machines (Ballings and Van den Poel, 2013).
kernelFactory(x = NULL, y = NULL, cp = 1, rp = round(log(nrow(x), 10)), method = "burn", ntree = 500, filter = 0.01, popSize = rp * cp * 7, iters = 80, mutationChance = 1/(rp * cp), elitism = max(1, round((rp * cp) * 0.05)), oversample = TRUE)
kernelFactory(x = NULL, y = NULL, cp = 1, rp = round(log(nrow(x), 10)), method = "burn", ntree = 500, filter = 0.01, popSize = rp * cp * 7, iters = 80, mutationChance = 1/(rp * cp), elitism = max(1, round((rp * cp) * 0.05)), oversample = TRUE)
x |
A data frame of predictors (numeric, integer or factor). Categorical variables need to be factors. Indicator values should not be too imbalanced because this might produce constants in the subsetting process. |
y |
A factor containing the response vector. Only {0,1} is allowed. |
cp |
The number of column partitions. |
rp |
The number of row partitions. |
method |
Can be one of the following: POLynomial kernel function ( |
ntree |
Number of trees in the Random Forest base classifiers. |
filter |
either NULL (deactivate) or a percentage denoting the minimum class size of dummy predictors. This parameter is used to remove near constants. For example if nrow(xTRAIN)=100, and filter=0.01 then all dummy predictors with any class size equal to 1 will be removed. Set this higher (e.g., 0.05 or 0.10) in case of errors. |
popSize |
Population size of the genetic algorithm. |
iters |
Number of generations of the genetic algorithm. |
mutationChance |
Mutationchance of the genetic algorithm. |
elitism |
Elitism parameter of the genetic algorithm. |
oversample |
Oversample the smallest class. This helps avoid problems related to the subsetting procedure (e.g., if rp is too high). |
An object of class kernelFactory
, which is a list with the following elements:
trn |
Training data set. |
trnlst |
List of training partitions. |
rbfstre |
List of used kernel functions. |
rbfmtrX |
List of augmented kernel matrices. |
rsltsKF |
List of models. |
cpr |
Number of column partitions. |
rpr |
Number of row partitions. |
cntr |
Number of partitions. |
wghts |
Weights of the ensemble members. |
nmDtrn |
Vector indicating the numeric (and integer) features. |
rngs |
Ranges of numeric predictors. |
constants |
To exclude from newdata. |
Authors: Michel Ballings and Dirk Van den Poel, Maintainer: [email protected]
Ballings, M. and Van den Poel, D. (2013), Kernel Factory: An Ensemble of Kernel Machines. Expert Systems With Applications, 40(8), 2904-2913.
#Credit Approval data available at UCI Machine Learning Repository data(Credit) #take subset (for the purpose of a quick example) and train and test Credit <- Credit[1:100,] train.ind <- sample(nrow(Credit),round(0.5*nrow(Credit))) #Train Kernel Factory on training data kFmodel <- kernelFactory(x=Credit[train.ind,names(Credit)!= "Response"], y=Credit[train.ind,"Response"], method=random) #Deploy Kernel Factory to predict response for test data #predictedresponse <- predict(kFmodel, newdata=Credit[-train.ind,names(Credit)!= "Response"])
#Credit Approval data available at UCI Machine Learning Repository data(Credit) #take subset (for the purpose of a quick example) and train and test Credit <- Credit[1:100,] train.ind <- sample(nrow(Credit),round(0.5*nrow(Credit))) #Train Kernel Factory on training data kFmodel <- kernelFactory(x=Credit[train.ind,names(Credit)!= "Response"], y=Credit[train.ind,"Response"], method=random) #Deploy Kernel Factory to predict response for test data #predictedresponse <- predict(kFmodel, newdata=Credit[-train.ind,names(Credit)!= "Response"])
kFNews
shows the NEWS file of the kernelFactory package.
kFNews()
kFNews()
None.
Authors: Michel Ballings and Dirk Van den Poel, Maintainer: [email protected]
Ballings, M. and Van den Poel, D. (2013), Kernel Factory: An Ensemble of Kernel Machines. Expert Systems With Applications, 40(8), 2904-2913.
kernelFactory
, predict.kernelFactory
kFNews()
kFNews()
Prediction of new data using kernelFactory.
## S3 method for class 'kernelFactory' predict(object, newdata = NULL, predict.all = FALSE, ...)
## S3 method for class 'kernelFactory' predict(object, newdata = NULL, predict.all = FALSE, ...)
object |
An object of class |
newdata |
A data frame with the same predictors as in the training data. |
predict.all |
TRUE or FALSE. If TRUE and rp and cp are 1 then the individual predictions of the random forest are returned. If TRUE and any of rp and cp or bigger than 1 then the predictions of all the members are returned. |
... |
Not used currently. |
A vector containing the response probabilities.
Authors: Michel Ballings and Dirk Van den Poel, Maintainer: [email protected]
Ballings, M. and Van den Poel, D. (2013), Kernel Factory: An Ensemble of Kernel Machines. Expert Systems With Applications, 40(8), 2904-2913.
#Credit Approval data available at UCI Machine Learning Repository data(Credit) #take subset (for the purpose of a quick example) and train and test Credit <- Credit[1:100,] train.ind <- sample(nrow(Credit),round(0.5*nrow(Credit))) #Train Kernel Factory on training data kFmodel <- kernelFactory(x=Credit[train.ind,names(Credit)!= "Response"], y=Credit[train.ind,"Response"], method=random) #Deploy Kernel Factory to predict response for test data predictedresponse <- predict(kFmodel, newdata=Credit[-train.ind,names(Credit)!= "Response"])
#Credit Approval data available at UCI Machine Learning Repository data(Credit) #take subset (for the purpose of a quick example) and train and test Credit <- Credit[1:100,] train.ind <- sample(nrow(Credit),round(0.5*nrow(Credit))) #Train Kernel Factory on training data kFmodel <- kernelFactory(x=Credit[train.ind,names(Credit)!= "Response"], y=Credit[train.ind,"Response"], method=random) #Deploy Kernel Factory to predict response for test data predictedresponse <- predict(kFmodel, newdata=Credit[-train.ind,names(Credit)!= "Response"])