| Title: | Competitive Adaptive Reweighted Sampling (CARS) Algorithm |
|---|---|
| Description: | Implements Competitive Adaptive Reweighted Sampling (CARS) algorithm for variable selection from high-dimensional dataset using Partial Least Squares (PLS) regression models. CARS algorithm iteratively applies the Monte Carlo sub-sampling and exponential variable elimination techniques to identify/select the most informative variables/features subjected to minimal cross-validated RMSE score. The implementation of CARS algorithm is inspired from the work of Li et al. (2009) <doi:10.1016/j.aca.2009.06.046>. This algorithm is widely applied in near-infrared (NIR), mid-infrared (MIR), hyperspectral chemometrics areas, etc. |
| Authors: | Md. Ashraful Haque [aut, cre], Avijit Ghosh [aut], Sayantani Karmakar [aut], Harsh Sachan [aut], Shalini Kumari [aut] |
| Maintainer: | Md. Ashraful Haque <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.5.0 |
| Built: | 2026-05-19 10:00:01 UTC |
| Source: | https://github.com/mah-iasri/carsalgo |
The CARSAlgorithm() function creates a configuration object for the
Competitive Adaptive Reweighted Sampling (CARS) algorithm. Pass this object
to fit.CARSAlgorithm to run variable selection on your high dimensional dataset.
CARSAlgorithm(max_iter = 100, N = 50, cv_folds = 5, random_state = 42)CARSAlgorithm(max_iter = 100, N = 50, cv_folds = 5, random_state = 42)
max_iter |
Maximum number of CARS iterations. Default |
N |
Number of Monte Carlo sub-sampling runs per iteration. Default |
cv_folds |
Number of folds for k-fold cross-validation. Default |
random_state |
Integer seed for reproducibility. Default |
An object of class "CARSAlgorithm" - a named list of
hyperparameters to be passed to fit.CARSAlgorithm.
cars_obj <- CARSAlgorithm(max_iter = 20, N = 30, cv_folds = 5) cars_objcars_obj <- CARSAlgorithm(max_iter = 20, N = 30, cv_folds = 5) cars_obj
Generic function for fitting model objects to data. Methods are
dispatched based on the class of x.
fit(cars_obj, ...)fit(cars_obj, ...)
cars_obj |
A model configuration object (e.g., a |
... |
Additional arguments passed to the specific method. |
Depends on the method. See fit.CARSAlgorithm.
Applies the CARS algorithm to a high-dimensional data matrix X and
response vector y, iteratively selecting the optimal variable subset
via Monte Carlo enabled PLS regression and adaptive reweighted sampling techniques.
## S3 method for class 'CARSAlgorithm' fit(cars_obj, X, y, max_components = 10L, plot = TRUE, plot_path = NULL, ...)## S3 method for class 'CARSAlgorithm' fit(cars_obj, X, y, max_components = 10L, plot = TRUE, plot_path = NULL, ...)
cars_obj |
A |
X |
Numeric matrix of predictors (n_samples x n_features). |
y |
Numeric response vector of length n_samples. |
max_components |
Integer cap on PLS latent components. Default |
plot |
Logical. Whether to display and save the RMSECV curve. Default |
plot_path |
File path for saving the RMSECV plot. Default |
... |
Currently unused. |
This function iteratively:
Sub-samples the calibration set (Monte Carlo, N runs per iteration).
Fits a PLS model and extracts regression coefficients.
Selects variables by Adaptive Reweighted Sampling (ARS) proportional to absolute coefficient magnitude.
Evaluates the subset via k-fold cross-validation (RMSECV).
Retains the best subset and repeats with an exponentially shrinking variable set.
A named list with:
best_featuresSorted 1-based column indices of selected features.
best_rmsecvLowest RMSECV achieved across all iterations.
rmsecv_historyNumeric vector of best RMSECV per iteration.
num_features_historyInteger vector of feature count per iteration.
plotA ggplot2 object of the RMSECV curve.
set.seed(1) X <- matrix(rnorm(100 * 200), nrow = 100) y <- X[, 5] * 2 + X[, 50] * -1.5 + rnorm(100, sd = 0.5) cars_obj <- CARSAlgorithm(max_iter = 15, N = 30, cv_folds = 5) result <- fit(cars_obj, X, y, max_components = 8) cat("Best RMSECV :", result$best_rmsecv, "\n") cat("Selected features:", result$best_features, "\n")set.seed(1) X <- matrix(rnorm(100 * 200), nrow = 100) y <- X[, 5] * 2 + X[, 50] * -1.5 + rnorm(100, sd = 0.5) cars_obj <- CARSAlgorithm(max_iter = 15, N = 30, cv_folds = 5) result <- fit(cars_obj, X, y, max_components = 8) cat("Best RMSECV :", result$best_rmsecv, "\n") cat("Selected features:", result$best_features, "\n")
Print method for CARSAlgorithm objects
## S3 method for class 'CARSAlgorithm' print(x, ...)## S3 method for class 'CARSAlgorithm' print(x, ...)
x |
A |
... |
Ignored. |
No return value, called for side effects