celer
is a Python package that solves Lasso-like problems and provides estimators that follow the scikit-learn
API. Thanks to a tailored implementation, celer
provides a fast solver that tackles large-scale datasets with millions of features up to 100 times faster than scikit-learn
.
Currently, the package handles the following problems:
Problem | Support Weights | Native cross-validation |
---|---|---|
Lasso | ✓ | ✓ |
ElasticNet | ✓ | ✓ |
Group Lasso | ✓ | ✓ |
Multitask Lasso | ✕ | ✓ |
Sparse Logistic regression | ✕ | ✕ |
If you are interested in other models, such as non convex penalties (SCAD, MCP), sparse group lasso, group logistic regression, Poisson regression, Tweedie regression, have a look at our companion package skglm
celer
is licensed under the BSD 3-Clause. Hence, you are free to use it.
If you do so, please cite:
@InProceedings{pmlr-v80-massias18a,
title = {Celer: a Fast Solver for the Lasso with Dual Extrapolation},
author = {Massias, Mathurin and Gramfort, Alexandre and Salmon, Joseph},
booktitle = {Proceedings of the 35th International Conference on Machine Learning},
pages = {3321--3330},
year = {2018},
volume = {80},
}
@article{massias2020dual,
author = {Mathurin Massias and Samuel Vaiter and Alexandre Gramfort and Joseph Salmon},
title = {Dual Extrapolation for Sparse GLMs},
journal = {Journal of Machine Learning Research},
year = {2020},
volume = {21},
number = {234},
pages = {1-33},
url = {http://jmlr.org/papers/v21/19-587.html}
}
celer
is specially designed to handle Lasso-like problems which makes it a fast solver of such problems.
In particular, it comes with tools such as:
- automated parallel cross-validation
- support of sparse and dense data
- optional feature centering and normalization
- unpenalized intercept fitting
celer
also provides easy-to-use estimators as it is designed under the scikit-learn
API.
To get started, install celer
via pip
pip install -U celer
On your python console, run the following commands to fit a Lasso estimator on a toy dataset.
>>> from celer import Lasso
>>> from celer.datasets import make_correlated_data
>>> X, y, _ = make_correlated_data(n_samples=100, n_features=1000)
>>> estimator = Lasso()
>>> estimator.fit(X, y)
This is just a starter example.
Make sure to browse celer
documentation to learn more about its features.
To get familiar with celer
API, you can also explore the gallery of examples
which includes examples on real-life datasets as well as timing comparisons with other solvers.
celer
is an open-source project and hence relies on community efforts to evolve.
Your contribution is highly valuable and can come in three forms
- bug report: you may encounter a bug while using
celer
. Don't hesitate to report it on the issue section. - feature request: you may want to extend/add new features to
celer
. You can use the issue section to make suggestions. - pull request: you may have fixed a bug, enhanced the documentation, ... you can submit a pull request and we will respond asap.
For the last mean of contribution, here are the steps to help you setup celer
on your local machine:
- Fork the repository and afterwards run the following command to clone it on your local machine
git clone https://github.com/{YOUR_GITHUB_USERNAME}/celer.git
cd
toceler
directory and install it in edit mode by running
cd celer
pip install -e .
- To run the gallery examples and build the documentation, run the following
cd doc
pip install -e .[doc]
make html