Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new file: packages/orf/orf.1.0.1/opam #22843

Merged
merged 6 commits into from
Jul 3, 2024
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions packages/orf/orf.1.0.1/opam
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
opam-version: "2.0"
authors: "Francois Berenger"
maintainer: "unixjunkie@sdf.org"
homepage: "https://github.com/UnixJunkie/orf"
bug-reports: "https://github.com/UnixJunkie/orf/issues"
dev-repo: "git+https://github.com/UnixJunkie/orf.git"
license: "LGPL-2.1-or-later WITH OCaml-LGPL-linking-exception"
build: ["dune" "build" "-p" name "-j" jobs]
depends: [
"batteries" {>= "3.2.0"}
"cpm" {>= "6.0.0"}
mseri marked this conversation as resolved.
Show resolved Hide resolved
"dolog" {>= "4.0.0"}
"dune" {>= "2.8"}
"minicli"
"molenc"
UnixJunkie marked this conversation as resolved.
Show resolved Hide resolved
"ocaml"
mseri marked this conversation as resolved.
Show resolved Hide resolved
"parany" {>= "11.0.0"}
"line_oriented"
]
depopts: [
"conf-gnuplot"
]
synopsis: "OCaml Random Forests"
description:"""
Random Forests (RFs) can do classification or regression modeling.

Random Forests are one of the workhorse of modern machine
learning. Especially, they cannot over-fit to the training set, are
fast to train, predict fast, parallelize well and give you a reasonable
model even without optimizing the model's default hyper-parameters. In
other words, it is hard to shoot yourself in the foot while training
or exploiting a Random Forests model. In comparison, with deep neural
networks it is very easy to shoot yourself in the foot.

Using out of bag (OOB) samples, you can even get an idea of a RFs
performance, without the need for a held out (test) data-set.

Their only drawback is that RFs, being an ensemble model, cannot predict
values which are outside of the training set range of values (this is
a serious limitation in case you are trying to optimize or minimize
something in order to discover outliers, compared to your training
set samples).

For the moment, this implementation only consider a sparse vector
of integers as features. i.e. categorical variables will need to be
one-hot-encoded.
For classification, the dependent variable must be an integer
(encoding a class label).
For regression, the dependent variable must be a float.

Bibliography
============

Breiman, Leo. (1996). Bagging Predictors. Machine learning, 24(2),
123-140.

Breiman, Leo. (2001). Random Forests. Machine learning, 45(1), 5-32.

Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely Randomized
Trees. Machine learning, 63(1), 3-42."""
url {
src: "https://github.com/UnixJunkie/orf/archive/refs/tags/v1.0.1.tar.gz"
checksum: "md5=8e58bcd5ccfd2cb11ab5c09a23000379"
}