• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

Evovest/EvoTrees.jl: Boosted trees in Julia

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称:

Evovest/EvoTrees.jl

开源软件地址:

https://github.com/Evovest/EvoTrees.jl

开源编程语言:

Julia 100.0%

开源软件介绍:

EvoTrees

Build status

A Julia implementation of boosted trees with CPU and GPU support. Efficient histogram based algorithms with support for multiple loss functions (notably multi-target objectives such as max likelihood methods).

R binding available.

Input features are expected to be Matrix{Float64/Float32}. Tables/DataFrames format can be handled through MLJ (see below).

Supported tasks

CPU

  • linear
  • logistic
  • Poisson
  • L1 (mae regression)
  • Quantile
  • multiclassification (softmax)
  • Gaussian (max likelihood)

Set parameter device="cpu".

GPU

  • linear
  • logistic
  • Gaussian (max likelihood)

Set parameter device="gpu".

Installation

Latest:

julia> Pkg.add("https://github.com/Evovest/EvoTrees.jl")

From General Registry:

julia> Pkg.add("EvoTrees")

Performance

Data consists of randomly generated float32. Training is performed on 200 iterations. Code to reproduce is here.

EvoTrees: v0.8.4 XGBoost: v1.1.1

CPU: 16 threads on AMD Threadripper 3970X GPU: NVIDIA RTX 2080

Training:

Dimensions / Algo XGBoost Hist EvoTrees EvoTrees GPU
100K x 100 1.10s 1.80s 3.14s
500K x 100 4.83s 4.98s 4.98s
1M x 100 9.84s 9.89s 7.37s
5M x 100 45.5s 53.8s 25.8s

Inference:

Dimensions / Algo XGBoost Hist EvoTrees EvoTrees GPU
100K x 100 0.164s 0.026s 0.013s
500K x 100 0.796s 0.175s 0.055s
1M x 100 1.59s 0.396s 0.108s
5M x 100 7.96s 2.15s 0.543s

Parameters

  • loss: {:linear, :logistic, :poisson, :L1, :quantile, :softmax, :gaussian}
  • device: {"cpu", "gpu"}
  • nrounds: integer, default=10
  • λ: L2 regularization, float, default=0.0
  • γ: min gain for split, default=0.0
  • η: learning rate, default=0.1
  • max_depth: integer, default=5
  • min_weight: float >= 0 default=1.0
  • rowsample: float [0,1] default=1.0
  • colsample: float [0,1] default=1.0
  • nbins: Int, number of bins into which features will be quantilized default=64
  • α: float [0,1], set the quantile or bias in L1 default=0.5
  • metric: {:mse, :rmse, :mae, :logloss, :quantile, :gini, :gaussian, :none}, default=:none
  • rng: random controller, either a Random.AbstractRNG or an Int acting as a seed. Default=123.

MLJ Integration

See official project page for more info.

using StatsBase: sample
using EvoTrees
using EvoTrees: sigmoid, logit
using MLJBase

features = rand(10_000) .* 5 .- 2
X = reshape(features, (size(features)[1], 1))
Y = sin.(features) .* 0.5 .+ 0.5
Y = logit(Y) + randn(size(Y))
Y = sigmoid(Y)
y = Y
X = MLJBase.table(X)

# @load EvoTreeRegressor
# linear regression
tree_model = EvoTreeRegressor(loss=:linear, max_depth=5, η=0.05, nrounds=10)

# set machine
mach = machine(tree_model, X, y)

# partition data
train, test = partition(eachindex(y), 0.7, shuffle=true); # 70:30 split

# fit data
fit!(mach, rows=train, verbosity=1)

# continue training
mach.model.nrounds += 10
fit!(mach, rows=train, verbosity=1)

# predict on train data
pred_train = predict(mach, selectrows(X, train))
mean(abs.(pred_train - selectrows(Y, train)))

# predict on test data
pred_test = predict(mach, selectrows(X, test))
mean(abs.(pred_test - selectrows(Y, test)))

Getting started using internal API

Minimal example to fit a noisy sinus wave.


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap