JuliaStats/NMF.jl: A Julia package for non-negative matrix factorization

原作者: [db:作者] 来自: 网络收藏邀请

开源软件名称：

JuliaStats/NMF.jl

开源软件地址：

https://github.com/JuliaStats/NMF.jl

开源编程语言：

Julia 100.0%

开源软件介绍：

NMF.jl

A Julia package for non-negative matrix factorization (NMF).

Development Status

Note: Nonnegative Matrix Factorization is an area of active research. New algorithms are proposed every year. Contributions are very welcomed.

Done

Lee & Seung's Multiplicative Update (for both MSE & Divergence objectives)
(Naive) Projected Alternate Least Squared
ALS Projected Gradient Methods
Coordinate Descent Methods
Random Initialization
NNDSVD Initialization
Sparse NMF
Separable NMF

To do

Probabilistic NMF

Overview

Non-negative Matrix Factorization (NMF) generally refers to the techniques for factorizing a non-negative matrix X into the product of two lower rank matrices W and H, such that WH optimally approximates X in some sense. Such techniques are widely used in text mining, image analysis, and recommendation systems.

This package provides two sets of tools, respectively for initilization and optimization. A typical NMF procedure consists of two steps: (1) use an initilization function that initialize W and H; and (2) use an optimization algorithm to pursue the optimal solution.

Most types and functions (except the high-level function nnmf) in this package are not exported. Users are encouraged to use them with the prefix NMF.. This way allows us to use shorter names within the package and makes the codes more explicit and clear on the user side.

High-Level Interface

The package provides a high-level function nnmf that runs the entire procedure (initialization + optimization):

nnmf(X, k, ...)

This function factorizes the input matrix X into the product of two non-negative matrices W and H.

In general, it returns a result instance of type NMF.Result, which is defined as

struct Result
    W::Matrix{Float64}    # W matrix
    H::Matrix{Float64}    # H matrix
    niters::Int           # number of elapsed iterations
    converged::Bool       # whether the optimization procedure converges
    objvalue::Float64     # objective value of the last step
end

The function supports the following keyword arguments:

init: A symbol that indicates the initialization method (default = :nndsvdar).

This argument accepts the following values:
- random: matrices filled with uniformly random values
- nndsvd: standard version of NNDSVD
- nndsvda: NNDSVDa variant
- nndsvdar: NNDSVDar variant
- spa: Successive Projection Algorithm
- custom: use custom matrices W0 and H0
alg: A symbol that indicates the factorization algorithm (default = :greedycd).

This argument accepts the following values:
- multmse: Multiplicative update (using MSE as objective)
- multdiv: Multiplicative update (using divergence as objective)
- projals: (Naive) Projected Alternate Least Square
- alspgrad: Alternate Least Square using Projected Gradient Descent
- cd: Coordinate Descent solver that uses Fast Hierarchical Alternating Least Squares (implemetation similar to scikit-learn)
- greedycd: Greedy Coordinate Descent
- spa: Successive Projection Algorithm
maxiter: Maximum number of iterations (default = 100).
tol: tolerance of changes upon convergence (default = 1.0e-6).
replicates: Number of times to perform factorization (default = 1).
W0: Option for custom initialization (default = nothing).
H0: Option for custom initialization (default = nothing).

Note: W0 and H0 may be overwritten. If one needs to avoid it, please pass in copies themselves.
update_H: Option for specifying whether to update H (default = true).
verbose: whether to show procedural information (default = false).

Initialization

NMF.randinit(X, k[; zeroh=false, normalize=false])

Initialize W and H given the input matrix X and the rank k. This function returns a pair (W, H).

Suppose the size of X is (p, n), then the size of W and H are respectively (p, k) and (k, n).

Usage:
```
W, H = NMF.randinit(X, 3)
```
For some algorithms (e.g. ALS), only W needs to be initialized. For such cases, one may set the keyword argument zerohto be true, then in the output H will be simply a zero matrix of size (k, n).

Another keyword argument is normalize. If normalize is set to true, columns of W will be normalized such that each column sum to one.
NMF.nndsvd(X, k[; zeroh=false, variant=:std])

Use the Non-Negative Double Singular Value Decomposition (NNDSVD) algorithm to initialize W and H.

Reference: C. Boutsidis, and E. Gallopoulos. SVD based initialization: A head start for nonnegative matrix factorization. Pattern Recognition, 2007.

Usage:
```
W, H = NMF.nndsvd(X, k)
```
This function has two keyword arguments:
- zeroh: have H initialized when it is set to true, or set H to all zeros when it is set to false.
- variant: the variant of the algorithm. Default is std, meaning to use the standard version, which would generate a rather sparse W. Other values are a and ar, respectively corresponding to the variants: NNDSVDa and NNDSVDar. Particularly, ar is recommended for dense NMF.
NMF.spa(X, k)

Use the Successive Projection Algorithm (SPA) to initialize W and H.

Reference: N. Gillis and S. A. Vavasis, Fast and robust recursive algorithms for separable nonnegative matrix factorization, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 4, pp. 698-714, 2013.

Usage:
```
W, H = NMF.spa(X, k)
```

Factorization Algorithms

This package provides multiple factorization algorithms. Each algorithm corresponds to a type. One can create an algorithm instance by choosing a type and specifying the options, and run the algorithm using NMF.solve!:

The NMF.solve! Function

NMF.solve!(alg, X, W, H)

Use the algorithm alg to factorize X into W and H.

Here, W and H must be pre-allocated matrices (respectively of size (p, k) and (k, n)). W and H must be appropriately initialized before this function is invoked. For some algorithms, both W and H must be initialized (e.g. multiplicative updating); while for others, only W needs to be initialized (e.g. ALS).

The matrices W and H are updated in place.

Algorithms

Multiplicative Updating

Reference: Daniel D. Lee and H. Sebastian Seung. Algorithms for Non-negative Matrix Factorization. Advances in NIPS, 2001.

This algorithm has two different kind of objectives: minimizing mean-squared-error (:mse) and minimizing divergence (:div). Both W and H need to be initialized.

MultUpdate(obj::Symbol=:mse,        # objective, either :mse or :div
           maxiter::Integer=100,    # maximum number of iterations
           verbose::Bool=false,     # whether to show procedural information
           tol::Real=1.0e-6,        # tolerance of changes on W and H upon convergence
           update_H::Bool=true,     # whether to update H
           lambda_w::Real=0.0,      # L1 regularization coefficient for W
           lambda_h::Real=0.0)      # L1 regularization coefficient for H

Note: the values above are default values for the keyword arguments. One can override part (or all) of them.

(Naive) Projected Alternate Least Square

This algorithm alternately updates W and H while holding the other fixed. Each update step solves W or H without enforcing the non-negativity constrait, and forces all negative entries to zeros afterwards. Only W needs to be initialized.

ProjectedALS(maxiter::Integer=100,    # maximum number of iterations
             verbose::Bool=false,     # whether to show procedural information
             tol::Real=1.0e-6,        # tolerance of changes on W and H upon convergence
             update_H::Bool=true,     # whether to update H
             lambda_w::Real=1.0e-6,   # L2 regularization coefficient for W
             lambda_h::Real=1.0e-6)   # L2 regularization coefficient for H

Alternate Least Square Using Projected Gradient Descent

Reference: Chih-Jen Lin. Projected Gradient Methods for Non-negative Matrix Factorization. Neural Computing, 19 (2007).

This algorithm adopts the alternate least square strategy. A efficient projected gradient descent method is used to solve each sub-problem. Both W and H need to be initialized.

ALSPGrad(maxiter::Integer=100,      # maximum number of iterations (in main procedure)
         maxsubiter::Integer=200,   # maximum number of iterations in solving each sub-problem
         tol::Real=1.0e-6,          # tolerance of changes on W and H upon convergence
         tolg::Real=1.0e-4,         # tolerable gradient norm in sub-problem (first-order optimality)
         update_H::Bool=true,       # whether to update H
         verbose::Bool=false)       # whether to show procedural information

Coordinate Descent solver with Fast Hierarchical Alternating Least Squares

Reference: Cichocki, Andrzej, and P. H. A. N. Anh-Huy. Fast local algorithms for large scale nonnegative matrix and tensor factorizations. IEICE transactions on fundamentals of electronics, communications and computer sciences 92.3: 708-721 (2009).

Sequential constrained minimization on a set of squared Euclidean distances over W and H matrices. Uses l_1 and l_2 penalties to enforce sparsity.

CoordinateDescent(maxiter::Integer=100,      # maximum number of iterations (in main procedure)
                  verbose::Bool=false,       # whether to show procedural information
                  tol::Real=1.0e-6,          # tolerance of changes on W and H upon convergence
                  update_H::Bool=true,       # whether to update H
                  α::Real=0.0,               # constant that multiplies the regularization terms
                  regularization=:both,      # select whether the regularization affects the components (H), the transformation (W), both or none of them (:components, :transformation, :both, :none)
                  l₁ratio::Real=0.0,         # l1 / l2 regularization mixing parameter (in [0; 1])
                  shuffle::Bool=false)       # if true, randomize the order of coordinates in the CD solver

Greedy Coordinate Descent

Reference: Cho-Jui Hsieh and Inderjit S. Dhillon. Fast coordinate descent methods with variable selection for non-negative matrix factorization. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1064–1072 (2011).

This algorithm is a fast coordinate descent method with variable selection. Both W and H need to be initialized.

GreedyCD(maxiter::Integer=100,  # maximum number of iterations (in main procedure)
         verbose::Bool=false,   # whether to show procedural information
         tol::Real=1.0e-6,      # tolerance of changes on W and H upon convergence
         update_H::Bool=true,   # whether to update H
         lambda_w::Real=0.0,    # L1 regularization coefficient for W
         lambda_h::Real=0.0)    # L1 regularization coefficient for H

Successive Projection Algorithm for Separable NMF

Reference: N. Gillis and S. A. Vavasis, "Fast and robust recursive algorithms for separable nonnegative matrix factorization," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 4, pp. 698-714, 2013.

A separable matrix X can be written as X = WH = W[I V]P, where W has rank k, I is the identity matrix, the sum of the entries of each column of V is at most one, and P is a permutation matrix to arange the columns of [I V] randomly. Separable NMF aims to decompose a separable matrix X into two nonnegative factor matrices W and H, so that WH is equal to X. This algorithm is used for separable NMF. Both W and H need to be initialized by init=:spa.
```
SPA(obj::Symbol=:mse)   # objective :mse or :div
```

Examples

Here are examples that demonstrate how to use this package to factorize a non-negative dense matrix.

Use High-level Function: nnmf

... # prepare input matrix X

r = nnmf(X, k; alg=:multmse, maxiter=30, tol=1.0e-4)

W = r.W
H = r.H

Use Multiplicative Update

import NMF

 # initialize
W, H = NMF.randinit(X, 5)

 # optimize 
NMF.solve!(NMF.MultUpdate{Float64}(obj=:mse,maxiter=100), X, W, H)

Use Naive ALS

import NMF

 # initialize
W, H = NMF.randinit(X, 5)

 # optimize 
NMF.solve!(NMF.ProjectedALS{Float64}(maxiter=50), X, W, H)

Use ALS with Projected Gradient Descent

import NMF

 # initialize
W, H = NMF.nndsvd(X, 5, variant=:ar)

 # optimize 
NMF.solve!(NMF.ALSPGrad{Float64}(maxiter=50, tolg=1.0e-6), X, W, H)

Use Coordinate Descent

import NMF

 # initialize
W, H = NMF.nndsvd(X, 5, variant=:ar)

 # optimize 
NMF.solve!(NMF.CoordinateDescent{Float64}(maxiter=50, α=0.5, l₁ratio=0.5), X, W, H)

Use Greedy Coordinate Descent

import NMF

 # initialize
W, H = NMF.nndsvd(X, 5, variant=:ar)

 # optimize 
NMF.solve!(NMF.GreedyCD{Float64}(maxiter=50), X, W, H)

Use Successive Projection Algorithm for Separable NMF

import NMF
 # initialize
W, H = NMF.spa(X, 5)

 # optimize 
NMF.solve!(NMF.SPA{Float64}(obj=:mse), X, W, H)

鲜花

握手

雷人

路过

鸡蛋

该文章已有0人参与评论

请发表评论

全部评论

专题导读

More+

10-27 六六分期app的软件客服如何联系？(六六分期

11-06 可心卡盟:win10系统火狐flash插件崩溃怎么

11-06 亲亲特价:怎么删除回收站图标

11-06 济南大学虚拟社区:鲁大师节能降温的具体办

11-06 xlueops.exe:无线网络安装向导

11-06 女斗合众国:win7系统cf与主机连接不稳定怎

11-06 0xc000022-[cf烟雾头]cf怎么调烟雾头

11-06 qizideyouhuo:应用程序无法正常启动0xc0000

11-06 ipz-185:win7系统vcf文件怎么打开

11-06 傻哥蹦迪:win10系统s4怎么打开usb调试

11-06 八神浩树gtaste:回收站清空了怎么恢复

11-06 妖尾之黑色守护:win10系统电脑没有1440x900

11-06 校园至尊魔王小说:win7系统浏览网页时字体

11-06 女斗合众国:win10系统访问共享文件夹提示请

11-06 tokyo hot n0654:恢复win7系统默认字体一招

11-06 雨酷仙境:设置win7系统转移临时文件夹腾出

11-06 阿穆纳伊之杖:win7系统开始菜单在右边还原

11-06 tunespotting:win10系统火狐flash插件总是

11-06 甘尔葛分析师：计谋网站seo关键词暴涨有什

11-06 蔡贵霖: 计谋网站seo关键词暴涨有什么秘密

11-06 博益网首页:ao3网页版进入不了解决方法

11-06 漏斗子专栏: 网站数据分析小白易懂精华篇

11-06 见证双虹怎么做:win7系统开启telnet命令的

11-06 颾狐蝶蜋:系统资源不足无法完成请求的服务

11-06 国光中学校歌:提交网站到alexa查询详细步骤

11-06 西安有情天:静态网页和动态网页的区别

11-06 红木雅尚斋:外部链接构造对网站的好处

11-06 前官礼遇：防止域名劫持–增强域安全性的10

11-06 密传二转答案: 中文分词算法有哪些

11-06 金泉家园邮编:百度快照劫持的表现及应对方

jbrea/BayesianOptimization.jl: Bayesian optimization for Julia发布时间：2022-07-09

felipenoris/Mongoc.jl: MongoDB driver for the Julia Language发布时间：2022-07-09

剪的笔顺,诠释剪的笔画,认识剪的部首

1 六六分期app的软件客服如何联系？(六六分期

六六分期app的软件客服如何联系？不知道吗？加qq群【895510560】即可！标题：六六分期

阅读：18319|2023-10-27

2 可心卡盟:win10系统火狐flash插件崩溃怎么

今天小编告诉大家如何处理win10系统火狐flash插件总是崩溃的问题，可能很多用户都不知

阅读：9696|2022-11-06

3 亲亲特价:怎么删除回收站图标

今天小编告诉大家如何对win10系统删除桌面回收站图标进行设置，可能很多用户都不知道

阅读：8192|2022-11-06

4 济南大学虚拟社区:鲁大师节能降温的具体办

今天小编告诉大家如何对win10系统电脑设置节能降温的设置方法，想必大家都遇到过需要

阅读：8558|2022-11-06

5 xlueops.exe:无线网络安装向导

我们在使用xp系统的过程中,经常需要对xp系统无线网络安装向导设置进行设置，可能很多

阅读：8468|2022-11-06

6 女斗合众国:win7系统cf与主机连接不稳定怎

今天小编告诉大家如何处理win7系统玩cf老是与主机连接不稳定的问题，可能很多用户都不

阅读：9409|2022-11-06

7 0xc000022-[cf烟雾头]cf怎么调烟雾头

电脑对日常生活的重要性小编就不多说了，可是一旦碰到win7系统设置cf烟雾头的问题，很

阅读：8442|2022-11-06

8 qizideyouhuo:应用程序无法正常启动0xc0000

我们在日常使用电脑的时候，有的小伙伴们可能在打开应用的时候会遇见提示应用程序无法

阅读：7874|2022-11-06

9 ipz-185:win7系统vcf文件怎么打开

今天小编告诉大家如何对win7系统打开vcf文件进行设置，可能很多用户都不知道怎么对win

阅读：8426|2022-11-06

10 傻哥蹦迪:win10系统s4怎么打开usb调试

今天小编告诉大家如何对win10系统s4开启USB调试模式进行设置，可能很多用户都不知道怎

阅读：7403|2022-11-06

客服电话

电子邮件

JuliaStats/NMF.jl: A Julia package for non-negative matrix factorization

开源软件名称：

开源软件地址：

开源编程语言：

开源软件介绍：

NMF.jl

Development Status

Done

To do

Overview

High-Level Interface

Initialization

Factorization Algorithms

The NMF.solve! Function

Algorithms

Examples

Use High-level Function: nnmf

Use Multiplicative Update

Use Naive ALS

Use ALS with Projected Gradient Descent

Use Coordinate Descent

Use Greedy Coordinate Descent

Use Successive Projection Algorithm for Separable NMF

请发表评论

全部评论

上一篇：

下一篇：

PacktPublishing/Python-Machine-Learning-

sussillo/hfopt-matlab: A parallel, cpu-b

鲁东大学一米网:Win7系统USB驱动器RAM的操

CVE-2022-31205

emersion/go-ostatus: An OStatus library

剪的笔顺,诠释剪的笔画,认识剪的部首

六六分期app的软件客服如何联系？(六六分期

florent37/ViewAnimator: A fluent Android

florent37/Shrine-MaterialDesign2: implem

CVE-2020-36276

SimpleSoftwareIO/simple-sms: Send and re

关于我们

产品与服务

解决方案

139-2527-9053