• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

haampie/IncompleteLU.jl: Crout ILU for Julia

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称:

haampie/IncompleteLU.jl

开源软件地址:

https://github.com/haampie/IncompleteLU.jl

开源编程语言:

Julia 100.0%

开源软件介绍:

Build Status codecov

ILU for SparseMatrixCSC

This package implements the left-looking or Crout version of ILU for the SparseMatrixCSC type. It exports the function ilu.

How to install

IncompleteLU can be installed through the Julia package manager:

julia> ]
pkg> add IncompleteLU

The package is then available via

julia> using IncompleteLU

When to use this package

Whenever you need an incomplete factorization of a sparse and non-symmetric matrix.

The package also provides means to apply the factorization in-place via ldiv!, forward_substitution! and backward_substitution!. This is useful in the context of left, right or split preconditioning. See the example below.

Example

Using a drop tolerance of 0.01, we get a reasonable preconditioner with a bit of fill-in.

> using IncompleteLU, LinearAlgebra, SparseArrays
> using BenchmarkTools
> A = sprand(1000, 1000, 5 / 1000) + 10I
> fact = @btime ilu($A, τ = 0.001)
  2.894 ms (90 allocations: 1.18 MiB)
> norm((fact.L + I) * fact.U' - A)
0.05736313452207798
> nnz(fact) / nnz(A)
3.6793806030969844

Full LU is obtained when the drop tolerance is 0.0.

> fact = @btime ilu($A, τ = 0.)
  209.293 ms (106 allocations: 12.18 MiB)
> norm((fact.L + I) * fact.U' - A)
1.5262736852530086e-13
> nnz(fact) / nnz(A)
69.34213528932355

Preconditioner

ILU is typically used as preconditioner for iterative methods. For instance

using IterativeSolvers, IncompleteLU
using SparseArrays, LinearAlgebra
using BenchmarkTools
using Plots

"""
Benchmarks a non-symmetric n × n × n problem
with and without the ILU preconditioner.
"""
function mytest(n = 64)
    N = n^3

    A = spdiagm(
      -1 => fill(-1.0, n - 1), 
       0 => fill(3.0, n), 
       1 => fill(-2.0, n - 1)
    )
    Id = sparse(1.0I, n, n)
    A = kron(A, Id) + kron(Id, A)
    A = kron(A, Id) + kron(Id, A)
    x = ones(N)
    b = A * x

    LU = ilu(A, τ = 0.1)
    @show nnz(LU) / nnz(A)

    # Benchmarks
    prec = @benchmark ilu($A, τ = 0.1)
    @show prec
    with = @benchmark bicgstabl($A, $b, 2, Pl = $LU, max_mv_products = 2000)
    @show with
    without = @benchmark bicgstabl($A, $b, 2, max_mv_products = 2000)
    @show without

    # Result
    x_with, hist_with = bicgstabl(A, b, 2, Pl = LU, max_mv_products = 2000, log = true)
    x_without, hist_without = bicgstabl(A, b, 2, max_mv_products = 2000, log = true)

    @show norm(b - A * x_with) / norm(b) 
    @show norm(b - A * x_without) / norm(b)

    plot(hist_with[:resnorm], yscale = :log10, label = "With ILU preconditioning", xlabel = "Iteration", ylabel = "Residual norm (preconditioned)", mark = :x)
    plot!(hist_without[:resnorm], label = "Without preconditioning", mark = :x)
end

mytest()

Outputs

nnz(LU) / nnz(A) = 2.1180353639352374
prec = Trial(443.781 ms)
with = Trial(766.141 ms)
without = Trial(2.595 s)
norm(b - A * x_with) / norm(b) = 2.619046427010899e-9
norm(b - A * x_without) / norm(b) = 1.2501603557459283e-8

Residual norm with preconditioner

The algorithm

The basic algorithm loops roughly as follows:

for k = 1 : n
  row = zeros(n); row[k:n] = A[k,k:n]
  col = zeros(n); col[k+1:n] = A[k+1:n,k]

  for i = 1 : k - 1 where L[k,i] != 0
    row -= L[k,i] * U[i,k:n]
  end

  for i = 1 : k - 1 where U[i,k] != 0
    col -= U[i,k] * L[k+1:n,i]
  end

  # Apply a dropping rule in row and col

  U[k,:] = row
  L[:,k] = col / U[k,k]
  L[k,k] = 1
end

which means that at each step k a complete row and column are computed based on the previous rows and columns:

          k
+---+---+---+---+---+---+---+---+
| \ |   | x | x | x | x | x | x |
+---+---+---+---+---+---+---+---+
|   | \ | x | x | x | x | x | x |
+---+---+---+---+---+---+---+---+
|   |   | . | . | . | . | . | . | k
+---+---+---+---+---+---+---+---+
| x | x | . | \ |   |   |   |   |
+---+---+---+---+---+---+---+---+
| x | x | . |   | \ |   |   |   |
+---+---+---+---+---+---+---+---+
| x | x | . |   |   | \ |   |   |
+---+---+---+---+---+---+---+---+
| x | x | . |   |   |   | \ |   |
+---+---+---+---+---+---+---+---+
| x | x | . |   |   |   |   | \ |
+---+---+---+---+---+---+---+---+

col and row are the .'s, updated by the x's.

At step k we load (part of) a row and column of the matrix A, and subtract the previous rows and columns times a scalar (basically a SpMV product). The problem is that our matrix is column-major, so that loading a row is not cheap. Secondly, it makes sense to store the L factor column-wise and the U factor row-wise (so that we can append columns and rows without data movement), yet we need access to a row of L and a column of U.

The latter problem can be worked around without expensive searches. It's basically smart bookkeeping: going from step k to k+1 requires updating indices to the next nonzero of each row of U after column k. If you now store for each column of U a list of nonzero indices, this is the moment you can update it. Similarly for the L factor.

The matrix A can be read row by row as well with the same trick.

Accumulating a new sparse row or column

Throughout the steps two temporary row and column accumulators are used to store the linear combinations of previous sparse rows and columns. There are two implementations of this accumulator: the SparseVectorAccumulator performs insertion in O(1), but stores the indices unordered; therefore a sort is required when appending to the SparseMatrixCSC. The InsertableSparseVector performs insertion sort, which can be slow, but turns out to be fast in practice. The latter is a result of insertion itself being an O(1) operation due to a linked list structure, and the fact that sorted vectors are added, so that the linear scan does not have to restart at each insertion.

The advantage of SparseVectorAccumulator over InsertableSparseVector is that the former postpones sorting until after dropping, while InsertableSparseVector also performs insertion sort on dropped values.

Todo

The method does not implement scaling techniques, so the τ parameter is really an absolute dropping tolerance parameter.




鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
上一篇:
timholy/IProfile.jl: Profilers for Julia发布时间:2022-07-09
下一篇:
JuliaPy/Pandas.jl: A Julia front-end to Python's Pandas package.发布时间:2022-07-09
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap