• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

Alexander-Barth/NCDatasets.jl: Load and create NetCDF files in Julia

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称:

Alexander-Barth/NCDatasets.jl

开源软件地址:

https://github.com/Alexander-Barth/NCDatasets.jl

开源编程语言:

Julia 99.2%

开源软件介绍:

NCDatasets

Build Status codecov.io documentation stable documentation dev

NCDatasets allows one to read and create netCDF files. NetCDF data set and attribute list behave like Julia dictionaries and variables like Julia arrays.

The module NCDatasets provides support for the following netCDF CF conventions:

  • _FillValue will be returned as missing (more information)
  • scale_factor and add_offset are applied if present
  • time variables (recognized by the units attribute) are returned as DateTime objects.
  • Support of the CF calendars (standard, gregorian, proleptic gregorian, julian, all leap, no leap, 360 day)
  • The raw data can also be accessed (without the transformations above).
  • Contiguous ragged array representation

Other features include:

  • Support for NetCDF 4 compression and variable-length arrays (i.e. arrays of vectors where each vector can have potentailly a different length)
  • The module also includes an utility function ncgen which generates the Julia code that would produce a netCDF file with the same metadata as a template netCDF file.

Installation

Inside the Julia shell, you can download and install the package by issuing:

using Pkg
Pkg.add("NCDatasets")

Windows users are required to pin the version of NetCDF_jll until this issue is resolved (help is more than welcome).

using Pkg
Pkg.add("NetCDF_jll")
Pkg.pin(name="NetCDF_jll", version="400.702.400")

Manual

This Manual is a quick introduction in using NCDatasets.jl. For more details you can read the stable or latest documentation.

Explore the content of a netCDF file

Before reading the data from a netCDF file, it is often useful to explore the list of variables and attributes defined in it.

For interactive use, the following commands (without ending semicolon) display the content of the file similarly to ncdump -h file.nc:

using NCDatasets
ds = Dataset("file.nc")

This creates the central structure of NCDatasets.jl, Dataset, which represents the contents of the netCDF file (without immediatelly loading everything in memory). NCDataset is an alias for Dataset.

The following displays the information just for the variable varname:

ds["varname"]

while to get the global attributes you can do:

ds.attrib

which produces a listing like:

Dataset: file.nc
Group: /

Dimensions
   time = 115

Variables
  time   (115)
    Datatype:    Float64
    Dimensions:  time
    Attributes:
     calendar             = gregorian
     standard_name        = time
     units                = days since 1950-01-01 00:00:00
[...]

Load a netCDF file

Loading a variable with known structure can be achieved by accessing the variables and attributes directly by their name.

# The mode "r" stands for read-only. The mode "r" is the default mode and the parameter can be omitted.
ds = Dataset("/tmp/test.nc","r")
v = ds["temperature"]

# load a subset
subdata = v[10:30,30:5:end]

# load all data
data = v[:,:]

# load all data ignoring attributes like scale_factor, add_offset, _FillValue and time units
data2 = v.var[:,:]


# load an attribute
unit = v.attrib["units"]
close(ds)

In the example above, the subset can also be loaded with:

subdata = Dataset("/tmp/test.nc")["temperature"][10:30,30:5:end]

This might be useful in an interactive session. However, the file test.nc is not directly closed (closing the file will be triggered by Julia's garbage collector), which can be a problem if you open many files. On Linux the number of opened files is often limited to 1024 (soft limit). If you write to a file, you should also always close the file to make sure that the data is properly written to the disk.

An alternative way to ensure the file has been closed is to use a do block: the file will be closed automatically when leaving the block.

data =
Dataset(filename,"r") do ds
    ds["temperature"][:,:]
end # ds is closed

Create a netCDF file

The following gives an example of how to create a netCDF file by defining dimensions, variables and attributes.

using NCDatasets
using DataStructures
# This creates a new NetCDF file /tmp/test.nc.
# The mode "c" stands for creating a new file (clobber)
ds = Dataset("/tmp/test.nc","c")

# Define the dimension "lon" and "lat" with the size 100 and 110 resp.
defDim(ds,"lon",100)
defDim(ds,"lat",110)

# Define a global attribute
ds.attrib["title"] = "this is a test file"

# Define the variables temperature with the attribute units
v = defVar(ds,"temperature",Float32,("lon","lat"), attrib = OrderedDict(
    "units" => "degree Celsius"))

# add additional attributes
v.attrib["comments"] = "this is a string attribute with Unicode Ω ∈ ∑ ∫ f(x) dx"

# Generate some example data
data = [Float32(i+j) for i = 1:100, j = 1:110]

# write a single column
v[:,1] = data[:,1]

# write a the complete data set
v[:,:] = data

close(ds)

Edit an existing netCDF file

When you need to modify variables or attributes in a netCDF file, you have to open it with the "a" option. Here, for example, we add a global attribute creator to the file created in the previous step.

ds = Dataset("/tmp/test.nc","a")
ds.attrib["creator"] = "your name"
close(ds);

Benchmark

The benchmark loads a variable of the size 1000x500x100 in slices of 1000x500 (applying the scaling of the CF conventions) and computes the maximum of each slice and the average of each maximum over all slices. This operation is repeated 100 times. The code is available at https://github.com/Alexander-Barth/NCDatasets.jl/tree/master/test/perf .

Module median minimum mean std. dev.
R-ncdf4 0.572 0.550 0.575 0.023
python-netCDF4 0.504 0.498 0.505 0.003
julia-NCDatasets 0.228 0.212 0.226 0.005

All runtimes are in seconds. Julia 1.6.0 (with NCDatasets b953bf5), R 3.4.4 (with ncdf4 1.17) and Python 3.6.9 (with netCDF4 1.5.4). This CPU is a i7-7700.

Filing an issue

When you file an issue, please include sufficient information that would allow somebody else to reproduce the issue, in particular:

  1. Provide the code that generates the issue.
  2. If necessary to run your code, provide the used netCDF file(s).
  3. Make your code and netCDF file(s) as simple as possible (while still showing the error and being runnable). A big thank you for the 5-star-premium-gold users who do not forget this point!

鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap