在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称:simdjson/simdjson开源软件地址:https://github.com/simdjson/simdjson开源编程语言:C++ 96.0%开源软件介绍:simdjson : Parsing gigabytes of JSON per second JSON is everywhere on the Internet. Servers spend a *lot* of time parsing it. We need a fresh approach. The simdjson library uses commonly available SIMD instructions and microparallel algorithms to parse JSON 4x faster than RapidJSON and 25x faster than JSON for Modern C++.
This library is part of the Awesome Modern C++ list. Table of Contents
Quick StartThe simdjson library is easily consumable with a single .h and .cpp file.
#include <iostream>
#include "simdjson.h"
using namespace simdjson;
int main(void) {
ondemand::parser parser;
padded_string json = padded_string::load("twitter.json");
ondemand::document tweets = parser.iterate(json);
std::cout << uint64_t(tweets["search_metadata"]["count"]) << " results." << std::endl;
}
DocumentationUsage documentation is available:
Performance resultsThe simdjson library uses three-quarters less instructions than state-of-the-art parser RapidJSON. To our knowledge, simdjson is the first fully-validating JSON parser to run at gigabytes per second (GB/s) on commodity processors. It can parse millions of JSON documents per second on a single core. The following figure represents parsing speed in GB/s for parsing various files on an Intel Skylake processor (3.4 GHz) using the GNU GCC 10 compiler (with the -O3 flag). We compare against the best and fastest C++ libraries on benchmarks that load and process the data. The simdjson library offers full unicode (UTF-8) validation and exact number parsing. The simdjson library offers high speed whether it processes tiny files (e.g., 300 bytes) or larger files (e.g., 3MB). The following plot presents parsing speed for synthetic files over various sizes generated with a script on a 3.4 GHz Skylake processor (GNU GCC 9, -O3). All our experiments are reproducible. For NDJSON files, we can exceed 3 GB/s with our multithreaded parsing functions. Real-world usageIf you are planning to use simdjson in a product, please work from one of our releases. Bindings and Ports of simdjsonWe distinguish between "bindings" (which just wrap the C++ code) and a port to another programming language (which reimplements everything).
About simdjsonThe simdjson library takes advantage of modern microarchitectures, parallelizing with SIMD vector instructions, reducing branch misprediction, and reducing data dependency to take advantage of each CPU's multiple execution cores. Some people enjoy reading our paper: A description of the design and implementation of simdjson is in our research article:
We have an in-depth paper focused on the UTF-8 validation:
We also have an informal blog post providing some background and context. For the video inclined, FundingThe work is supported by the Natural Sciences and Engineering Research Council of Canada under grant number RGPIN-2017-03910. Contributing to simdjsonHead over to CONTRIBUTING.md for information on contributing to simdjson, and HACKING.md for information on source, building, and architecture/design. LicenseThis code is made available under the Apache License 2.0. Under Windows, we build some tools using the windows/dirent_portable.h file (which is outside our library code): it under the liberal (business-friendly) MIT license. For compilers that do not support C++17, we bundle the string-view library which is published under the Boost license. Like the Apache license, the Boost license is a permissive license allowing commercial redistribution. For efficient number serialization, we bundle Florian Loitsch's implementation of the Grisu2 algorithm for binary to decimal floating-point numbers. The implementation was slightly modified by JSON for Modern C++ library. Both Florian Loitsch's implementation and JSON for Modern C++ are provided under the MIT license. For runtime dispatching, we use some code from the PyTorch project licensed under 3-clause BSD. |
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论