在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称(OpenSource Name):spotify/scio开源软件地址(OpenSource Url):https://github.com/spotify/scio开源编程语言(OpenSource Language):Scala 74.0%开源软件介绍(OpenSource Introduction):Scio
Scio is a Scala API for Apache Beam and Google Cloud Dataflow inspired by Apache Spark and Scalding. Scio 0.3.0 and future versions depend on Apache Beam ( Features
* provided by Google Cloud Dataflow Quick StartDownload and install the Java Development Kit (JDK) version 8. Install sbt. Use our giter8 template to quickly create a new Scio job repository:
Switch to the new repo (default
Run the included word count example:
List result files and inspect content:
DocumentationGetting Started is the best place to start with Scio. If you are new to Apache Beam and distributed data processing, check out the Beam Programming Guide first for a detailed explanation of the Beam programming model and concepts. If you have experience with other Scala data processing libraries, check out this comparison between Scio, Scalding and Spark. Finally check out this document about the relationship between Scio, Beam and Dataflow. Example Scio pipelines and tests can be found under scio-examples. A lot of them are direct ports from Beam's Java examples. See this page for some of them with side-by-side explanation. Also see Big Data Rosetta Code for common data processing code snippets in Scio, Scalding and Spark.
ArtifactsScio includes the following artifacts:
LicenseCopyright 2021 Spotify AB. Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0 |
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论