k8s-rdma-device-plugin is a device plugin for Kubernetes to manage RDMA device.
RDMA(remote direct memory access) is a high performance network protocol, which has the following major advantages:
Zero-copy
Applications can perform data transfer without the network software stack involvement and data is being send received directly to the buffers without being copied between the network layers.
Kernel bypass
Applications can perform data transfer directly from userspace without the need to perform context switches.
No CPU involvement
Applications can access remote memory without consuming any CPU in the remote machine. The remote memory machine will be read without any intervention of remote process (or processor). The caches in the remote CPU(s) won't be filled with the accessed memory content.
You can read this post to get more information about RDMA.
This plugin allow you to use RDMA device in container of Kubernetes cluster. And more, We can use this plugin work with sriov-cni to provide high perfmance network connection for distributed application, especially GPU distributed application, such as Tensorflow,Spark, etc.
Quick Start
Build
Install libibverbs package, for CentOS:
# yum install libibverbs-devel -y
Then run build:
# ./build
# ls bin
k8s-rdma-device-plugin
Work with Kubernetes
Preparing RDMA node
Install ibverbs libraries, then start kubelet with --feature-gates=DevicePlugins=true.
FROM mellanox/mofed421_docker:latest
CMD ["/bin/sleep", "360000"]
TODO
Share RDMA device for the containers in the same pod
Generally speaking, for RoCE with k8s, all containers in the same pod should share the same RDMA devices, this is unsupported by k8s now.
Work with sriov-cni plugin
Kubernetes call DP(device plugin) when Admit pod, and call CNI plugin when creating sandbox container. We need a way that pass RDMA device information from DP to CNI. Refer to the issue 32.
Work with NVIDIA GPU plugin
For high performance, we should coordinate the k8s-rdma-device-plugin and nvidia device plugin, and try to make RDMA devices and GPU devices allocated for the same container are located under the same PCIe switch.
请发表评论