在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称(OpenSource Name):vmware/kube-fluentd-operator开源软件地址(OpenSource Url):https://github.com/vmware/kube-fluentd-operator开源编程语言(OpenSource Language):Go 87.3%开源软件介绍(OpenSource Introduction):kube-fluentd-operator (KFO)OverviewKubernetes Fluentd Operator (KFO) is a Fluentd config manager with batteries included, config validation, no needs to restart, with sensible defaults and best practices built-in. Use Kubernetes labels to filter/route logs per namespace! kube-fluentd-operator configures Fluentd in a Kubernetes environment. It compiles a Fluentd configuration from configmaps (one per namespace) - similar to how an Ingress controller would compile nginx configuration from several Ingress resources. This way only one instance of Fluentd can handle all log shipping for an entire cluster and the cluster admin does NOT need to coordinate with namespace admins. Cluster administrators set up Fluentd only once and namespace owners can configure log routing as they wish. KFO will re-configure Fluentd accordingly and make sure logs originating from a namespace will not be accessible by other tenants/namespaces. KFO also extends the Fluentd configuration language making it possible to refer to pods based on their labels and the container name pattern. This enables for very fined-grained targeting of log streams for the purpose of pre-processing before shipping. Writing a custom processor, adding a new Fluentd plugin, or writing a custom Fluentd plugin allow KFO to be extendable for any use case and any external logging ingestion system. Finally, it is possible to ingest logs from a file on the container filesystem. While this is not recommended, there are still legacy or misconfigured apps that insist on logging to the local filesystem. Try it outThe easiest way to get started is using the Helm chart. Official images are not published yet, so you need to pass the image.repository and image.tag manually: git clone [email protected]:vmware/kube-fluentd-operator.git
helm install kfo ./kube-fluentd-operator/charts/log-router \
--set rbac.create=true \
--set image.tag=v1.16.6 \
--set image.repository=vmware/kube-fluentd-operator Alternatively, deploy the Helm chart from a Github release: CHART_URL='https://github.com/vmware/kube-fluentd-operator/releases/download/v1.16.6/log-router-0.4.0.tgz'
helm install kfo ${CHART_URL} \
--set rbac.create=true \
--set image.tag=v1.16.6 \
--set image.repository=vmware/kube-fluentd-operator Then create a namespace kubectl create ns demo
cat > fluent.conf << EOF
<match **>
@type null
</match>
EOF
# Create the configmap with a single entry "fluent.conf"
kubectl create configmap fluentd-config --namespace demo --from-file=fluent.conf=fluent.conf
# The following step is optional: the fluentd-config is the default configmap name.
# kubectl annotate namespace demo logging.csp.vmware.com/fluentd-configmap=fluentd-config
In a minute, this configuration would be translated to something like this: <match demo.**>
@type null
</match> Even though the tag All configuration errors are stored in the annotation # extract just the value of logging.csp.vmware.com/fluentd-status
kubectl get ns demo -o jsonpath='{.metadata.annotations.logging\.csp\.vmware\.com/fluentd-status}'
bad tag for <match>: hello-world. Tag must start with **, $thisns or demo When the configuration is made valid again the To see kube-fluentd-operator in action you need a cloud log collector like logz.io, loggly, papertrail or ELK accessible from the K8S cluster. A simple loggly configuration looks like this (replace TOKEN with your customer token): <match **>
@type loggly
loggly_url https://logs-01.loggly.com/inputs/TOKEN/tag/fluentd
</match> BuildGet the code using go get -u github.com/vmware/kube-fluentd-operator/config-reloader
cd $GOPATH/src/github.com/vmware/kube-fluentd-operator
# build a base-image
cd base-image && make build-image
# build helm chart
cd charts/log-router && make helm-package
# build the daemon
cd config-reloader
make install
make build-image
# run with mock data (loaded from the examples/ folder)
make run-once-fs
# run with mock data in a loop (may need to ctrl+z to exit)
make run-loop-fs
# inspect what is generated from the above command
ls -l tmp/ Project structure
Config-reloaderThis is where interesting work happens. The dependency graph shows the high-level package interaction and general dataflow.
How does it workIt works be rewriting the user-provided configuration. This is possible because kube-fluentd-operator knows about the kubernetes cluster, the current namespace and
also has some sensible defaults built in. To get a quick idea what happens behind the scenes consider this configuration deployed in a namespace called <filter $labels(server=apache)>
@type parser
<parse>
@type apache2
</parse>
</filter>
<filter $labels(app=django)>
@type detect_exceptions
language python
</filter>
<match **>
@type es
</match> It gets processed into the following configuration which is then fed to Fluentd: <filter kube.monitoring.*.*>
@type record_transformer
enable_ruby true
<record>
kubernetes_pod_label_values ${record["kubernetes"]["labels"]["app"]&.gsub(/[.-]/, '_') || '_'}.${record["kubernetes"]["labels"]["server"]&.gsub(/[.-]/, '_') || '_'}
</record>
</filter>
<match kube.monitoring.*.*>
@type rewrite_tag_filter
<rule>
key kubernetes_pod_label_values
pattern ^(.+)$
tag ${tag}._labels.$1
</rule>
</match>
<filter kube.monitoring.*.*.**>
@type record_transformer
remove_keys kubernetes_pod_label_values
</filter>
<filter kube.monitoring.*.*._labels.*.apache _proc.kube.monitoring.*.*._labels.*.apache>
@type parser
<parse>
@type apache2
</parse>
</filter>
<match kube.monitoring.*.*._labels.django.*>
@type rewrite_tag_filter
<rule>
invert true
key _dummy
pattern /ZZ/
tag 3bfd045d94ce15036a8e3ff77fcb470e0e02ebee._proc.${tag}
</rule>
</match>
<match 3bfd045d94ce15036a8e3ff77fcb470e0e02ebee._proc.kube.monitoring.*.*._labels.django.*>
@type detect_exceptions
remove_tag_prefix 3bfd045d94ce15036a8e3ff77fcb470e0e02ebee
stream container_info
</match>
<match kube.monitoring.*.*._labels.*.* _proc.kube.monitoring.*.*._labels.*.*>
@type es
</match> ConfigurationBasic usageTo give the illusion that every namespace runs a dedicated Fluentd the user-provided configuration is post-processed. In general, expressions starting with The admin namespaceKube-fluentd-operator defines one namespace to be the admin namespace. By default this is set to <match **>
@type ...
# destination configuration omitted
</match>
Fluentd assumes it is running in a distro with systemd and generates logs with these Fluentd tags:
As the admin namespace is processed first, a match-all directive would consume all logs and any other namespace configuration will become irrelevant (unless <match systemd.** kube.kube-system.** k8s.** docker>
# all k8s-internal and OS-level logs
# destination config omitted...
</match> Note the Using the $labels macroA very useful feature is the <filter $labels(app=log-router, _container=reloader)>
@type parser
reserve_data true
<parse>
@type logfmt
</parse>
</filter>
<match **>
@type loggly
# destination config omitted
</match> The above config will pipe all logs from the pods labelled with If you use Kubernetes recommended labels for the pods and deployments, then KFO will rewrite For example, let's assume the following labels exist in the fluentd-config in the This label This label This label This fluentd configmap in the <filter **>
@type concat
timeout_label @DISTILLERY_TYPES
key message
stream_identity_key cont_id
multiline_start_regexp /^(\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}|\[\w+\]\s|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b|=\w+ REPORT====|\d{2}\:\d{2}\:\d{2}\.\d{3})/
flush_interval 10
</filter>
<match **>
@type relabel
@label @DISTILLERY_TYPES
</match>
<label @DISTILLERY_TYPES>
<filter $labels(app_kubernetes_io/name=kafka)>
@type parser
key_name log
format json
reserve_data true
suppress_parse_error_log true
</filter>
<filter $labels(app.kubernetes.io/name=nginx-ingress, _container=controller)>
@type parser
key_name log
<parse>
@type json
reserve_data true
time_format %FT%T%:z
emit_invalid_record_to_error false
</parse>
</filter>
<match $labels(tag=noisy)>
@type null
</match>
</label> will be rewritten inside of KFO pods as this: <filter kube.testing.**>
@type concat
flush_interval 10
key message
multiline_start_regexp /^(\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}|\[\w+\]\s|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b|=\w+ REPORT====|\d{2}\:\d{2}\:\d{2}\.\d{3})/
stream_identity_key cont_id
timeout_label @-DISTILLERY_TYPES-0e93f964a5b5f1760278744f1adf55d58d0e78ba
</filter>
<match kube.testing.**>
@label @-DISTILLERY_TYPES-0e93f964a5b5f1760278744f1adf55d58d0e78ba
@type relabel
</match>
<match kube.testing.**>
@label @-DISTILLERY_TYPES-0e93f964a5b5f1760278744f1adf55d58d0e78ba
@type null
</match>
<label @-DISTILLERY_TYPES-0e93f964a5b5f1760278744f1adf55d58d0e78ba>
<filter kube.testing.*.*._labels.*.kafka.*>
@type parser
format json
key_name log
reserve_data true
suppress_parse_error_log true
</filter>
<filter kube.testing.*.controller._labels.nginx_ingress.*.*>
@type parser
key_name log
<parse>
@type json
emit_invalid_record_to_error false
reserve_data true
time_format %FT%T%:z
</parse>
</filter>
<match kube.testing.*.*._labels.*.*.noisy>
@type null
</match>
</label> All plugins that change the fluentd tag are disabled for security reasons. Otherwise a rogue configuration may divert other namespace's logs to itself by prepending its name to the tag. Ingest logs from a file in the containerThe only allowed <source>
@type mounted-file
path /var/log/welcome.log
labels app=grafana, _container=test-container
<parse>
@type none
</parse>
</source> The The above configuration would translate at runtime to something similar to this: <source>
@type tail
path /var/lib/kubelet/pods/723dd34a-4ac0-11e8-8a81-0a930dd884b0/volumes/kubernetes.io~empty-dir/logs/welcome.log
pos_file /var/log/kfotail-7020a0b821b0d230d89283ba47d9088d9b58f97d.pos
read_from_head true
tag kube.kfo-test.welcome-logger.test-container
<parse>
@type none
</parse>
</source> Dealing with multi-line exception stacktraces (since v1.3.0)Most log streams are line-oriented. However, stacktraces always span multiple lines. kube-fluentd-operator integrates stacktrace processing using the fluent-plugin-detect-exceptions. If a Java-based pod produces stacktraces in the logs, then the stacktraces can be collapsed in a single log event like this: <filter $labels(app=jpetstore)>
@type detect_exceptions
# you can skip language in which case all possible languages will be tried: go, java, python, ruby, etc...
language java
</filter>
# The rest of the configuration stays the same even though quite a lot of tag rewriting takes place
<match **>
@type es
</match> Notice how Also, users don't need to bother with setting the correct Reusing output plugin definitions (since v1.6.0)Sometimes you only have a few valid options for log sinks: a dedicated S3 bucket, the ELK stack you manage, etc. The only flexibility you're after is letting namespace owners filter and parse their logs. In such cases you can abstract over an output plugin configuration - basically reducing it to a simple name which can be referenced from any namespace. For example, let's assume you have an S3 bucket for a "test" environment and you use loggly for a "staging" environment. The first thing you do is define these two output in the admin namespace: admin-ns.conf:
<match systemd.** docker kube.kube-system.** k8s.**>
@type loggly
loggly_url https://logs-01.loggly.com/inputs/TOKEN/tag/fluentd
</match>
<plugin test>
@type s3
aws_key_id YOUR_AWS_KEY_ID
aws_sec_key YOUR_AWS_SECRET_KEY
s3_bucket YOUR_S3_BUCKET_NAME
s3_region AWS_REGION
</plugin>
<plugin staging>
@type loggly
loggly_url https://logs-01.loggly.com/inputs/TOKEN/tag/fluentd
</plugin> In the above example for the admin configuration, the A namespace can refer to the acme-test.conf
<match **>
@type test
</match>
acme-staging.conf
<match **>
@type staging
</match> kube-fluentd-operator will insert the content of the Retagging based on log contents (since v1.12.0)Sometimes you might need to split a single log stream to perform different processing based on the contents of one of the fields. To achieve this you can use the Logs that are emitted by this plugin can be consequently filtered and processed by using the <match $labels(app=apache)>
@type retag
<rule>
key message
pattern /^(ERROR) .*$/
tag notifications.$1 # refer to a capturing group using $number
</rule>
<rule>
key message
pattern /^(FATAL) .*$/
tag notifications.$1
</rule>
<rule>
key message
pattern /^(ERROR)|(FATAL) .*$/
tag notifications.other
invert true # rewrite tag when unmatch pattern
</rule>
</match>
<filter $tag(notifications.ERROR)>
# perform some extra processing
</filter>
<filter $tag(notifications.FATAL)>
# perform different processing
</filter>
<match $tag(notifications.**)>
# send to common output plugin
</match> kube-fluentd-operator ensures that tags specified using the Sharing logs between namespacesBy default, you can consume logs only from your namespaces. Often it is useful for multiple namespaces (tenants) to get access to the logs streams of a shared resource (pod, namespace). kube-fluentd-operator makes it possible using two constructs: the source namespace expresses its intent to share logs with a destination namespace and the destination namespace expresses its desire to consume logs from a source. As a result logs are streamed only when both sides agree. A source namespace can share with another namespace using the producer namespace configuration: <match $labels(msg=nginx-ingress)>
@type copy
<store>
@type share
# share all logs matching the labels with the namespace "consumer"
with_namespace consumer
</store>
</match> consumer namespace configuration: # use $from(producer) to get all shared logs from a namespace called "producer"
<label @$from(producer)>
<match **>
# process all shared logs here as usual
</match>
</match> The consuming namespace can use the usual syntax inside the The producing namespace need to wrap Log metadataOften you run mulitple Kubernetes clusters but you need to aggregate all logs to a single destination. To distinguish between different sources, helm install ... \
--set meta.key=metadata \
--set meta.values.region=us-east-1 \
--set meta.values.env=staging \
--set meta.values.cluster=legacy Every log event, be it from a pod, mounted-file or a systemd unit, will now carry this metadata: {
"metadata": {
"region": "us-east-1",
"env": "staging",
"cluster": "legacy",
}
} All logs originating from a file look exactly as all other Kubernetes logs. However, their {
"message": "Some message from the welcome-logger pod",
"stream": "/var/log/welcome.log",
"kubernetes": {
"container_name": "test-container",
"host": "ip-11-11-11-11.us-east-2.compute.internal",
"namespace_name": "kfo-test",
"pod_id": "723dd34a-4ac0-11e8-8a81-0a930dd884b0",
"pod_name": "welcome-logger",
"labels": {
"msg": "welcome",
"test-case": "b"
},
"namespace_labels": {}
},
"metadata": {
"region": "us-east-2",
"cluster": "legacy",
"env": "staging"
}
} Custom resource definition(CRD) support (since v1.13.0)Custom resources are introduced from v1.13.0 release onwards. It allows to have a dedicated resource for fluentd configurations, which enables to manage them in a more consistent way and move away from the generic ConfigMaps. It is possible to create configs for a new application simply by attaching a FluentdConfig resource to the application manifests, rather than using a more generic ConfigMap with specific names and/or labels. apiVersion: logs.vdp.vmware.com/v1beta1
kind: FluentdConfig
metadata:
name: fd-config
spec:
fluentconf: |
<match kube.ns.**>
@type relabel
@label @NOTIFICATIONS
</match>
<label @NOTIFICATIONS>
<match **>
@type null
</match>
</label> The "crd" has been introduced as a new datasource, configurable through the helm chart values, to allow users that are currently set up with ConfigMaps and do not want to perform the switchover to FluentdConfigs, to be able to keep on using them. The config-reloader has been equipped with the capability of installing the CRD at startup if requested, so no manual actions to enable it on the cluster are needed. The existing configurations though ConfigMaps can be migrated to CRDs through the following migration flow
Tracking Fluentd versionThis projects tries to keep up with major releases for Fluentd docker image.
Plugins in latest release (1.16.6)
|
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论