在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称(OpenSource Name):gregbkr/kubernetes-kargo-logging-monitoring开源软件地址(OpenSource Url):https://github.com/gregbkr/kubernetes-kargo-logging-monitoring开源编程语言(OpenSource Language):Python 58.6%开源软件介绍(OpenSource Introduction):Deploy kubernetes via kargo with logging (efk) & monitoring (prometheus) supportWhat you will get:
Prerequisit:
More info: you can find an overview of that setup on my blog: https://greg.satoshi.tech/ Summary:
1. Deploy kubernetesWe will deploy a base k8s multi-master, etcd cluster, and dns support. You can modify the architecture depending on which values you will set later in inventory.cfg. 1.1 Clone repo
1.2 Deploy coreos nodesBastion An ubuntu vm from where you will install the infra with kargo tool (which is ansible recipes in the background), and manage/pilot the k8s cluster with kubectl. You need the latest version of ansible. You will need Netaddr install on the bastion too:
Firewall This setup doesn't managed firewall rules yet. Please create a security group with port 0-40000 tcp & udp open for all k8s servers inside that group. Bastion will be ouside this group. Please give some accesses to bastion on 22,80,443 port so this ubuntu server will be able to run kargo ansible recipes(22/tcp), and run kubectl(443/tcp). Open outside acccess to 80,443/tcp, 5601/tcp (kibana), 3000 & 3002/tcp (grafana), 8080/tcp (traefik), 9090/tcp (prometheus), 9999/tcp(k8s-dashboard). If you need to implement firewall, a good start here: https://github.com/gregbkr/kubernetes-ansible-logging-monitoring/blob/master/ansible/roles/k8s/tasks/create_secgroup_rules.yml Coreos Please install with your preferered cloud provider, or on baremetal, basic latest coreos os as much as you need nodes. 1.3 Deploy k8sWe are using the kargo powerful project. It is made of many ansible scripts to build your cloud and k8s on top. To pilot you can use:
Kargo-cli is being rebuilded in go, so I will just use the underlying ansible recipe at the moment. Please clone in your repo k8s:
First fill the inventory file with your node info
Deploy k8s
Note: You can also modifiy nginx_kube_apiserver_port instead of kube_apiserver_port to something other than 443 to enable loadbalancers as well. This would leave the API server running on port 443 but the internal proxy would be on a different port. Set ansible configuration with your key and inventory
You can now add logging support (efk) out of the box when adding the following flag:
Then deploy k8s with ansible:
Run few times untils no more errors.
I got ansible.cfg and inventory.yml example in ./util 1.4 Install kubectlKubectl is your admin local client to pilot the k8s cluster. One version of kubectl is already present on master, but it is better to have it locally, on your admin/bastion. Please use the same version as server. You will be able to talk and pilot k8s with this tool. Get kubectl
Get the cert from master
Configure kubectl
Autocompletion
If issues, see troubleshooting section. Do you want to migration k8s, add new node? Please see the annexes. 2. Deploy logging (efk) to collect k8s & containers eventsPlease note that you may already have efk deployed if you enabled the flag efk_enabled when you installed via kargo. If it is the case, you don't need to deploy anything, but just access the right service:
I didn't map yet this service to any lb (as it was added later than this doc). You can't access directly kibana UI because no host port (only internal) are declared in the service configuration. To give access to the service you will have to modify the lb:
And deploy it (see section lb) Same for traefik, you need to edit the name and namespace of the service:
2.1 Deploy elasticsearch, fluentd, kibana
2.2 Access servicesFrom here, you should be able to access our services from your laptop, as long as your cloud server ip are public:
To enable that access, we had set Type=NodePort and nodePort:35601/39200 in kibana/elasticsearch-service.yaml, to make it easier to learn at this point. Because we want to control how and from where we should be accessing our public services, we will set in a later section a loadbalancer. 2.3 See logs in kibanaCheck logs coming in kibana, you just need to refresh, select Time-field name : @timestamps + create Load and view your first dashboard: management > Saved Object > Import > logging/dashboards/elk-v1.json 3. Monitoring services and containersIt seems like two schools are gently fighting for container monitoring:
More on which one to choose: kubernetes-retired/heapster#645 3.1 Monitoring with prometheusCreate monitoring containers
Prometheus Access the gui: http://any_minion_node_ip:30090 Go to status > target : you should see only some green. We got one false positive error scaping k8s-node with 2 ports 9102 and 80. As long as 9102 is good, we got the data. If you got some "context deadline exceeded" or "getsockopt connection refused", you will have to open firewall rule between the nodes. For exemple in security group k8s, you need to open 9100 and 10255. Try a query: "node_memory_Active" > Execute > Graph --> you should see 2 lines representing both nodes. Grafana Login to the interface with login:admin | pass:admin) : http://any_minion_node_ip:30000 Load some dashboards: dashboard > home Load other public dashboards Grafana GUI > Dashboards > Import Already loaded:
Other good dashboards :
3.2 Monitoring2 with heapster
Access services
You can load Cluster or Pods dashboards. When viewing Pods, type manually "namespace=monitoring2" to view stats for the related containers. 4. Kubenetes dashboard addonDashboard addon let you see k8s services and containers via a nice GUI.
Access GUI: http://any_minion_node_ip:30999 5. Docker private registry addonI use the manifests here, with some little modifications. For some reasons, I have to force the image:docker.io/registry:2, otherwise docker can't connect to that deployed registry. We use registry without password nor tls to make it easier. Docker daemon on nodes should already have the service network range as insecure-registry. So when you will make available registry service on k8s network, docker we will be able to use it. To check daemon on one node: (you should see --insecure-registry=10.233.0.0/18 )
Deploy the registry:
In order to target the registry from outside, or within k8s, please setup traefik (see lb section) and create a dns (ex:registry.satoshi.tech) pointing to your lb node. Then add ingress config so traefik are aware of registry service:
Get the ip of the service, and check service access from one node
Or from the ubuntu image:
or from anywhere
Add an image to private registry
6. Gitlab for CI & CDThis setup is based on the great blog of lwolfs First please edit gitlab config:
Deploy gitlab
Use traefic as a loadbalancer (see lb section), create a dns record pointing to your lb_node (ex: gitlab.satoshi.tech) and deploy the related ingress:
You should be able to access gitlab with (login:root/rootpassword) http://gitlab.satoshi.tech or http://a_minion_ip:30088 Setup the cache server for docker runner builds
Create the runner cache folder
Register a runner to gitlab:
Answer the questions:
Delete that temp container:
Edit your final runner config and replace:
Deploy runner
Gitlab CI and CD We will now test a full continuous integration (docker build + test of a simple flask app saying helloworld) and a simplified continuous deployment (redeploy the flask container in k8s, on staging and prod, which are the same k8s) For CD, your runner will need access to k8s, so bring over to the ci folder the kubectl certificates (for production, please be careful when sharing certificates in repo)
Create a repo (k8s-ci) in gitlab GUI, and link the ci folder to that repo, such as mine:
Then edit the gitlab-ci config
And push:
Go to gitlab to check pipeline and the status. If you have some error, try to hit the retry button. Or remove most part of gitlab-ci.yml and try again. Check the app to get the "hello world" message:
At the end, you should be able to edit app.py to say "helloworld 3!", push the code, and see/curl the result later when the pipeline is finished! 7. LoadBalancersIf you are on aws or google cloud, these provider we automatically set a loadbalancer matching the *-ingress.yaml configuration. For all other cloud provider and baremetal, you will have to take care of that step. Luckyly, I will present you two types of loadlancer below ;-)
7.1 Service-loadbalancerBecause Kargo runs an nginx proxy for kubelet to access the api on all minions, port 443 is not available for any lbs to listen for public requests by default. You should change either the kube_apiserver_port or nginx_kube_apiserver_port options describe in configuring Kargo above. Create the load-balancer to be able to connect your service from the internet. Give 1 or more nodes the loadbalancer role:
If you change the config, use "kubectl delete -f service-loadbalancer.yaml" to force a delete/create, then the discovery of the newly created service. Add/remove services? please edit service-loadbalancer.yaml Access services
7.2 TraefikAny new services, exposed by *-ingress.yaml, will be caught by traefik and made available without restart. To experience the full power of traefik, please purchase a domain name (ex: satoshi.tech), and point that record to the node you choose to be the lb. This record will help create the automatic certificate via the acme standard.
Then for each services you will use, create a dns A record:
Based on which name you use to access the lb_node, traefik will forward to the right k8s service. Now you need to edit the configuration:
Label a minion as "loadbalancer" (see previous section) and create the dynamic proxy to be able to connect your service from the internet.
Access services If set in traefik, please use login/pass: test/test You can use http or https
7.3 Security considerationsThese lb nodes are some kind of DMZ servers where you could balance later your DNS queries. For production environment, I would recommend that only DMZ services (service-loadbalancer, traefik, nginx, ...) could run in here, because these servers will apply some less restrictive firewall rules (ex: open 80, 433, 5601, 3000) than other minion k8s nodes. So I would create a second security group (sg): k8s-dmz with same rules as k8s, and rules between both zone, so k8s services can talk to k8s and k8s-dmz. Then open 80, 433, 5601, 3000 for k8s-dmz only. Like this, k8s sg still protect more sensitive containers from direct public access/scans. The same applies for the master node. I would create a new sg for it: k8s-master, so only this group will permit access from kubectt (port 80, 443). Then you should remove all NodePort from the services configuration, so no service will be available when scanning a classic minion. For that please comment the section "# type: NodePort" for all *-service.yaml 7.4 Scaling loadbalancersAdd more loadbalancers, by adding more loadbalancers nodes. Because we use Daemonset type of job, all new nodes tagged with loadbalancer will spawn a loadbalancer container. Use ansible to add a node
Label it as a loadbalancer node
Then just check the new containers getting created kubectl get all --all-namespaces For service-loadbalancer, try to access new_lb_minion_ip:5601 For trafik, add a dns A-record kibana.satoshi.tech --> new_lb_minion_ip so we will balance dns resolution to the old and new lb_node. Test some ping, and access kibana.satoshi.tech few times... 8. Data persistenceIn this setup, if you loose influxdb or elasticsearch containers, k8s will restart the container but you will loose the data. You got few options to make your data persistent:
I will demonstrate the first 3 solutions. More info on volume types: https://kubernetes.io/docs/user-guide/volumes/ 8.1 EmptyDirif you open influxdb deployment, you will notice that it is already configured for "emptyDir". So if the container crashes, and get restarted on the same node, your data will stay. But if you delete the container, or the reschedule happens on another node, you will loose the data.
8.2 HostPathWe mount in the container a folder physically on the node where the container runs. This data is persistent, so you can kill the container and restart it to get the data, as long as you don't change nodes. Could be good then to label one node to always deploy influxdb on the node where the data live.
and use the tag below
The storage config:
Check the data are indeed on the host: ssh -i ~/.ssh/id_rsa_sbexx core@your_influx_node sudo ls /srv/influxdb |
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论