Fluentd: Kubernetes Log Collection with Fluentd, Elasticsearch

log data flow from Fluentd to Elasticsearch

What is Fluentd?

Fluentd is a cross platform data collector, which is very useful for log collection, transformation and shipping to backends like Elasticsearch. It decouples data sources from log storage systems by providing a unified logging layer in between.

In this quick start demo, we’ll use Fluentd to collect, transform, and ship logs from Kubernetes Pods to Elasticsearch cluster. Fluentd runs as a DaemonSet on all Kubernetes nodes with access to container log files to mount them locally.
It reads container log files, filter and transform the logs, finally ship them to an Elasticsearch cluster, where logs will be indexed and stored.

Running Fluentd DaemonSets on Kubernetes Cluster

Prerequisites:
  • A Kubernetes cluster(For information on how to deploy a GKE cluster, see this post.)
  • kubectl client library to connect to Kubernetes Cluster
  • Admin privileges on Kubernetes Cluster
  • Up & Running Elasticsearch Cluster with Kibana frontend(For information on how to deploy Elasticsearch and Kibana on GKE, see this post)
Connect to GKE cluster
gcloud container clusters get-credentials demo-k8s-cluster
Create Service Account for Fluentd to Access Container log files

We need to create a Kubernetes service account and Role with “get, list, watch” permissions on Pods and namespaces and bind them together using a ClusterRoleBinding.

Copy following configuration to a file called fluentd-sa.yaml.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: kube-system
  labels:
    app: fluentd
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluentd
  labels:
    app: fluentd
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - namespaces
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: fluentd
  namespace: kube-system

Apply fluentd-sa.yaml using kubectl command.

cloudshell:~/elk$ kubectl apply -f fluentd-sa.yaml
serviceaccount/fluentd created
clusterrole.rbac.authorization.k8s.io/fluentd created
clusterrolebinding.rbac.authorization.k8s.io/fluentd created
Deploying Fluentd DaemonSets

Copy following configuration to a file called fluentd.yaml and apply it using kubectl command for deploying Fluentd DaemonSets on all Kubernetes nodes.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-system
  labels:
    app: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.14.1-debian-elasticsearch7-1.0
        env:
          - name:  FLUENT_ELASTICSEARCH_HOST
            value: "elasticsearch.elk.svc.cluster.local"
          - name:  FLUENT_ELASTICSEARCH_PORT
            value: "9200"
          - name: FLUENT_ELASTICSEARCH_SCHEME
            value: "http"
          - name: FLUENTD_SYSTEMD_CONF
            value: disable
          - name: FLUENT_CONTAINER_TAIL_EXCLUDE_PATH
            value: /var/log/containers/fluent*
          - name: FLUENT_ELASTICSEARCH_SSL_VERIFY
            value: "false"
          - name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
            value: /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/ 
        resources:
          limits:
            memory: 512Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
cloudshell:~/elk$ kubectl apply -f fluentd.yaml
daemonset.apps/fluentd created

The above Fluentd DaemonSets are going to ship logs to “elasticsearch.elk.svc.cluster.local” endpoint which we deployed in earlier post.

List DaemonSets that we have deployed using ‘kubectl‘ command

cloudshell:~/elk$ kubectl get pods -n kube-system |grep fluentd
fluentd-5zdbt                                                  1/1     Running     0          2m14s
fluentd-c26mx                                                  1/1     Running     0          2m14s
fluentd-xd7j2                                                  0/1     Running     0          2m14s
kumar_valla@cloudshell:~/elk$
Testing Log Collection

Now we are going to deploy a simple pod to generate log messages and see if those logs are shipped to Elasticsearch by Fluentd DaemonSets. This Pod continuously prints a line like “hello, enjoy DevOps Counsel Fluetd demo” in a never ending a while loop.

Let’s deploy a Pod called counter for log generation. Copy following configuration to a file called logger-pod.yaml

apiVersion: v1
kind: Pod
metadata:
 name: counter
spec:
 containers:
 - name: count
   image: busybox
   args: [/bin/sh, -c,
           'i=0; while true; do echo "$i: hello, enjoy DevOps Counsel Fluetd demo"; i=$((i+1)); sleep 1; done']

Deploy logger-pod.yaml using kubectl command.

cloudshell:~/elk$ kubectl apply -f logger-pod.yaml
pod/counter created
cloudshell:~/elk$ kubectl get pods
NAME      READY   STATUS    RESTARTS   AGE
counter   1/1     Running   0          21s
cloudshell:~/elk$ kubectl logs counter |tail -5
76: hello, enjoy DevOps Counsel Fluetd demo
77: hello, enjoy DevOps Counsel Fluetd demo
78: hello, enjoy DevOps Counsel Fluetd demo
79: hello, enjoy DevOps Counsel Fluetd demo
80: hello, enjoy DevOps Counsel Fluetd demo

Now we will go to Kibana console and look for above logs.

On Kibana console, go to ‘Analytics‘ -> ‘Discover‘ to search for logs stored in Elasticsearch cluster and type above log message in search box. There you can see logs generated by counter pod

Conclusion

In this quick start demo we have deployed Fluentd DaemonSets on a GKE cluster to collect, transform and ship logs to Elasticsearch cluster. You can find more information about Fluentd in official documentation.

For Elasticsearch and Kibana setup refer this earlier post.

Leave a Reply

%d