Containers

Centralized Logging for Windows Containers on Amazon EKS using Fluent Bit

Introduction

Today, Amazon Web Services (AWS) announced the support for Fluent Bit container images for Windows operating system. This support eliminates the need for Windows customers to implement any custom logging solutions in their application code or manage custom agents on their Windows nodes to scrape the logs. For more details about the supported Windows versions and the image tags for Fluent Bit, please visit our documentation here.

Fluent Bit is a fast and flexible log processor and router supported by various operating systems. It’s used to route logs to various AWS destinations, such as Amazon CloudWatch, Amazon Kinesis Data Firehose destinations, Amazon Simple Storage Service (Amazon S3), and Amazon OpenSearch. In addition to the AWS destinations, Fluent Bit supports common partner solutions such as Datadog, Splunk, custom HTTP servers, and many more. For more details about container log routing, please visit the blog link here. AWS already supports the Fluent Bit container images based on Linux operating system, and this release allows the customers to have a centralized mechanism for processing and routing their logs across Amazon Elastic Container Service (Amazon ECS) and Amazon Elastic Kubernetes Service (Amazon EKS) for both Linux and Windows workloads.

In this post, we cover how Amazon EKS customers can deploy Fluent Bit Windows images as a DaemonSet on their Windows nodes to stream Internet Information Services (IIS) logs generated in the Windows pods to Amazon CloudWatch logs as a way to centralize logging. We’ll configure Fluent Bit to route all the logs generated by pods in different namespaces to respective CloudWatch log groups. Additionally, each log entry would be enriched with additional Kubernetes metadata such as namespace, pod name, container name, image name, host name, etc. For more details on deploying Fluent Bit on Amazon ECS Windows cluster, please visit our blog post here.

Prerequisites

 Prerequisites and assumptions:

  • Amazon EKS cluster (1.20 or newer) up and running. See this step by step guide.
  • Launch Amazon EKS Windows worker nodes. See this step by step guide.
  • You have properly installed and configured Amazon Command Line Interface (AWS CLI), eksctl, and kubectl.
  • For building Windows container image, you have created Amazon Elastic Compute Cloud (Amazon EC2) Windows instance with Docker installed, which is based on the same Windows version as the Amazon EKS Windows worker nodes. Alternatively, you can also use Amazon EC2 Image Builder for building your container image.
  • You have created Amazon Elastic Container Registry (Amazon ECR) repository to host Windows container image. See this step by step tutorial.

Solution overview

In this post, we’ll complete the following tasks:

  1. Validate the Windows worker nodes are up and running.
  2. Create Kubernetes namespace amazon-cloudwatch in which Fluent Bit runs.
  3. Configure the AWS Identity and Access Management (AWS IAM) policies required for enabling Fluent Bit to send the logs to required destinations.
  4. Create a config map with the required configuration.
  5. [Optional] Build a Windows container image containing IIS and LogMonitor.
  6. Deploy Fluent Bit on Windows node as a DaemonSet.
  7. Deploy Windows container image containing IIS and LogMonitor.
  8. [Optional] Access the IIS pods to generate logs.
  9. Checking logs on Amazon CloudWatch logs.
  10. Cleanup the created resources.
Architecture of setup used in this blog post

Architecture of setup used in this blog post

Walkthrough

1. Validate the Windows worker nodes are up and running

To follow along in this tutorial, you must have an existing Amazon EKS cluster with Windows nodes running in it. To create the same, please follow our guide here.

  • To check that Windows worker nodes are ready, run the following command.
kubectl get nodes -o wide

You can expect the following output:

NAME                                           STATUS   ROLES    AGE   VERSION                INTERNAL-IP      OS-IMAGE                         KERNEL-VERSION                 CONTAINER-RUNTIME
ip-192-168-24-199.us-west-2.compute.internal   Ready    <none>   13h   v1.23.12-eks-d2d28e7   192.168.24.199   Windows Server 2019 Datacenter   10.0.17763.3532                docker://20.10.17
ip-192-168-47-140.us-west-2.compute.internal   Ready    <none>   13h   v1.23.12-eks-d2d28e7   192.168.47.140   Windows Server 2019 Datacenter   10.0.17763.3532                containerd://1.6.6
ip-192-168-9-130.us-west-2.compute.internal    Ready    <none>   13h   v1.23.9-eks-ba74326    192.168.9.130    Amazon Linux 2                   5.4.209-116.367.amzn2.x86_64   docker://20.10.17

As you can see, we have two Windows Server 2019 nodes based on Amazon EKS version 1.23, which are using docker and containerd runtime, respectively.

2. Create Kubernetes namespace amazon-cloudwatch for Fluent Bit

Kubernetes namespaces provide a scope for names and organize your workload inside the cluster. For consolidating all the Fluent Bit resources, we’ll create a namespace named amazon-cloudwatch.

2.1 Create a file with the following content and name it namespace.yaml:

apiVersion: v1
kind: Namespace
metadata:
  name: amazon-cloudwatch
  labels:
    name: amazon-cloudwatch

2.2 Run the following command to create the namespace:

kubectl apply -f namespace.yaml

3. Configure the IAM policies required for enabling Fluent Bit to send the logs to required destinations

Fluent Bit requires specific IAM permissions to send the logs to AWS destinations. The permissions required for a Fluent Bit output plugin are mentioned in the documentation for that specific output plugin. For example, we use Amazon CloudWatch as the log destination. The following IAM permission applies:

{
    "Version": "2012-10-17",
    "Statement": [
    {
        "Effect": "Allow",
        "Action": [
            "logs:CreateLogStream",
            "logs:CreateLogGroup",
            "logs:PutLogEvents"
        ],
        "Resource": "*"
    }
    ]
}

You can also check for additional details in the official Fluent Bit documentation.

Going by the principle of least privilege, instead of creating and distributing your AWS credentials to the containers or using the Amazon EC2 instance’s role, you can use IAM Roles for Service Accounts (IRSA) and configure your pods to use the Kubernetes service account. For more details, please refer to the documentation here.

Alternatively, Fluent Bit can use the permissions from the IAM role attached to the Amazon EC2 instance. For more details on how to attach specific IAM policies when creating nodegroups using eksctl, please visit the schema documentation here. You can also attach the IAM policies once the nodegroup has been created. Please refer to documentation here.

3.1 To configure IAM roles for service accounts (i.e., IRSA) on Amazon EKS, we need to associate an IAM OpenID Connect (OIDC) provider to the Amazon EKS cluster. Check if your Amazon EKS cluster has an existing IAM OIDC provider by executing the following command:

# Please replace <CLUSTER_NAME> and <REGION> with the actual values 
# before running the command.
aws eks describe-cluster --name <CLUSTER_NAME> --region <REGION> --query "cluster.identity.oidc.issuer" --output text

3.2 If you need to create an IAM OIDC provider run the following command:

# Please replace <CLUSTER_NAME> and <REGION> with the actual values 
# before running the command.
eksctl utils associate-iam-oidc-provider  --region <REGION> --cluster <CLUSTER_NAME> --approve 

3.3 Create a file named fluent-bit-policy.json with the policy mentioned above. Run the following command to create the IAM policy.

aws iam create-policy --policy-name fluent-bit-policy --policy-document file://fluent-bit-policy.json

3.4 Create the IAM service account and attach the policy previously created.

Run the following command to create IAM service account using eksctl:

# Please replace <CLUSTER_NAME>, <REGION>, and 
# <FLUENT_BIT_POLICY_ARN> with the actual values before running the commands.
eksctl create iamserviceaccount --cluster <CLUSTER_NAME> \ 
--region <REGION> \
--attach-policy-arn <FLUENT_BIT_POLICY_ARN> \ 
--name fluent-bit-windows \
--namespace amazon-cloudwatch \
--approve

In order to configure the same using AWS CLI, please refer to the documentation here.

4. Create a config map with the required configuration

We need to provide few details to Fluent Bit to configure it. These include the cluster name and region. We accomplish the same using Config Maps.

4.1 Run the following to create a config map for providing configuration options to Fluent Bit:

# Please replace <CLUSTER_NAME> and <REGION> with the actual values before running the commands.
ClusterName=<CLUSTER_NAME>
RegionName=<REGION>
FluentBitReadFromHead='Off'

kubectl create configmap fluent-bit-cluster-info \
--from-literal=cluster.name=${ClusterName} \
--from-literal=logs.region=${RegionName} \
--from-literal=read.head=${FluentBitReadFromHead} -n amazon-cloudwatch 

By default, Fluent Bit reads log files from the tail, and captures only new logs after it’s deployed. If you want the opposite, set FluentBitReadFromHead='On' and it collects all logs in the file system.

5. [Optional] Build a Windows container image containing IIS and LogMonitor

If you already have a Windows container image for your application, please feel free to skip this section.

5.1 To test the functionality explained in this post, we create a Windows container image containing IIS and LogMonitor. By default, the Windows containers send logs to Event Tracing for Windows (ETW), Event Log, and custom logs files. However, log processors such as Fluent Bit and fluentd fetch containers logs from a STDOUT pipeline which doesn’t exist on Windows containers. LogMonitor is an open-source plugin created by Microsoft to create a STDOUT pipeline within the Windows container so such tools can successfully fetch logs in the same way they do on a Linux environment.

For more instructions on how to use LogMonitor, access the official GitHub repository or Microsoft blog post about the same.

In the example below, we have a Dockerfile build the Windows container image containing IIS and LogMonitor.

FROM mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2019
 
#Set powershell as default shell
SHELL ["powershell", "-NoLogo", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]
 
#Add X-Forward-For Header to IIS Default website log
RUN Add-WebConfigurationProperty -pspath 'MACHINE/WEBROOT/APPHOST' -filter "system.applicationHost/sites/siteDefaults/logFile/customFields" -name "." "-value @{logFieldName='X-Forwarded-For';sourceName='X-Forwarded-For';sourceType='RequestHeader'}" 
 
#Add STDOUT LogMonitor binary and config in json format
COPY LogMonitor.exe LogMonitorConfig.json 'C:\LogMonitor\'
WORKDIR /LogMonitor
 
ENTRYPOINT ["C:\\LogMonitor\\LogMonitor.exe", "powershell.exe"]
CMD ["C:\\ServiceMonitor.exe w3svc;"]

LogMonitorConfig.json

This sample LogMonitorConfig configuration retrieves all the logs files with the extension .log saved in C:\inetpub\logs and sub directories, including the IIS access logs.

{
  "LogConfig": {
    "sources": [
      {
        "type": "EventLog",
        "startAtOldestRecord": true,
        "eventFormatMultiLine": false,
        "channels": [
          {
            "name": "system",
            "level": "Error"
          }
        ]
      },
      {
        "type": "File",
        "directory": "c:\\inetpub\\logs",
        "filter": "*.log",
        "includeSubdirectories": true
      },
      {
        "type": "ETW",
        "providers": [
          {
            "providerName": "IIS: WWW Server",
            "ProviderGuid": "3A2A4E84-4C21-4981-AE10-3FDA0D9B0F83",
            "level": "Information"
          },
          {
            "providerName": "Microsoft-Windows-IIS-Logging",
            "ProviderGuid": "7E8AD27F-B271-4EA2-A783-A47BDE29143B",
            "level": "Information",
            "keywords": "0xFF"
          }
        ]
      }
    ]
  }
}

As the build completes, push the image to your Amazon ECR registry.

6. Deploy Fluent Bit on Windows nodes as a DaemonSet

A DaemonSet ensures that all (or some) nodes run a copy of a pod. As nodes are added to the cluster, pods are also added to them. In order to make sure that all Windows worker nodes have a copy of the Windows Fluent Bit pod, we deploy a DaemonSet using the deployment file specified in the following steps.

6.1 Copy the following manifest into a file named fluent-bit-daemon-set.yaml.

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluent-bit-windows-role
  namespace: amazon-cloudwatch
rules:
  - nonResourceURLs:
      - /metrics
    verbs:
      - get
  - apiGroups: [""]
    resources:
      - namespaces
      - pods
      - pods/logs
      - nodes
      - nodes/proxy
    verbs: ["get", "list", "watch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluent-bit-windows-role-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: fluent-bit-windows-role
subjects:
# Assuming that the Service Account was created earlier with name fluent-bit-windows.
- kind: ServiceAccount
  name: fluent-bit-windows
  namespace: amazon-cloudwatch
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-windows-config
  namespace: amazon-cloudwatch
  labels:
    k8s-app: fluent-bit-windows
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush                       5
        Log_Level                   info
        Daemon                      off
        net.dns.resolver            LEGACY
        Parsers_File                parsers.conf
        
    @INCLUDE application-log.conf
 
  application-log.conf: |
    [INPUT]
        Name                tail
        Tag                 application.*
        Exclude_Path        C:\\var\\log\\containers\\fluent-bit*
        Path                C:\\var\\log\\containers\\*.log
        Docker_Mode         On
        Docker_Mode_Flush   5
        Docker_Mode_Parser  container_firstline
        Parser              docker
        DB                  C:\\var\\fluent-bit\\state\\flb_container.db
        Read_from_Head      ${READ_FROM_HEAD}
 
    [INPUT]
        Name                tail
        Tag                 application.*
        Path                C:\\var\\log\\containers\\fluent-bit*
        Parser              docker
        DB                  C:\\var\\fluent-bit\\state\\flb_log.db
        Read_from_Head      ${READ_FROM_HEAD}
 
    [FILTER]
        Name                kubernetes
        Match               application.*
        Kube_URL            https://kubernetes.default.svc.cluster.local:443
        Kube_Tag_Prefix     application.C.var.log.container.
        Merge_Log           On
        Merge_Log_Key       log_processed
        K8S-Logging.Parser  On
        K8S-Logging.Exclude Off
        Labels              Off
        Annotations         Off
        Use_Kubelet         Off
        Buffer_Size         0
 
    [OUTPUT]
        Name                cloudwatch_logs
        Match               application.*_default_*
        region              ${AWS_REGION}
        log_group_name      /aws/containerinsights/${CLUSTER_NAME}/default
        log_stream_prefix   ${HOST_NAME}-
        auto_create_group   true
        extra_user_agent    container-insights
 
    [OUTPUT]
        Name                cloudwatch_logs
        Match               application.*_amazon-cloudwatch_*
        region              ${AWS_REGION}
        log_group_name      /aws/containerinsights/${CLUSTER_NAME}/amazon-cloudwatch
        log_stream_prefix   ${HOST_NAME}-
        auto_create_group   true
        extra_user_agent    container-insights
 
  parsers.conf: |
    [PARSER]
        Name                docker
        Format              json
        Time_Key            time
        Time_Format         %Y-%m-%dT%H:%M:%S.%LZ
 
    [PARSER]
        Name                container_firstline
        Format              regex
        Regex               (?<log>(?<="log":")\S(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
        Time_Key            time
        Time_Format         %Y-%m-%dT%H:%M:%S.%LZ
 
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit-windows
  namespace: amazon-cloudwatch
  labels:
    k8s-app: fluent-bit-windows
    version: v1
    kubernetes.io/cluster-service: "true"
spec:
  selector:
    matchLabels:
      k8s-app: fluent-bit-windows
  template:
    metadata:
      labels:
        k8s-app: fluent-bit-windows
        version: v1
        kubernetes.io/cluster-service: "true"
    spec:
      containers:
      - name: fluent-bit-windows
        image: public.ecr.aws/aws-observability/aws-for-fluent-bit:windowsservercore-latest
        imagePullPolicy: Always
        env:
          - name: AWS_REGION
            valueFrom:
              configMapKeyRef:
                name: fluent-bit-cluster-info
                key: logs.region
          - name: CLUSTER_NAME
            valueFrom:
              configMapKeyRef:
                name: fluent-bit-cluster-info
                key: cluster.name
          - name: HOST_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          - name: READ_FROM_HEAD
            valueFrom:
              configMapKeyRef:
                name: fluent-bit-cluster-info
                key: read.head
        resources:
          limits:
            memory: 600Mi
          requests:
            cpu: 500m
            memory: 600Mi
        volumeMounts:
        # Only read only access to the following mounts is required
        - name: fluentbitstate
          mountPath: C:\var\fluent-bit\state
        - name: varlog
          mountPath: C:\var\log
          readOnly: true
        - name: varlibdockercontainers
          mountPath: C:\ProgramData\docker\containers
          readOnly: true
        - name: fluent-bit-config
          mountPath: C:\fluent-bit\etc\
          readOnly: true          
      terminationGracePeriodSeconds: 30
      volumes:
      - name: fluentbitstate
        hostPath:
          path: C:\var
      - name: varlog
        hostPath:
          path: C:\var\log
      - name: varlibdockercontainers
        hostPath:
          path: C:\ProgramData\docker\containers
      - name: fluent-bit-config
        configMap:
          name: fluent-bit-windows-config
      nodeSelector:
        kubernetes.io/os: windows          
      serviceAccountName: fluent-bit-windows

Based on the above configuration, Fluent Bit would:

  • Use tail output plugin to find new log entries which are appended to the container specific log files maintained by container runtime.
  • Use Kubernetes Fluent Bit filter to append additional metadata to each log entry. This is done by querying the Kube-API server.
  • Use Amazon CloudWatch log group ending with default for sending application logs of pods running in default Kubernetes namespace.
  • Use Amazon CloudWatch log group ending with amazon-cloudwatch for sending logs generated by Fluent Bit pods running in amazon-cloudwatch Kubernetes namespace.

The tag with each log entry is of the format application.C.var.log.container.<POD_NAME>_<NAMESPACE_NAME>_<CONTAINER_NAME>-<DOCKER_ID>. In the above example Fluent Bit configuration, we route logs based on namespaces using the regex in Match settings for output plugins. For more details about how Fluent Bit can be configured to use Tag and Match, please visit the documentation here.

You can also visit our documentation here for instructions on customizing your Amazon CloudWatch Log Group or Stream with Kubernetes metadata.

Note:  In this post, we configured Kubernetes filter to use Kube API Server instead of kubelet endpoint for querying the additional metadata. This is because kubelet runs in host network namespace on Windows where Fluent Bit runs in the container network. We can use kubelet endpoint if the Fluent Bit pods are launched as HostProcess pods.

6.2 Deploy this manifest using the following command:

kubectl apply -f fluent-bit-daemon-set.yaml

7. Deploy Windows container image containing IIS and LogMonitor

In this step, we will deploy Windows pods on the Windows worker nodes.

7.1 Create a deployment file named windows_manifest.yaml with the following content:

# Please replace <IIS_CONTAINER_IMAGE> with the image created in step 5.1
---
apiVersion: v1
kind: Service
metadata:
  name: winiis
  labels:
    run: winiis
spec:
  ports:
  - port: 80
    protocol: TCP
  selector:
    run: winiis
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    run: winiis
  name: winiis
  namespace: default
spec:
  replicas: 2
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      run: winiis
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        run: winiis
    spec:
      containers:
      - image: <IIS_CONTAINER_IMAGE>
        imagePullPolicy: IfNotPresent
        name: winiis
      dnsPolicy: ClusterFirst
      nodeSelector:
        kubernetes.io/os: windows
      restartPolicy: Always
      terminationGracePeriodSeconds: 30

7.2 Deploy the manifest using the following command:

kubectl apply -f windows_manifest.yaml

8. [Optional] Access the IIS pods to generate logs

This is an optional step. You can either wait for actual traffic to hit your Windows pod hosting the IIS webserver. Alternatively, you can log into your container and force the logs.

8.1 Run the following command to log into your container:

kubectl -it exec <your_winiis_pod_name> powershell

8.2 From inside the container, run the following command to hit the web server:

Invoke-WebRequest winiis -UseBasicParsing

9. Checking logs on Amazon CloudWatch logs

In this step, we login into the Amazon CloudWatch console and observe the logs that were generated. The Amazon CloudWatch console can be accessed via the link here. Access Log Group console from side panel via Logs/Log Groups.

You can expect log groups with name:

  • /aws/containerinsights/<CLUSTER_NAME>/default
  • /aws/containerinsights/<CLUSTER_NAME>/amazon-cloudwatch

Each log group would contain logs from the pods deployed in respective Kubernetes namespaces.

Cleaning up

When you have finished the tutorial in this post, clean up the resources associated with it to avoid incurring charges for resources that you aren’t using:

  • Delete the Windows deployment
kubectl delete -f windows_manifest.yaml
  • Delete the Fluent Bit DaemonSet
kubectl delete -f fluent-bit-daemon-set.yaml
  • Delete the log groups created by Fluent Bit
aws logs delete-log-group —log-group-name /aws/containerinsights/<CLUSTER_NAME>/default
aws logs delete-log-group --log-group-name /aws/containerinsights/<CLUSTER_NAME>/amazon-cloudwatch
  • Delete the Windows container image created in step 5
  • Delete the Service Account and IAM policy created in step 3
  • Delete the amazon-cloudwatch namespace

Conclusion

In this post, we showed you how to deploy Fluent Bit as a DaemonSet on Windows worker nodes in Amazon EKS.

Using Fluent Bit as a log router is a beneficial way to centralize logging across Amazon EKS for both Linux and Windows workloads. Fluent Bit can send the logs to various destination solutions allowing flexibility for customers.