Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increased Disk Load After Fluent Bit Update to 3.1.2 #9819

Open
Aliev-L opened this issue Jan 10, 2025 · 2 comments
Open

Increased Disk Load After Fluent Bit Update to 3.1.2 #9819

Aliev-L opened this issue Jan 10, 2025 · 2 comments

Comments

@Aliev-L
Copy link

Aliev-L commented Jan 10, 2025

Bug Report: Increased Disk Load and Space Usage After Fluent Bit Update to 3.1.2
Description
After updating Fluent Bit to version 3.1.2, we noticed a significant increase in disk load and space usage, leading to performance degradation on the system.

Steps to Reproduce
Update Fluent Bit to version 3.1.2.
Observe the increased disk load and disk space usage.
Monitor the system’s performance, which is negatively impacted by the increased disk usage.
Expected Behavior
The update was expected to have no significant impact on disk load or disk space usage, nor cause degradation in system performance.
Configuration
The configuration for Fluent Bit is as follows:

config:
    service: |
      [SERVICE]
        Flush {{ .Values.flush }}
        Log_Level {{ .Values.logLevel }}
        Daemon        off
        Parsers_File /fluent-bit/etc/conf/custom_parsers.conf
        HTTP_Server   On
        HTTP_Listen   0.0.0.0
        HTTP_Port {{ .Values.metricsPort }}
    inputs: |
      [INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/containers/*.log
        multiline.parser  cri
        Mem_Buf_Limit     50MB
        Skip_Long_Lines   On
        Skip_Empty_Lines  On
        Refresh_Interval  10

      [INPUT]
        Name              tail
        Tag               calico.*
        Path              /var/log/calico/cni/cni.log
        Mem_Buf_Limit     20MB
        Skip_Long_Lines   On
        Skip_Empty_Lines  On
        Refresh_Interval  10

      [INPUT]
        Name systemd
        Tag host.*
        Systemd_Filter _SYSTEMD_UNIT=kubelet.service
        Read_From_Tail On

    filters: |
      [FILTER]
        Name parser
        Match calico.*
        Key_Name log
        Parser calico_log_parser
        Reserve_Data On
        Preserve_Key On

      [FILTER]
        Name modify
        Match calico.*
        Set host ${NODE_NAME}
        Set tag cni_log

      [FILTER]
        Name kubernetes
        Match kube.*
        Kube_URL https://kubernetes.default.svc:443
        Merge_Log On
        K8S-Logging.Parser On
        K8S-Logging.Exclude On
        Labels Off
        Annotations Off
        Buffer_Size 512k

      [FILTER]
        Name nest
        Match *
        Operation lift
        Nested_under kubernetes

      [FILTER]
        Name modify
        Match *
        Condition Key_value_matches level_name EMERGENCY
        Set level 0

      [FILTER]
        Name modify
        Match *
        Condition Key_value_matches level_name ALERT
        Set level 1

      [FILTER]
        Name modify
        Match *
        Condition Key_value_matches level_name CRITICAL
        Set level 2

      [FILTER]
        Name modify
        Match *
        Condition Key_value_matches level_name ERROR
        Set level 3

      [FILTER]
        Name modify
        Match *
        Condition Key_value_matches level_name WARNING
        Set level 4

      [FILTER]
        Name modify
        Match *
        Condition Key_value_matches level_name NOTICE
        Set level 5

      [FILTER]
        Name modify
        Match *
        Condition Key_value_matches level_name INFO
        Set level 6

      [FILTER]
        Name modify
        Match *
        Condition Key_value_matches level_name DEBUG
        Set level 7

      [FILTER]
        Name modify
        Match *
        Hard_rename message log

      [FILTER]
        Name modify
        Match *
        Condition Key_value_does_not_match upstream_response_time ^(\d+\.\d+)$
        Remove upstream_response_time

      [FILTER]
        Name modify
        Match *
        Condition Key_value_does_not_match upstream_status ^(\d+)$
        Remove upstream_status

      [FILTER]
        Name modify
        Match *
        Condition Key_value_does_not_match level ^([1-7])$
        Remove level

      [FILTER]
        Name modify
        Match host.*
        Rename MESSAGE log
        Set tag kubelet_log
        Set host ${NODE_NAME}
    outputs: |
      [OUTPUT]
        Name          gelf
        Match         *
        Host          {{ .Values.host }}
        Port          12201
        Mode          udp
        Gelf_Short_Message_Key log
        Workers 1
    customParsers: |
      [PARSER]
        Name   json
        Format json
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z
      [PARSER]
        Name        docker
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        # Command      |  Decoder | Field | Optional Action
        # =============|==================|=================
        Decode_Field_As   escaped    log
      [PARSER]
        Name        syslog
        Format      regex
        Regex       ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
        Time_Key    time
        Time_Format %b %d %H:%M:%S
      [PARSER]
        Name        k8s-nginx-ingress
        Format      regex
        Regex       ^(?<remote_addr>[^ ]*) - (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] (\[(?<proxy_alternative_upstream_name>[^ ]*)\] )?(?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<reg_id>[^ ]*).*$
        Time_Key    time
        Time_Format %d/%b/%Y:%H:%M:%S %z
      [PARSER]
        Name        cri
        Format      regex
        Regex       ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<log>.*)$
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L%z
      [PARSER]
        Name   nginx
        Format regex
        Regex ^(?<remote_addr>[^ ]*) - (?<user>[^ ]*) (?<request_time>[^ ]*) (?<upstream_response_time>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)".*$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z
      [PARSER]
        Name   kong
        Format regex
        Regex ^(?<remote_addr>[^ ]*) - (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?))(?<query>\?[^\"]*)?(?: +(?<http_version>\S+))" (?<code>[^ ]*) (?<size>[^ ]*) (?<request_time>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<request_length>[^ ]*) "(?<gzip_ratio>[^ ]*)"$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z
      [PARSER]
        Name        calico_log_parser
        Format      regex
        Regex       ^(?<timestamp>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3}) \[(?<level_name>[^\]]+)\]\[(?<log_id>\d+)\] (?<file_name>\S+) (?<line_number>\d+): (?<message>.+)
        Time_Key    time
        Time_Format %Y-%m-%d %H:%M:%S.%L
        Types       timestamp:timestamp, level_name:string, log_id:integer, file_name:string, line_number:integer, message:string

Screenshoot
Снимок экрана 2025-01-10 в 12 29 11

Environment Information
Kubernetes Version: 1.28.10
Server OS: Debian GNU/Linux 11
Fluent Bit Version: 3.1.2 (Issue starts from this version)
Filters and Plugins: Various filters applied (e.g., parser, modify, kubernetes, nest) and custom parsers for different log formats.
Additional Context
The increased disk load is significantly affecting system performance, making it difficult to maintain smooth operation. This issue needs to be resolved promptly to restore normal system functionality.

@edsiper
Copy link
Member

edsiper commented Jan 14, 2025

which version were you using before ?

@Aliev-L
Copy link
Author

Aliev-L commented Jan 14, 2025

@edsiper 3.0.4
I finally updated to 3.1.1
starting from version 3.1.2 there is a load on the disk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants