Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Followup on Issue 89 #2199

Closed
jcampaner opened this issue May 24, 2020 · 2 comments
Closed

Followup on Issue 89 #2199

jcampaner opened this issue May 24, 2020 · 2 comments
Labels

Comments

@jcampaner
Copy link

jcampaner commented May 24, 2020

Bug Report

Describe the bug
Using stdin as input and es as output, I am seeing that processing stops and the engine shuts down when a JSON line longer than 16K is encountered. Filtering out all lines greater than 16K from the stdin stream avoids the problem (e.g. piping through "cut -c1-16000"). I see that this concern was already raised in #89. I tried setting property buffer_size to various values but it did not seem to have any effect. I would like to have the lines longer than 16k included in output.

To Reproduce

  • Create a ndjson file with one line longer than 16K.

  • Example
    {"message":"<some string longer than 16k>","stream":"stdout","@timestamp":"2018-06-11T14:37:30.681701731Z"}

  • Steps to reproduce the problem:
    execute cat your.ndjson | ${FLUENT_BIT_ROOT}/bin/td-agent-bit -i stdin -o es -p Host=127.0.0.1 -p Port=9200 -p Index=yourindex -p Buffer_Size=32k

Expected behavior
Presuming that I'm interpreting the resolution of issue 89 correctly: we should see one record in the target ES index for every line in the input ndjson (which does not exceed the buffer size). And, we should have the ability to configure the buffer size on stdin to allow lines longer than 16k to be processed as input.

Screenshots
Copyright (C) Treasure Data

[2020/05/24 12:39:31] [ info] [storage] version=1.0.3, initializing...
[2020/05/24 12:39:31] [ info] [storage] in-memory
[2020/05/24 12:39:31] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/05/24 12:39:31] [ info] [engine] started (pid=28932)
[2020/05/24 12:39:31] [ info] [sp] stream processor started
[2020/05/24 12:39:42] [ warn] [in_stdin] end of file (stdin closed by remote end)
[2020/05/24 12:39:42] [ info] [input] pausing stdin.0
[2020/05/24 12:39:42] [ warn] [engine] service will stop in 5 seconds
[2020/05/24 12:39:47] [ info] [engine] service stopped
[2020/05/24 12:39:47] [ info] [input] pausing stdin.0

Your Environment

  • Version used: Fluent Bit v1.3.11
  • Configuration: -i stdin -o es -p Host=127.0.0.1 -p Port=9200 -p Index=yourindex -p Buffer_Size=32k
  • Environment name and version (e.g. Kubernetes? What version?): Running from bash 4.2.46(2)-release
  • Server type and version: Oracle VM VirtualBox Windows
  • Operating System and version: Centos 7
  • Filters and plugins: None

Additional context
I can workaround the problem by cutting out the problematic lines. But in reading the prior issue, I am hoping that this measure is (or will be) unnecessary and that all lines in the input file can be loaded so that the log data is more complete.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label May 16, 2021
@github-actions
Copy link
Contributor

This issue was closed because it has been stalled for 5 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant