Skip to content

Support multiline log entries (stack traces) #1

@zhiqiang-zz

Description

@zhiqiang-zz

Problem

Current implementation splits logs by newline (\n), which breaks multiline log entries like:

  • Rust/Java stack traces
  • JSON blobs spanning multiple lines
  • Error messages with context

Each line is treated as a separate log entry, causing patterns to fail and incorrect categorization.

Proposed Solution

Use a "first-line pattern" approach (same as Filebeat/Logstash/Fluentd):

  1. Detect entry start pattern - LLM analyzes sample logs and identifies regex that matches the START of a new log entry (e.g., timestamp + log level)
  2. Merge continuation lines - Lines not matching the pattern are appended to the previous entry
  3. Process merged entries - Patterns and pipeline work on complete log entries

Implementation

New function: detectFirstLinePattern()

  • LLM call to detect the first-line regex
  • Returns pattern like ^\d{4}-\d{2}-\d{2}.*?(DEBUG|INFO|WARN|ERROR)

New function: mergeMultilineEntries()

function mergeMultilineEntries(lines: string[], firstLinePattern: RegExp): string[] {
  const entries: string[] = [];
  let currentEntry = "";
  for (const line of lines) {
    if (firstLinePattern.test(line)) {
      if (currentEntry) entries.push(currentEntry);
      currentEntry = line;
    } else {
      currentEntry += "\n" + line;
    }
  }
  if (currentEntry) entries.push(currentEntry);
  return entries;
}

CLI flag

  • --first-line-pattern <regex> - skip LLM detection, use provided pattern

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions