Skip to content

WIP - Java Wrapper for Elasticsearch's ES|QL Parser#6207

Draft
eric-forte-elastic wants to merge 11 commits into
mainfrom
elasticsearch_esql_parser
Draft

WIP - Java Wrapper for Elasticsearch's ES|QL Parser#6207
eric-forte-elastic wants to merge 11 commits into
mainfrom
elasticsearch_esql_parser

Conversation

@eric-forte-elastic
Copy link
Copy Markdown
Contributor

@eric-forte-elastic eric-forte-elastic commented May 27, 2026

Pull Request

Issue link(s):

Summary - What I changed

ES|QL query verification used to require a live Elasticsearch cluster: we'd create test indices with the right mappings, rewrite the rule's query to use them, send it to _query, and tear the indices down. This provided appropriate errors and some syntax validation and query parsing but also required a stack.

This PR introduces a small JVM daemon that wraps Elasticsearch's ES|QL EsqlParser and Verifier which Kibana uses.

Note

This assumes that it is preferable to add a Java lib to the repo instead of using a remote stack. This is a large assumption and may not be what we want to do. Adding another language could incur a significant maintenance burden and we should consider this before merging.

Also 2 other things need to be resolved:

  1. Repo var name update in vars.esql_validation in pythonpackage.yml.
  2. CI needs an elastic/elasticsearch checkout for the validator build, it currently checks out main. We may want to checkout the specific version for each branch and perform the validation against each backported branch.

How To Test

This PR wires the JVM into the existing python ES|QL validation pipeline. Using view rule with the remote existing validation flags will use the JVM instead and produce the same results.

Example Run

detection-rules on  elasticsearch_esql_parser [!?] is  v1.6.42 via  v3.12.13 (detection-rules-build) on  eric.forte took 6s 
❯ python -m detection_rules view-rule rules/network/initial_access_newly_observed_fortigate_admin_logon.toml --esql-validation

█▀▀▄ ▄▄▄ ▄▄▄ ▄▄▄ ▄▄▄ ▄▄▄ ▄▄▄ ▄▄▄ ▄   ▄      █▀▀▄ ▄  ▄ ▄   ▄▄▄ ▄▄▄
█  █ █▄▄  █  █▄▄ █    █   █  █ █ █▀▄ █      █▄▄▀ █  █ █   █▄▄ █▄▄
█▄▄▀ █▄▄  █  █▄▄ █▄▄  █  ▄█▄ █▄█ █ ▀▄█      █ ▀▄ █▄▄█ █▄▄ █▄▄ ▄▄█

{
  "author": [
    "Elastic"
  ],
  "description": "This rule detects the first observed successful login of a user with the Administrator role to the FortiGate management interface within the last 5 days. First-time administrator logins can indicate newly provisioned accounts, misconfigurations, or unauthorized access using valid credentials and should be reviewed promptly.",
  "from": "now-7205m",
  "interval": "5m",
  "language": "esql",
  "license": "Elastic License v2",
  "name": "First-Time FortiGate Administrator Login",
  "note": "## Triage and Analysis\n\n### Investigating First-Time FortiGate Administrator Login\n\nThis alert indicates that a user with the **Administrator** role has successfully logged in to the FortiGate management interface for the first time within the last 5 days of observed data.\n\nBecause administrator access provides full control over network security devices, any newly observed admin login should be validated to confirm it is expected and authorized.\n\n### Investigation Steps\n\n- **Identify the account**\n  - Review `source.user.name` and confirm whether the account is known and officially provisioned.\n  - Determine whether this is a newly created administrator or an existing account logging in for the first time.\n\n- **Validate the source**\n  - Review `source.ip` and confirm whether it originates from a trusted management network, VPN, or jump host.\n  - Investigate geolocation or ASN if the source IP is external or unusual.\n\n- **Review login context**\n  - Examine associated FortiGate log messages for details such as login method, interface, or authentication source.\n  - Check for additional administrative actions following the login (policy changes, user creation, configuration exports).\n  - Review  `fortinet.firewall.profile` to identify the FortiGate Admin Profile the identity logged in under.\n\n- **Correlate with recent changes**\n  - Verify whether there were recent change requests, onboarding activities, or maintenance windows that explain the login.\n  - Look for other authentication attempts (failed or successful) from the same source or user.\n\n### False Positive Considerations\n\n- Newly onboarded administrators or service accounts.\n- First-time logins after log retention changes or data source onboarding.\n- Automation, backup, or monitoring tools introduced recently.\n- Lab, development, or test FortiGate devices.\n\n### Response and Remediation\n\n- **If authorized**\n  - Document the activity and consider adding an exception if the behavior is expected.\n  - Ensure the account follows least-privilege and MFA best practices.\n\n- **If suspicious or unauthorized**\n  - Disable or restrict the administrator account immediately.\n  - Rotate credentials and review authentication sources.\n  - Audit recent FortiGate configuration changes.\n  - Review surrounding network activity for lateral movement or persistence attempts.",
  "query": "FROM logs-fortinet_fortigate.*, filebeat-* metadata _id\n\n| WHERE data_stream.dataset == \"fortinet_fortigate.log\" and\n        event.category == \"authentication\" and event.action == \"login\" and\n        event.outcome == \"success\" and source.user.roles == \"Administrator\" and source.user.name is not null\n| stats Esql.logon_count = count(*),\n       Esql.first_time_seen = MIN(@timestamp),\n       Esql.source_ip_values = VALUES(source.ip),\n       Esql.message_values = VALUES(message) by source.user.name, fortinet.firewall.profile\n\n// first time seen is within 6m of the rule execution time and for the last 5d of events history\n| eval Esql.recent = DATE_DIFF(\"minute\", Esql.first_time_seen, now())\n| where Esql.recent <= 6 and Esql.logon_count == 1\n\n// move dynamic fields to ECS equivalent for rule exceptions\n| eval source.ip = MV_FIRST(Esql.source_ip_values)\n\n| keep source.ip,\n       source.user.name,\n       fortinet.firewall.profile,\n       Esql.logon_count,\n       Esql.first_time_seen,\n       Esql.source_ip_values,\n       Esql.message_values,\n       Esql.recent\n",
  "references": [
    "https://www.elastic.co/docs/reference/integrations/fortinet_fortigate",
    "https://www.cisa.gov/news-events/alerts/2026/01/28/fortinet-releases-guidance-address-ongoing-exploitation-authentication-bypass-vulnerability-cve-2026"
  ],
  "related_integrations": [
    {
      "package": "fortinet_fortigate",
      "version": "^1.31.0"
    }
  ],
  "required_fields": [
    {
      "ecs": false,
      "name": "fortinet.firewall.profile",
      "type": "keyword"
    },
    {
      "ecs": true,
      "name": "source.ip",
      "type": "ip"
    },
    {
      "ecs": true,
      "name": "source.user.name",
      "type": "keyword"
    }
  ],
  "risk_score": 73,
  "rule_id": "55a372b9-f5b6-4069-a089-8637c00609a2",
  "severity": "high",
  "tags": [
    "Use Case: Threat Detection",
    "Tactic: Initial Access",
    "Resources: Investigation Guide",
    "Domain: Network",
    "Domain: Identity",
    "Data Source: Fortinet",
    "Data Source: Fortinet FortiGate"
  ],
  "threat": [
    {
      "framework": "MITRE ATT&CK",
      "tactic": {
        "id": "TA0001",
        "name": "Initial Access",
        "reference": "https://attack.mitre.org/tactics/TA0001/"
      },
      "technique": [
        {
          "id": "T1078",
          "name": "Valid Accounts",
          "reference": "https://attack.mitre.org/techniques/T1078/"
        }
      ]
    }
  ],
  "timestamp_override": "event.ingested",
  "type": "esql",
  "version": 4
}

You can test all rules with (takes ~4min vs 10min using remote stack)

python -m detection_rules dev test esql-validation --verbosity 1

Total rules: 159
Failed rules: 0
Failed rules written to failed_rules.log

You can also test the Java validation directly using the following script as an example.

Test Script

from detection_rules.esql_parser import EsqlValidator

with EsqlValidator() as v:
    # 1. Valid query — show the parsed output columns (same shape as the
    #    ES|QL HTTP API's response["columns"]).
    result = v.validate(
        "FROM logs | WHERE foo == 1 | LIMIT 5",
        indices={"logs": {"properties": {
            "foo": {"type": "integer"},
            "name": {"type": "keyword"},
        }}},
    )
    if result.ok:
        print("columns:")
        for col in result.columns:
            print(f"  {col}")
    else:
        for err in result.errors:
            print(f"{err.type} at {err.line}:{err.column}: {err.message}")

    # 2. Typo — parse error path.
    result = v.validate(
        "FROM logs | WHETRE foo == 1 | LIMIT 5",
        indices={"logs": {"properties": {"foo": {"type": "integer"}}}},
    )
    if not result.ok:
        for err in result.errors:
            print(f"{err.type} at {err.line}:{err.column}: {err.message}")

    # 3. Unsupported field type — query still validates, but the offending
    #    column surfaces `original_types` (and `suggested_cast` when one
    #    can be inferred), matching the live API for type conflicts.
    result = v.validate(
        "FROM logs",
        indices={"logs": {"properties": {
            "foo": {"type": "integer"},
            "blob": {"type": "binary"},
        }}},
    )
    if result.ok:
        print("columns (with unsupported field):")
        for col in result.columns:
            print(f"  {col}")

Checklist

  • Added a label for the type of pr: bug, enhancement, schema, maintenance, Rule: New, Rule: Deprecation, Rule: Tuning, Hunt: New, or Hunt: Tuning so guidelines can be generated
  • Added the meta:rapid-merge label if planning to merge within 24 hours
  • Secret and sensitive material has been managed correctly
  • Automated testing was updated or added to match the most common scenarios
  • Documentation and comments were added for features that require explanation

Contributor checklist

@eric-forte-elastic eric-forte-elastic added enhancement New feature or request python Internal python for the repository minor labels May 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Enhancement - Guidelines

These guidelines serve as a reminder set of considerations when addressing adding a feature to the code.

Documentation and Context

  • Describe the feature enhancement in detail (alternative solutions, description of the solution, etc.) if not already documented in an issue.
  • Include additional context or screenshots.
  • Ensure the enhancement includes necessary updates to the documentation and versioning.

Code Standards and Practices

  • Code follows established design patterns within the repo and avoids duplication.
  • Ensure that the code is modular and reusable where applicable.

Testing

  • New unit tests have been added to cover the enhancement.
  • Existing unit tests have been updated to reflect the changes.
  • Provide evidence of testing and validating the enhancement (e.g., test logs, screenshots).
  • Validate that any rules affected by the enhancement are correctly updated.
  • Ensure that performance is not negatively impacted by the changes.
  • Verify that any release artifacts are properly generated and tested.
  • Conducted system testing, including fleet, import, and create APIs (e.g., run make test-cli, make test-remote-cli, make test-hunting-cli)

Additional Checks

  • Verify that the enhancement works across all relevant environments (e.g., different OS versions).
  • Confirm that the proper version label is applied to the PR patch, minor, major.

@eric-forte-elastic eric-forte-elastic self-assigned this May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request minor python Internal python for the repository

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant