Skip to content

Simpler field filter#1

Open
igitur wants to merge 3 commits intoTimoSavi:masterfrom
igitur:simpler-field-filter
Open

Simpler field filter#1
igitur wants to merge 3 commits intoTimoSavi:masterfrom
igitur:simpler-field-filter

Conversation

@igitur
Copy link
Contributor

@igitur igitur commented Dec 16, 2025

Simpler Filter Syntax and Documentation Updates

Summary

  • Added syntactic sugar for filter field definitions: field name filter filtername instead of field name * * * filtername
  • Updated documentation to Texinfo 6.8 format
  • Added documentation examples for the new filter syntax

Changes

1. Simpler Filter Syntax (e50643c)

  • Modified src/parserc.c to support field name filter filtername syntax
  • This is equivalent to the existing field name * * * filtername syntax
  • Makes configuration files more readable and intuitive

2. Documentation Updates (a14bce7, 7b2c2f2)

  • Updated HTML documentation generated by Texinfo 6.5 → 6.8
  • Added documentation for the new filter syntax in all formats (HTML, Info, TeXinfo)
  • Included practical examples showing usage:
    filter uppercase "tr '[:lower:]' '[:upper:]'"
    filter datefmt "date -f - '+%Y-%m-%d'"
    
    structure data {
        type separated ,
        record entry {
            field id
            field name filter uppercase
            field date filter datefmt
        }
    }

Benefits

  • Improved readability: Clearer syntax for filter definitions
  • Backward compatible: Old syntax still works
  • Better documentation: Updated examples and modern formatting

@igitur
Copy link
Contributor Author

igitur commented Dec 16, 2025

hi @TimoSavi . After many years, I'm still a loyal user of your tool. I'm unsure whether this will break backwards compatibility, but I'm keen to hear your thoughts. Even if you don't like the syntactic sugar change, I'd love to update the documentation with better examples, which would assist an MCP like Context7 more.

@TimoSavi
Copy link
Owner

TimoSavi commented Jan 5, 2026

hi @igitur , I sorry for later answer and thanks for your contribution (very rare these days for ffe). About filter: It could be done but I am little bit confused about the this change because It breaks the overall notation and is only for one config item. Making documentation better is always welcomed.

@igitur
Copy link
Contributor Author

igitur commented Jan 15, 2026

Hi, sorry, I was away on holiday myself.

Yes, I realised the same thing shortly after submitting this. I think my issue is that the repeated asterisks for some field definitions feels a bit clumsy and verbose.

Currently, to implement a filter (in my use case), I have to define it, e.g.:
field effective_date * * * to_iso8601

I'm using variable length fields, hence the first asterisk.
I'm using no lookup table, hence 2nd asterisk.
I'm not any output definition, hence 3rd asterisk.

I'll postulate that most character delimited files these days have fields with variable length (fixed width fields are increasingly rare, but of course do exist). And I think it's rare that ffe users would want to use more than one of the lookup/output/filter features simultaneously for a single field.

Therefore I propose the new alternative syntax (original syntax would still be supported), where fields can be defined as:
field name feature featurename. This would imply variable length fields always (where applicable).

feature could take 1 of 3 values: lookup, output or filter, and featurename would be the regular name that would have been present in the old syntax.

So examples:
field effective_date lookup my_lookup, analogous to field effective_date * my_lookup
field effective_date output my_output_def, analogous to field effective_date * * my_output_def
field effective_date filter datefmt, analogous to field effective_date * * * datefmt

Currently the 3rd parameter (length|*) would be either an integer or an asterisk, so introducing the new syntax should be a breaking change, right? To confirm, the old syntax will still be supported.

I don't feel too strongly about this change, so if you're uncomfortable with it, it's OK. It's more for ease of use, because I'm still a heavy ffe user.

If this proposal is accepted, I also want to propose support for inline definitions for output and filter, e.g.

field field_name output { justify left }
or
field field_name filter { date -f - '+%Y-%m-%d' }

So I hope that clarifies the context in which this syntax change is proposed.

@TimoSavi
Copy link
Owner

TimoSavi commented Jan 19, 2026

Hi,
I would make this a little bit more versatile. We could have two types of single line config item:

  • current * notation for missing options
  • "fat arrow" notation for needed options only

Like if we have:

field effective_date * * * to_iso8601
field effective_date * mylookup * to_iso8601

this could be done alternatively (name is optional too):

field name=>effective_date filter=>to_iso8601
field name=>effective_date lookup=>mylookup filter=>to_iso8601

and for other keywords too.

This should be done in parse_option() so that you have list of allowed option names,, their positions and option max count for each keyword in current notation. Options and their count would be returned to caller as they would have been given traditional way. So the rest of the code would be intact.

Or something similar. Some logic should be created for identify the notation used (e.g. every given option has "=>", in some cases someone could use current notation with "=>" in actual data for each option, but that is probably rare)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants