Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/main/java/org/z3950/zing/cql/CQLGenerator.java
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ private String generate_base_relation() throws MissingParameterException {
private String generate_term() {
switch (rnd.nextInt(10)) {
case 0: return "cat";
case 1: return "\"cat\"";
case 1: return "\\\"cat\\\"";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this briefly on Slack, but I still don't understand why the change. Yes this needs to round-trip correctly — but all strings in any CQL query (hence all strings that we generate in the query generator) need to round-trip correctly. So why do we care what this one is?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR changes the term so that escape sequences are retained (preserved). If a term includes " it will always be preceded by backslash.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that only some terms are round-tripped correctly?

Copy link
Copy Markdown
Contributor Author

@adamdickmeiss adamdickmeiss Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep. If a term contained a bare ", that would not be round-tripped correctly. But that would never be the result of parsing.

case 2: return "comp.os.linux";
case 3: return "xml:element";
case 4: return "<xml.element>";
Expand Down
16 changes: 6 additions & 10 deletions src/main/java/org/z3950/zing/cql/CQLLexer.java
Original file line number Diff line number Diff line change
Expand Up @@ -69,19 +69,15 @@ else if (comp.equals("<>")) {
//remember quote char
char mark = c;
qi++;
boolean escaped = false;
buf.setLength(0); //reset buffer
while (qi < ql) {
if (!escaped && qs.charAt(qi) == mark) //terminator
break;
if (escaped && strchr("*?^\\", qs.charAt(qi))) //no escaping for d-quote
buf.append("\\");
if (!escaped && qs.charAt(qi) == '\\') { //escape-char
escaped = true;
while (qi < ql && qs.charAt(qi) != mark) {
if (qs.charAt(qi) == '\\') { //escape-char
if (qi == ql - 1) {
break; //unterminated
}
buf.append(qs.charAt(qi));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the intent of this section at all. It looks like it can't lex "a\"b" at all, but will return the string a. Am I wrong?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bin/CQLParser 
"a\"b"
<searchClause>
  <index>cql.serverChoice</index>
  <relation>
    <value>=</value>
  </relation>
  <term>a\"b</term>
</searchClause>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, good to see. My code-reading foo is depressed.

qi++;
continue;
}
escaped = false; //reset escape
buf.append(qs.charAt(qi));
qi++;
}
Expand Down
2 changes: 1 addition & 1 deletion src/main/java/org/z3950/zing/cql/CQLTermNode.java
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,7 @@ static String maybeQuote(String str) {
str.indexOf('/') != -1 ||
str.indexOf('(') != -1 ||
str.indexOf(')') != -1) {
str = '"' + str.replace("\"", "\\\"") + '"';
str = '"' + str + '"';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably missing something, but this looks wrong to me. It looks like it will render a"b as "a"b", when surely it should be "a\"b" as in the previous version?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I repeat: " will always be preceded by backslash.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're asserting a precondition that where str contains a ", it is always immediately prefixed with a \ — right? If so, then this is OK, I guess, but feels fragile.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bare quote case is now considered.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

}

return str;
Expand Down
2 changes: 1 addition & 1 deletion src/test/resources/regression/06/03.xcql
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@
<relation>
<value>=</value>
</relation>
<term>^cat says "fish"</term>
<term>^cat says \"fish\"</term>
</searchClause>
2 changes: 1 addition & 1 deletion src/test/resources/regression/06/06.xcql
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@
<relation>
<value>=</value>
</relation>
<term>^cat*fishdog"horse?</term>
<term>^cat*fishdog\"horse?</term>
</searchClause>
2 changes: 1 addition & 1 deletion src/test/resources/regression/12/01.xcql
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@
<relation>
<value>=</value>
</relation>
<term>term\*\?\^</term>
<term>te\rm\*\?\^</term>
</searchClause>