6

The Problem:

Note: NO_BACKSLASH_ESCAPES mode is ENABLED.

I am using mysqli prepared statements for querying the database. The target code is like:

$conn = new mysqli('localhost', 'root', 'mypass', 'mydb');
$stmt = $conn->prepare('select * from `table` where `data` like ?;');

Now, let's say I manually(from mysql client, not from php) insert 3 records which have the literal values of JDxDD, JD_DD and JD\DD:

INSERT INTO `table` SET `data` = 'JDxDD';
INSERT INTO `table` SET `data` = 'JD_DD';
INSERT INTO `table` SET `data` = 'JD\DD';

Now, consider this:

$stmt = $conn -> prepare('SELECT * from `table1` where `data` like ?;');
$key = 'JD_DD';
$stmt -> bind_param('s', $key);
$stmt-> execute();

This returns all 3 rows, as expected. Now, I want to just select the one with value JD_DD (i.e. exclude the JDxDD and JD\DD). So, as answered here, I need to escape the _ using escape keyword to $key = 'JD\_DD';.

However, this:

$stmt = $conn -> prepare('SELECT * from `table1` where `data` like ? escape \'\\\';');
/*
first and last \ are for putting the ' quote,
middle 2 are for putting a \ as a part of string.
*/

evaluates the $stmt to be false, calling $conn->error gives this error:

You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near ''\'' at line 1

Both of following codes give the same Syntax error as above:

(1) I tried switching to " for encapsulating the query, as:

$stmt = $conn -> prepare("SELECT * from `table1` where `data` like ? escape '\\';");

(2) And this:

$stmt = $conn -> prepare("SELECT * from `table1` where `data` like ? escape '\';");

Please note that $key = 'JD\_DD';, while can be effectively used with (instead of LIKE) an = operator, is just an example. The $key can be complex such as 'JD\_D%\_X%'; or like.

My efforts:

Here are some codes that produces the desired output:

(1) Using an alternative escape character:

$stmt = $conn -> prepare("SELECT * from `table1` where `data` like ? escape '|';");
$key = 'JD|_DD';
$stmt->bind_param('s', $key);

This works fine as one would expect.

(2) Omitting the escape word altogether:

$stmt = $conn -> prepare('SELECT * from `table1` where `data` like ?;');
$key = 'JD\_DD';
$stmt->bind_param('s', $key);

This produces the desired JD_DD output. Why?? Even with the NO_BACKSLASH_ESCAPES MODE ENABLED, why does \ works as an escape character?

(3) The following also works:

$stmt = $conn -> prepare("SELECT * from `table1` where `data` like ? escape '\\\';");
$key = 'JD\_DD';
$stmt->bind_param('s', $key);

(4) And so does the following:

$stmt = $conn -> prepare("SELECT * from `table1` where `data` like ? escape '\\\\';");
$key = 'JD\_DD';
$stmt->bind_param('s', $key);

The (3) and (4) produce the same output. Again, Why?

(5) Also, I could turn off the NO_BACKSLASH_ESCAPES mode, but that seems more like a workaround rather than a solution.

The mariadb docs says:

If you need to match the characters _ or %, you must escape them. By default, you can prefix the wildcard characters the backslash characer \ to escape them. The backslash is used both to encode special characters like newlines when a string is parsed as well as to escape wildcards in a pattern after parsing. Thus, to match an actual backslash, you sometimes need to double-escape it as "\\\\".

To avoid difficulties with the backslash character, you can change the wildcard escape character using ESCAPE in a LIKE expression. The argument to ESCAPE must be a single-character string.

Also, this SO post showed a bug that was fixed in MariaDB 10.1.22. I am using MariaDB 10.1.13, and that bug is fixed in my version. (i.e. select 'a%b' like '%\%\%'; returns 0 as expected.)

I also went through this mysql bug.

Any help understanding the nature and origin of this issue, and/or the right way to code for like clause in cases where NO_BACKSLASH_ESCAPES mode is enabled, is greatly appreciated.

Also, this is my first SO question, so I have tried to be as informative and proper as I could be. Any critics/suggestions regarding a better way to express my question is also appreciated.

Thank you!

Edit:

Simply using = instead of LIKE won't help. There may be wildcards based on the user preferences. For example, if the user prefers to fetch all relevant suggestions based on his input (say 'J%D'), then such query becomes WHERE data LIKE '%J\%D%';. In case when the user want to fetch only what he typed, I can surely use = in place of LIKE.

Edit 2:

Here are some tests I did:

With NO_BACKSLASH_ESCAPES mode DISABLED

(1) INSERT INTO table set name = 'abc\def';

this inserts abcdef as expected.

(2) INSERT INTO table set name = 'xyz\\pqr';

this inserts xyz\pqr as expected.

(3) SELECT * FROM table where name = 'abc\def';

this returns abcdef as expected.

(4) SELECT * FROM table where name = 'abc\\def';

this returns 0 rows as expected.

(5) SELECT * FROM table where name = 'xyz\pqr';

this returns 0 rows as expected.

(6) SELECT * FROM table where name = 'xyz\\pqr';

this returns xyz\pqr as expected.

(7) SELECT * FROM table where name LIKE 'abc\def';

this returns abcdef as expected.

(8) SELECT * FROM table where name LIKE 'abc\\def';

this returns abcdef. not expected. Isn't (8) supposed to be showing only if there are any abc\def?

(9) SELECT * FROM table where name LIKE 'xyz\pqr';

this returns 0 rows as expected.

(10) SELECT * FROM table where name LIKE 'xyz\\pqr';

this returns 0 rows. not expected. Isn't (10) supposed to be showing xyz\pqr?

Now with NO_BACKSLASH_ESCAPES mode ENABLED The data inserted at (1) and (2) is also kept.

(11) INSERT INTO table set name = 'abc\def';

this inserts abc\def as expected.

(12) INSERT INTO table set name = 'xyz\\pqr';

this inserts xyz\\pqr as expected.

(13) SELECT * FROM table where name = 'abc\def';

this returns abc\def (inserted at 11) as expected.

(14) SELECT * FROM table where name = 'abc\\def';

this returns 0 rows as expected.

(15) SELECT * FROM table where name = 'xyz\pqr';

this returns xyz\pqr (inserted at 2) as expected.

(16) SELECT * FROM table where name = 'xyz\\pqr';

this returns xyz\\pqr (inserted at 12) as expected.

(17) SELECT * FROM table where name LIKE 'abc\def';

this returns abcdef (inserted at (1)). not expected. With this mode on, I would expect it to take the \ literally (unless used with escape = '\'). Expected result is abc\def (inserted at 11).

(18) SELECT * FROM table where name LIKE 'abc\\def';

this returns abc\def (inserted at (11)). not expected. With this mode on, I would expect it to take both the \\ literally. Expected result is 0 rows as there is no data like abc\\def.

(19) SELECT * FROM table where name LIKE 'xyz\pqr';

this returns 0 rows. not expected. With this mode on, I would expect it to take the \ literally. Expected result is xyz\pqr (inserted at 2).

(20) SELECT * FROM table where name LIKE 'xyz\\pqr';

this returns xyz\pqr (inserted at (11)). not expected. With this mode on, I would expect it to take both the \\ literally. Expected result is xyz\\pqr (inserted at 12).


So, to conclude, the NO_BACKSLASH_ESCAPES mode does in INSERT and WHERE col = value as expected - taking every \ literally when set to ON. BUT, with WHERE col LIKE value it gets strange as seen above.

For LIKE clauses, the \ is NOT taken literally even if the mode is set to ON.

I now strongly suspect that I have misunderstood some very basic things regarding LIKE clauses. Any clarification is very much welcomed!

Jay Dadhania
  • 374
  • 3
  • 13
  • 1
    `'a%b' like '%\%\%'` _should_ return 0. It is asking for "anything (first %) followed by _two_ percents. – Rick James Jul 13 '18 at 15:23
  • 1
    What exactly is the question when your effort nr. 2 gives you the desired result? When you have `data LIKE ?` and provide `"JD\_DD"` you will get, as expected, the only row with the value `JD_DD`. – Progman Jul 13 '18 at 20:07
  • 1
    Keep in mind that NO_BACKSLASH_ESCAPES does not affect the values for your prepared statement, it only affects strings in your query like `SELECT 'abc\def';` (which will return "abc\def" or "abcdef" depending on your setting). – Progman Jul 13 '18 at 20:13
  • @RickJames yes, that is what I get. I wanted to say that I *don't have* this bug in my version, so my issue is originated somewhere else. My bad, I was not clear there. Edited. – Jay Dadhania Jul 13 '18 at 20:43
  • @Progman _that is exactly my question_, NO_BACKSLASH_ESCAPES tells the parser to take the '\' character as _literal \._ But in this case, even with this mode, the \ in the passed string `JD\_DD` _is not taken literally._ – Jay Dadhania Jul 13 '18 at 20:49
  • 1
    @JayDadhania NO_BACKSLASH_ESCAPE is only relevant for strings you create in the query, but you already have the string value as a value for the prepared statement available. See how `SELECT 'abc\def';` is affected when you enable and disable this flag. The setting has no affect when you already have the value. – Progman Jul 13 '18 at 20:55
  • @Progman I have tested some queries as you suggested, and have added the results in original question. Thank you very much for your interest! – Jay Dadhania Jul 13 '18 at 22:24
  • 1
    @JayDadhania Check the note on https://dev.mysql.com/doc/refman/8.0/en/string-comparison-functions.html#operator_like. When you want to search for a `\` with LIKE you need to have `\\` in the input string, and you need to write `\\\\` to get one, when you create the string in the query. The LIKE operator has a parser as well: "[...] and again when the pattern match is made". – Progman Jul 14 '18 at 08:29
  • Thank you so much, @Progman! I added that as an answer and made it a community wiki. Much appreciated, man! – Jay Dadhania Jul 14 '18 at 12:23
  • Clarification needed -- were those insert/select tests done via the commandline tool? Or through some api (eg PHP)? – Rick James Jul 14 '18 at 16:04
  • @RickJames the tests in Edit 2 are done in a Query Tab in HeidiSQL, which is same as doing in command line. – Jay Dadhania Jul 14 '18 at 16:32
  • Bravo to the thorough testing. – Rick James Jul 14 '18 at 16:32

2 Answers2

2

Simply use = instead of LIKE??

LIKE should be used only when you want the wildcard actions of % and/or _.

Revised case

Requirements:

  • User can enter _, %, or \ -- and the test is to take them literally, plus
  • User's input is the beginning of the string

Solution 1 avoids dealing with LIKE:

WHERE LEFT(data, CHAR_LENGTH($input)) = $input

Solution 2 makes LIKE work without changing NO_BACKSLASH_ESCAPES:

First, quadruple each \ , second prefix each _ or % with a \ . Then tack on % and do LIKE.

If the Requirement is that $input is to be located anywhere in data, then

WHERE INSTR($input, data)
Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Of course, that was just an example. I want to be able to query something complex like `where data LIKE 'JD\_D%\_X%';`. The `$key` is prepared from arbitrary user input, so I need to know the proper way handling the escape etc. Also, I can sanitize the user input to remove the special characters like `%`, but I don't want to. I want my users to be able to any literal data they want, and search/retrieve it *however they like*. Also, This is *only for learning purposes*, and I will disable the NO_BACKSLASH_ESCAPES mode, but I want my application to work in correct way regardless of this mode. – Jay Dadhania Jul 13 '18 at 04:38
  • @JayDadhania - Do you allow users to enter wildcards to mean wildcards? _And_ enter the same characters to be taken literally? Which parts of the example are escapes, which are wild, which are literal? I know you are asking about escaping, but it is unclear what cases need escaping. – Rick James Jul 13 '18 at 15:20
  • Yes, all characters coming from *the user input* needs to be taken as *literals*. For example, if the user types in `J%D`, the `%` inside it should be converted into `literal %` such as `J\%D`. I know I can achieve this by escaping, my question is regarding the `escape` clause used in prepare statement, and the *value of `escape` such as '\' or '|'`, and *why* does `J%D` need to be escaped even when NO_BACKSLASH_ESCAPES mode is *ON*. Also, with this mode, backslash *works as an escape character*, unexpected for this mode. – Jay Dadhania Jul 13 '18 at 16:15
  • If there are no wildcards, use `=` instead of `LIKE`. – Rick James Jul 13 '18 at 16:33
  • There *may be* wildcards based on the user preferences. For example, if the user prefers to fetch *all relevant suggestions* based on his input (say `'J%D'`), then such query becomes `WHERE data LIKE '%J\%D%';`. In case when the user want to fetch *only what he typed*, I can surely use `=` in place of `LIKE`. I updated my question to reflect this. – Jay Dadhania Jul 13 '18 at 19:10
  • 1
    @JayDadhania - Ok, then I would expect 2 or 4 backslashes to work -- it depends on how many layers of PHP the string goes through. Each layer would cut the number in half. – Rick James Jul 13 '18 at 22:06
  • you're correct. I've added this as an answer to my question. Thank you! – Jay Dadhania Jul 14 '18 at 15:49
  • 1
    And, apparently, `LIKE` is one of the layers that cuts in half the backslashes. – Rick James Jul 14 '18 at 16:09
2

Update 14-07-2018: (The Solution)

As @Progman's comment pointed to the mysql docs, here is the summary from the docs:

Note

Because MySQL uses C escape syntax in strings (for example, \n to represent a newline character), you must double any \ that you use in LIKE strings. For example, to search for \n, specify it as \\n. To search for \, specify it as \\\\; this is because the backslashes are stripped once by the parser and again when the pattern match is made, leaving a single backslash to be matched against.

Exception: At the end of the pattern string, backslash can be specified as \\. At the end of the string, backslash stands for itself because there is nothing following to escape.

Please do note that the above summary is for cases when NO_BACKSLASH_ESCAPES is disabled.

That said, here is my simple understanding about this:

The string in LIKE clause passes through 2 things:

(1) The Parser

(2) The Pattern Matcher

Thus, when looking for a literal \ in normal mode, we have to double escape it as \\\\, once for the parser and once for the pattern matcher.

When looking for a literal \ in NO_BACKSLASH_ESCAPES mode, we can skip the escaping that is meant for the parser (because this mode makes the parser take them as literal), BUT we must escape once that is for the pattern matcher. So, in this mode, we must use \\ to look for a literal \.

The NO_BACKSLASH_ESCAPES mode DOES NOT affect the Pattern Matcher!


To conclude:

(1) for same results, the normal mode requires 2x slashes than NO_BACKSLASH_ESCAPES mode, regardless of its use in = or LIKE. These are same:

SET @@sql_mode = '';
SELECT * FROM table where name LIKE 'abc\\\\def';

and

SET @@sql_mode = 'NO_BACKSLASH_ESCAPES';
SELECT * FROM table where name LIKE 'abc\\def';

both returns abc\def as result.

(2) for same results, the LIKE requires 2x slashes than =, regardless of its use in normal or NO_BACKSLASH_ESCAPES mode. These are same:

SELECT * FROM table where name LIKE 'abc\\\\def';

and

SELECT * FROM table where name = 'abc\\def';

both returns abc\def in normal mode and abc\\def in NO_BACKSLASH_ESCAPES mode.

Thanks a ton, @Progman!

Jay Dadhania
  • 374
  • 3
  • 13