What is the meaning and difference between these queries?
SELECT U'String' FROM dual;
and
SELECT N'String' FROM dual;
What is the meaning and difference between these queries?
SELECT U'String' FROM dual;
and
SELECT N'String' FROM dual;
In this answer i will try to provide informations from official resources
N'' is used to convert a string to NCHAR or NVARCHAR2 datatype
According to this Oracle documentation Oracle - Literals
The syntax of text literals is as follows:
where
Nornspecifies the literal using the national character set (NCHARorNVARCHAR2data).
Also in this second article Oracle - Datatypes
The N'String' is used to convert a string to NCHAR datatype
From the article listed above:
The following example compares the
translated_descriptioncolumn of thepm.product_descriptionstable with a national character set string:SELECT translated_description FROM product_descriptions WHERE translated_name = N'LCD Monitor 11/PM';
U'' is used to handle the SQL NCHAR String Literals in Oracle Call Interface (OCI)
Based on this Oracle documentation Programming with Unicode
The Oracle Call Interface (OCI) is the lowest level API that the rest of the client-side database access products use. It provides a flexible way for C/C++ programs to access Unicode data stored in SQL
CHARandNCHARdatatypes. Using OCI, you can programmatically specify the character set (UTF-8, UTF-16, and others) for the data to be inserted or retrieved. It accesses the database through Oracle Net.
OCI is the lowest-level API for accessing a database, so it offers the best possible performance.
Handling SQL NCHAR String Literals in OCI
You can switch it on by setting the environment variable
ORA_NCHAR_LITERAL_REPLACEtoTRUE. You can also achieve this behavior programmatically by using theOCI_NCHAR_LITERAL_REPLACE_ONandOCI_NCHAR_LITERAL_REPLACE_OFFmodes inOCIEnvCreate()andOCIEnvNlsCreate(). So, for example,OCIEnvCreate(OCI_NCHAR_LITERAL_REPLACE_ON)turns onNCHARliteral replacement, whileOCIEnvCreate(OCI_NCHAR_LITERAL_REPLACE_OFF)turns it off.[...] Note that, when the
NCHARliteral replacement is turned on,OCIStmtPrepareandOCIStmtPrepare2will transformN'literals withU'literals in the SQL text and store the resulting SQL text in the statement handle. Thus, if the application usesOCI_ATTR_STATEMENTto retrieve the SQL text from theOCIstatement handle, the SQL text will returnU'instead ofN'as specified in the original text.
From datatypes perspective, there is not difference between both queries provided
N'string' just returns the string as NCHAR type.
U'string' returns also NCHAR type, however it does additional processing to the string: it replaces \\ with \ and \xxxx with Unicode code point U+xxxx, where xxxx are 4 hexadecimal digits. This is similar to UNISTR('string'), the difference is that the latter returns NVARCHAR2.
U' literals are useful when you want to have a Unicode string independent from encoding and NLS settings.
Example:
select n'\€', u'\\\20ac', n'\\\20ac' from dual;
N'\€' U'\\\20AC' N'\\\20AC'
----- ---------- ----------
\€ \€ \\\20ac
when using N' we denote that given datatype is NCHAR or NVARCHAR.
U' is used to denote unicode
The documented N'' literals are the same as standard character literals ('') except that their data type is NVARCHAR2 and not VARCHAR2. It is important to note that the characters in these literals, together with the entire SQL statement, are converted from the client character set to the database character set when transmitted to the server. All characters from the literals that are not supported by the database character set are lost.
The data type of the undocumented U'' literals is also NVARCHAR2. The content of a U'' literal is interpreted like the input to the SQL UNISTR function. That is, each character sequence \xxxx, where each x is one hex digit, is interpreted as a UTF-16 code point U+xxxx. I am not sure why the U'' literals are undocumented. I can only guess. They are used internally by the NCHAR literal replacement feature, which, when enable on a client, automatically translates N'' literals to U'' literals. This prevents the mentioned data loss due to character set conversion and enables literal Unicode data to be provided for NVARCHAR2 columns even if the database character set is not Unicode.
The two queries in this thread's question are generally not equivalent because the literal text would be interpreted differently. However, if no backslash is present in the literals, no difference can be observed.