Mono Documentation

ECMA-334 C# Language Specification

9.4.1: Unicode escape sequences

A Unicode escape sequence represents a Unicode character. Unicode escape sequences are processed in identifiers (9.4.2), regular string literals (9.4.4.5), and character literals (9.4.4.4). A Unicode character escape is not processed in any other location (for example, to form an operator, punctuator, or keyword).

unicode-escape-sequence: \u hex-digit hex-digit hex-digit hex-digit; \U hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit

A Unicode escape sequence represents the single Unicode character formed by the hexadecimal number following the "\u" or "\U" characters. Since C# uses a 16-bit encoding of Unicode characters in characters and string values, a Unicode character in the range U+10000 to U+10FFFF is represented using two Unicode surrogate characters. Unicode characters with code points above 0x10FFFF are not supported.

Multiple translations are not performed. For instance, the string literal "\u005Cu005C" is equivalent to "\u005C" rather than "\". [Note: The Unicode value \u005C is the character "\". end note]

[Example: The example
class Class1 { static void Test(bool \u0066) { char c = '\u0066'; if (\u0066) System.Console.WriteLine(c.ToString()); } }
shows several uses of \u0066, which is the escape sequence for the letter "f". The program is equivalent to
class Class1 { static void Test(bool f) { char c = 'f'; if (f) System.Console.WriteLine(c.ToString()); } }
end example]