ECMA-334 C# Language Specification9.1: Programs |
A C# program consists of one or more source files, known formally as compilation units (16.1). A source file is an ordered sequence of Unicode characters. Source files typically have a one-to-one correspondence with files in a file system, but this correspondence is not required.
Conceptually speaking, a program is compiled using three steps:
1 Transformation, which converts a file from a particular character repertoire and encoding scheme into a sequence of Unicode characters.
2 Lexical analysis, which translates a stream of Unicode input characters into a stream of tokens.
3 Syntactic analysis, which translates the stream of tokens into executable code.
Conforming implementations must accept Unicode source files encoded with the UTF-8 encoding form (as defined by the Unicode standard), and transform them into a sequence of Unicode characters. Implementations may choose to accept and transform additional character encoding schemes (such as UTF-16, UTF-32, or non-Unicode character mappings).