Character References
&#
decimal-number
;
&#x
hexadecimal-number
;
All XML parsers are based on the Unicode character set, no matter what the external encoding of the XML file is. It is theoretically possible to author documents directly in Unicode, but many text-editing, storage, and delivery systems do not fully support the Unicode character set. To allow XML authors to include Unicode characters in their documents' content without forcing them to abandon their existing editing tools, XML provides the character reference mechanism.
A character reference allows an author to insert a Unicode character by number (either decimal or hexadecimal) into the output stream produced by the parser to an XML application. Consider an XML document that includes the following character data:
© 2002 O'Reilly & Associates
In this example, the parser would replace the character reference with the actual Unicode character and pass it to the client application:
© 2002 O'Reilly & Associates
Character references may not be used in element or
attribute names, although they may be used in attribute values.
Note that hexadecimal character references are case-insensitive
(i.e., &xa9;
is
equivalent to &xA9;
).