Internationalized Resource Identifier | |
Abbreviation | IRI |
---|---|
Status | Proposed Standard |
Year started | 22 April 2002 |
First published | 22 April 2002 |
Latest version | 21 January 2020 |
Organization | IETF |
Authors |
|
Base standards |
|
Domain | Character encoding |
Website | RFC 3987 |
The Internationalized Resource Identifier (IRI) is an internet protocol standard which builds on the Uniform Resource Identifier (URI) protocol by greatly expanding the set of permitted characters.[1][2][3] It was defined by the Internet Engineering Task Force (IETF) in 2005 in RFC 3987. While URIs are limited to a subset of the US-ASCII character set (characters outside that set must be mapped to octets according to some unspecified character encoding, then percent-encoded), IRIs may additionally contain most characters from the Universal Character Set (Unicode/ISO 10646),[4][5] including Chinese, Japanese, Korean, and Cyrillic characters.
Notice that IRIs (Internationalized Resource Identifier) [11] are supposed to replace URIs in next future.
This document defines a new protocol element, the Internationalized Resource Identifier (IRI), as a complement to the Uniform Resource Identifier (URI). An IRI is a sequence of characters from the Universal Character Set (Unicode/ISO 10646). A mapping from IRIs to URIs is defined, which means that IRIs can be used instead of URIs, where appropriate, to identify resources. The approach of defining a new protocol element was chosen instead of extending or changing the definition of URIs.
This document defines a new protocol element called Internationalized Resource Identifier (IRI) by extending the syntax of URIs to a much wider repertoire of characters. It also defines "internationalized" versions corresponding to other constructs from [RFC3986], such as URI references. The syntax of IRIs is defined in section 2, and the relationship between IRIs and URIs in section 3.