Source code
Revision control
Copy as Markdown
Other Tools
Metadata-Version: 2.1
Name: idna
Version: 2.10
Summary: Internationalized Domain Names in Applications (IDNA)
Author: Kim Davies
Author-email: kim@cynosure.com.au
License: BSD-like
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Topic :: Internet :: Name Service (DNS)
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Utilities
Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*
Internationalized Domain Names in Applications (IDNA)
=====================================================
Support for the Internationalised Domain Names in Applications
This is the latest version of the protocol and is sometimes referred to as
“IDNA 2008”.
This library also provides support for Unicode Technical Standard 46,
This acts as a suitable replacement for the “encodings.idna” module that
comes with the Python standard library, but only supports the
Basic functions are simply executed:
.. code-block:: pycon
# Python 3
>>> import idna
>>> idna.encode('ドメイン.テスト')
b'xn--eckwd4c7c.xn--zckzah'
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
ドメイン.テスト
# Python 2
>>> import idna
>>> idna.encode(u'ドメイン.テスト')
'xn--eckwd4c7c.xn--zckzah'
>>> print idna.decode('xn--eckwd4c7c.xn--zckzah')
ドメイン.テスト
Packages
--------
The latest tagged release version is published in the PyPI repository:
Installation
------------
To install this library, you can use pip:
.. code-block:: bash
$ pip install idna
Alternatively, you can install the package using the bundled setup script:
.. code-block:: bash
$ python setup.py install
This library works with Python 2.7 and Python 3.4 or later.
Usage
-----
For typical usage, the ``encode`` and ``decode`` functions will take a domain
name argument and perform a conversion to A-labels or U-labels respectively.
.. code-block:: pycon
# Python 3
>>> import idna
>>> idna.encode('ドメイン.テスト')
b'xn--eckwd4c7c.xn--zckzah'
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
ドメイン.テスト
You may use the codec encoding and decoding methods using the
``idna.codec`` module:
.. code-block:: pycon
# Python 2
>>> import idna.codec
>>> print u'домена.испытание'.encode('idna')
xn--80ahd1agd.xn--80akhbyknj4f
>>> print 'xn--80ahd1agd.xn--80akhbyknj4f'.decode('idna')
домена.испытание
Conversions can be applied at a per-label basis using the ``ulabel`` or ``alabel``
functions if necessary:
.. code-block:: pycon
# Python 2
>>> idna.alabel(u'测试')
'xn--0zwm56d'
Compatibility Mapping (UTS #46)
+++++++++++++++++++++++++++++++
specification no longer normalizes input from different potential ways a user
may input a domain name. This functionality, known as a “mapping”, is now
considered by the specification to be a local user-interface issue distinct
from IDNA conversion functionality.
This library provides one such mapping, that was developed by the Unicode
it provides for both a regular mapping for typical applications, as well as
a transitional mapping to help migrate from older IDNA 2003 applications.
For example, “Königsgäßchen” is not a permissible label as *LATIN CAPITAL
LETTER K* is not allowed (nor are capital letters in general). UTS 46 will
convert this into lower case prior to applying the IDNA conversion.
.. code-block:: pycon
# Python 3
>>> import idna
>>> idna.encode(u'Königsgäßchen')
...
idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of 'Königsgäßchen' not allowed
>>> idna.encode('Königsgäßchen', uts46=True)
b'xn--knigsgchen-b4a3dun'
>>> print(idna.decode('xn--knigsgchen-b4a3dun'))
königsgäßchen
Transitional processing provides conversions to help transition from the older
2003 standard to the current standard. For example, in the original IDNA
specification, the *LATIN SMALL LETTER SHARP S* (ß) was converted into two
*LATIN SMALL LETTER S* (ss), whereas in the current IDNA specification this
conversion is not performed.
.. code-block:: pycon
# Python 2
>>> idna.encode(u'Königsgäßchen', uts46=True, transitional=True)
'xn--knigsgsschen-lcb0w'
Implementors should use transitional processing with caution, only in rare
cases where conversion from legacy labels to current labels must be performed
(i.e. IDNA implementations that pre-date 2008). For typical applications
that just need to convert labels, transitional processing is unlikely to be
beneficial and could produce unexpected incompatible results.
``encodings.idna`` Compatibility
++++++++++++++++++++++++++++++++
Function calls from the Python built-in ``encodings.idna`` module are
mapped to their IDNA 2008 equivalents using the ``idna.compat`` module.
Simply substitute the ``import`` clause in your code to refer to the
new module name.
Exceptions
----------
All errors raised during the conversion following the specification should
raise an exception derived from the ``idna.IDNAError`` base class.
More specific exceptions that may be generated as ``idna.IDNABidiError``
when the error reflects an illegal combination of left-to-right and right-to-left
characters in a label; ``idna.InvalidCodepoint`` when a specific codepoint is
an illegal character in an IDN label (i.e. INVALID); and ``idna.InvalidCodepointContext``
when the codepoint is illegal based on its positional context (i.e. it is CONTEXTO
or CONTEXTJ but the contextual requirements are not satisfied.)
Building and Diagnostics
------------------------
The IDNA and UTS 46 functionality relies upon pre-calculated lookup tables for
performance. These tables are derived from computing against eligibility criteria
in the respective standards. These tables are computed using the command-line
script ``tools/idna-data``.
This tool will fetch relevant tables from the Unicode Consortium and perform the
required calculations to identify eligibility. It has three main modes:
* ``idna-data make-libdata``. Generates ``idnadata.py`` and ``uts46data.py``,
the pre-calculated lookup tables using for IDNA and UTS 46 conversions. Implementors
who wish to track this library against a different Unicode version may use this tool
to manually generate a different version of the ``idnadata.py`` and ``uts46data.py``
files.
* ``idna-data make-table``. Generate a table of the IDNA disposition
(e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix B.1 of RFC
* ``idna-data U+0061``. Prints debugging output on the various properties
associated with an individual Unicode codepoint (in this case, U+0061), that are
used to assess the IDNA and UTS 46 status of a codepoint. This is helpful in debugging
or analysis.
The tool accepts a number of arguments, described using ``idna-data -h``. Most notably,
the ``--version`` argument allows the specification of the version of Unicode to use
in computing the table data. For example, ``idna-data --version 9.0.0 make-libdata``
will generate library data against Unicode 9.0.0.
Note that this script requires Python 3, but all generated library data will work
in Python 2.7.
Testing
-------
The library has a test suite based on each rule of the IDNA specification, as
well as tests that are provided as part of the Unicode Technical Standard 46,
The tests are run automatically on each commit at Travis CI: