In the introduction to domain names, you learned that a domain name is nothing more than a set of letters meant to represent an IP address somewhere on the internet. You also learned that the main point of a domain name was to make it easier for people to remember and type in web addresses.
Just to Review –
When a computer wants to know where to find a site like iGoldRush.com, it first looks at the Registry. The Registry tells the user’s computer which nameserver to go to. So the user’s computer goes to that specific nameserver, which then tells it exactly which IP address to go to. The user’s computer goes to that IP address, and finds the information it was looking for.
Right now, domain names are all stored in ASCII characters. ASCII stands for American Standard Code for Information Interchange, which represents the most basic and necessary English language characters (A-Z, a-z, 0-9 and punctuation) in a standardized fashion that both computers and users can understand. If you live in an English-speaking country, chances are any character shown on the keyboard you are using is included in ASCII. Since the Registries only recognize ASCII (which is based on the English alphabet) the Registries only allow users to register domain names containing English letters, numbers, and hyphens. This rule of only using such characters is called the LDH (letter-digit-hyphen) restriction. Any domain name that follows the restriction is said to have an LDH-label.
The Problem with English Domain Names
So what if someone in China wanted to register a domain for their office supply business? They would have to use English characters, even though the domain name wouldn’t be recognized by the Chinese business’ consumers.
This issue has caused much discussion and argument from the very beginning, and many have proposed different ways to fix it. Right now, the domain industry uses a system in which domain names that use non-ASCII characters (for example, any domain name in Cantonese) are converted into an ASCII format in a process called ToASCII. Doing this, people are able to register Internationalized Domain Names(IDNs).
How the DNS Handles Multilingual Domain Names
The process starts right at the Registrar’s website. For example, someone in Russia wants to register the Russian translation of igoldrush.com. So, they type in the name яЗолотойПорыв.com in their ‘check availability’ window. The Registrar translates the letters into ASCII characters by first making sure all the letters are lowercase. This process is called the Nameprep algorithm. The “яЗолотойПорыв.com” would become “язолотойпорыв.com” (igoldrush.com). язолотойпорыв.com is the U-label (the label in Unicode format). The name must now be converted, or translated, to an A-label (the label in ASCII format).
The name now undergoes another algorithm called Punycode, in which the non-ASCII characters (or Unicode characters) are converted into a set of ASCII characters, preceded by the suffix xn—. The name that Punycode would come up with for яЗолотойПорыв.com would be xn—b1ajghobabilr7itb.com. Notice the xn—. These four characters precede all IDNs. The zn—prefix is reserved for multi-lingual domain names.
Once the Registrar has the domain name in ASCII format, it checks to see if xn—b1ajghobabilr7itb.com has been registered before. If not, and the domain name is available, the Registrant (user) goes on with the normal process of registering a domain name.
Some Registrars are able to register IDNs, but do not make the ToASCII translation as part of their service. In such cases, someone wanting to register the IDN яЗолотойПорыв.com would have to first have it translated by an outside source and then register the A-label (the ASCII translation of the name). Such ‘translators’ are relatively common. You can see one in action on VeriSign’s website.
When someone in Russia wants to see яЗолотойПорыв.com, they simply type the name into their browser and they’ll be taken to the site registered in the example above. Many modern browsers will automatically translate the Russian domain name into the corresponding A-label. Most of them will display the A-label instead of the Russian characters if the computer is set up in English. This cuts down on potential IDN scams.
As the IDN industry grows, changes are inevitable. But, as those in control try to preserve the stability of the current system, the inevitable changes are sure to be slow and incremental.