IDN Policy

1. Introduction

This document sets out the policy for IDN (Internationalized Domain Names) registrations under.ලංකා and .இலங்கை at LK Domain Registry. This policy and procedure are designed to ensure reliable and reasonable assignments of IDN for the registrants.

2. Abbreviations

Character: Character can be either a vowel, or a consonant or a composite ( a consonant with a vowel modifier) in Sinhala/Tamil script, or a digit/number or a Latin letter or hyphen(-)

Domain name: Domain name is unique address that can be used identify a resource on the Internet. It may consist one or more domain labels.

Domain Label: Domain label is a string which is bounded by period(s) “.”

Ex: nic.lk in this Domain name “nic” is a domain label

3.2. IDN Policy

3.2.1.       Registrants can request IDN domains under.ලංකා and .இலங்கை in LK Domain Registry.

3.2.2.      The requested domain name should consists only the characters given in the relevant IDN language table [Appendix-A]. The relevant language table for. ලංකා and . இலங்கை are attached in the Appendix A.

E.g.: If you are registering a   .ලංකා domain name, then you should use characters which are included in the.ලංකා language table.

3.2.3.      The domain string must contain at least two letters.

3.2.4.      The string should consist valid Unicode code points and should comply with the linguistic rules of the respective language. However it does not need to comply with the spelling rules.

E.g.:

ෙඅ is not a valid string.

නය and ණය are valid strings and which are considered as two different strings.

3.2.5.      The String should not contain the pattern “xn- -”

3.2.6.      It you are requesting a domain name containing zwj(200D) we are registering two domain names. They are domain name which contains zwj and a domain name without zwj as a bundle.

  • Even if we register two domain names the domain registration charges will not be affected.

E.g.:

If you are requesting a .ලංකා domain, containing ් + ර or ්+ය we are registering an extra domain name contains rakaransaya(්‍ර) or yansaya(්‍ය) with the requesting domain accordingly.

E.g.:  If you are requesting සත්‍ය.ලංකා we are registering සත්ය.ලංකා domain as well.

සත්‍ය.ලංකා        සත්ය.ලංකා

4. IDN Label Rules

This set of rules (Appendix-B) guide you to create a valid domain name for.ලංකා and. இலங்கை domains.

5. Appendix

5.1. Appendix-A

5.1.1. Permitted String Table for. ලංකා domains

Following tabel specifies the IDN (Internationalized Domain Names) Language Table used by the LK Domain Registry for the registration of Sinhala language domain labels in the .lk and .ලංකා domains. These are based on the recommendation of the ICTA IDN working group.

Other restrictions on the allowable character sequences exist, which are not documented in this table.

Latin
U+002D HYPHEN-MINUS
U+0030..U+0039 DIGIT ZERO – DIGIT NINE 0-9
U+0061..U+007A LATIN SMALL LETTER A – LATIN SMALL LETTER Z A-Z

a-z

Sinhala
U+0D82 Sinhala sign anusvaraya (ං)
U+0D83 Sinhala sign visargaya (ඃ)
U+0D85 Sinhala letter ayanna (අ)
U+0D86 Sinhala letter aayanna (ආ)
U+0D87 Sinhala letter aeyanna (ඇ)
U+0D88 Sinhala letter aeeyanna (ඈ)
U+0D89 Sinhala letter iyanna (ඉ)
U+0D8A Sinhala letter iiyanna (ඊ)
U+0D8B Sinhala letter uyanna (උ)
U+0D8C Sinhala letter uuyanna (ඌ)
U+0D8D Sinhala letter iruyanna (ඍ)
U+0D8E Sinhala letter iruuyanna (ඎ)
U+0D91 Sinhala letter eyanna (එ)
U+0D92 Sinhala letter eeyanna (ඒ)
U+0D93 Sinhala letter aiyanna (ඓ)
U+0D94 Sinhala letter oyanna (ඔ)
U+0D95 Sinhala letter ooyanna (ඕ)
U+0D96 Sinhala letter auyanna (ඖ)
U+0D9A Sinhala letter alpapraana kayanna (ක)
U+0D9B Sinhala letter mahaapraana kayanna (ඛ)
U+0D9C Sinhala letter alpapapraana gayanna (ග)
U+0D9D Sinhala letter mahaapraana gayanna (ඝ)
U+0D9E Sinhala letter kantaja naasikyaya (ඞ)
U+0D9F Sinhala letter sanyaka gayanna (ඟ)
U+0DA0 Sinhala letter alpapraana cayanna (ච)
U+0DA1 Sinhala letter mahaapraana cayanna (ඡ)
U+0DA2 Sinhala letter alpapraana jayanna (ජ)
U+0DA3 Sinhala letter mahaapraana jayanna (ඣ)
U+0DA4 Sinhala letter taaluja naasikyaya (ඤ)
U+0DA5 Sinhala letter taaluja sanyooga naaksikyaya (ඥ)
U+0DA7 Sinhala letter alpapraana ttayanna (ට)
U+0DA8 Sinhala letter mahaapraana ttayanna (ඨ)
U+0DA9 Sinhala letter alpapraana ddayanna (ඩ)
U+0DAA Sinhala letter mahaapraana ddayanna (ඪ)
U+0DAB Sinhala letter muurdhaja nayanna (ණ)
U+0DAD Sinhala letter alpapraana tayanna (ත)
U+0DAE Sinhala letter mahaapraana tayanna (ථ)
U+0DAF Sinhala letter alpapraana dayanna (ද)
U+0DB0 Sinhala letter mahaapraana dayanna (ධ)
U+0DB1 Sinhala letter dantaja nayanna (න)
U+0DB3 Sinhala letter sanyaka dayanna (ඳ)
U+0DB4 Sinhala letter alpapraana payanna (ප)
U+0DB5 Sinhala letter mahaapraana payanna (ඵ)
U+0DB6 Sinhala letter alpapraana bayanna (බ)
U+0DB7 Sinhala letter mahaapraana bayanna (භ)
U+0DB8 Sinhala letter mayanna (ම)
U+0DB9 Sinhala letter amba bayanna (ඹ)
U+0DBA Sinhala letter yayanna (ය)
U+0DBB Sinhaya letter rayanna (ර)
U+0DBD Sinhala letter dantaja layanna (ල)
U+0DC1 Sinhala letter taaluja sayanna (ශ)
U+0DC2 Sinhala letter muurdhaja sayanna (ෂ)
U+0DC3 Sinhala letter dantaja sayanna (ස)
U+0DC4 Sinhala letter hayanna (හ)
U+0DC5 Sinhala letter muurdhaja layanna (ළ)
U+0DC6 Sinhala letter fayanna (ෆ)
U+0DCA Sinhala sign al-lakuna (්)
U+0DCF Sinhala vowel sign aela-pilla (ා)
U+0DD0 Sinhala vowel sign ketti aeda-pilla (ැ)
U+0DD1 Sinhala vowel sign diga aeda-pilla (ෑ)
U+0DD2 Sinhala vowel sign ketti is-pilla (ි)
U+0DD3 Sinhala vowel sign diga is-pilla (ී)
U+0DD4 Sinhala vowel sign ketti paa-pilla (ු)
U+0DD6 Sinhala vowel sign diga paa-pilla (ූ)
U+0DD8 Sinhala vowel sign gaetta-pilla (ෘ)
U+0DD9 Sinhala vowel sign kombuva (ෙ)
U+0DDA Sinhala vowel sign diga kombuva (ේ)
U+0DDB Sinhala vowel sign kombu deka (ෛ)
U+0DDC Sinhala vowel sign kombuva haa aela-pilla (ො)
U+0DDD Sinhala vowel sign kombuva haa diga aela-pilla (ෝ)
U+0DDE Sinhala vowel sign kombuva haa gayanukitta (ෞ)
U+0DF2 Sinhala vowel sign diga gaetta-pilla (ෲ)


5.1.2. Permitted String Table for .இலங்கை domains

This document specifies the IDN (Internationalized Domain Names) Language Table used by the LK Domain Registry for the registration of Tamil language labels in the .lk and .இலங்கை domains. These are based on the recommendation of the ICTA IDN working group.

Latin
U+002D HYPHEN-MINUS
U+0030..U+0039 DIGIT ZERO – DIGIT NINE 0-9
LATIN SMALL LETTER A – LATIN SMALL LETTER Z A-Z

a-z

Tamil
U+0B83 TAMIL SIGN VISARGA = aytham
U+0B85 TAMIL LETTER A
U+0B86 TAMIL LETTER AA
U+0B87 TAMIL LETTER I
U+0B88 TAMIL LETTER II
U+0B89 TAMIL LETTER U
U+0B8A TAMIL LETTER UU
U+0B8E TAMIL LETTER E
U+0B8F TAMIL LETTER EE
U+0B90 TAMIL LETTER AI
U+0B92 TAMIL LETTER O
U+0B93 TAMIL LETTER OO
U+0B94 TAMIL LETTER AU
U+0B95 TAMIL LETTER KA
U+0B99 TAMIL LETTER NGA
U+0B9A TAMIL LETTER CA
U+0B9C TAMIL LETTER JA
U+0B9E TAMIL LETTER NYA
U+0B9F TAMIL LETTER TTA
U+0BA3 TAMIL LETTER NNA
U+0BA4 TAMIL LETTER TA
U+0BA8 TAMIL LETTER NA
U+0BA9 TAMIL LETTER NNNA
U+0BAA TAMIL LETTER PA
U+0BAE TAMIL LETTER MA
U+0BAF TAMIL LETTER YA
U+0BB0 TAMIL LETTER RA
U+0BB1 TAMIL LETTER RRA
U+0BB2 TAMIL LETTER LA
U+0BB3 TAMIL LETTER LLA
U+0BB4 TAMIL LETTER LLLA
U+0BB5 TAMIL LETTER VA
U+0BB6 TAMIL LETTER SHA
U+0BB7 TAMIL LETTER SSA
U+0BB8 TAMIL LETTER SA
U+0BB9 TAMIL LETTER HA
U+0BBE TAMIL VOWEL SIGN AA
U+0BBF TAMIL VOWEL SIGN I ி
U+0BC0 TAMIL VOWEL SIGN II
U+0BC1 TAMIL VOWEL SIGN U
U+0BC2 TAMIL VOWEL SIGN UU
U+0BC6 TAMIL VOWEL SIGN E
U+0BC7 TAMIL VOWEL SIGN EE
U+0BC8 TAMIL VOWEL SIGN AI
U+0BCA TAMIL VOWEL SIGN O
U+0BCB TAMIL VOWEL SIGN OO
U+0BCC TAMIL VOWEL SIGN AU
U+0BCD TAMIL SIGN VIRAMA

5.2. Appendix-B

5.2.1. IDN Label Rules for .ලංකා domains

  • IDN rules for Indic scripts are based on strings rather than individual Unicode characters
  • as Indic letters (akshara) are represented by strings of Unicode characters.
  • we define the sets (consonants, vowels, modifiers, semi consonants, zwj etc.) to which we group the letters

SinhalaVowel = [

Sinhala_Letter_A = U+0D85 # (අ)

Sinhala_Letter_AA = U+0D86 # (ආ)

Sinhala_Letter_AE = U+0D87 # (ඇ)

Sinhala_Letter_AEE = U+0D88 # (ඈ)

Sinhala_Letter_I = U+0D89 # (ඉ)

Sinhala_Letter_II = U+0D8A # (ඊ)

Sinhala_Letter_U = U+0D8B# (උ)

Sinhala_Letter_UU = U+0D8C # (ඌ)

Sinhala_Letter_vR= U+0D8D # (ඍ)

Sinhala_Letter_vRR= U+0D8E # (ඎ)

Sinhala_Letter_E = U+0D91 # (එ)

Sinhala_Letter_EE = U+0D92 # (ඒ)

Sinhala_Letter_AI= U+0D93 # (ඓ)

Sinhala_Letter_O= U+0D94 # (ඔ)

Sinhala_Letter_OO = U+0D95 # (ඕ)

Sinhala_Letter_AU = U+0D96 # (ඖ)

]

SinhalaConsonant = [

Sinhala_Letter_KHA = U+0D9A # (ක)

Sinhala_Letter_GA= U+0D9B # (ඛ)

Sinhala_Letter_GHA = U+0D9C # (ග)

Sinhala_Letter_NGA = U+0D9D # (ඝ)

Sinhala_Letter_NGGA = U+0D9E # (ඞ)

Sinhala_Letter_CA = U+0D9F # (ඟ)

Sinhala_Letter_CHA = U+0DA0 # (ච)

Sinhala_Letter_JA= U+0DA1 # (ඡ)

Sinhala_Letter_JHA = U+0DA2 # (ජ)

Sinhala_Letter_NYA = U+0DA3 # (ඣ)

Sinhala_Letter_JNYA= U+0DA4 # (ඤ)

Sinhala_Letter_NYJA= U+0DA5 # (ඥ)

Sinhala_Letter_NYJA = U+0DA6 # (ඦ)

Sinhala_Letter_TTA = U+0DA7 # (ට)

Sinhala_Letter_TTHA= U+0DA8 # (ඨ)

Sinhala_Letter_DDA = U+0DA9 # (ඩ)

Sinhala_Letter_DDHA = U+0DAA # (ඪ)

Sinhala_Letter_NNA= U+0DAB # (ණ)

Sinhala_Letter_NNDDA = U+0DAC # (ඬ)

Sinhala_Letter_TA = U+0DAD # (ත)

Sinhala_Letter_THA = U+0DAE # (ථ)

Sinhala_Letter_DA = U+0DAF # (ද)

Sinhala_Letter_DHA = U+0DB0# (ධ)

Sinhala_Letter_NA= U+0DB1# (න)

Sinhala_Letter_NDA = U+0DB3# (ඳ)

Sinhala_Letter_PA= U+0DB4 # (ප)

Sinhala_Letter_PHA = U+0DB5 # (ඵ)

Sinhala_Letter_BA= U+0DB6 # (බ)

Sinhala_Letter_BHA = U+0DB7 # (භ)

Sinhala_Letter_MA= U+0DB8 # (ම)

Sinhala_Letter_MBA= U+0DB9# (ඹ)

Sinhala_Letter_YA = U+0DBA # (ය)

Sinhala_Letter_RA = U+0DBB # (ර)

Sinhala_Letter_LA = U+0DBD # (ල)

Sinhala_Letter_VA = U+0DC0 # (ව)

Sinhala_Letter_SHA = U+0DC1 # (ශ)

Sinhala_Letter_SSA= U+0DC2 # (ෂ)

Sinhala_Letter_SA= U+0DC3 # (ස)

Sinhala_Letter_HA = U+0DC4 # (හ)

Sinhala_Letter_LLA = U+0DC5 # (ළ)

Sinhala_Letter_FA= U+0DC6 # (ෆ)

]

SinhalaModifiers=[

Sinhala_Vowel_Sign_AA= U+0DCF # (ා)

Sinhala_Vowel_Sign_AE = U+0DD0# (ැ)

Sinhala_Vowel_Sign_AEE= U+0DD1# (ෑ)

Sinhala_Vowel_Sign_I= U+0DD2# (ි)

Sinhala_Vowel_Sign_II= U+0DD3# (ී)

Sinhala_Vowel_Sign_U= U+0DD4# (ු)

Sinhala_Vowel_Sign_UU= U+0DD6# (ූ)

Sinhala_Vowel_Sign_VR= U+0DD8# (ෘ)

Sinhala_Vowel_Sign_VRR= U+0DF2# (ෲ)

Sinhala_Vowel_Sign_E= U+0DD9# (ෙ)

Sinhala_Vowel_Sign_EE= U+0DDA # (ේ)

Sinhala_Vowel_Sign_AI= U+0DDB # (ෛ)

Sinhala_Vowel_Sign_VI= U+0DDF # (ෟ)

Sinhala_Vowel_Sign_O= U+0DDC # (ො)

Sinhala_Vowel_Sign_OO= U+0DDD # (ෝ)

Sinhala_Vowel_Sign_AU= U+0DDE # (ෞ)

Sinhala_Sign_ALLAKUNA= U+0DCA # (්)

]

SinhalaSemiConsonants=[

Sinhala_Sign_Anusvaraya= U+0D82 # (ං)

Sinhala_Sign_Visargaya= U+0D83 # (ඃ)

]

ZWJ= [

ZWJ =U+200D #(zwj)

]

English_Letters=[A-Z or a-z]

Digits=[0 to 9]

  • Rules

# Rules have the following format:

# <sequence>:<result>

# Key:

# <sequence> is the sequence of characters starting from the current position in the label where each element is either a named character or a member of a character set defined above.

# <result>  is either “fail” or “next”

# Logically, a label is processed by iterating through its character positions

# In each iteration, each rule is checked with the substring starting from the current character position.

# If the current substring matches then the result is applied as follows:

#   fail: stop, the label is invalid

#   next: move to the next character position

# If the processing reaches the end of the string, then the label is valid.

#

# Variants:

# A variant is defined by a rule of the form

# <sequence1> | <sequence2> : variant

# If the current substring matches either <sequence1> or <sequence2>, then note that

#    the label contains a variant, and then move to the next character position.

# Rule can be defined as follows.

1. First letter can be a vowel a consonant a digit or a English letter

EX: Sinhala_Letter_A(0D85). . . . Sinhala_Letter_AU(0D96)

Sinhala_Letter_KHA(0D9A) … Sinhala_Letter_FA(0DC6)

English_Letter (A – Z or a – z)

Digits (0 to 9)

2. A vowel can follow another vowel, consonant, a semi consonant, English letter or a digit

Ex:

Sinhala_Letter_A  Sinhala_Letter_AA (අආ)

Sinhala_Letter_I Sinhala_Letter_RA (ඉර)

Sinhala_Letter_A Sinhala_Sign_Anusvaraya(අං),
Sinhala_Letter_A Sinhala_Sign_Visargaya(අඃ)

Sinhala_Letter_A English_Letter_C (අc)

Sinhala_Letter_A 1 (අ1)

3. A consonant can follow another consonant, modifier, vowel, al-lakuna a semi consonant, digit or an English letter

Ex:

Sinhala_Letter_GHA Sinhala_Letter_MA(ග ම)
Sinhala_Letter_GHA Sinhala_Vowel_Sign_AA Sinhala_Letter_LA Sinhala_Sign_ALLAKUNA Sinhala_Letter_LA (ගාල්ල)

Sinhala_Letter_KHA Sinhala_Sign_ALLAKUNA Sinhala_Letter_LA Sinhala_Vowel_Sign_I Sinhala_Letter_FA Sinhala_Letter_DDA Sinhala_Sign_ALLAKUNA (ක්ලිෆඩ්)

Sinhala_Letter_NA Sinhala_Sign_Anusvaraya Sinhala_Letter_GHA Sinhala_Vowel_Sign_II (නංගී)

Sinhala_Letter_GHA 1 (ග1)
Sinhala_Letter_GHA m(ගm)

4. A digit/ a English letter can follow a vowel, a consonant or digit/a English letter

Ex:

English_Letter_A Sinhala_Letter_I Sinhala_Letter_RA (Aඉර)

English_Letter_B Sinhala_Letter_GHA Sinhala_Letter_MA(Bග ම)

English_Letter_A English_Letter_B (AB)

5. A semi consonant can follow a vowel, a consonant, digit/or an English letter

Ex:

Sinhala_Letter_A Sinhala_Sign_Visargaya Sinhala_Letter_RA(අඃර)

Sinhala_Letter_KHA Sinhala_Sign_Anusvaraya Sinhala_Letter_vR(කංඍ)

Sinhala_Letter_KHA Sinhala_Sign_Anusvaraya 1 (කං1)
Sinhala_Letter_KHA Sinhala_Sign_Anusvaraya English_Letter_a (කංa)

6. A modifier can follow a semi consonant, vowel, consonant, digit or an English letter

Ex:

Sinhala_Letter_KHA Sinhala_Vowel_Sign_II Sinhala_Sign_Anusvaraya (කීං)

Sinhala_Letter_NA Sinhala_Vowel_Sign_AA Sinhala_Letter_U Sinhala_Letter_LA (නාඋල)

Sinhala_Letter_NA Sinhala_Vowel_Sign_AA Sinhala_Letter_GHA Sinhala_Letter_SA (නාගස)

Sinhala_Letter_NA 1(නා1)

7. Sinhala_Sign_ALLAKUNA can follow vowel, consonant, zwj,digit or an English letter

Ex:

Sinhala_Letter_GHA Sinhala_Letter_LA Sinhala_Sign_ALLAKUNA Sinhala_Letter_A Sinhala_Letter_MA Sinhala_Vowel_Sign_U Sinhala_Letter_NNA(ගල්අමුණ)

Sinhala_Letter_A Sinhala_Letter_TA Sinhala_Sign_ALLAKUNA Sinhala_Letter_LA(අත්ල)

Sinhala_Letter_KHA Sinhala_Sign_ALLAKUNA 200D Sinhala_Letter_RA(ක + ් + zwj + ර) = ක්‍ර

Sinhala_Letter_BA Sinhala_Letter_SA Sinhala_Sign_ALLAKUNA 1 (බස්1)

8. After a zwj Sinhala_Letter_YA, Sinhala_Letter_RA can be followed.

Ex:

ක්‍ර = Sinhala_Letter_KHA Sinhala_Sign_ALLAKUNA zwj(200D) Sinhala_Letter_RA (ක + ් + zwj + ර)

ක්‍ය = Sinhala_Letter_KHA Sinhala_Sign_ALLAKUNA zwj(200D) Sinhala_Letter_YA (ක + J + zwj + ය)

5.2.2. IDN Label Rules for .இலங்கைdomains

  • IDN rules for Indic scripts are based on strings rather than individual Unicode characters
  • as Indic letters (akshara) are represented by strings of Unicode characters.
  • we define the sets (consonants, vowels, vowel signs, etc.) to which we group the letters

TamilVowel = [

Tamil_Letter_A

Tamil_Letter_AA

Tamil_Letter_I

Tamil_Letter_II

Tamil_Letter_U

Tamil_Letter_UU

Tamil_Letter_E

Tamil_Letter_EE

Tamil_Letter_AI

Tamil_Letter_O

Tamil_Letter_OO

Tamil_Letter_AU

]

TamilConsonant = [

Tamil_Letter_KA

Tamil_Letter_NGA

Tamil_Letter_CA

Tamil_Letter_JA

Tamil_Letter_NYA

Tamil_Letter_TTA

Tamil_Letter_NNA

Tamil_Letter_TA

Tamil_Letter_NA

Tamil_Letter_NNNA

Tamil_Letter_PA

Tamil_Letter_MA

Tamil_Letter_YA

Tamil_Letter_RA

Tamil_Letter_RRA

Tamil_Letter_LA

Tamil_Letter_LLA

Tamil_Letter_LLLA

Tamil_Letter_VA

Tamil_Letter_SHA

Tamil_Letter_SSA

Tamil_Letter_SA

Tamil_Letter_HA

]

TamilVowelSign = [

Tamil_Vowel_Sign_AA

Tamil_Vowel_Sign_I

Tamil_Vowel_Sign_II

Tamil_Vowel_Sign_U

Tamil_Vowel_Sign_UU

Tamil_Vowel_Sign_E

Tamil_Vowel_Sign_EE

Tamil_Vowel_Sign_AI

Tamil_Vowel_Sign_O

Tamil_Vowel_Sign_OO

Tamil_Vowel_Sign_AU

]

TAMIL SIGN VISARGA – Aytham

ASCIIDigit = [0-9]

  • Rules

# Rules have the following format:

# <sequence> : <result>

# Key:

# <sequence> is the sequence of characters starting from the current position in the label

#    where each element is either a named character or a member of a character set defined above.

# <result> is either “fail” or “next”

# Logically, a label is processed by iterating through its character positions

# In each iteration, each rule is checked with the substring starting from the current character position.

# If the current substring matches then the result is applied as follows:

#   fail: stop, the label is invalid

#   next: move to the next character position after the end of the matched string

# If the processing reaches the end of the string, then the label is valid.

# Variants:

# A variant is defined by a rule of the form

# <sequence1> | <sequence2> : variant

# If the current substring matches either <sequence1> or <sequence2>, then note that

#    the label contains a variant, and then move to the next character position.

# we now define each of the special cases, and finally the general rules.

# allow ik + ssa as either a single glyph (க்ஷ) or separate glyphs (க்‌ஷ)

# these are variants of each other

# NOTE GD-20100416: we could also allow just one form, and make the other invalid

# This is the only place where ZWNJ is valid

Tamil_Letter_KA Tamil_Sign_Pulli Tamil_Letter_SSA | Tamil_Letter_KA Tamil_Sign_Pulli ZWNJ Tamil_Letter_SSA : variant

# the ZWNJ is not valid anywhere else except in the sequence2 above

ZWNJ : fail

# disallow old form of Shri (ஸ+்+ர+ீ)

Tamil_Letter_SA Tamil_Sign_Pulli Tamil_Letter_RA Tamil_Vowel_Sign_II

: fail

# Note: the valid representation of Shri is

# Tamil_Letter_SHA Tamil_Sign_Pulli Tamil_Letter_RA Tamil_Vowel_Sign_II (ஶ+்+ர+ீ)

# we don’t need a special rule for this

# disallow a LLA after a consonant with a Kombu (e.g. கெ ள) unless it is modified by a vowel sign or Pulli

#    to avoid confusion with TamilConsonant+Vowel Sign AU

# It is presumed that this sequence will never occur in a valid word

# the kombu should be preceeded by a consonant

TamilConsonant Tamil_VowelSign_E Tamil_Letter_LLA TamilVowelSign : next

TamilConsonant Tamil_Vowel_Sign_E Tamil_Letter_LLA Tamil_Sign_Pulli : next

TamilConsonant Tamil_Vowel_Sign_E Tamil_Letter_LLA : fail

# disallow a LLA after Letter O (ஒ ள) unless it is modified by a vowel sign or Pulli

#   to avoid confusion with Letter AU (ஔ)

# again, we assume that this sequence will never occur in a valid word

Tamil_Letter_O Tamil_Letter_LLA TamilVowelSign : next

Tamil_Letter_O Tamil_Letter_LLA Tamil_Sign_Pulli : next

Tamil_Letter_O Tamil_Letter_LLA : fail

# General Rules

# a vowel sign or a pulli (virama) can only follow a consonant and is not valid elsewhere

TamilConsonant TamilVowelSign : next

TamilConsonant Tamil_Sign_Pulli  : next

TamilVowelSign : fail

Tamil_Sign_Pulli : fail

# allow consonants, vowels, Aytham, European numerals anywhere (unless disallowed by previous rules)

TamilConsonant : next

TamilVowel : next

Tamil_Sign_Aytham : next

ASCIIDigit : next

Hyphen-Minus : next

# IDN rules, which are not implemented in this table, restrict the placement of hyphen-minus

# anything else is invalid

: fail