In this unit we will learn about the history of Tibetan writing, its relation to the spoken language, its scripts, its romanization systems, and some of its other features. We will also learn about a few different kinds of Tibetan literature.
Unit 4 Sections:
The invention of Tibetan writing is traditionally attributed to T h önmi Sambhoṭa during the reign of Songtsän Gampo in the mid-7th century. The history of Tibetan writing can generally be split into three periods: Old Tibetan, Classical Tibetan, and Modern Tibetan.
The term Old Tibetan is used to describe the language of Tibet from the 7th century up until the 11th or early 12th century. It includes the earliest Tibetan writings from the time of the Tibetan Empire (c. 7th – 9th centuries), as well as later writings from the Age of Fragmentation (སིལ་བུའི་དུས་ silbü: t h ü ) that followed the collapse of the empire. There are two types of Tibetan writing that survive from this period: inscriptions (carved writings) and manuscripts (handwritten texts).
Many Old Tibetan inscriptions survive on steles, tablets, rocks, bells, and walls; they discuss mainly imperial and political matters, as well as early Tibetan Buddhism. Some of these inscriptions are documented in the book Old Tibetan Inscriptions (edited by Iwao, Hill, and Takeuchi; 2009). A few Old Tibetan inscriptions also survive on precious metal artefacts, some of which are discussed in Amy Heller’s article Tibetan Inscriptions on Ancient silver and gold Vessels and Artefacts (2013).
Old Tibetan manuscripts are generally divided into two sets: those from the time of the Tibetan Empire, and those from the Age of Fragmentation. Old Tibetan manuscripts from the time of the Tibetan Empire have been recovered from the forts of Miran and Mazar Tagh in the Taklamakan Desert on the Silk Road. Many of these manuscripts have been catalogued and made available online on the website Old Tibetan Documents Online. The International Dunhuang Project hosts a collection of virtual exhibits on the history and culture of the Silk Road, as well as site profiles on Miran and Mazar Tagh that contain photos, architectural maps, and a catalogue of texts and artefacts found at each site. (Note: site profiles may take a few minutes to load.)
Old Tibetan manuscripts from the Age of Fragmentation have been recovered most notably from the Mogao caves near the oasis town of Dunhuang, at the eastern entrance to the Taklamakan. These manuscripts have been catalogued and made available online through the International Dunhuang Project.
Codicology is the study of manuscripts and the materials used to make them. One of the most prominent scholars in Tibetan codicology is Agnieszka Helman-Ważny. Many of her papers are available with no paywall on academia.edu if you create a free account. A good introduction to the codicology of Tibetan manuscripts is her article A Note on the Manuscript Culture of Tibet (2019).
After the collapse of the Tibetan Empire, Old Tibetan continued to be used as an international lingua franca on the Silk Road until the 11th century. Travelers and merchants on the Silk Road used bilingual phrasebooks written in Tibetan, Chinese, Sanskrit, or Khotanese to help them navigate the many languages of the Taklamakan desert.
The term “Classical Tibetan” or “Middle Tibetan” is used to describe the language of Tibetan writings beginning in the 11th or 12th centuries. These writings show a variety of spelling and grammar changes from Old Tibetan, including the development of a clear distinction between aspirated and unaspirated consonants (letters like ཅ་ and ཆ་ were interchangeable in Old Tibetan) and the loss of the post-suffix ད་.
This period is also marked by the rise of woodblock book printing, or xylography, which spread slowly at first around the periphery of Tibet between the 12th and 14th centuries, and then more widely in Central Tibet in the 15th century.
“The earliest known datable printed work in Tibetan language (but not from Tibet proper) is a small prayer produced in Khara Khoto, a Tangut city in western Inner Mongolia, in 1153 and preserved in the Institute of Oriental Manuscripts in St. Petersburg.”
-Agnieszka Helman-Ważny (from “The Choice of Materials in Early Tibetan Printed Books,” in Tibetan Printing: Comparison, Continuities, and Change, p. 537)
Two of the best English-language books on Tibetan printing published to date are Tibetan Printing (2016) and Tibetan Manuscript and Xylograph Traditions (2016).
There is no precisely defined boundary between Classical Tibetan and Modern Tibetan. Modern Tibetan generally shows a shift in tense marking from the main verb to the auxiliary verb, as well as a relatively complex system of evidentiality.
If Old Tibetan was characterized by inscriptions and manuscripts, and Classical Tibetan by printed books, then Modern Tibetan is surely characterized by the digitization of the language and its subsequent spread on the internet. The online presence of the Tibetan language has grown as Tibetan language digitization has become more and more sophisticated.
Apple devices were the first to offer native support for Tibetan script, which has lead to Apple’s widespread popularity among Tibetan and Bhutanese people. Apple’s Tibetan language system was designed in the 1980’s by a collaboration between a former Apple programmer named Steve Hartwell and the Tibetologists Tsuguhito Takeuchi and Yoshiro Imaeda. The project was supported by the government of Bhutan and by Otani University in Japan.
Since then, figures like Ven. Lobsang Monlam have been instrumental in improving Tibetan language digitization and accessibility through initiatives like the Great Monlam Dictionary. The Tibetan Himalayan Library hosts a webpage (mirror) that discusses Tibetan Unicode, fonts, and keyboards. This topic is also discussed on the South Asia Language Resource Center from the University of Chicago. Another valuable resource is Digital Tibetan, which contains information, guides, and tools related to Tibetan language digitization and computation. In 2019, the Central Tibetan Administration launched an online dictionary to modernize and standardize Tibetan terminology, adapting it to fields such as science, technology, engineering, and math.
Diglossia is when a single language has two different forms that use different grammar and vocabulary.
Standard Tibetan is diglossic because there is a great difference between the spoken language (ཁ་སྐད་ k h akä) and the written language (ཡིག་སྐད་ y i kkä). This course mainly teaches the spoken language.
The written language uses spellings and particles that were codified hundreds of years ago in grammatical texts like the སུམ་ཅུ་པ་ Sumchupa. Because the written language is relatively standardized, it allows people from different regions to communicate even if their spoken dialects are no longer mutually intelligible.
Also, because the classic texts describing the written language are hundreds of years old, people who have only studied Classical Tibetan often find that they can read Modern Tibetan writing. However, it should be emphasized that written or literary Tibetan is not the same thing as Classical Tibetan. Classical Tibetan describes a period in the history of the language that would have had both a written and a spoken form. Naturally, only written texts have survived from that period, and our data on spoken Classical Tibetan is mostly limited to written narrative dialogues.
Tibetan spelling is notoriously opaque, but this wasn’t always the case. When the Tibetan script was created in the 7th century, it reflected the spoken pronunciation of the language at the time. However, even in the time of Old Tibetan the spelling began to fall out of step with the pronunciation. Chinese and Khotanese transcriptions of Tibetan indicate that the prefix letters ག་ and བ་ had stopped being pronounced as early as the 9th century, but continued to be written in the spelling.
With few exceptions, most changes in Tibetan pronunciation have not been reflected in its spelling. There have been minor updates to the spelling like the loss of the post-suffix ད་ during the transition to Classical Tibetan, or spellings such as ཡོག་རེད་ and གལ་ that have appeared during the transition to Modern Central Tibetan. But by and large, the spelling has remained unchanged since the time of Old Tibetan.
There are many Tibetic lects that have preserved older forms of pronunciation that are more similar to the spelling. Ladakhi has very conservative pronunciation, with words like ལྤགས་ and ལྟད་མོ་ pronounced as “hlpaks” and “hltadmo”, compared to Standard Tibetan’s “pak” and “tämo”. The ཁྱོད་ཅག་ང་ཅག་ Chocha Ngacha language also has very conservative pronunciation, with words like བྲད་ and ཕྱུག་པོ་ pronounced as “br a t” and “phyukpo”, compared to Standard Tibetan’s “tr h ä ” and “ch h ukpo”.
Standard Tibetan speakers sometimes use non-standard spellings for certain words to reflect their modern pronunciation, but these are generally considered to be improper in the written language. The rate of change for the spelling is far slower than for the pronunciation, anyway, so innovative new spellings remain the exception rather than the norm.
The spoken language and written language sometimes differ in their vocabulary (e.g. འགྲེལ་བརྗོད་ vs. འགྲེལ་བཤད་ for “explanation”), grammar (e.g. རུང་ vs. ནའི་ for “although”), and pronunciation (e.g. ཞིག་ as sh i k or as chik). Speakers employ different strategies for navigating this gap.
Speaking styles
More educated speakers often use more features of the written language in their speech. Many written words such as the ལྷག་བཅས་ particles (ཏེ་དེ་སྟེ་) are admissible in speech even if they are less common than their spoken counterparts (such as ནས་). We could perhaps differentiate between written-style speech, which uses features of the written language, and spoken-style speech, which does not.
Writing styles
Formal writing uses the written language (which I will call written-style writing), and using elements of the spoken language in one’s writing is typically seen as improper. This norm extends even to online spaces. However, it’s not uncommon to encounter speech-style writing online.
Recitation styles
When reading a written text out loud, speakers may either read it as written (which I will call written-style recitation), or they may convert it to the corresponding speech forms (which I will call spoken-style recitation). For example, the word སོ་སོའི་ would be recited as “sosö:” in written-style recitation, and “soso ki” in spoken-style recitation.
There are a whole set of pronunciation exceptions that relate to the difference between the spoken and the written language.
For example, as we will see in unit 8, Standard Tibetan has mostly collapsed the distinction between different verb tenses in the verbal root. The verb གཏོང་ (“to send”) typically has the forms བཏང་, གཏོང་, གཏང་, and ཐོངས་ for the past tense, present tense, future tense, and imperative, respectively. However, in the spoken language the form བཏང་ tang is used for all tenses. Therefore, when reading a text out loud, someone might pronounce the written word གཏོང་ as བཏང་, which is a typical example of spoken-style recitation. If this is pointed out, they may correct themselves and pronounce it as གཏོང་ instead.
Some words have a standard pronunciation in their written form, and an unexpected pronunciation in their spoken form. For example, the conjuncts པྲ་ tra and ཕྲ་ tr h a are often pronounced as པ་ pa and ཕ་ p h a in the spoken language. As a result, the word ཕྲུག་གུ་ (also spelled ཕྲུ་གུ་) is pronounced tr h uku in written-style recitation but p h uku in the spoken language. Another example is ལྷམ་གོག་, which is pronounced hlamk h ok in written-style recitation but hamk h ok in the spoken language.
There are certain regular pronunciation exceptions that seem to relate to the collapse in verb tense. One common example is the root འབྱུང་ (“to arise, to exit”), which is typically pronounced as ch h u ng (like its past tense form བྱུང་) and not as j u ng. As a result, words like འབྱུང་བ་ (“element”), འབྱུང་ཁུངས་ (“origin”) and འབྱུང་འགྱུར་ (“future”) are generally pronounced as ch h u ngwa, ch h u ngkhung, and ch h u nggyur, and not as j u ngwa, j u ngkhung, or j u nggyur. Unlike spoken-style recitation, these pronunciations are used even in careful pronunciation, so they can be understood as simple exceptions to the pronunciation rules we’ve learned.
There are other pronunciation exceptions that apply even in careful speech, but have an unclear origin. For example, the word ཡུག་ཡུག་ (“wave of the hand”) is pronounced as yuk-yuk and not y u k-yuk. The word འདེད་ is pronounced t h e, showing both a past-tense pronunciation (the past tense is written དེད་) and an unexpected high-tone. The word བདག་ is often pronounced t h a k rather than d a k. These are just a few common examples.
It’s not necessary to memorize these exceptions right now. It is enough to be aware that they exist, and that you will likely encounter other words with unexpected pronunciations.
The Tibetan alphabet is written in a variety of scripts that broadly fall into two categories: དབུ་ཅན་ Uchän (lit. “with a head”) and དབུ་མེད་ U-me (lit. “without a head”) scripts. Uchän is so-called because it has a “head”, i.e. a strong horizontal line, written at the top of each letter. This is similar to what is known as a serif in Western typography. U-me scripts lack such a line.
Uchän has only one main type of script, but U-me is a family of scripts including ཚུགས་རིང་ Ts h ukring (lit. “long form”), འཁྱུག་ཡིག་ Ky h ukyik (lit. “fast letters”), and a variety of other scripts. To operate comfortably in Tibetan spaces you need to know all three of these scripts (Uchän, Tsukring, and Ky h ukyik); however, for most online contexts, Uchän alone is sufficient.
Tibetan calligraphy has specific norms for the kind of pen to use, the angle to write at, and the proportions to follow, but this topic is beyond the scope of this course.
The script that we’ve learned in this course is དབུ་ཅན་ Uchän. Uchän is the main script used for Tibetan in most contexts, but it is somewhat formal and often written in a slow and careful manner. We learned how to write Uchän script in Unit 1 §6.1.
The classic U-me script is ཚུག་རིང་ Ts h ukring (lit. “long-form”), which can be learned here. It is usually written slowly and carefully. Children in Ütsang often first learn to write in this script and only later switch to Uchän. Here is the above quote written in Ts h ukring:
Studying Ts h ukring will give you a good foundation for all other U-me scripts, because they all use similar letter forms which are often quite distinct from Uchän. One of the most identifiable features of U-me scripts is the use of a short vertical ཚེག་ ts h ek instead of a dot. You can study Ts h ukring here and here.
Another U-me script which is more suitable for handwriting is ཚུག་ཐུང་ Ts h ukt h ung (lit. “short-form”). You can study Ts h ukt h ung here. Below is the same quote written in Ts h ukt h ung:
Ts h ukring and Ts h ukt h ung are also called སུག་རིང་ Sukring (lit. “long-legged”) and སུག་ཐུང་ Sukt h ung (lit. “short-legged”) in some sources.
The script most often used for handwriting, especially for quick notes, is འཁྱུག་ཡིག་ Ky h ukyik (lit. “fast letters”). Here is the same quote written in Ky h ukyik:
Note: The fonts used in the images above are the Qomolangma fonts created by yalasoo.
You can study Ky h ukyik here, or using this video by BodYig Jung. In the first link it is called ཚུགས་མ་འཁྱུག་ Ts h ukma ky h uk instead.
I also want to mention this book, which teaches all three main scripts: Uchän, Ts h ukring, and Ky h ukyik. By the end of it you will be able to comfortably read and write all the main kinds of Tibetan script.
Tibetan has many other scripts used for various purposes. Many are just subtle variants on one of the above scripts, and are used for writing ordinary text in Tibetan. Some are more eccentric scripts that appear mainly in Tibetan calligraphy books.
There are also scripts like Horyik, which is often used in Tibetan calligraphers’ seals, or Ranjana, which is often used for writing Sanskrit text. You’ve already seen both of these scripts, even if you didn’t know it. Horyik is the script used in this website’s logo, and in the symbols at the top of each section heading. Ranjana is used in the image at the top of this unit, on the top half of the Maṇi wheels.
There are many Tibetan calligraphy books that document different script styles. Three calligraphy books that can be viewed for free online are this, this, and this.
Paleography is the study of writing systems and handwriting styles. One of the most prominent scholars in Tibetan paleography is Sam van Schaik. Many of his papers are available with no paywall on academia.edu if you create a free account. A good introduction to the paleography of Tibetan scripts is his article Towards a Tibetan Paleography (2014).
We’ve discussed how the pronunciation of most Tibetic languages has diverged from the spelling. This raises the question: when we want to write Tibetan words in other scripts like the English alphabet, do we want to represent their spelling or their pronunciation?
This question is the basis of the distinction between transliteration and transcription. Transliteration refers to a system that represents the spelling of a language using the letters of another language. Transcription refers to a system that represents the pronunciation of a language using the letters of another language. Romanization is a term for transliterations or transcriptions that use the Roman alphabet, which is the alphabet used to write English. This section will focus on romanizations.
Various systems of Tibetan transliteration have become popular in Western academia. The most common one is called Wylie (ཝེ་ལི་), named after its creator, Turrell V. Wylie. It is extremely useful to know Wylie because many English-language books, papers, and websites use it to transliterate Tibetan terms. Because it is a transliteration system, Wylie aims to faithfully represent the spelling of Tibetan words, but not necessarily their pronunciation.
The Wylie transliteration of the 30 consonants is:
Any letters that are not main letters use the same transliteration scheme but without the default “a”. So, for example, བསྒྲུབས་ is transliterated as bsgrubs, and འཁྱམས་ is transliterated as ‘khyams.
Many new students of Tibetan fall into the pitfall of using Wylie as a guide to modern Tibetan pronunciation, but this is not what Wylie is intended for. For example, the word དེ་ is spelled as de in Wylie, but speakers of Kha lects (like the one taught in this course) pronounce it as t h e .
The original form of Wylie was well-suited for representing ordinary Tibetan text, but little else. The Tibetan and Himalayan Library (THL) created the Extended Wylie Transliteration Scheme in order to represent the full range of common Tibetan characters, including a variety of special symbols, punctuation marks, and letters that sometimes appear in Tibetan script. This is the most widely used transliteration scheme for anything that’s not covered in the original Wylie system.
Wylie is just one of many different transliteration systems for Tibetan, with its own strengths and weaknesses. Many other transliteration schemes exist, such as Library of Congress transliteration and Bialek transliteration. The appendix to Bialek’s paper includes an overview of many different scholars’ transliteration schemes for Tibetan:
There is no need to study all of these, but this chart can serve as a useful reference for any transliteration systems you encounter in your study of Tibetan.
The Tibetan and Himalayan Library has also produced the Simplified Phonetic Transcription system for representing Standard Tibetan speech in a way that’s accessible to a broad audience. For example, the phrase དགའ་ལྡན་ཕོ་བྲང་ཕྱོགས་ལས་རྣམ་རྒྱལ། is transcribed as “ganden podrang chok lé namgyal” in this system. Although this transcription is suitable for a broad audience, it does not capture several important nuances of Standard Tibetan pronunciation which are important for anyone seriously learning the language. For example, it does not mark the difference between “e” and “ä”, the raising of “a” to “ä” before ལ་, or the difference between aspirated and unaspirated consonants.
A variety of other romanized transcription systems have been created, mostly by Buddhist organizations in the West. LotsawaHouse hosts a useful online tool for generating phonetic transcriptions of Tibetan text in a range of different transcription systems.
Another important transcription system is Tibetan Pinyin. Tibetan Pinyin is typically used by Chinese-speaking people in the Tibetan Autonomous Region and throughout China. This system is based on the principles of Chinese pinyin and subject to the syllable constraints of the Chinese lects, so it is not particularly useful for non-Chinese audiences.
Tibetan literature covers a wide range of genres and subjects. We are only covering a few important ones here.
The Topics of Knowledge (Sanskrit: vidyāsthāna, Tibetan: རིག་གནས་ r i knä) are a set of traditional fields of knowledge originally imported from India as part of the spread of Buddhism into Tibet. They are often divided into five, ten, or eighteen topics. When divided into ten topics, they consist of the five major topics and the five minor topics.
The five major topics of knowledge are:
The five minor topics of knowledge are:
There are various introductions to these topics, such as this introduction to Tibetan logic, this introduction to Tibetan medicine, or this introduction to Tibetan theatre. Of the ten topics, Tibetan medicine has received perhaps the most attention in English-language sources. I also recommend Heruka Institute’s video on the 10 topics of knowledge.
The Buddhist Digital Resource Center (formerly the Tibetan Buddhist Resource Center) has collected a large number of texts on the Topics of Knowledge which it now hosts on its website and on the Internet Archive. Some texts on the Topics of Knowledge include this, this, and this.
Buddhism has been a major influence on Tibetan culture for over a thousand years. Tibetan Buddhist literature has received a great deal of international attention since the 1950s, and non-Tibetan people are often introduced to Tibetan language and culture through Buddhism specifically. The most notable online repositories of Tibetan Buddhist literature are BDRC and 84000.
Modern Tibetan literature includes genres such as poetry, short stories, and novels. Perhaps the most famous Tibetan poet and writer is Dhondup Gyal, who has published works such as Waterfall of Youth and The Narrow Footpath. The anthology Old Demons, New Deities, edited by Tenzin Dickie, presents a collection short stories by Tibetan authors. Tibetan novels include ཕལ་པའི་ཁྱིམ་ཚང་གི་སྐྱིད་སྡུག (The Joys and Sorrows of an Ordinary Family) by Tashi Palden, with an English-language introduction available here; དྲེལ་པའི་མི་ཚེ། (The Life of a Mule Driver) by ལྷག་པ་དོན་གྲུབ་ hlakpa t h ö ndrup, with an English-language introduction available here; and the English-language book The Tibetan Suitcase by Tsering Namgyal Khortsa.
Also worth mentioning are the Journal of Tibetan Literature; the paper Witnessing Exile, which is a short introduction to modern Tibetan literature; the anthology Contemporary Tibetan Short Stories; this translation of 10 young Tibetan authors’ views on modern Tibetan literature; and the book Modern Tibetan Literature and Social Change.
This section contains notes on various other topics related to Tibetan writing.
Early Tibetan manuscripts were written in a variety of formats, including scrolls, codices, concertina-format texts, and sheets of paper. However, the textual format that would become dominant in Tibetan culture is the དཔེ་ཆ་pech h a (usually written simply as “pecha” in English). This format has been used for both manuscripts and printed books.
Pechas are made of very wide, short pages with writing on both the front (recto) and back (verso) sides of a page, bounded by a clear black margin. They are usually protected in a wood and cloth cover when not in use, and stacked densely on wooden shelves.
In the 20th century Tibetan books began to published in the Western book (དེབ་ t h ep) format, which has since become the most common format for new Tibetan books. Nevertheless, pechas are still produced, sold, and used, and there are various resources online (such as PechaMaker and Digital Tibetan) that support pecha digitization.
Tibetan only has two main punctuation marks: the ཚེག་ ts h ek, which is a small dot (་) written between syllables, and the ཤད་ shä, which is a vertical line (།) used to to mark the end of a sentence or to break a long piece of text into smaller segments.
Longer sections of text and lines of verse are followed by two shä (༎), and chapter or text endings are often marked by four shä (༎ ༎). Text titles and the recto pages of a pecha are marked with a ཡིག་མགོ་ y i ng-go (༄༅། །) which helps those pages be easily recognized. This can be particularly useful if a loose-leaf pecha becomes scattered and needs to be put back in order.
There are a couple important usage rules for shä in Uchän script. First of all, the letter ང་ cannot be directly followed by a shä; you must put a ts h ek in between the letter ང་ and the shä. Second of all, when the letter ག་ has no vowel mark, its right “leg” acts like a shä and so no shä is added to it at the end of a segment of text.
So, for example, here is how you would mark the end of a segment of text for a typical word, a word ending in ང་, and a word ending in ག་:
There are many other punctuation marks, variants, and rules, but these are the most common ones and the most important to know.
The Tibetan alphabet has a range of special letters and conjuncts to represent sounds from Sanskrit, English, and other languages. A common example of this is that the ཏ་ row can be written backwards to represent the retroflex sounds of Sanskrit, which are pronounced with the tip of the tongue flipped backwards in the mouth. These sounds are typically romanized by putting a dot below the letter “t”, “d”, “n”, or “s”. Here are the five basic reversed letters (ལོག་ཡིག་ l o k-yik) used for Sanskrit sounds:
Romanized: | ṭa | ṭha | ḍa | ṇa | ṣa |
Devanagari: | ट | ठ | ड | ण | ष |
Tibetan: | ཊ་ | ཋ་ | ཌ་ | ཎ་ | ཥ་ |
The letter ཊ་ is also used to represent the English letter “t”. For example, the Tibetan word མོ་ཊ་ moṭa means “car” or “vehicle”, and is derived from the English word “motor”. A ཊ་ is used rather than a ཏ་ because the Tibetan ཏ་ is pronounced near the teeth, whereas the English “t” is pronounced further back in the mouth, more similar to the retroflex “ṭ” of Sanskrit.
Tibetan script often makes use of abbreviations (བསྐུངས་ཡིག་ kung-yik) to make letters or words more compact. This may involve condensing words into a shorter form, such as condensing བཀྲ་ཤིས་ into བཀྲིས་, or using special characters to save space, such as:
There are many resources on abbreviations, including:
You can also search abbreviations on the Resources for Kanjur and Tenjur Studies’ website (note: your search term must be romanized), or by searching for the original word on Christian Steinert’s dictionary and scrolling down to see if it has any listed abbreviations.
We’ve finished our discussion of Tibetan letters, scripts, and writing, so in unit 5 we will finally turn our attention to Standard Tibetan grammar.