Phoneme List

The phonetic system forms the basis of speech play back in the Vocaloid software. Symbols used in the phoneme system are based on X-SAMPA.

How Phonemes Work
Note: The following applies to the Vocaloid 2 system, while both programs work in a similar fashion, some things may not apply to Vocaloid or working differently to Vocaloid 2.

The phonetic system works by taking diphong and triphonetic samples from a sample library and reassembling them in accordance to how a words would be phonetic pronounced. The word "example" would be made up of the phonetic components "ex", "am" and "ple", however to make this word phonetically, the data needed is "ɪg", "zɑːm" and "pəl". Both Japanese and English Vocaloids alike will use the same method of arrange the phonetic library, with english Vocaloids requiring more diphongs and triphonetic samples then Japanese all together. Due to to the softwares musical nature, monophonetic and polyphonetics may also be needed to be considered where needed for closer vocal pitching pronounication. The user will however have access only to the pronounication at a phonetic levels and the finer levels of vocal speech adjustments cannot be accessed currently. Many of these samples needed are also not shared by either the Japanese and English languages.

To create and edit phonemes a user must right click on a note click and press "Note Properties". Here they can edit a phoneme and add additional effects through the "Note Expression Property" and the "Vibrato Property" windows. The user can press Alt key and down arrow key at the same time as shortcut to edit the phonetic data directly. For instance, to roll the phoneme U: we would edit to read U: R, making sure there is a space between U: and R so the program knows they are different phoneme effects. When a user is entering a phoneme they must take care as capitalization of a letter may not have the same effect as a lower case letter (example: Z and z are different phonemes, so they don't produce the same result).



A Vocaloids dictionary will attempt to match the correct phonemes with the word the user enters, although a note is that not all words are able to be found by the dictionary. To be able to write the word "bungle" if you do note know how to use the phoneme system, an easier way to find a few similar words and work from there, in this case "bangle" and "bung" can be used to enter manually the word "bungle". "Bung" gives us the phonemes bh V N and bangle bh { N g V l. Bungle will be formed from "bung"s bh V N with the addition of "bangle"s g V l, giving us bh V N g V l for "bungle".

However, the phoneme sounds do not always produce the same results; they may sound differently or weakly/strongly according to their previous/following phoneme sound. To make a consonant sound stronger than the following vowel, editing Brightness, the constant sound's Breathiness or Dynamics higher will often work on some level.

A note is that all the Vocaloids simply do not have the same phonemes, such as [br1]- [br5] the breathing phonemes. There are also some phonemes that found only in one language so not all of the Japanese and English Vocaloids will share the same phonemes. Also, while a Vocaloid's help guide will list the alphabets of the languages they may not include additional notes. If a user allows the program to auto-find phonemes and it has a particular word it simply cannot identify, it will automatically write it as the phoneme u: by default. In addition, if the user enters a phoneme manually that the Vocaloid simply does not have in its voicebank, there will be no sound at all when the Vocaloid is played back.

Using One Language To Create Another
A user can use the phoneme system to create languages from scratch, so long as it is within the Vocaloid's capabilities. For this, due to the set up of the Japanese Vocaloids, they are more limiting for the use of the English language, since the phonology of the Japanese language including phonemes, accents, tones, intonations, moras and assimilation's, is very different from that of the English language. As each consonant sound is always followed by inseparable vowels and consonants do not get in cluster in the Japanese language, generally each of them is pronounced weakly and not independently, except んn, sokuon and some transliterated phonemes for non-Japanese words. Because of this, some of Japanese Vocaloids’ consonant sounds slightly contain vowel sounds to be smooth and sound right in Japanese when they are connected to the following vowels.

Also even if X-SAMPA, IPA, Latin Alphabet or the symbol transcriptions are the same, their actual pronunciations in Japanese and English are not always the same; for instance, symbol S is often pronounced /ʃ/ by English Vocaloids and /ɕ/ by Japanese ones, basically Japanese "a" is a low central vowel and is between the English "a" in "father" and the English "a" in "dad" , and "r" in Japanese is not as same as either "r" or "l" in English. (See "Japanese Phonetic System" below) 

In addition, the English language often put emphasis on certain letters of words (stress accent) while the Japanese language frequently use pitch accents. These differences between two languages frequently make Japanese Vocaloids retain a Japanese accent when there is no perfectly equivalent phonemes, even if users manage them to sing in the correct language. On the contrary, the same things can happen to English Vocaloids and they often have English accents when they sing in other languages.

Another consideration with English Vocaloids is their regional accent. This will not affect any of the Vocaloids' overall performance or the handling of the Vocaloid engine and they will use identical Phonemes regardless. In fact, the only effect this will have on the Vocaloid is simply a particular stress or emphasis on certain vowels and consonants that may not be seen in another English Vocaloid but may make an English Vocaloid sound not how a user expects. Examples of Vocaloids who may be affected by this include Sonika who has a British accent and Big Al who has an American; also included in this is Luka Megurine who will retain a Japanese accent. One noted example of a regional accent effecting a Vocaloid's outcome is Big Al's pronunciation of vowel sounds; he can often be harder to make sing in Japanese because of it. In contrast, Japanese Vocaloids do not have as much of a regional accent effect between them in Japanese.

Regardless it is difficult for any Vocaloids to sing in a language they are not intended for and it may take hours to do through a trial and error process. A user may have to make the dictionary for another language from scratch, this however also allows for a user to be creative, even going so far as to invent languages of their own if they desire. Essentially, the more time a user spends working to get familiar with the phoneme system, the more they can get out of the Vocaloid program. Sometimes phonemes that are not equivalent work better than equivalent ones in the target languages; for example, when Miriam sings in Japanese, v V sound closer to the actual pronunciation of w a as a Japanese particle は than w V.

Another technique that is possible to use on Vocaloids is phoneme slicing. This can be used on Japanese phonemes for Japanese Vocaloids, either in the Vocaloid software itself or additional software like FL Studio. The length of the note is decreased or cut down, until only half the pronunciation needed for the spoken Japanese is heard (example "su" becomes "s"). However, this will affect the singing capablities of the Vocaloid and the notes being cut have to be much longer than normal. Although this technique may be hard for new users and results in a lack of singing smoothness, it increases the chances of getting a closer match to the intended sound. This can also be applied to English capable Vocaloids.

An additional note is so far Sonika is regarded as one of the most potential Vocaloids to "sing in any language" due to her unique set up and Luka will also work for having both English and Japanese voicebanks, according to which language of the program the user is working in. Users' technique often makes surprising results, however, it is greatly influenced by how much a Vocaloid's Phonetic System has phonologically in common with that of the target language without aids of other music/audio software.

For more explanations on the differences between English and Japanese Vocaloids see Language Issues.

Flaws in the Phonetic System
Vocaloids must have the correct diphonetic sounds to avoid sounding choppy. However, the Vocaloid system will attempt to sound out all diphonetic data assigned to the phonemes used, even if that particular sound is not needed, resulting in Vocaloids with too many sounds becoming slightly slurry. A natural speaker may not sound out the needed diphonetic sounds when they sing for various reasons such as a naturally slurred vocals, their localised accent, vocal disorders such as stuttering or speech impediments such as a lisp. This restriction may limit the ability of a Vocaloid in regard to mimicking the language they are intended for.

In some cases, Vocaloids like the Kagamines may have missing pronounications. When a Vocaloid has pronounications they cannot sound out they will not singng anything at all; even if the phonetic data is registered by the Vocaloid engine. The current Vocaloids also cannot recreate some languages due to their different and contrasting vocal structures. This has been pointed out in regards to Sonika's claim related to "being able to speak any language". As a result of being unable to pick out sounds at diphonetic level, sometimes spelling the word as it is spelt in accordience to the dictionary of that language will not produce the correct phonetic sounding results. This means the user will have to swap phonetic data until the pronunciation is correct. This is mostly noticable with English Vocaloids and is owed to the more complex nature of the English language. On occasions, words have to be written as they sound, rather than how they are spelled.

Since Japanese Vocaloids do not have to blend their words like English ones and for having just 500 diphones to use, Japanese Vocaloids can produce choppier results then English Vocaloids when trying to be used for non-Japanese words. Additional tuning both in and outside of the Vocaloid software may also have to be applied. However, because English Vocaloids are the opposite of their Japanese cousins, in this respect produce the opposite problems to their Japanese cousins. They may attempt to match up every phonetic combition given to them within their dictionary where possible if the user does not take this into account. To prevent them doing this, the user may have to break up their phonetic data often, enough to prevent as much of the unneeded blending as possible, avoiding whole word construction where it is most likely to appear. There are also a number of known words that have been used by English capable Vocaloids that have more than one pronunciation of the word due to stress accents. However the user failed to be able to separate the correct results from what the software gave them since Vocaloid can currently only store one pronunciation of the word. Without knowing how to sound out the alternative pronunciation, these words can be considered a problem to non-native English speakers;
 * Wind - The wind blew; you wind me up
 * Read - I will read the book; I read the book
 * Tear - You have a tear in your eye; The paper has a tear in it 
 * Bow - You must bow before royalty; I tie a bow in my hair.
 * Live - The show was broadcast on TV live; I know where you live

As noted in this section, due to the sheer number of things to take into account, English capable Vocaloids can often be potentially far more complex due to the problems presented by the English language, then the Japanese Vocaloids. Liberally interpreted, English Vocaloids have a greater language capacity than their Japanese cousins for having more vowel and clearly separated consonant sounds and are therefore easier to make sing in other languages, although both will only be using the equivalent or quasi-equivalent phonemes according to the set up of the phonetic system of either language. Japanese Vocaloids can often be far more simplier to use, despite the more limited voicebank and low-quality results for non-Japanese words.

English Phonetic System
The following is a list of phonemes needed to make the Vocaloid sing in English.

''Special note: This was the list provided by Big Al's help file, however there were some incorrect entries within the released list. Entering some of the words provided here as examples for the phoneme usage will not result in the expected phonemes that were used for the list. In addition, the list did not indicate which particular letters the phoneme applied to; the wikia has underlined the relevant letters for the benefit of readers. Of the Japanese Vocaloids, only Luka will be able to use this system properly. ''

Japanese Phonetic System
The followings are lists of phonemes needed to make the Vocaloid sing in Japanese.

List 1
Special note: this is based on Big Al's help file and some information is added to show English equivalent/quasi-equivalent phonemes for Japanese phonemes with symbols and compare their actual pronunciations. Even if the symbol transcriptions are the same, their actual pronunciations in each of the language are often different as each IPA shows. This guide is meant for users who is working to make an English/Japanese Vocaloid to sing in the opposite language. However, additional work will be needed to get closer to the target language's phoneme usage.

Additional notes

 * Linguistically, the phonemes which the English language and the Japanese language share in common are k, g, s, z, Z, tS, h, b, p, j and m. Also both English and Japanese voicebanks have e, S, dZ, d, N, n and w, however, these phonemes generally do not sound the same. (See IPA in each language)
 * Since all the voicebanks have their distinctive characteristics, their phonemes do not always produce the same result especially in languages which they are not intended for.
 * The above is particularly true for Miku and Rin, who are remarked to sound excessively aged when singing in normal configurations, higher octaves, but in another language.
 * Some consonants in the Japanese phonemes (and certain English phonemes) are not intended to be encoded standalone. Using them for such may sometimes result in audio distortion, clicks or sound loops.

List 2
Special note: this Japanese phonetic list is taken from help file of Vocaloid2 developed by Crypton Future Media.

Additional notes

 * Crypton’s Vocaloids, including Kaito and Meiko, have almost the same Japanese phonetic system. To use z, Z, h\, N and N' , users need to edit the phonemes, not entering kana-characters.
 * Rin/Len Kagamine Act 1 can pronounce h\ while their Act 2 cannot (comparison of consonant sounds Act 1, Act 2).


 * Vocaloids of Internet Co. Ltd., such as Gackpoid or Megpoid, mostly share the same system as Crypton’s, but they do not have z and Z sounds. As is often the case with the Japanese language, they are replaced by dz and dZ.
 * Commonly h\ sound works only in h\ e(ぇ, xe) and h\ o(ぉ, xo).
 * Japanese VOCALOID2 voicebanks can combine a and i phonemes (eg. w a i) but not with the original VOCALOID voicebanks. The workaround is to simply use the y consonant. (w a j)
 * N\, N or n alone tends to be pronounced as "ng". This is the basis for Japanese vocaloids being used for South-East Asian languages.
 * However, some SEA languages have a different way of pronouncing "u", which is different from the Japanese. Only Miku, Gackpoid and Iroha can pronounce "u" closer to the way SEA languages do.

Misc.
The following is a list of phonemes that will alter the effect of a note in a certain way.

''Special note: Not all the Vocaloids will share these particular effects. Sweet Ann, for instance, does not have the breathing phonemes. Some vocaloids, such as Kaito and Meiko, have a breathing phoneme /*in/ instead. Sonika has much more capability within her Voicebanks, but lacks the 5 breathing phonemes. Despite Prima having them, Tonio does not.''


 * Example of Breathing Phonetics in use
 * Example of rolling phonetic in use

Additional Help
Also note, both Zero-G and PowerFX also have tutorials of their own.


 * How To Make a Vocaloid Breathe Using VOCALOID: Explanation on how some of the Japanese Vocaloids sound when you use the breathing effects
 * Comparative Table of English and Japanese Phonetic System of Japanese and English Vocaloids, including notes on if the vocaloid has this phoneme. List also includes information on how to transform the quasi- equivalent phonemes in Japanese and English into the opposite language effectively.
 * Vocaphonetic: A Japanese community site for creating and distributing Japanese dictionary data for English Vocaloids to sing better in Japanese. The dictionary data for Vocaloid and Vocaloid2 are respectively available.
 * Vocaloid Phonetic Library - a quick look up guide for Phonetics of all Vocaloids.
 * From English to Japanese - Using Tonio, this is the instructions for how Japanese users can make Tonio sing in Japanese. Also shown is how close to and how much of the Japanese language Tonio can reproduce.
 * Tutorial - here you see a tutorial showing a user making Miku sing in "english" Japanese phonemes.
 * Making Big-Al sing Japanese