This subject is work in progress. Please bear with us while improvements are being made and assume good faith until the edits are complete. For information on how to help, see the guidelines. More subjects categorized here.
The following is a tutorial made for VOCALOID fans by fellow VOCALOID fans.
The following is a list of auxiliary phonemes that will alter the effect of a note or control some effects in a certain way. It is possible to use a VOCALOID voicebank without having to ever touch this set of data, however, use of them within a song can improve the results of a VOCALOID's ability to sound more human-like and emotional. In all cases, the data has to be entered manually through the note properties selection.
[*in] and [*out] are used to mimick a singer inhaling and exhaling while singing. Because the singer is breathing out as they sing, there actually is little need for [*out] and its usage is more limited. For using them correctly, it is best to work them in a different track from the main melody.
[Asp] seems to be thought as a slot for a Breathing sample, however not all of the first generation voicebanks seem to have it. In the case of these ones, it seems to alter the way how the sound is rendered, affecting the pronunciation.
[Asp] seems to be an empty slot in the phonetic system carried from the first version of VOCALOID. However as an empty slot, it affects the way the sound is rendered, thus affecting the pronunciation.
When placed between a vowel pair, it generates devoicing and blending of the second vowel. It has various potential applications.
This one can be used as a blending phoneme, allowing to correct some choppy phoneme combinations, a common problem in the Japanese voicebanks. There is less need for [Asp] in English voicebanks as these ones tends to have the opposite issue (unwanted blending and slurry sounds), making [Sil] much more useful to them.
The blending effect also allows the emulation of some sounds, like producing a faux /æ / by combining the [e] and [a] phonemes.
Breaks the transition recording between two phonemes.
Also allows the insertion of a rest between notes. It's particularly useful for keeping small notes separated at high tempos or for generating a staccato effect.
Big Al's use of [R] is much more limited than other English VOCALOIDs. It works at the beginning of a syllable and only if there's any adjacent note before this one.
For achieving this effect with Japanese voicebanks, it requires blending successive flaps into a trill, producing a sound similar to rolling R.
Breaths ([br1] - [br5])
These replace the [*in] and [*out] phonetics of VOCALOID, with the exception that the majority of VOCALOIDs for VOCALOID2 have inhaling sounds, except Big Al, who has grunts and exhaling sounds.
With the breathes ([br1] to [br5]) in VOCALOID2, there is one particular glitch to note. If several are placed together side by side with no breaks between them, then depending on the note length, not all of them may be sounded out. This also may happen if they're placed to close a singing note. For this reason, it is recommended a user uses them in a different track.
DevoicedSonorant (replace the <*> for the intended phoneme). Adding _0 to a sonorant alters its pronunciation to be pronounced in a voiceless way, making it barely audible. The sonorants includes the vowels, glides, approximants, liquids (laterals and rhotics) and nasals, thus the available devoiced sonorants varies per language.
Available for all the currently released V3 and onwards voicebanks. Also works for V2 voicebanks imported to V3.
"Asp" is short for "aspiration" and normally applies to consonants rather then vowels.
From VOCALOID3 and onwards, the [Asp] can't be used as a blending phoneme anymore, as apparently the pronunciation glitch was removed in the succeeding version of the software.
Despite this, this phoneme still seems to affect the rendering of the Pitch. 
Now it has an easier usage for the Japanese voicebanks, as typing っ into the lyrics will put this phoneme automatically.
Initial ㅇ = N (silent consonant/glottal stop) and Final ㅇ = Np (retroflex nasal consonant) can work for Korean VOCALOIDs in place of [Sil].
Extends the vowel pronunciation across several notes.
Japanese voicebanks now can use this phonetic data, however it's usage is not critical due to the way the language works. For this language, it allows a better control the pronunciation.
First, allows to control the stress or attack of the vowel, making it more stable and consistent across several notes.
As it affects the rendering at triphonic level. also may change the pronunciation and stress and transitions of the adjacent consonant.
Other languages such as English cannot connect certain pronunciations (diphthongs and rhotic vowels) without this and it helps smooth the transactions across notes. It still can be utilized as the old hyphen/slash input in the lyric, used in V2.
Some voicebanks, like SONiKA or Miku English, may manifest some particular sound glitches when using this phoneme.
Most of the VOCALOID3 and later voicebanks now insert their breaths via WAV sample insert, removing the need for [br1] - [br5]. However, OLIVER and various other VOCALOIDs still can make use of the VOCALOID2 system.
The Devoiced Sonorant [*_0] hasn't much use in English voicebanks due to the way the language works (the reduced vowels tends to shift toward an unstressed vowel rather lose their voicing). Anyway, it still can be used as a supplementary resource.
In the case of Japanese voicebanks, it allows a more natural or colloquial pronunciation. Usually the vowels [M], [i] (after a voiceless palatal consonant) and occasionally the [o] (after voiceless plosive) becomes devoiced when they're placed between two voiceless consonants or between a voiceless consonant and a silence. However, this varies across the dialects and also depends on the emphasis or speech way.
Many producers have found this particular phonetic data useful in extending the VOCALOIDs' language capabilities. For example, with a Japanese VOCALOID, it allows them to produce a much closer level of English capabilities or other languages, as it allows to work around the limitations produced by the restrictive CV phonotactics of the Japanese language, producing single or coda (end of syllable) consonant sounds.
The [*_0] also allows some interesting effects, like imitating a whispering voice, some breath effects, or doubling as a voice release.
When importing, the VOCALOID3 software is capable of making the Devoiced Sonorant ([*_0]) out of Phonetics already available within the imported V2 voicebanks. It is unknown the way the software processes the samples to achieve this, however it may be related to the manipulation of the harmonic content, an option previously available in V1 .
[?] greatly increases a VOCALOID's potential capabilities by allowing them to mimic certain dialects such as Cockney, North American and Scottish English.
In Japanese, glottal stops occur at the end of interjections of surprise or anger, and are represented by the character っ. However adding っ into the lyrics will input [Sil] a phoneme instead.
Korean has a different symbol for the glottal stop, being replaced by [N].