Vocaloid Wiki
! The following is a tutorial made for VOCALOID fans by fellow VOCALOID fans. !

Optimum Range and Optimum Tempo are recommended ranges often given out by VOCALOID developers. See also, a listing of vocal stats here.


Optimum range and tempo were first given out in VOCALOID2 as a guide to where the best vocal ranges are within VOCALOID itself. The first release to have this listed was Hatsune Miku and it became common for Japanese voicebanks in particular to list the recommended optimum range of the voice and its tempo capabilities.

It is often argued that this is merely a recommendation, but they are a form of "beginners guide" on how to use a vocal and there is a reason they are given out. They allow a user to know where best to work with each vocal. It lets users know where the best singing results are and in theory makes using each vocal easier especially for newer users still learning the engine. A potential consumer can read the optimum range and tempo and know if this VOCALOID voicebank is for them and their style of music, or if they wish to buy it regardless know what the characteristics of the VOCALOID can be expected to follow in terms of performance.

It is important to remember that even within these optimum ranges, the VOCALOID still has limitations set by the engine itself. Meaning that a VOCALOID is still a vocal synthesizer and has limitations on how good it can be. Users shouldn't expect a result better then VOCALOID can give itself.

Despite how notable these optimum range and tempos are, very little information is said on them in regards to the YAMAHA Corporation or any of the 3rd party developers of voicebanks. What is said comes mostly from observations and reactions to the engines users.

Optimum Range[]

VOCALOID3 list of optimum ranges for various releases on the VOCALOID STORE.

This is the more impacting of the two optimums as it has a lot more apparent and distinct results on how the vocal sounds.

Optimum range is where on the piano roll a VOCALOID is best at singing. This does not mean that a producer should only stick within this range, but it gives an idea of where to begin in working with the vocal. Usually, a VOCALOID's best keys are within certain areas of the optimum range, and going outside of this range sees a large deficiency in vocal quality.

When venturing outside the optimum range, notes are much more prone to tone collapse, which impact the singing vocal results. Depending on the version of VOCALOID is how well the results are handled. However, users who know how to repair these issues will have no problems with venturing outside of an optimum vocal range. Though some VOCALOID releases such as the original Megurine Luka[1] or Megpoid[2] voicebanks really suffered outside of their optimum ranges. For Megurine Luka, her high notes became thin and weak making her unable to go high into the octaves and making it difficult to reach a soprano style vocal ranges because her treble range will suffer collapse. GUMI meanwhile could suffer heavy vocal collapse if venturing higher then a mezzo-soprano range due to background noises becoming more apparent and creating a noise loop.

The optimum vocal range is based on the VOCALOID's combined recorded Stationary layers, how many they have and how many keys for each layer, with some additional leeway also noted due to how compatible the voice is with the engine, optimum ranges therefore also double by letting a user have an idea of what the VOCALOID's singer vocal range is such as if it is a Baritone, Alto, Soprano, etc. VOCALOIDs like kokone or Macne Nana V4 have fairly large vocal ranges due to the recorded number of layers or quality of their samples recorded. However, as confirmed in regards to Dandy 704 from the Chipspeech software, having multiple layers doesn't necessarily increase a vocal range, as a monotone layer produces the most amount of flexibility in a vocal and in turn higher engine compatibility range. However, the reason for higher layer numbers is to increase realism. This can be exampled by ALYS of the Alter/ego software, who originally had a mono-layer release and then later a multi-layer release with notable differences in realism between them and vocal range.

Once venturing outside of the recorded keys, VOCALOID continues to morph the voice to match the keys expected behaviour, with some improvisation on the engines part being observed as a result. The results are vocal traits that get stretched out and recalculated to be able to hit each note and this can create additional glitches, bugs and general flaws. The result can be unpleasant at times and realism especially can be lost. This is also the reason for tones commonly collapsing outside of a optimum vocal range as vocal traits that maintain tone quality are lost during the morphing process to match notes outside of a optimum range.

As such, a user may like Hatsune Miku at a key of #A2, but a VOCALOID like VY2 can handle that note better because it is within their vocal range. They suffer far less issues then her [Miku] at this key. Therefore it makes more sense to use VY2 at this lower range over Hatsune Miku as they will produce a higher quality result then Hatsune Miku will. However, baring in mind this also works the other way around with Miku being much better at soprano scale lyrics than VY2, unless switched over to their "Falsetto" voicebank, VY2v3, which can handle that range.

Over time, studios have increased the number of layers they give a VOCALOID voicebank in many cases, for example in VOCALOID2 the number of stationary layers was just 2 while by the end of VOCALOID3 3-layered releases were being noted, therefore the optimum range for these vocals are often smaller then many VOCALOID3 or later voicebanks. In comparison, voicebanks that are updated to later engines do not always increase the number of layers, therefore the voicebanks tend to stick more closer to their original release ranges when compared. Macne Nana V4 over her original release, the Character Vocal Series or the VY series are all example increased ranges, though Macne Nana remains the vocal with the most increased range. Vocals like Megpoid V4 maintained their original ranges.

Despite the vocal range being given out, it is not safe to presume that a VOCALOID will preform flawlessly and that no type of issue will occur, as exampled by SONiKA. Though she has a 12 whole key vocal range, around #B2 to #G3 keys are crisp and clear, but the rest are mumbled. This can given the illusion that SONiKA is only capable of using the keys around the #B2 to #G3 range, when in reality these keys sharpness are caused by the layers of her vocal stacking up perfectly. This often happens to many VOCALOIDs, though SONiKA is one of the most noticeable of the VOCALOIDs to show this happening. Another example is KAITO V3 versus his original release, his "vocal range" in VOCALOID is not the result of an actual vocal range itself and VOCALOID3 KAITO actually has the larger sample pool, meaning it has a large vocal range.

In addition it is important to remember that while results in the optimum vocal range are a voicebanks highest results, their behaviours are not always consistent throughout. For example, kokone has both a normal range and a falsetto range built into a single voicebank. At points in the octaves as a user goes up and down scales, the vocal will either sound more like normal singing, or like falsetto, sometimes switching without warning. Other voicebanks such as SF-A2 miki can sound very different depending on where on the octaves use her and KAITO's treble range being strong and his bass more soft.

Optimum Tempo[]

VOCALOID3 list of optimum ranges for various releases on the VOCALOID STORE.

In comparison to the optimum range, tempo is a lot more flexible, particularly in later engines due to improvements to its data handling. VOCALOID3 and later voicebanks venture as high as 200bpm, which is unheard of in VOCALOID2 optimum tempos.

Optimum tempo is more centered around the articulation layers capabilities to maintain words as well as the best suited genres of music for the tone of that vocal so are based on samples themselves such as how they sound and their sample length. In short, this impacts genres of music the VOCALOID can cover well.

For example, a typical fast paced rock genre music is set at 120bpm, while a balled is more 60bpm. For Yuzuki Yukari, that means she can handle those slow ballads as well as the higher paced rock songs thanks to her 60 ~ 120bpm tempo range. However, Kizuna Akari is much better at handling rock songs due to her better tempo handling, but worst then Yukari at handling ballads since her optimum tempo range is 70 ~ 190 BPM. Yukari can get around this thanks to her later "JUN" release from Yuzuki Yukari V4 thanks to its 80 ~ 200bpm range, making her even better then Akari at handling these faster paced genres of music, but once again loosing out on her coverage of ballads in the process.

Tempo ranges are a lot easier to work around then vocal ranges as this doesn't necessarily impact the VOCALOID greatly enough to notice. In addition, it is often independent on the size of the optimum vocal range, as exampled by the ZOLA PROJECT releases which each have 3 small optimum vocal ranges, but huge optimum tempo range. Therefore small vocal ranges do not result in a voicebank's inability to handle music at all.

The most notable thing impacted is lyrics if the song has too extreme on either end of a tempo optimum capability. Tempos too high or too low can produce low quality results.

In high tempos, the result is words not being heard and not forming correctly. One of the earliest examples of this is "裏表ラバーズ (Ura-omote Lovers)" which some listeners find they are barely able to make out the lyrics because of how fast the tempo of this song is at times. In some voicebanks, words completely collapse or sounds such as diaphones or triphones are skipped. Faults like this however can be barely be heard at extreme tempos simply because of how fast the song is moving and the listener not be able to pick up fast enough the faults.

In contrast with higher tempos, lower tempo ranges have the problem instead that the lyrics are not so much the problem. Faults in the voicebank itself such as lack of smoothness can have a tendency to show up more. Tone collapsing can also be more noticeable. There are very few voicebanks that currently have their optimum tempos set below 70bpm, so very few voicebanks can handle these slower genres of music.

Since users tend to use a VOCALOID for any genre of music, this optimum is one of the more ignored of the two.

Unlisted Optimums[]

Some VOCALOID releases do not list what they are such as YOHIOloid or the Standard Vocals, although they still exist. Those who are used to knowing what are the best ranges of a VOCALOID, have little problems figuring out what the best ranges are, though misconceptions can develop on ranges if a user doesn't know how to use a VOCALOID.

Without general editing tools or the education, these can work against any user who can't identify the vocal ranges of these voicebanks and can be off-putting to use. This can also led to myths about the vocal range. For example, prior to E-Capsule Co. Ltd giving out an optimum vocal range, SONiKA was generally thought to have a very poor vocal range due to the aforementioned issues with having a few sharp sounding keys and all others being muffled.

The 5 VOCALOID vocals in particular have no optimum range or tempo listed. This is part to how different the engine worked compared to VOCALOID2 and later engines as its method of referencing samples was different. Vocal range was more likely to work based on how compatible the vocal was with the engine. KAITO being highly compatible with the engine meant he had a lot of "Give" to his vocal. While compatibility still is an issue in modern VOCALOID engines, VOCALOID is most notable for this issue. As mentioned early, this is why users often feel his [KAITO's] VOCALOID engine vocal has more vocal range then his later VOCALOID3 release.


  1. Taken from notes Nico Nico Pedia
  2. Taken from notes Nico Nico Pedia