! The following is a tutorial made for VOCALOID fans by fellow VOCALOID fans. !
🛠 This subject is work in progress.
Please bear with us while improvements are being made and assume good faith until the edits are complete.
For information on how to help, see the guidelines. More subjects categorized here.

The following is a tutorial on what is needed for a basic VOCALOID bedroom/indie musician set up and highlights many of the things they may need to consider, some which may not be apparent or obvious when first starting out. This has been catered with newbie VOCALOID producers in particular.



A special budget starter kit Amazon once sold showing some of the items that a bedroom EDM producer may have in their set up[1]

To begin the user needs the following;

  • A computer, PC or MAC
    • It must meet the requirements to run the software, so things such as RAM, etc have to be checked.
    • VOCALOID and VOCALOID2 are only PC compatible, but Windows version is important with VOCALOID being buggy post Windows 7, with some bugs known for VOCALOID2 in Windows 8 and 10.
    • VOCALOID3, VOCALOID4 and VOCALOID5 are PC and Mac compatible though be aware the license for these versions only covers either and not both.
    • Physical copies of any version of the software also require a CD or DVD drive.
  • The VOCALOID software and at least 1 voicebank
    • VOCALOID3 and VOCALOID4 and their respective voicebanks can be purchased separately.
    • VOCALOID, VOCALOID2 and VOCALOID5 come with at least 1 vocal.
    • Piapro Studio is compatible with VOCALOID3 and VOCALOID4 vocals, it comes with any VOCALOID3 and VOCALOID4 Crypton Future Media, Inc. VOCALOID.
    • For users of Cubase, there is also VOCALOID Editor for Cubase
  • A DAW (Digital audio workstation)
    • Recommended DAWs include Steinberg's Cubase or Presonus's StudioOne software. Piapro Studio also comes with a DAW.
    • ReWire and VSTi supported DAWs are recommended for certain versions of VOCALOID such as VOCALOI5 can use them. An $80.00 ReWire & VSTi add on can be bought for StudioOne Artist versions 3 and 4.
  • Audio editing software
    • Many DAW have their own editing software included.
    • Being able to at least mix VOCALOID vocals with instruments is a required skill, but it is important to also be able to add various vocal effects such as vibrato, echoing, etc.
    • Free options like Audacity are also available.

Additional ExtrasEdit

The following are non-essential but useful tools.

  • A MIDI keyboard can also be an option for general music making. VOCALOID5 can use these as an alternative input method, though older versions of the software may not.
  • A decent set of headphones for playback reasons as a good headset will allow hearing delicate levels of sound and it is useful for helping them listening to their music as it is being created.
  • Various instrument producing plug-ins
    • Samples, plug-ins and other software for use for creating music.
    • The DAW may come with some of these to begin with depending.
  • Graphics tablet, this is usually used for media especially in art based projects, but certain tablets are a recommendation for use for music production as you can use it for drawing lines on parameters sections quickly and more precisely then with a mouse.
    • Tablets do not work too great with VOCALOID due to the slowness of drawing lines within parameters, however they tend to work well with Piapro Studio.[2]


Initial Set UpEdit

It is important to be aware that even a basic set up for VOCALOID, can be expensive. An average producer could pay $200 or higher just on VOCALOID itself, depending on what they are after in quality. For some places such as the United Kingdom or European Union countries, extra VAT tax will add additional amount of money to the purchase of the software, which will add 20% to the price of all purchases.

This is without including $100+ for a DAW, though free options are available.

This is for a basic bedroom producer set up price. So a budget has to be set with what exactly the producer has in mind and consider non-essentials may have to be purchased at a later date.

The main need to set a budget is that it may be a prejudice to be tempted to spend a lot of money when they first start out and instead focus on a few things such as a percussion, a lead instrument, etc. Many modern bands consist of members that are a keyboard pianist, a lead guitarist, a bass guitarist, a drummer and a lead singer; back up singers and rhythm guitarist roles being optional extras. So a basic guide of which instruments to invest in can vary from this but will be the same sort of style. Some money can be saved in the long term and costs reduced to know exactly what the producer needs before setting out.

Other styles require different set ups. For example, classic jazz will greatly vary the instruments and use some such as a saxophone, a trombone, a trumpet, a cello or double bass, as well as a classic piano. Opera style music will usually work with orchestra style instruments such as harps, classic pianos and other traditional style orchestra instruments. If the user seeks for this music style, then companies such as Zero-G Limited or Internet Co., Ltd. often have sample packs and other resources covering this. Seeing what studios have to offer is a good reflex to have. The genre of music will likely cause the user to have to consider what type of VOCALOID voicebank they may also purchase.

Some developers of VOCALOID voicebanks are aware of the situation and cost to get into VOCALOID. A number of multi-bank releases have been released both as a complete package or separately, it can be expensive to buy all the voicebanks separately and a producer usually saves money buying the "complete" bundled release. However, for a low-budget, maybe the only way a producer can gain all of the Vocaloid's voicebanks. The cost is spread out over a period of time allowing the producer to slowly get all the voicebanks, or just the ones they want.

An example of this is the Megpoid V4 release.

Music can be made itself with only a single instrument such as a piano or a guitar and a singer alone can create a cappella style songs. While it is noted that other instruments may be needed for diversity reasons, having 50 plug-ins for a guitar may be unneeded if the user is only going to use one or two of them.

In addition to the items on the "essentials" list on this page a producer can also buy many other types of sound recording equipment or MIDI instruments, depending on what the producer wants to do. However, these are not necessarily important or useful and are completely dependent on the producers needs or goals with their music. For example, the VOCALOID Keyboard is used for live performances and is one of the few ways of creating spontaneous VOCALOID results. Unless a producer is performing on stage it is not necessary at all. Likewise, the VOCALOOP is also intended for live or random sound creation, but not only it is unnecessary, but extremely difficult to find due to the limited numbers of units sold. When first setting up, sticking to the basics is better than investing in things that a producer may never use or may not prove useful at all for VOCALOID music making.

Updating a Set upEdit

One thing to consider is the long term effectiveness the producer has created as their set up, it may not last forever and investing in what they need as they need it replaced or updated may become an issue. This may not seem so important, especially when the producer is just starting out, but can present itself a problem in the future. Updating a set up is also important so a producer can increase the quality of their music. So for a new user, this is something worth noting before they begin so it does not become a surprise later on. By carefully budgeting and taking into account updates, you can make the set up last and be effective for a number of years.

Always be aware that technology does move forward and that a set up no matter what will sooner or later have to updated. In terms of just VOCALOID, the engine updates every approximately 4 years as for now. Because this has been reliable so far, it may be possible to think in 4 year lifespans and stuck with thinking of 4 years long periods. A decent set up now can last up to 8 years before there is a need to update at all, but in 12 years it may be completely out of date.

However, a producer does not have to always buy the latest equipment out there. Often older equipment can be purchased on second hand, which brings down the price of a set up that is only a couple of years out of date, yet possible to last for a number of years longer despite not being the latest equipment out there. Normally, musicians find a comfortable set up and stick with that only slightly adjusting over time, replacing or adding the old item without replacing their entire set up at once and budgeting what they update. It is very rare there is a need to completely overhaul the set up unless it becomes completely out of date. Some producers can even prefer older models of equipment or older software versions because features and capabilities have been removed in later versions for a number of reasons. A lot of VOCALOID musicians feel this way with VOCALOID4 vs VOCALOID5, as an example.

So while it seems not apparent to think to the future immediately after or before completing the initial set up, it does help to at least make a producer aware of what to expect going forward. An organized producer will always have a plan, even if it is rough, of the future goals and aims to progress forward, after all the set up is a type of investment, be it for hobby or business purposes.

Updating VOCALOIDEdit

As for VOCALOID itself, while updating adds new features, there is not always a need to update as soon as the engine is released in a newer version. General note that older voicebanks usually benefit from being moved into a newer engine, but it is not always possible to update engines as soon as the new version comes out, due to updating being expensive.

There is usually a number of discounts available for updating, these are not universal with some only on offer for Japanese customers only. While a producer does save money in between engine versions, this does not mean great amounts of money is saved in the long run.

A producer can actually save more money if they skip engine versions, in other words buying every other version of the software. For example going from VOCALOID3 to VOCALOID5, then they ever could buying VOCALOID4 and then getting VOCALOID5. So if a producer has VOCALOID3, they might as well get the VOCALOID5 engine and ignore VOCALOID4. Unless their level is professional, then there is no need to be fully update and the indie or hobbyish can afford to have a slightly older set up. So despite VOCALOID updating about every 4 years, a producer can get away with updating every 8 years.

Updating voicebanksEdit

In regards to voicebanks, it is very similar. There is not always an advantage or a need to update a voicebank every time that VOCALOID gets a new release and doing so can be expensive. Voicebanks do get dated, for example there is no denying that VOCALOID2 vocals are overall inferior to modern voicebanks, the rule does not apply in every instance. Additional updates can often add or expand on previous releases and be worth the update to gain access to these new additions. For example with Yuzuki Yukari, when updating her to VOCALOID4, two new voicebanks were added on top of the original. Yukari also received other features, exVoice and her VOICEROID+ voicebank was updated to VOICEROID2, making her a well supported update in more ways than just with VOCALOID.

However, not every voicebank update features a vocal that has been hugely updated. It is possible to wonder if when there is obviously going to be a difference between Gackpoid version and V3 Gackpoid, if there is much difference between the VOCALOID3 and Gackpoid V4 version. While there were improvements and adjustments made between versions, in this example, there were less reasons for a producer to rush for updating V3 Gackpoid to Gackpoid V4. The same happened for a number of other VOCALOID voicebanks. Especially if the user owns VOCALOID5, which doesn't have access to XSY, meaning the difference between them is just the basic issues such as quality improvements in V4 Gackpoid. So, there can be no reason to chase after both of the versions at all.

In short, when it comes to voicebank updates, they are not always great value for money as they may appear at first and its possible to make do with the same voicebank for years. Compatibility onward-backwards from VOCALOID3 has existed for a reason as producers have been known to use the same voicebank for years without even bothering to update to the newer version at all.

See Status for a list of all releases.

Extra Resources?Edit

Some VOCALOID voicebanks come with additional resources, which are often mentioned in the VOCALOID voicebanks release pages here on the VOCALOID wiki.

One such example are "Talk" vocals.

Such extra resources often compliment the VOCALOID voicebanks and are worth checking out. If done correctly they can enhance the way a voice can be used, simply by mixing the VOCALOID voicebank or voicebanks with their respective resources. In the case of "talk" voicebanks, it gives the user the ability to switch between singing and talking, and some styles of music benefit from such abilities such as Singspiel or rap music.

Other resources such as the AH-Software Co. Ltd. "exVoice" sample releases add phrases that can be used for natural sounding effects. They can be used as loops or cut and spliced with the VOCALOID voicebank. For example the Kizuna Akari "exVoice" was recommended for use in splicing consonants and using them to replace VOCALOID ones, as VOCALOID has a particular weakness with its strength in consonants.

At the same time, while these are useful, a producer may also consider if they will use them or not. Not everything released to compliment a VOCALOID voicebank can be useful for them, and if the producer is only focused on the basic VOCALOID voices, then there can be no point in chasing that "Talk" voicebank or extra samples.

Free Resources?Edit

In addition, due how pricey a VOCALOID set up can be, taking advantage of "freebies" may also be a help for starting.

For example,

  • The VOCALOID PHRASE PACK SERIES for VOCALOID5 is a free download.
  • Haruno Sora was later supplied a free "Attack Release" function for her that, while it mostly works with just her vocal, was revealed to also work with other vocals.
  • Clara, Bruno and MAIKA all had language support plug-ins for VOCALOID3 and VOCALOID4 that converted VSQ and VSQ files from Japanese phonetics to Spanish, allowing the Spanish Vocaloids to sing "Japanese" via Japanese - Spanish conversion.

These freebies may often be minor, but they should not be skipped over even if the producer does not plan to use them.

Some freebies also come with the voicebank itself, such as Sachiko's "Sachikobushi" plug-in, Zola Projects "ZOLA_Unison" plug-in or the V4x ranges E.V.E.C. functions. It is always worth reading what comes with each voicebank as they often include resources such as VSQ/VSQX/VPR files or even samples found within the download of the software or CD/DVD itself.


A producer looking to publish their work in any way, should never look at pirate software. Be this for Youtube, Nico Video, SoundCloud, a potential client/employer, a bandmate who is going to perform on stage, etc. The simple reason is that the impact can be devastating when others find out that a producer has been using illegal versions of a software. A producer's reputation can often be easily ruined if found out and this can be a consequence of using the software.

For more information, see POCALOID.

This is not the best way to self support a music carer or support VOCALOID itself and it has been known to have devastating backlashes on that producer. In the end, no one can stop someone from using illegal versions of software, but it then becomes producer's fault for using it and no one else.

There are always alternatives such as UTAU, Alter/Ego or Synthesizer V that all have access to free vocal voicebanks and can act as alternatives to VOCALOID. The reality is because VOCALOID is made with professionals in mind, its market is small and the price is high. So it may turn out that because of that price, not everyone can afford it due to circumstances and in that case, the reality is that VOCALOID is considered a luxury item so is not at all an essential piece of software for a set up. Therefore, VOCALOID can be out of reach for some producers and that is something that will unlikely change, as Yamaha do not produce VOCALOID for charitable reasons.

General knowledgeEdit

Music terms and music theoryEdit

Music theory is the collective knowledge on the techniques and technicalities on how music works. For example, a basic terms "silence" and "pregnant pause" and though both involve a lack of noise, they are different from each other. When someone learns about music theory, they are taught these basic differences which will help when writing music or just learning about it in general.

It becomes essential in music writing and making. The reason for the basic 7 notes (A,B,C,D,E,F,G) is because each has a specific frequency able to be picked out from all others. For example, a B5 note on a violin string vibrates at about 494 Hz. If a musician works with that instrument, over time they can learn how a "B" sounds and can recreate that note on demand whenever needed, and can correct the placement of their finger on the string if the note is out of tune. Being a producer of music takes this a step further and requires more then learning how to just recreate that sound on demand, but also piece it together with other sounds and construct something that is pleasing to the ear.

Music theory therefore teaches the producer how to write at certain tempo paces, what the terminology involved with music is and essentially the difference between general noise and actually producing something that sounds like it has direction. In short, they have to learn how each of the instruments work together to create a song, but they also have to learn how to write the song to begin with.

As VOCALOID is a music-based software designed to allow musicians to have a professional singer for their works at their disposal, it becomes important the user needs to understand at least the basics of how to write music. VOCALOID being considered as a vocal instrument, consider they should understand how scales or vocal ranges work and what they are (such as bass, baritone, tenor, countertenor, contralto, alto, mezzo-soprano, soprano, boy soprano). Knowing music theory specifically related to the human voice above all other things may thus be helpful.

The more idea of how music works the user going into VOCALOID has, the better understanding they have of how to make it. If they do not know at least some basic theory information, it is recommended you either take a course with a local educational institute such as a high-school or college, read theory books or look up on-line tutorials lectures and other informational pages. There are many mistakes newbie musicians make that can easily be fixed with the right knowledge and they have to start learning to be a musician.

Software knowledgeEdit


While most VOCALOID guides are in Japanese, magazines such as ボカロPになりたい! (Vocalo-P ni Naritai) are useful for teaching a producer how to use the software

In addition to musical terms and theory, the user will at least have to know the basics of how all software works they are playing to use. Often there are on-line tutorials, in some cases the software comes with a tutorial or example works to examine.

The user should be at a point when they begin wherein they know how to use all their software for basic music production, i.e. writing a song with the bare minimum knowledge. Since many things within a software are learnt with usage, a producer does not have to know every last thing a software does, though they should be working towards learning many of its tools available. For example, within VOCALOID itself, it is not necessarily essential that they learn how to use the phonemes, as well as technically using VOCALOID without lyrics. What is essential is that they try to learn how to create the vocal arrangement and use it in conjunction with their DAW, having a basic understanding of how to mix correctly with the other instrumentals of the song (VOCALOID being considered as a music instrument itself).

In addition to create non-lyric arrangements, if the producer has issues to create lyrics, VOCALOID can be used for basic Loop arrangement. VOCALOID5 includes 1,000 phrases in Japanese and English for such arrangements, and in addition VOCALOID PHRASE PACK SERIES was created to add more.

Users need to have basic understanding of how save files are made within VOCALOID and all relevant software, how to load them. Saving a work in progress is essential and a reflex to learn. For VOCALOID3, the VOCALOID-P data series was created to act partly as examples of different VSQ and VSQX arrangements with various materials being made available within its releases.

Magazines such as DTM MAGAZINE have included tutorials while other books and mooks have existed such as the ボカロPになりたい! (Vocalo-P ni Naritai) magazine. Almost all examples, however, are in Japanese.

Leon Interface 2

The piano roll as it can be seen in LEON's interface

One other thing to note is that even if a producer may not be leaning towards music involving a piano, VOCALOID and most of the modern software have a piano roll. This is because most modern music based software is designed to be used with a MIDI keyboard, or expects those who use it to have the basic knowledge of the piano itself because of how common a MIDI keyboard is in music making. In short, whether or not a producer wants to, they are going to have to learn a bit about the piano or keyboard just to know where the keys are within VOCALOID and its piano roll. VOCALOID5 also supported the MIDI keyboard allowing for real time editing and for anyone working with that version it is really advisable that they purchase a MIDI keyboard, if they haven't one for use integrated within their DAW.

Choosing a voicebankEdit

While there are many ways to approach the purchase of the user's first VOCALOID, it is best to ask basic questions and form an idea of basic planning especially if wishing to buy further vocals in the future. Several of these factors can limit the direction of VOCALOID purchases and usage.

Engine VersionEdit

One of the factors at play when choosing a voicebank is the engine you at looking at and the options available for that engine. The engine version is a huge factor in which voicebanks can be purchased, for example if an user owns VOCALOID3, it is not possible at all to use VOCALOID4 or VOCALOID5 vocals, and purchasing voicebanks working with the engine is thus required. Meaning that they are limited to VOCALOID2 and VOCALOID3 vocals. However, VOCALOID4 allows the use of VOCALOID2, VOCALOID3 and VOCALOID4 vocals. It is important to plan purchases according to what version that a producer wants to invest in.

The core engines are not the only options, some others including:

It can be noted that VOCALOID is not worth more than the price of the current most recent release of the VOCALOID software. For example, it is highly unrecommended to pay $1,000 for the VOCALOID engine if the current version is $400. Unless the user is collector looking for older versions of the engine, they should not be seeked. For example, while XSY may be absent in VOCALOID5, the function is not worth $1,000, this was the price some units of VOCALOID4 were sold at after the release of VOCALOID5. This can be an issue with seeking older versions of VOCALOID.

Language + PhoneticsEdit

One of the most common mistakes made is not taking language into account. As many of the western VOCALOID fandom may also be into J-Pop and anime/manga, it is common for them to snap up the chance to own a Japanese voicebank. However, without a basic understanding of Japanese, this means the producer is limited to cover songs, basic vocal arrangements or "English" with a Japanese voicebank. A machine translation from English to Japanese or vice versa is not recommended either as it produces nonsensical results to both a Japanese native speaker and a English speaker.

The following is a list of available Phonetic information that the Wiki supplies on the 5 major languages Vocaloid has produced vocals for;

Other than this, it is notable that the Japanese vocals were greatly improved over the case of the VOCALOID2 vocals. The later engine vocals gained a better quality compared to the older ones. Japanese VOCALOID3 and later vocals are superior to their VOCALOID2 counterparts in every way. The next major improvement to Japanese vocals is found in VOCALOID4. From Chika onwards, VOCALOID began to appear using a new recording style that captured more traits of the vocalist. This was due to an issue with VOCALOID3 vocals, reported to sound too alike.

English vocals saw a vast improvement in VOCALOID4, having been set back by Yamaha selling a faulty Dev Kit which contained errors. Pre-VOCALOID4 vocals in general have more errors as a result. English is the 2nd most supported language.

Spanish vocals have not seen a release since MAIKA in the VOCALOID3 engine.

The Korean ones have only two vocals: SeeU and Uni, with SeeU being released for VOCALOID3 and Uni for VOCALOID4.

For Chinese vocals, some are currently available for VOCALOID3 and VOCALOID4 and this is currently the 3rd most well supported language.

When buying a VOCALOID in a particular language the user is not familiar with, it is recommended for them to take time learning that language and improve their skills in this area. Otherwise a producer essentially brought a software for a lot of money but can never fully use correctly.

However, not everyone can pick up phonetics straight away and even using a language that they are familiar with, as understanding language input via phonemes can be hard. This is true for more complex languages such as English and Spanish. Despite this, VOCALOID usually can be used with limited knowledge of phonetics and how each phonemes works, there is often a library to save words for languages such as English. VOCALOIDs in those respective languages usually have several thousand of the most common words within their libraries and the limiter is mostly encountered when a producer has to write a uncommon word, which requires working out how to get the voicebank to say that word using its phonetics.

See Phoneme List for more details.


If an user is unsure about things, there are a number of multilingual options. These are recommended especially for those who wish to buy a VOCALOID for one language but wish to have a back up language in case they cannot use the desired language. In short, these vocals are a form of safety net that insures a mistake purchase is not a complete loss. However in many cases, the vocals in one language are not equal to another and the two will not work in the same way. For example, the Kagamine Rin & Len V4X release is quite a bit different to their English release.

The following is a list of options for those who want Japanese and English;

VOCALOID5 also includes in both the standard and Premium versions come with voicebanks in English and Japanese. For the standard release, Amy, Chris, Kaori, and Ken are present, while the premium version includes the aforementioned in addition to VY1, VY2, CYBER DIVA II and CYBER SONGMAN II.

Other language options;

Phonetic conversionEdit

As previously mentioned, is also possible to use voicebanks they were not made for.

The following pages to get an idea on how to convert phonetics from one language to another:

Important; despite the myth, it is not possible to get high quality English results from Japanese VOCALOID voicebanks. Some understandable results can be obtained, but each VOCALOID is initially meant to work only with their respective languages. In all cases, there are always missing sounds, including basic essential phonemes, needed for high quality results. It is important to remember this has it has been witnessed over time a number of even veteran fans recommending purchases such as Japanese voicebanks for English results.

As a beginner, the user would not be expected to refer to these charts as they are considered "advanced" techniques. It is not recommended relying on the existence of this knowledge when purchasing a VOCALOID as it requires general knowledge of phonetic manipulation. In short, it is generally not recommended buying VOCALOIDs in languages such as Japanese and presume they can simply be used for "English" later as the user may not be capable or skilled at this manipulation.

A few plug-ins work with VOCALOID3 and VOCALOID4 that can auto convert languages, such as MAIKA's one. There are several unofficial ones as well, but there is always a risk factor with unofficial plug-ins and that only the ones from recommended sources should be downloaded. In both cases of official and unofficial, they are not perfect and will make mistakes from time to time so it is recommended to check all results.

Haruno Sora also has a supplied function called "Attack Release", allowing her to convert phonetics into "V" and "R", two sounds Japanese speakers generally struggle to pronounce. This allows her to say some words from foreign languages, particularly European languages such as English or Spanish, though it is not a substitute for either of them.

SeeU's Korean voicebank and MAIKA also both come with extra phonemes that can expand their language capabilities.

MAIKA's extra phonetic data allows her to say Catalan words, though can be extended to say words in other languages such as English. SeeU's extra phonetics are designed specifically for English recreation. In both cases, there is no supporting library of words, so even though SeeU has extra phonemes for English, she doesn't have a library of the most common English words. Knowledge of phonemes and how phonetic data works becomes essential for use of these voicebanks extra data. Other VOCALOIDs such as Prima or Sonika also have a trilled "R" sound.

None of these techniques are a substitute for an entire voicebank in the language they can better recreate, but it does mean that they have more chance of being able to say words that aren't in their intended language. They simply give a better method of expanding on the basic capabilities of the voicebank beyond that of being restricted to merely speaking in the languages they were intended for. In short they ensure that a producer can get more out of their VOCALOID voicebanks then the limited linguistic intentions of the base core of the voicebank.

The VoiceEdit

One thing to note about VOCALOID voicebanks is that each has its strengths and weaknesses and each is catered to a particular direction. Even if the user narrows it down to half a dozen voicebanks, there may still have trouble choosing the first vocal they wish to invest in. A newbie producer should really look at only a handful of VOCALOIDs and not seek to own every voicebank that has been released. The typical VOCALOID song focuses on a solo or duet, so even though a producer may have access to 8 voicebanks, they may only use 1 or 2 most commonly of all. For example, though Hatsune Miku has 7 voicebanks in Japanese her two most commonly used vocals are "Original" and "Dark". This is partly why "Light" and "Vivid" were dropped for her Hatsune Miku V4X update.

It is not safe to presume that because an user likes a voicebank that they can work with it, as not every VOCALOID is easy to use and each has its own personality or vocal traits. It is very easy to hear a result of a VOCALOID they like and take this vocal as being the one they should go after as a form of confirmation bias because they like how it sounds. Likewise, it is very easy to also listen to a bad VOCALOID result and have it confirm that the VOCALOID is bad via this same bias as in both cases, good or bad, the result confirms user's belief on the vocal.

Ofclboxart cfm Kagamine RinLen

The Kagamines original release was known for its low quality results and was not beginner friendly. For a producer who wanted their vocals, this was a rough purchase

One argument is that if they like the VOCALOID, they would easily enjoy using it, since the vocal appeals to it. However, the downside of this idea is that if they cannot get a good result from their favourite VOCALOID, the user may feel they have made a mistake and may stop using VOCALOID or that vocal. It is better to have a balanced view on a VOCALOID before buying it, even if they like it, and understand the best they can as to what to expect from it. In the end, the VOCALOID that may suit the style of music the user wishes to create, may not be their favourite, so it may be best to keep an open mind and not dismiss all VOCALOIDs they dislike or miss some diamonds in the rough. There are plenty of other things that can be considered.

The VOCALOIDs usually come with a list of tempo and vocal ranges. The tempo covers the genre of music the VOCALOID is considered as the best at. For example country and rock music are generally a 70-140bpm range, so for the most part GUMI can cover this range easily as her own tempo is 60-175 BPM, making her a suitable option for these type of songs. After tempo, the next thing to consider is tone as each tonal voicebank has a different type of vocal timbre. "Natural" has a vocal more suited in tone for pop music, so an user would be looking at "Adult" or "Power", while ignoring her "Whisper" and "Sweet" vocals.

This will work for all VOCALOIDs. If an user was to focus on the choice of Yukari instead for country and rock music, they may have to consider what her VOCALOID4 voicebanks have to offer and work from there. Her normal JUN vocal is recommended for ballads as it has a 60bpm-120bpm range. While she can cover rock and country music, she cannot do double tempo (140bpm) so well and her vocal will only be able to handle the genres slower tempo of 70bpm. By looking at "ONN", it may theoretically be better for moodier slow songs such as jazz and she is ideal for slow-medium tempo jazz genres such as smooth jazz because it is even softer then the "Jun" vocal in tone. It can also handle more slower styles of ballads such as the Sentimental ballad style. But if the user really wishes to pursue country and rock music types for Yukari, then her vocal "LIN" with its 80-200bpm would be a better option as it can handle these genres the best overall.

While music tempo ranges are not as solid as this and live performance do vary at times (a song can vary in tempo from 73bpm-78bpm if the musicians are trying to match a 75bpm due tonatural errors and variations), for DTM and EDM musicains the tempo is much more fixed and the user does not have to worry about this.

When it comes to constructing a song's layers of music as well, it is important to remember what a VOCALOIDs recommended vocal range is. This tells if the voicebank is suited best for a soprano, mezzo-soprano, bass, etc. vocal range. Often a song has to be catered for and it is important to be careful when mixing the vocal with music. The VOCALOIDs range impacts the frequency of the vocal, which means that the user will have to pick out instruments suiting it, but this also impacts mixing. The vocal range also impacts the role the vocal can take at times, for example a soprano singer can be used for haunting atmospheric effects while a bass vocal for lyrics may sound powerful and deep.

Note that even if the user chooses a vocal that doesn't fit their style, once their skills are good, it is possible to push VOCALOIDs out of their limitations and expand their vocals beyond their limits. However, this requires knowledge of where a voicebank's short comings are in order to fix them. In addition, though VOCALOIDs recommended tempo range is their best range, they do not immediately fail when drifting just 10bpm outside of it, as it normally means the tone is wrong for the genre of music. However some VOCALOIDs experience pronunciation errors outside of their recommended tempo and vocal ranges. In addition, it may often be better to make music that fits the VOCALOID, rather than making the VOCALOID fit the music.

Older voicebanks versus NewerEdit

  • Gackpoid vocal comparison; VOCALOID2(top) and VOCALOID3 (bottom)
  • Lily VOCALOID2 (top) and VOCALOID3 (bottom) comparison
  • VOCALOID2(top) results vs VOCALOID3 (bottom); Gachapoid

One last thing to consider, is about older voicebanks, with which there can be issues. They can have less quality overall, either due to the technology at the time used for recording being less advance, or because standards were lower. They were made by developers who may have not been as experienced at producing VOCALOID voicebanks.

Generally a number of these issues can be minor, but overall the best example is when comparing Megpoid and V3 Megpoid - Native. The VOCALOID3 version has not only diaphonetic data, but was cleaned up greatly and many of the VOCALOID2 versions errors, glitches and general sound issues were fixed. The VOCALOID3 version therefore became far superior to its previous VOCALOID2 version, with the added bonus of when VOCALOID4 was released it could use XSY with all other VOCALOID3 and VOCALOID4 Internet Co., Ltd. vocals. Thus, there is little reason to track down the VOCALOID2 Megpoid vocal at all, except as a collector.

Currently as of 2019, the only voicebanks who do not have modern alternatives from VOCALOID and VOCALOID2 are the following:

  • Prima and Tonio - the closest "Opera" like vocaloid is IA but she is Japanese and they are English.
  • Sweet Ann - there is no perfect match for this voicebank in English, with possibly her closest match being MEIKO English.
  • Big Al - VOCALOID in general has a lack of deep, mature masculine vocals and Big Al has no vocals that come close to him at all currently.
  • MIRIAM - the closest two vocals to her are Macne Nana and Daina, both lack the mature tone that Miriam has in her vocal.

All other voicebanks have have modern alternatives to them.

  • SONiKA is not needed if the producer has both OLIVER and AVANNA
  • Utatane Piko has several alternatives such as ZOLA PROJECTs "Yuu" vocal or Yohioloid,
  • LEON or LOLA can see modern replacements with Amy and Chris.
  • All other voicebanks from these eras were updated, with VOCALOID3 and later engine vocals being much higher quality than their previous VOCALOID and VOCALOID2 versions.

In addition, many VOCALOID3 vocals have issues with sounding similar and not standing out from each other. A new recording style was used in VOCALOID4 for Japanese VOCALOIDs, creating more chance of two similar vocals behaving differently and generally VOCALOID4 and beyond voicebanks tend to be more unique. With this comes the note that they can often be harder to use instead because of the recording style.

Lastly, with newer engine releases voicebanks have been able to do more, since VOCALOID can handle many new types of vocals and often handles the voicebank databases better with each consequential new version. This reflects onto new releases allowing the vocals to have bigger ranges, more types of vocals, larger sample databases, more functions such as plug-ins and features that work with the voicebanks like E.V.E.C..

Baring updates, which most commonly keep the ranges of their previous version, the majority of post VOCALOID3 voicebanks have been able to out perform VOCALOID and VOCALOID2 voicebanks. The number of ways they can be used for music producing, either because they can cover more roles or more genres, even down to the way they can be manipulated to sound differently to how they normally sound. Even down to the type of vocal the engine can handle, for example the "Power" voicebank from MEIKO V3 could only be added because of improvements to VOCALOID3. So the issue with older voicebanks covers more then just improved quality.

Seeking Recommendations?Edit

One of the final things to note is Producers and fans will usually answer questions about their opinions on existing VOCALOIDs or recommendations. The reason is that it is tempting to ask advice because it is normal to ask to vouch for a product as good before a producer buys the product and VOCALOID itself owed part of its popularity to "word of mouth". Since information is coming out all the time the more knowledgeable fans often will know where to start in sifting through all the information on voicebanks and they can be helpful.

However, this had proven to be both a good and bad idea. While producers can talk from experience on the problems on voicebanks, they can also be biased or misinformed. It has been witnessed in the past situations where incorrect recommendations were made due to bias and even professional reviews may be fall into favor of a certain voicebank type.

Ofclboxart icltd Megpoid Gumi

Prior to the release of her Megpoid English release, Gumi's Megpoid release was often recommended for her "English"

Earlier in this tutorial, it was noted there have been incidents such as recommending a Japanese voicebank for English and this is just one of many examples of a bad recommendation, as you can never get the same level of quality English from a Japanese voicebank. One such example was seen with Megpoid. Because of the demo "Fly me to the Moon", despite the song being an example of Japanese being used for English and the voicebanks was not English the vocals were understandable enough to English speakers. This gave the impression the voicebank was great at English, even though this was not necessarily the case at all as the voicebank was made for Japanese, not English. So there were recommendations that put forward Gumi's Megpoid (currently known as Megpoid Native) voicebank for this reason.

But there have been plenty of examples of other bad recommendations and it mostly owed to the divide in opinions or approaches to the VOCALOIDs.

For example, due to the fact there is so many choices available and it is the most well supported and developed version the Japanese voicebanks are often viewed as an entirely "safe" recommendation for starter voicebanks. So often new producers will focus on them over all other Vocaloid even if they cannot use Japanese phonetics, or for that matter, speak Japanese.

This is especially true for those in the west who are into Japanese manga and anime. In particular the voicebanks by Crypton Future Media, Inc. are often the most focused on due to their popularity with other VOCALOIDs not even getting looked at because less is known about them.

However, baring in mind VOCALOID itself can be a complex software, it is also not certain that a producer can use it at all, this is because VOCALOID is made to appeal to professional musicians as well as amateur ones so was written as a software for its particular target audience in mind. Some of the issues in the past were simply a result of far less information being available, in particular to overseas fans and producers with most information prior to 2010 being only available to Japanese fans. From VOCALOID3 onwards, more information has been supplied for overseas reading, both by Yamaha and within VOCALOID communities, so there is more reliable reactions and information available now in VOCALOID5 then there was in VOCALOID2.

It is best to never take the opinions of one person's word as speaking for all who have used the same voicebank. Some producers may not respond to questions at all. Not all of them have the time to help a new producer, can help nor want to help them nor even notice that a question has been asked, though this is not necessarily mark that they are a bad person. Not everyone feels safe recommending a specific voicebank and it is not a good idea to harass a producer for an answer if they don't answer. This is why it is recommended to use a trial version of a voicebank if one exists. Outside this, the best locations to find advice is from within VOCALOID fan specific communities where a greater diversity of opinions can be found.

Ultimately, no amount of recommendations by anyone can force the producer to buy a particular voicebank and it is the judgement of the producer who ultimately decides what they want to start out with. While this page is designed to help people know what they need to set up a VOCALOID working environment, it too should not be taken literally. Others have differing opinions on what that basic set up should be and the different situations of a producer can vary. The set up described on this page is for an amateur or indie musician working from their private studio or even bedroom and is not meant for a highly professional set up inside a studio. It is also subject to change as other editors add more information to it, as well as the conditions for the modern "Indie VOCALOID set up" changing due to progress of technology or of VOCALOID itself.

Expectations for SuccessEdit

One last thing is to keep expectations low and don't expect any real success at all. Not everyone who enters the VOCALOID producer scene is an instant hit, some take a while to get anywhere and others do not get anywhere at all. It may or may not be the quality of a producer's music, it can often be just luck. But it is best not to approach VOCALOID aiming to be successful, instead aim to use it for fun and if success comes anyway, then it becomes a bonus.

It is best to note that since 2014 in Japan, Vocal synthesizer interest has waned and VOCALOID itself has been impacted by this. The current greatest increase is from overseas interest rather then necessarily within Japan itself. There are also rival software such as CeVIO Creative Studio on the market.

VOCALOID itself is counted as EDM music and does not have its own self category of which it falls into. Thus, not only a producer will be competing with other VOCALOID producers, they also will be competing with other EDM producers. In addition if a producer's music leads to employment or other types of work within the music industry, then often it is found that VOCALOID and similar software synthesizers are unwelcome. It is much easier to get somewhere using a real human singer than a VOCALOID singer, so there may come a time when you may even have to abandon VOCALOID as a main instrument. This does not mean entirely abandoning it as VOCALOID can be using for background loops with a real singer and other vocal effects, but it is worth baring in mind the attitude of the music industry itself often favours traditional singers over vocal synthesizers.


Community content is available under CC-BY-SA unless otherwise noted.