Vocal Synthesis Tool UTAU (歌声合成ツール UTAU) (commonly shortened to UTAU) is a voice synthesizer program currently available for Windows and Mac OS X systems (the Mac version being named UTAU-Synth).


UTAU is a shareware vocal synthesizer program unlike VOCALOID, VOCALOID2, and VOCALOID3, which are commercially-sold programs with an accompanying voice bank. Distribution of UTAU began in March 2008.

UTAU, meaning "to sing" in Japanese, has its origin in "Jinriki VOCALOID" (人力ボーカロイド, "Manual Vocaloid"), the act of re-editing an existing singing voice, extracting tones as WAV files, and reassembling them. In December 2007, Ameya/Ayame (飴屋/菖蒲), using LOLI.COM's voice samples, released a beta software called Loliedit featuring a simple voice bank called "Loline Com" (a pun on the original voice provider and the "ne" particle originally used on Crypton's VOCALOID). It features a simple interface with a piano roll, has limited mora (Japanese syllalbes) and works with a primitive beta engine (or "resampler") later updated on UTAU. This beta software can still be downloaded from Ameya/Ayame's website. In March 2008, Ameya/Ayame released a free, advanced support tool to aid a "Manual Vocaloid" process, called UTAU.

The program comes with a default voicebank of 142 samples of Japanese syllables generated from the default voice of A-QUEST's text-to-speech software "AquesTalk". Any user can load their own voicebank into UTAU to use. However, without the explicit permission of the voice donor, it is a violation of copyright laws. Those laws protect the rights of any vocalist who may not wish for their voices to be used within the program, such as celebrities. Any music made through this program can be used in the commercial sector. UTAU can be downloaded for free from the home page. It will not run properly on computers which do not support Japanese text or AppLocale.

Some UTAU voicebanks have been put out as "real" VOCALOIDs, such as the April Fool's joke origins of Teto Kasane. Songs using both UTAU and VOCALOID are also not unheard of. Some users have also began to enforce their copyright ruling over their voicebanks; UTAU or fanmade VOCALOIDs who are guilty of plagiarizing an UTAU's name or using a voicebank without permission risk violating UTAU software agreements and voicebank copyright ownerships.

Usage in MusicEdit

UTAU is well supported as an alternative to VOCALOID and is favoured in both the VOCALOID and UTAU fandoms, with VOCALOID fans often supporting it as an alternative to pirating the VOCALOID software itself. This support extends to its ability to create a new vocal from scratch, since this is a ability VOCALOID commonly lacks. Other than this, the principle of both software is the same as UTAU and VOCALOID both share a multitude of common traits and abilities with each other.

For those unsure of their handling of VOCALOID, UTAU can also act as an introduction to synthesized vocals and aid in making the decision to purchase a VOCALOID.

The reasons for UTAU's popularity are owed to some major differences between it and the VOCALOID software (listed below in the "strengths" and "weaknesses" section). It's for these reasons that there is some debate as to whether this software is overall better than the VOCALOID software or worse. It is able to compete with VOCALOID, the reason being that there is a sizable gap between what areas each software covers. UTAU has earned a reputation as the closest rival software to VOCALOID for these reasons and has managed to stay competitive over the course of its existence, whereas other software such as Cantor failed to see continued development.

✔ StrengthsEdit

UTAU saves data in the .UST (UTAU Sequence text) format and is capable of converting .VSQ files to .UST. Since few software packages can read the .VSQ file format beside VOCALOID itself, UTAU has been an attractive alternative and partner software to VOCALOID.

UTAU also has the advantage of having its development occur at a faster pace. It has plug-in support and users have made a number of plug-ins that greatly improve the software's handling and experience. This support was established fairly early in the software's existence, whereas VOCALOID did not gain this ability until VOCALOID3 in late 2011 and even now it only offers a limited access to source code and plug-in support. Therefore the plug-ins for UTAU can often prove invaluable to users as they can effect the software's results and quality greatly.

UTAU had the advantage of being able to see updates that took instant effect and was able to adjust itself to feedback and suggestions, as well as other such ideas. As a result triphone ("VCV"; vowel-consonant-vowel) voicebanks were created by 2010, whereas VOCALOID did not gain this capability until 2011 when VOCALOID3 was released. Even in comparison to VOCALOID3, the amount of languages offered is much larger with some vocals able to do more than 10 languages. For VOCALOID, there are very few VOCALOIDs with bilingual capabilities, and the software only offers 5 languages at the most. Voicebanks practically work with any version of the software, thus issues seen between different versions of VOCALOID and VOCALOID2 software (such as those displayed by KAITO and Prima) are usually absent.

The UTAU software is open license, which means that vocals from other software can be used in conjunction with the software, so long as it complies with the other software's agreement (VOCALOID cannot be used in UTAU legally for this reason as its licensing is restricted). There are hundreds of vocals for the software and the type of vocals are much broader and cover a variety of different genres and vocal types. Most of these vocals can be obtained for free. In VOCALOID, one is restricted to just the vocals offered for sale, with no chance of producing one's own vocals for the software should none of the current releases spark one's interests.

✘ WeaknessesEdit

UTAU is one of the few programs able to convert VOCALOID data files for its own use. However, .UST files itself do not hold as much data as the VOCALOID engines' VSQ or VSQX file extensions, and UTAU does not try to convert many things into even its rough equivalent, only placing the notes. As a result, loss of data may occur. It currently does not support the VOCALOID5 extension, VPR.

As for the engine itself, there is a level of uncertainty in how to grade the results of the software. The advantage of UTAU being simply an interface has resulted in a large range in quality of UTAU's results, with many engine plug-ins ("resamplers") being created, all with different results.

UTAU is not professional software while VOCALOID is produced as a professional software package. For this reason it overall doesn't produce the same quality results as VOCALOID. This also gives an additional drawback to the software; whereas VOCALOID gives a means for professional singers to release their vocals much safer, with the singers not only getting something out of each sale, but also there is a definite structure to using the vocals with and without the singers consent. In contrast, UTAU vocals may not offer any form of commercial-based distribution security; there is less chance of a professional singer considering to offer their vocals to the engine. As a result of this, it can be at times difficult to find a standard level of quality within the vocals offered.

UTAU was created for Japanese vocal synthesis and a large majority of the fanbase is in Japan, so finding quality non-Japanese vocals is often harder with more complex languages such as English. Additionally, UTAU does not officially support a way to handle final consonants, which are featured in many languages, such as English, Korean, and Chinese.

A large part of the vocals offered for the engine are of poor quality in comparison to the standard of vocals offered by VOCALOID or other synthesizers. Users creating vocals for the engine may not take full advantage of the tools UTAU offers, resulting in a serious undermining of the voicebank creation process. Basic bug fixes, glitches or mispronunciations are common especially in foreign voicebank creators, simply because there is no standard checking process for all vocals.

The standard of practice within the UTAU community is vast as a result, technical support may or may not be offered by a voicebanks creator and most common support is found within the UTAU community. Not all vocals are offered freely and some have to be paid for in order to be used (However, this is very uncommon). Unfinished vocals may be never completed, or abandoned altogether. Some vocals are also not recorded with high quality microphones or configured properly, since quality is not certified for any voicebank except a handful such as Kasane Teto. This is also one of the reasons why users will fall back onto the few reliable vocals at times such as her or ones such as the Macne Nana 2S vocals, as these vocals are considered the "safest" to work with, having a greater effort put into them. These do not nesscarily represent the best of UTAU, but are just reliable or supported.

UTAU has not seen development since 2013 and is currently subject to problems regarding it in term of stability and OS updates. Also, by 2018 its progression against other vocal synthesizers such as VOCALOID5, CeVIO Creative Studio or Synthesizer V have become notable. While UTAU is still a suitable and popular alternative for these more modern software, it is now considered dated in comparison.


Under no circumstances is a VOCALOID vocal allowed to be exported into UTAU. However there has been cases of this occurring.

In 2019, there was a case wherein a user claimed a provider had granted them permission to export a VOCALOID into UTAU, however, it was quickly pointed out the illegal nature of this. The provider of a VOCALOID cannot grant permission of use of their VOCALOID within UTAU as the permission is gained from the developer of the VOCALOID voicebank and most importantly from YAMAHA Corporation.[4]


Over the course of its lifetime while UTAU never had the popularity of Vocaloid, it was a popular alternative even long after it ceased development in 2013.

In 2014, a year after it ceased development, the tag for "UTAU" had received approximatively 1/5 of the popularity of "Vocaloid" on Niconico overall.[5]

By 2016 the overall number was between 1/7 and 1/8 that of Vocaloid, however, at this point all Vocal synthesizer search results had fallen greatly by up to half across the board for Niconico's website. Which means that the interest in Vocal Synthesizer technology had fallen greatly on the website. However, UTAU's popularity had taken more than a 60% hit compared to others such as CeVIO and Vocaloid and has been one of the hardest hit vocal synthesizers of all by the decline.[6]


  • UTAU offered the ability to create a voice, with the advantage of legally owning the voicebank and controlling how it was passed around. UTAU led to a decrease in fanmade VOCALOIDs because their creators did not have this advantage.
  • The best-known voicebank for UTAU is Kasane Teto. She is recognized as the first UTAUloid. Although, she is actually the third UTAUloid.


  2. link
  3. link

External linksEdit