💻 Technology article work in progress. What is being worked on: Idea is to redirect all non-VOCALOID tech to this directory page and link related wikis when avalible. Discussions here
For information on how to help, see the guidelines.  More subjects categorized here.

{{>Wiki Directory navbar}}

Sub-root article guide to direct readers to information about various software.



AquesTone Software-icon

AquesTone is a VSTi plugin developed by Aquest, there are four voice options: Female F1, Auto F1, Male HK, and Auto HK. UTAU Uta Utane (aka. Defoko) uses Female voice 1 as a source for the UTAU default voicebank.

The female voice is credited in Pop'n Music ("Chilt Featuring AquesTone") and other BEMANI rhythm games like Dance Dance Revolution.

External links

Examples of usage

  • "ちかてつ (Chikatetsu); Subway" by Calmwind01x ft. TILT (Aquatones)
  • "ミルキーポケット (Miruki Poketto); Milky Pocket" by Calmwind01x ft. TILT (Aquatones)
  • "Time's Intersection" by Calmwind01x ft. TILT (Aquatones)


Filoyo 7e61cc2f-02ff-e011-98e6-0025902c7e73 2 full-1-

Cadencii is a voice synthesizer program and frontend for several other voice synthesizers: VOCALOID, VOCALOID2, UTAU (or rather, UTAU resamplers), STRAIGHT with UTAU, WORLD, and AquesTone. It has its own engine (written by shuraba-P / HAL) named v.Connect-STAND. The source code used to be hosted on SourceForge.JP, but has moved to GitHub.

Cadencii's interface emulates the VOCALOID interface very closely. The piano roll can also change color depending on the synthesizer engine being used; for example, when VOCALOID2 is selected as the synthesizer, the piano roll becomes grey and green, and when UTAU is selected, it becomes blue and pink.


It is currently officially available for Windows and Mac OS X. The latest version available is v3.5.4 for Windows and v3.4.1 for Mac OS X. There is also an unofficial port in the Debian repositories at v3.3.9.[1]

On Mac OS X and other Unix-based operating systems, Cadencii requires Wine to be useful. For OS X there is jCadencii, a JAVA version of Cadencii for OS X. The JAVA frontend is actually compiled from a lot of #ifdef JAVA in the C# Code.[2] If you take a look into the OS X .app bundle, you'll still find minimized wine bundles to support the dlls with native Windows code.

It's not possible to use Mono to build and run the CSharp program cross-platform yet, since there is some native Windows code. Workarounds may be possible in a way like pipelight, a method to provide better SilverLight/Flash support in Unix(-like) systems using wine.

Cadencii has its own file format, .xvsq (not to be confused with VOCALOID3's .VSQx). Cadencii can also import and export other synthesizers' project files, such as VOCALOID's VSQ and UTAU's UST file formats. Notably, it can export as MusicXML, making it a popular choice for creating files that work with Sinsy, especially in combination with its ability to import VOCALOID and UTAU project files.

Besides simply being able to import VSQ and UST files, Cadencii can also read the pitchbends (old pitchbend type/Mode1-only for UST), which can be used in combination with Cadencii's ability to use multiple synthesizers, as seen in the example video below (Tori no Uta).

  • License
    • Cadencii is free software.
    • Source codes of Cadencii are the copyright of kbinani.
    • It's stated that Cadencii is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    • Cadencii is released under the GNU General Public License, version 3.0.

External links

Examples of usage

  • "CadenciiをインストールしてSinsyに歌わせるまでの録画" by nwp8861 (Mimirobo-P) ft. Sinsy; export tutorial
  • "Tori no Uta" by FastSpeedy ft. Nukupoid; UTAU voicebank


Cantor CANTOR1
Cantor2 CANTOR2a

Cantor (and its successor Cantor 2) is a synthesizer developed by VirSyn and utilizes Formant synthesis. It was released at the same time as MIRIAM, and was a rival to the original VOCALOID software. It was sold for £199.99 including VAT which was said to be expensive for its time, but that was due to offering a far greater selection of vocals.

Unlike VOCALOID, it had 50 voices for use, which was far greater than what VOCALOID had to offer upon its release. But because of its design it was more intended to resemble a virtual instrument than a virtual singer. It had capabilities for both German and English, and supported both Windows XP and Mac OS X, as opposed to VOCALOID which was restricted to just Windows XP. However, similar to VOCALOID, it worked as either a stand alone or plug-in and supported ReWire.


As with the early version of VOCALOID, Cantor was a victim of change in the way indie music was produced, and by Cantor 2 was being impacted by this change (alongside VOCALOID) as the demand for synthesized voices began to disappear.

The final version of the software, Cantor 2.1 was released on February 6, 2007. Though updates have since ceased, the software remains on sale, and is confirmed compatible with Windows XP/Vista/7 and Mac OS X 10.5/10.6 versions. A demo is able to still be downloaded from VirSyn's website, although it requires an eLicensed Syncrosoft dongle to be able to download the demo and the full software version itself.[3] The special dongle was included with the boxed version of the software, as well as other software sold by VirSyn. It was not able to be bought with the downloaded versions but could be purchased separately.[4]

Cantor 2 later became available for purchase on the Crypton Future Media Website in 2008, and a demo was released on Crypton's Youtube account.[5]

External links

Examples of usage

  • "Tears of an Angel" by Mike Oldfield ft. MIRIAM + CANTOR

Festival Speech Synthesis System

Fl Software-icon

Festival Speech Synthesis System is a free (libre) / open-source software speech synthesizer developed at the Centre for Speech Technology Research (CSTR) of the University of Edinburgh. In order to enable Festival to sing, there is a plugin required called Flinger (Festival Singer) developed at the OGI School of Science and Engineering that was released in 2001. Free(b)soft has contributed Czech diphone voices and an accessible editor similar to the VOCALOID editor.

Flinger has been used by users in the western world for creating English demo songs.

Most voicebanks for Festival are available under a free license, whereas VOCALOID (and UTAU) use proprietary licenses. However, the voicebanks that come with Flinger are still proprietary. The experimental LMMS plugin Singerbot uses Festival for singing releases.

External links


Filoyo IVoxeliPhone1Small

iVoxel is a singing vocoder with a vocal sequencer developed by VirSyn. It is a built in application for the iPhone and iPad. It has many features beyond a sequencer.

Like other VirSyn products it is based on the same ideas and concepts that IVOCALOID is based on. But like previous software, such as Cantor, it is not intended to be a realistic singing voice.

External links


Filoyo Realivox - The Ladies
Filoyo Realivox - Blue

Realivox is a vocal synthesizer by Realitone. There are two software packages for Realivox entitled The Ladies and Blue. The software used to run the voices are powered by the Kontakt which has a full version which goes by the same name, or Kontakt Player which is free but comes with less features. Kontakt runs on Windows and Mac meaning that Realivox does as well.


The Ladies consist of 5 vocals :

  • Cheryl; A voice described as airy and pretty that is perfect for ethereal film cues.
  • Teresa; Described as a Soprano opera diva.
  • Patty; A voice suited for pop and ethnic music.
  • Julie; For full range songs.
  • Toni; For smooth R'n'B songs.

The voices have 30 multi-sampled articulations, being Oo, Ah, Ee, Oh, Ey, Hmm, Mmm, La, Bah, Bee, Boh, Boo, Buh, Bop, Bow, Bah Fall, Dah, Dee, Doh, Doo, Duh, Boom, Bom, Hey, Ha, Ho, Hoo, Me, Shoo, and Yeah. On top of that the voices have true sampled legato. These voices can also be stacked to create choirs as well as solos.

The Ladies software comes with Kontakt Player in it's installation.

Blue comes with a single voice that comes with 12,000 vocal samples. Each of the six vowels has thirty-two samples accompanying it. This is because when pronouncing different consonants before or at the end of different vowels, the mouth will open and close differently based off of what vowel has been pronounced. This reduces the amount of incorrectly sounding pronunciations and replicates sung English more accurately.

Like with The Ladies, Blue also has true sampled legato, but it also has polyphonic legato.

There is also an Ensemble Mode which allows the user to switch between 3 voices that comes with the package without the need to switch between tracks. This comes with settings for volume, pan, tuning, timbre and offset for each voice that is activated.

Blue produces a crystal clear sound.[6]

External links

Examples of usage

The Ladies

  • "How We Created "Mmmquiring Minds" with Realivox" by Realitone ft. The Ladies
  • "Mmquiring Minds" by Realitone ft. The Ladies
  • "Walking Through a Dark Town" by Realitone; Frank Raschke ft. Cheryl


  • "Realivox Blue Walkthrough" by Realitone ft. Blue
  • "Blue Demos" by Realitone ft. Blue; Kontakt
  • "The Wonderful Blue" by Realitone; Man Parrish ft. Blue


Filoyo Renoidplayer

RenoidPlayer is an online synthesizer created by g200kg. It is compatible with various web browsers and also works on the iPad and iPhone as long as they run iOS 6. Note: Safari running on Mac OS X cannot export the final product. Use Chrome/Firefox when exporting.


It has a built-in sequencer, unlike Sinsy. However this Editor can be a little confusing to new users, but for those experienced, there is MML available for use. RenoidPlayer is similar to AquesTone in the fact that pitchbends and other flags and tunings are ignored. This applies to all voices. Currently, there are only 8 voicebanks available to use, many of which are UTAU voicebanks which the author got permission from their creators to make RenoidPlayer compatible voice libaries for.

  • Data import: RenoidPlayer can accept file-drop as a sequence data. (Note infomation and basic lyric informations only. Pitch-bends and other additional informations are ignored.)
    • VOCALOID: VOCALOID Sequence files (.VSQ/.VSQx)
    • UTAU: UTAU Script files (.UST)
    • CeVIO: CeVIO Creative Studio files (.CCS)
    • MusicXML (.XML)
  • Parameters
    • Volume - Output volume control.
    • Transpose - Output pitch control, semi-note step.
    • Portamento - Pitch change smoothness control.
    • FormantCorrection - Keep formant independently of output pitch.
    • Formant - Formant control.
    • Humanize - Add some fluctuation to pitch and dynamics.
    • VibratoDepth - Amount of vibrato. Note that the vibrato has the delay time to start by 'VibratoDelay' parameter.
    • VibratoRate - Vibrato speed control.
    • VibratoDelay - Delay time to start vibrato.

External links


Filoyo Software-icon

Sinsy (Singing Voice Synthesis System) (しぃんしぃ) is an online HMM-based singing voice synthesis system by the Nagoya Institute of Technology that was created under the Modified BSD license.


The synthesizer is free to use, but will only generate tracks up to 5 minutes. The user uploads data in the MusicXML format, which the Sinsy website reads to output a WAV file of the generated voice. Gender factor, vibrato intensity, and pitch shift can be adjusted prior to output.[7]

MusicXML files can be made in Symphony Pro, Cadencii, MuseScore, and finale NotePad.

Some users have praised Sinsy for its realism. This can be attributed to Sinsy's voice source being a TTS (specifically, HTS, also by the Nagoya Institute of Technology), a process known for producing human-like results.

As of December 25, 2013 the official creators of the Sinsy are Keiichi Tokuda (Producer and designer), Keiichiro Oura (Design and Development), Nakamura Kazuhiro (Development and Main Maintainer), and Yoshihiko Nankaku.

  • Voices: Sinsy has four known voices: Yoko, Xiang-Ling, Matsuo-P, and Namine Ritsu S. Sinsy supports Japanese and English. A Chinese version has been released of Xiang-Ling as of Christmas 2015.
    • Yoko (謡子; f001j) is a Japanese-only voice.
    • Xiang-Ling (香鈴; f002j; f002e; f002m) is a Japanese, English, and Chinese (Mandarin) voice. The English voice was added on Christmas 2012. The Chinese voice was added on Christmas 2015.
    • Matsuo-P (松尾P; m003e_beta) is an English voice that was released to the public on December 25, 2013 along with the version 3.4 release of the Sinsy website. Unlike other banks on the Sinsy website that were created using female voices, Matsuo was voiced by a male voice actor. Its voice can be heard here singing "Who's Crying Now". Matsuo-P's voice actor is a YouTube and Niconico user who goes by the name Koya Matsuo.[8]
    • Namine Ritsu S (波音リツS; f004j_beta) is a Japanese voice. It can be heard here singing RIP=RELEASE. As of December 25, 2013, Namine Ritsu S became available for public use.

External links

Examples of usage

  • "CadenciiをインストールしてSinsyに歌わせるまでの録画" by nwp8861 (Mimirobo-P) ft. Tutorial
  • "MuseScoreで楽譜作成し、Sinsyに歌わせる手順(2010/09/02版)" by nwp8861 (Mimirobo-P) ft. Tutorial
  • "Golden Slumbers" by Koya Matsuo (Matsuo-P) ft. Tutorial

Symphonic Choirs

Filoyo QLChoirs-xlarge

Symphonic Choirs is a choir synthesizer produced by EastWest/Quantum Leap, able to recreate the effect of an entire choir for any song. It is popular with indie musicians.

External links

Virtual Singer

Filoyo HarmonyAssistantVS LMT

Virtual Singer is an plug-in module released late 2000 for Melody Assistant or Harmony Assistant and was made by Myriad.


Virtual Singer was a relatively small time package. It was a relatively cheap program at only $20, a total of $50 would be spent acquiring both it and Assistant programs. The software's forum is still seeing activity despite the software being dated and users were still producing works using the software in October 2011. There exists a work produced in March 2018, although this seems to be an anomaly, not a revival.

The results are comparable to other software in terms of clarity for its time including Cantor, but was a little more realistic in comparison to Cantor since it was based on human results. It was capable of singing in the following languages: British English, American English, French (Northern and Southern), Finnish, German, Latin, Spanish, Italian, Japanese, and Occitan. However, more scripts are being written by the users of the software that would allow it to produce more than this.[9]

It generated a "human" voice from the score lyrics. It came with the Real Singer II technology. Like UTAU a new voice was possible to be created from your own. However, if you downloaded the software several "free" voices could be downloaded from Myriad's website, some capable of multilingual results and others were made just for one language. Updates were also free of charge, however the latest version is 3.2.

External links


Filoyo Software-icon

SING (upcoming name SOHO) is a software by Emvoice for OSX and Windows. The software will allow music producers to create vocals without the need of a singer.

External links




Filoyo Dandyinterface

Chipspeech is a synthesizer developed by plogue. A retro styled vocal synthesizer created to reproduce vintage vocal synthesizers released in the 20th century. The software acts as stand alone or plug-in software to various DAWs. It can sing and talk and supports two languages: English and Japanese (though Japanese currently does not have talk capability). There are various means to adjust the vocal in the way the user wants, creating some very unique sounds and results.

The main strength of the engine is it can have a multiple number of synthesizer styles built into it. While some like Dandy 704 or Lady Parsec are based on samples recreation vocals much like VOCALOID or UTAU, others like Dee Klatt do not have samples and are fully synthetic sounds. They are instead based on direct input, meaning they recreate to various degrees faithful recreations of their engine, with Dee Klatt's being a fully rendered "live" feed back. For example, along with 5 new vocals since release, the Circuit Bending feature was added in version 1.032. This mimicked the circuit bending method of getting unique vocals from the classic old chips, which allowed for 'one of a kind' results.

As they are based on old technology they are all dated vocals and do not reflect the modern sounding ones. Therefore they at best barely sound human and do not attempt to even sound remotely like an uncanny effect. This can be off putting for those inexperienced with synthesizers of the past who and those who want realistic sounding vocals. The vocals themselves are difficult to find otherwise in their original chip forms, some being impossible to find due to how old their technology is.


Plogue Art et Technologie, Inc is a small company specialized in chipbased technology and aiming to recreate chip effects and sounds and apply them to modern technology. Chipspeech was one of their many ideas they planned for years to create, however, they were held back on the means to execute it. Though they had the technology, they lacked the knowledge. The Chipspeech software was born after they hired a member of their development team with Phonetic knowledge, allowing them to come up with ideas on how to create a vocal synthesizer technology.

The software acts as stand alone or plug-in software to various DAWs. It can sing and talk and supports two languages: English and Japanese (though Japanese currently does not have talk capability). There are various means to adjust the vocal in the way the user wants, creating some very unique sounds and results.

Unlike VOCALOID, CeVIO or other synthesizers words are typed as sentences. If used as a plug-in into a DAW, the synthesizer will play each sentence in the keyboard. The software is easy to use but requires some work to master.

The software currently has 12 characters, with an optional 13th being able to once downloaded known as "Daisy". Daisy was able to go into Alter/ego and is the only character able to do so. Daisy is, however. currently retired and unavailable for download. The other 12 characters are based on various synthesizers, such as Rotten.ST based on Atari ST’s STSPEECH.TOS or Dandy 704 based upon the IBM 704 computer. With the exception of Dandy 704, all have a cyberpunk-style character illustration representing each voice, with Dandy 704's instead being Steampunk. There is a basic "storyline" between the characters and a "canon" as per say. However, this does not impact the software itself.

External links

Macne Series

Filoyo Software-icon

Macne Series (Mac音シリーズ) is a series of voice banks designed for Reason and GarageBand, music sequencer software for the Macintosh operating system, developed by MI7 Japan Inc. and distributed by Act2. The idea of releasing a voicebank for Macintosh computers was conceptualized in the Japanese voice actress Haruna Ikezawa's regular column 天声姫語 Vox Reginae, Vox Dei ("voice of princess, voice of god," a spoof of Asahi Shimbun's editorial article 天声人語 Vox Populi, Vox Dei or voice of the people, voice of god) carried in the magazine Mac Fan by Mainichi Communications.

The voicebanks in their original form are considered difficult to use or noted for being time consuming to use as there is no quick input method like with VOCALOID and UTAU. Results are usually heavily robotic when used outside of UTAU and they have almost no capabilities to sound human-like.


At the time of release, VOCALOID and UTAU were both released for the PC, leaving Macintosh computers with no such software for music users in Japan. The Macne series would serve to fill that void. For Mac users, the Macne series is one of the best options for vocal music creation as at the time of their first release there were few alternatives. This made them popular with Japanese software users of Reason and GarageBand. The characters are used in Reason or GarageBand by selecting one of them as an instrument and altering pitches and lengths of sound files.

Since then a number of vocals for the Macne series has been produced and there are currently 6 characters in the series. Each new character is created to become a part of the "Macne family" and so some have relationships with each other such as the twins "Macne Coco White" and "Macne Coco Black" and the Macnes' 'father' "Macne Papa". Some vocals have also been updated to new versions and older versions have been retired from sale. They can also work normally on UTAU by converting their samples and providing a tuned oto.ini file, although the new files will not work with Reason or GarageBand. Users interested in using them for UTAU conversions must simply purchase the original versions and they will be allowed to then transfer the samples into UTAU. The reason for this allowance for the vocals to be converted is because they are sold under a open-source license unlike VOCALOID. So long as the user uses them as their intended purpose (as sound samples), there is no issue with what software they use it for, with UTAU being comparable with the license and terms of usage as it counts as using them as "sound samples" still.

Since the release of the Whisper☆Angel Sasayaki vocal, the Macnes have begun to contain already converted UTAU voicebanks. However, upon the release of Nana as a Vocaloid, all the past Voicebanks for Reason, Garage Band and UTAU have been retired. Some of the voice providers have moved on, with the Coco Twins's provider Kikuko Inoue voiced an entirely new Vocaloid, Haruno Sora, meaning that going forward the series is currently in limbo.

External links

Examples of usage

  • "Antenna (Logic + Rewire + Reason)" by moemaniadotorg ft. Macne Nana
  • "Mr. Wonderful (Garageband)" by ChaoFreak1 ft. Macne Nana
  • "Sousei No Aquarion" by Pomicyrus ft. Macne Nana; UTAU voicebank


Filoyo NiaoNiao

NIAONiao Virtual Singer (袅袅虚拟歌手 Niǎoniǎo xūnǐ gēshǒu) is a Chinese voice synthesizer program developed by dsound.[10]

The default voicebank is named Yu Niaoniao (余袅袅), however, users can create their own voicebank and take advantage of its larger file feature. NIAONiao can import MIDI files, VSQX files (VOCALOID3 only), and UST files, export tracks as the "Niao" file format (*.nn), and can render vocal tracks directly as WAV, MP3, or MIDI files.


The principle is the same as UTAU. Many Chinese fans have begun producing vocal banks for both programs. The voicebank format for NIAONiao is radically different from UTAU, the main difference being that the voice samples are packed in a large file. Due to being made for a Chinese audience, NIAONiao can have final consonants in a voice, also unlike UTAU. NIAONiao is not exclusive to singing in Chinese, just as UTAU is not restricted to Japanese. For example, a NIAONiao voicebank for Nagone Mako can be downloaded from the official NIAONiao website.

The interface is much closer in similarity to VOCALOID and (unlike UTAU) there is a panel at the bottom for controlling parameters, pitchbends, and vibrato.

External links


Filoyo Sugarcape

SugarCape is a vocal synthesizer developed by sota, only available for use on Mac OS X Snow Leopard 1.6.8 or above. The newest version now called SaltCase Alpha 0.0.2 uses a tripitch voicebank, and it is assumed that it has been given a sort of "", similar to that of an UTAU voicebank. This function allows transitions between pitches to be more natural, rather than giving in to the harsh distortion of the sample as it goes deeper.

In addition to the preset voice in SugarCape, one can add a voice to be compatible with the program. This has been done with Nagone Mako, and several other UTAU voicebanks. Some have thought of importing the Macne Series into SugarCape.

External links

Examples of usage

  • "sugarcape" by talc; なんてこったい on NND ft. N/A
  • "てくてく~地べたのスカイウォーカー~" by 呑気大王 ft. SugarCapePro
  • "DESTINY" by ねこ伯爵P  Vocalist missing!

Synthesizer V

Filoyo Synthesizer V

Synthesizer V (also know as SynthV) is a vocal synthesizer created and developed by Dreamtonics, Co. Ltd. It is currently available on Windows, Mac, and Linux systems. The software is available in English, Japanese, Chinese and Korean. The engine was fully released on December 28, 2018.


It was first previewed in 2017. According to the developers, the project is a product of 7 years of work and is the fifth revision.

It was made available for download in 2018 with the first vocal "Eleanor Forte", downloadable for free. Chinese and Japanese vocals are in production. More languages and dialects are planned.

On December 28, 2018 the engine got fully released with the possibility to register three vocals for free, and it also became possible to purchase a permanent license for the engine.

In December 2019, a web version of Synthesizer V was released, for those who cannot or don't wish to use the program version.

  • Vocals:
    • Eleanor Forte (エレノア フォルテ; formerly known as ENG-F1) was the first American English vocal and the first vocal overall to be released for Synthesizer V. Her first name, "Eleanor", has the meaning of light or bright with a sense of nobility, and her last name, "Forte", has the meaning of strength and references the musical dynamic, forte, which literally translates to loud.
    • Yamine Renri (闇音レンリ) is a Japanese female vocal released previously for UTAU. She has since been released for Synthesizer V. Her download can be obtained from her own Japanese website and the Synthesizer V website downloads.[11]
    • Genbu (ゲンブ) is a Japanese male vocal and the first male vocal released for the program. He may seem a bit straightforward, to the point of being called rude, however he cares about the people around him. His voice is supposed to represent his personality: calm, soothing, and somewhat weak.
    • AiKO (艾可) is a female Chinese voice and the first Chinese voice released for the program. AiKO is an enthusiastic girl. She tends to be careless but won't let tough times get her down. She always happy to make progress, no matter how small. She is a hard worker and likes to wear her work clothes.
    • Chiyu (赤羽) is a female Chinese voice and the first Synthesizer V vocal released from Beijing Photek S&T Development Co., Ltd.. She is part of the Medium⁵ series. Chiyu is Xingchen's older sister and is 17 years old. She is based on the element of fire and her representative shape is the tetrahedron.
    • Shian (诗岸) is a female Chinese voice and the second vocal from the Medium⁵ series. She is the youngest of the sisters and is 14 years old. She is based on the element of earth and her representative shape is the cube.
    • Cangqiong (苍穹) is a female Chinese voice and the third vocal from the Medium⁵ series. She made her debut in early 2019 using a different voice synthesizer before officially becoming a Synthesizer V vocal. Cangqiong is the eldest of the sisters and is 18 years old. She is based on the element of air and her representative shape is the octahedron. Her birthday is May 20.
    • MAN-M1 - Chinese (Mandarin) male vocal.
    • MAN-F1 - Chinese (Mandarin) female vocal.
    • JA-F1 - Japanese female vocal.

External links

Examples of usage


Filoyo Utau-interface

Vocal Synthesis Tool UTAU (歌声合成ツール UTAU) is a voice synthesizer program currently available for Windows and Mac OS X systems (the Mac version being named UTAU-Synth), it was developed by Ameya/Ayame. UTAU is a shareware] vocal synthesizer program that allows users to create and distribute their own voicebanks. And is viewed as a well-supported alternative to more expensive software that share the same abilities.

UTAU has the advantage of having its development occur at a faster pace. It has plug-in support and users have made a number of plug-ins that greatly improve the software's handling and experience. This support was established fairly early in the software's existence, whereas VOCALOID did not gain this ability until VOCALOID3 in late 2011 and even now it only offers a limited access to source code and plug-in support. Therefore the plug-ins for UTAU can often prove invaluable to users as they can effect the software's results and quality greatly.

Some UTAU voicebanks have been put out as "real" VOCALOIDs, such as the April Fool's joke origins of Kasane Teto. Songs using both UTAU and VOCALOID are also not unheard of. Some users have also began to enforce their copyright ruling over their voicebanks; UTAU or fanmade VOCALOIDs who are guilty of plagiarizing an UTAU's name or using a voicebank without permission risk violating UTAU software agreements and voicebank copyright ownership.


UTAU, meaning "to sing" in Japanese, has its origin in "Jinriki VOCALOID" (人力ボーカロイド, "Manual VOCALOID"), the act of re-editing an existing singing voice, extracting tones as WAV files, and reassembling them. In December 2007, Ameya/Ayame (飴屋/菖蒲), using LOLI.COM's voice samples, released a beta software called Loliedit featuring a simple voicebank called "Loline Com" (a pun on the original voice provider and the "ne" particle originally used on Crypton's products). It features a simple interface with a piano roll, has limited mora (Japanese syllables) and works with a primitive beta engine (or "resampler") later updated on UTAU. This beta software can still be downloaded from Ameya/Ayame's website. In March 2008, Ameya/Ayame released a free, advanced support tool to aid a "Manual VOCALOID" process, called UTAU. In later years, 2010, user feedback and suggestions, as well as other such ideas, lead to the creation of triphone ("VCV"; vowel-consonant-vowel) voicebanks; VOCALOID did not gain this capability until 2011 when VOCALOID3 was released.

The program comes with a default voicebank of 142 samples of Japanese syllables generated from the default voice of A-QUEST's text-to-speech software AquesTalk. Any user can load their own voicebank into UTAU to use. However, without the explicit permission of the voice donor, it is a violation of copyright laws. Those laws protect the rights of any vocalist who may not wish for their voices to be used within the program, such as celebrities. Any music made through this program can be used in the commercial sector. UTAU can be downloaded for free from the home page. It will not run properly on computers which do not support Japanese text or AppLocale.

UTAU is one of the few programs able to convert VOCALOID data files for its own use. It saves data in the .UST (UTAU Sequence text) format and is capable of converting .VSQ files to .UST. However, .UST files itself do not hold as much data as the VOCALOID engines' VSQ or VSQX file extensions, and UTAU does not try to convert many things into even its rough equivalent, only placing the notes. As a result, loss of data may occur. It currently does not support the VOCALOID5 extension, VPR.

External links


CeVIO Multimedia Studio

Filoyo CeVIO S old interface
Filoyo CeVIO interface

CeVIO Creative Studio (pronounced che-ˈvē-ˈo) is a commercial vocal synthesizer product released on September 26, 2013. CeVIO Creative Studio received two awards in 2013, the "MicrosoftⓇ Innovation Award 2013" and the "CEDEC Award". Its demo version, CeVIO Creative Studio FREE, is available in trial form. Previously, users could not edit Parameters in this version, but they are able to now, with few restrictions.


CeVIO has two capabilities, speaking function and singing, both of which need to be provided by their developers.

The speaking portion offers a large dictionary of words to which the vocalists can pronounce in a variety of ways and emotions. There are usually 3 different types of voices that can be cross-synthesized, or isolated to portray a single emotion. If they misinterpret kanji, phonemes can be edited. Velocity, Length, Tone, Accent, and Pitch can be edited in this mode.

The singing portion offers: Amplitude Timing, Pitchbends, Volume, and Vibrato rate and Depth. Gender is also available to be edited on the side bar of the piano roll. A recent development of the engine also introduced phoneme-input, previously not used. The ability to add and edit phonemes manually allows some "Engrish" words to be made, or allows smoother pronunciation of borrowed words. In addition to phoneme editing, up to 5 hiragana/katakana characters can be added onto a single space. This is a unique feature of CeVIO.

  • Parameters:
    • Amplitude Timing allows for phoneme editing. Sounds often are split into 6 segments in this section, and the beginning of the note is highlighted with a pink line. Dragging the last purple line back to the pink line of the next note will often help with vowel transitions. If there are lines between notes covering a blank space, this characterizes a breath or static sound. It cannot be deleted.
    • Pitch allows for editing of Pitchbends and addition of Portamento, which most vocals can produce on their own. Many advise using the Line Tool to draw pitchbends, as the Pencil Tool is very sensitive.
    • Volume can be used for dynamics, however it uses a very intense scale and minor adjustments to loudness can cause peaking/clipping, and lower volumes can cause an influx of static. Adjusting the volume of the track may be preferable.
    • Vibrato rate controls how fast vibrato cycles will happen. Within the program, the user isn't supposed to draw the vibrato, but rather a diagonal line/curve upwards to indicate the vibrato is accelerating, or a diagonal line/curve downwards to indicate the vibrato is decelerating.
    • Vibrato depth controls how deep the pitch of the vibrato cycles will be. In other words, the intensity of the vibrato. This is edited the same way as Vibrato rate, with the diagonal lines or curves.
  • Voices:
    • Sato Sasara is a speaking and singing CeVIO product. She is 16 years old.
    • Suzuki Tsudumi is a speaking CeVIO product. She is 17 years old and friends with Sasara.
    • Takahashi is a speaking CeVIO product. He is 20 years old and friends with Sasara.
    • ONE is a speaking and singing CeVIO product. She is the second installment of the - ARIA ON THE PLANETES -" project.
    • IA is a speaking and singing CeVIO product. She was the first installment of the - ARIA ON THE PLANETES -" project and a VOCALOID3 character.
    • Akasaki Minato is a singing CeVIO product and part of the Color Voice Series. He is 25 years old.
    • Midorizaki Kasumi is a singing CeVIO product and part of the Color Voice Series. She is 27 years old.
    • Ginsaki Yamato is a singing CeVIO product and part of the Color Voice Series. He is 50 years old.
    • Kinzaki Koharu is a singing CeVIO product and part of the Color Voice Series. She is 52 years old.
    • Shirosaki Yuudai is a singing CeVIO product and part of the Color Voice Series. He is 20 years old.
    • Kizaki Airi is a singing CeVIO product and part of the Color Voice Series. She is 18 years old.
    • HAL-O-ROID is a singing CeVIO product and based on the deceased Enka singer. He is 37.3 years old.

External links

CeVIO Creative Studio

Gynoid Talk


Megpoid Talk




Talk Ex


{{>Wiki directory}} {{>Technology directory}}

[[>Category:Browse]] [[>Category:Technology]]

Cite error: <ref> tags exist, but no <references/> tag was found
Community content is available under CC-BY-SA unless otherwise noted.