Vocaloid Wiki

KAITO VOCALOID1 VOCALOID3 | Piapro Studio | Songs Albums Notable Originals

This article is about the VOCALOID3 software known as a voicebank. If you are looking for the VOCALOID character then click here.


It is unknown when production switched from VOCALOID2 to VOCALOID3, however, it is known that even Crypton Future Media could not predict if this update would be for either engine. Use of the VOCALOID2 version can be found in Project Mirai.


Upon a resurgent interest in KAITO and MEIKO, Crypton Future Media began pre-production of potential updates to their respective voicebanks. In September 2010 this was teased, but it was un-declared if these updates would be for VOCALOID2 or VOCALOID3.[1]

In December, MEIKO and KAITO were confirmed for updates, but it was unknown whether they would be released for VOCALOID2 or VOCALOID3, as they were still being planned.[2] The two were not being developed as a pair, however.[3]

Test recordings of Naoto Fuuga's voice started on December 10, 2010.[4][5] Several voicebanks consisting of different expressions, like the Appends, were being produced.[6] Wat confirmed they were still discussing the marketing of the KAITO product, and called to fans for patience.[7] The project timeline was also confirmed in December to be MEIKO → Megurine Luka → KAITO.[8]

A "Power" append that came from "deep within the stomach" was among the expressions being tested.[9] On December 17, 2010, 4 Append vocals were specifically mentioned.[10][11] These appends were the following:

  • "Mellow"[12]
  • "Serious"[13]
  • "Sweet"[14] - this was abandoned for the "Solid" Append.
  • "Solid"[15]

Wat discussed that he would be asking female producer's for their perspectives on KAITO.[16]

When thoughts on the update were being discussed, it was noted that the differences between Miku's high voice and KAITO's low voice were ridiculously large; being reflected on the engine itself and leaving Wat to wonder if they were still compatible.[17] There was also the matter of how to handle the male vocal, questioning if the "voice of the character" could be maintained with a voice such as KAITO's.[18] Due to KAITO's vocal being an "all-around" vocal, Wat felt that getting the "expression of color" previously seen in the Appends out of him was specifically challenging, describing the vocal as already being beautiful enough.[19]

On December 25, Wat mentioned that he had been charged with the sharing of information on KAITO through Twitter.[20] One of the biggest concerns at the time of development was that they felt the KAITO vocals they were working on were failing to meet quality expectations of consumers; more work was deemed necessary.[21]

V3 KAITO: 2011[]

In January, Crypton Future Media discussed the in-development silhouette for KAITO; noting that his transparent muffler had now become transparent. On February 17, 2011, Crypton released a temporary silhouette KAITO's Append. Due to backlash from fans, the aspect of a transparent muffler was being reconsidered.[22][23][24]

In March, Wat mentioned work on KAITO's vocal had come to a halt due to an undescribed problem. The effect this would have on scheduling and release date was unknown.[25]

Work on KAITO's voice resumed in April, with various project timelines being adjusted as KAITO was brought into focus.[26][27] In addition to a soft vocal, Wat also expressed the dire to produce a "down-towned" vocal.[28] On April 11, KAITO's vocal was in the α stage and had received negative feedback due to his treble range.[29] On April 13, 2011, it was confirmed 6 vocal expressions had been recorded; two of which were dropped, and the remaining continuing development. One of the mentioned expressions had already reached completed alpha stage. The Character Vocal (Hatsune Miku, Kagamine Rin/Len and Megurine Luka) series Appends had been created from vocal performance by their voice providers, but the new KAITO Appends were being created by adding echo, force and tension to the samples.[30]

Crypton Future Media made note that several of the voicebanks had been recorded with new, high-tech, microphones through a series of trial and error instead of the standard U87ai.[31] These voicebanks were the following:

  • "Vivid"
  • "Solid"
  • "Power" [32]

The vocals were intended as experiments utilizing the new equipment with Naoto Fuuga's voice.[33]

On May 12, 2011, Crypton met with Yamaha to discuss the current development of KAITO, whose demos were pending for release. Wat apologized for the slow development, and mentioned enhancements to KAITO's Whisper Append were being made.[34]

On May 28, KAITO's Append voicebanks "Whisper" and "Power" had demos posted to Nico Nico Douga, accompanied with a demonstration of Hatsune Miku's English voicebank.

A demo using the alpha version of KAITO's "Soft" Append was published on June 3, 2011, singing Karakuridokei to Koi no Hanashi.[35]

KAITO's "Soft" Append was considered by Crypton as KAITO's most natural vocal, being described as requiring little editing to sound good.[36] At this point in time, "Soft" had left α (Alpha) and entered β (Beta).[37] Confirmed by Wat, the release would remain unknown for sometime, but there would continue to be development updates as soon as possible.[38] "Whisper" was still receiving adjustments, as well as other data; being adjusted in a hurry.[39] An interesting aspect that was noted was that by lowering the gender factor on KAITO's "Soft" Append, it would sound like a female vocalist. Although it was cute, it was considered a mistake for correction on their part. This resulted in further changes to the vocal to prevent the outcome.[40]

By June 11, Wat tweeted they were in the middle of recording KAITO and working extra hard in comparison to the past to get the vocals completed.[41] He further mentioned they were prioritizing completion of the main KAITO voice to give them extra time to work on "Whisper".[42] It was then mentioned later that by the end of the month, "Whisper" and "Soft" would be at the same stages of development.[43] The data for "Whisperβ" had been finalized by the end of June.[44]

At the beginning of July, KAITO's "Whisper" vocal was stated to be having a demo soon.[45] The process for KAITO's English voicebank was also being described as "annoying".[46] In relation to the introduction of Hatsune Miku in America, Wat expressed that there was a high level of female VOCALOID fans in America, in contrast to Japan whose demographic was primarily male. Due to this, KAITO was to be tested with this audience to gauge interest and demand.[47] After an interval of suspension, Crypton managed to finish recording KAITO's vocals by the end of June.[48] From that point on, discussions of image and vocal outputs were being held with Naoto Fuuga.[49] A classic, VOCALOID-style, KAITO recording to match the vocal tone of the original voicebank that users were familiar with was also completed.[50]

The first commercial usage of KAITO's Append was produced by yanagi and kaoling for the album "VOCALOID Minzokucho Kyokushu." In this album, he sings the song "Sennen no Dokusou Ka (kaoling mix)".[51] Later, beta versions of KAITO's "Normal", "Soft", and "Whisper" voicebanks would appear on the VOCALO APPEND album, singing "Lost Destination" with Kagamine Len Append.[52]

In October, Crypton updated progress on the project via Twitter, stating that English KAITO would take some time to produce.[53] Not long after, it was announced via Piapro[54] and the Hatsune Miku Facebook that there would be a surprise regarding KAITO being revealed at the 2011 New York Anime Festival. On October 16, the panel played a demo of KAITO's English voicebank. Footage of the presentation was captured and uploaded onto YouTube:


KAITO singing "Top of the World" by the Carpenters (Straight) YouTube

It was specified after this demo that KAITO's English voicebank was in its alpha stage, and still needing further work.[55] This work continued into December, where the lower tones of the voicebank were described as needing to be fixed. [56][57]

On December 1, it was mentioned that all previous demos had been rendered with the VOCALOID2 engine, and the project was being redone for the VOCALOID3 engine.[58]

Further exclusive information on KAITO's update was rewarded to the winner of a contest at the time based around Kagamine Len Append.[59]

V3 KAITO: 2012[]

In February 2012, Wat discussed the differences between the CV series (Hatsune Miku, Kagamine Rin/Len and Megurine Luka) and KAITO/MEIKO; stating that their appends were being adjusted using editing techniques to achieve the results, whereas the CV series Appends had been done by voice acting.[60] KAITO's English voicebank was reported to have received adjustments to a few of its diphone data in the same month.[61]

By this time, Crypton Future Media had begun the process of looking for Japanese producers who could use English-capable VOCALOIDs.[62] During development, KAITO's quality was being compared to overseas male English VOCALOIDs.[63] On February 23, it was noted that both progress on the Japanese vocals and English vocal was possible to share moving forward.[64] Through exchanging VSQx files with producers, further comparisons with native English VOCALOIDs was done for testing.[65] Several overseas producers were now taking part in KAITO's production.[66] KAITO's English vocal had received significant feedback on Vowel-Vowel transitions, as well as Diphthong samples, which resulted in further adjustment.[67]

As KAITO's clothes were a point of contention, they were confirmed not to be receiving significant changes in March; merely receiving some modernization.[68] The usage of a 2.5D style drawing was intended for the physical package's illustration. In early March 2012, Kenmochi Hideki was given the updated KAITO package (including English) for feedback and further adjustments.[69] Crypton noted the only work remaining was for KAITO's vowels to be adjusted.[70]

In April, the use of triphonic data in "Whisper" was described as "wonderful".[71] The previous KAITO "Normal" vocal was confirmed to be a final voicebank in the updated package.[72] In the update's manual, a history of VOCALOID KAITO would be included.[73] Further adjustment to KAITO's voice were being made up until May of 2012.[74]

In October, checks on KAITO's vocal were being performed once again.[75] Both KAITO and Miku's vocals were adjusted at this point in time.[76] While the previous KAITO "Normal" vocal had been confirmed for the updated package, the naming scheme was still subject to change: names such as "Neutral", "Natural" and "Default" were all in consideration.[77]

Demos using Miku and KAITO's English voicebanks were shown at NYCCon 2012, both being in the beta stage. KAITO's "Whisper" and default vocal were also shown.[78] The expected release for KAITO was to be "by the end of this year or the beginning of the next".[79]

More information regarding the KAITO update was planned for December. Due to the desire to include "Piapro Studio" with the new KAITO package, a slight delay occurred.

V3 KAITO: 2013[]

In early January 2013, KAITO's voicebanks had received their final adjustments, being completed and set to be released mid-February as scheduled.[80] DTM Magazine ran a series of tutorials on using KAITO V3 for beginners.[81] The bundling of KAITO V3 with other products was in an effort to give producers everything Crypton felt they would need.[82] KAITO V3 would consist of 4 vocal libraries, in an effort to minimize any potential troubles.[83] Users who had purchased VOCALOID KAITO would receive a special discount of the difference between VOCALOID KAITO's price and the price of the CV Series.[84]

KAITO V3 was released on February 15, 2013.[85]

Mac Update[]

With the release of the VOCALOID NEO engine, KAITO V3 would be updated for Mac compatibility. Those who own the KAITO V3 package for Windows would receive this update free of charge. KAITO V3's Mac version was released on February 15, 2014.[86]

Future Plans[]

In 2014, Wat mentioned playing with the wavelengths of experimental vocals he hoped to one day release. These 3 vocals he mentioned were KAITO "Light", Miku "Falsetto", and MEIKO "Hard".[87]

KAITO V3: 2024[]

On July 1, Crypton announced that their virtual singer software product line would be repackaged as part of celebration of the company's 20th Anniversary of their singing synthesis software business. The DAW in the KAITO V3 package was changed to Cubase LE.[88]

Product Information[]



黄金木の葉が舞う頃に (Ougon Konoha Ga Mau Koro Ni) (Straight) NicoNico YouTube Crypton
からくり時計と恋の話 (Karakuridokei to Koi no Hanashi) (Soft) NicoNico YouTube Crypton
からくり時計と恋の話 (Karakuridokei to Koi no Hanashi) (Whisper) NicoNico YouTube Crypton
CiRCuS MoNSTeR (English) NicoNico YouTube Crypton
月雪花 (Tsuki-Yuki-Hana) (Straight) NicoNico YouTube Crypton
夢懸歌 (Yumekake Uta) (Whisper, Straight) NicoNico YouTube Crypton
水の道化師 (Mizu no Doukeshi) (Soft) YouTube Crypton
Rose + Thorn (English) NicoNico YouTube Crypton
玄冬桜 (Gentouzakura) (Whisper) YouTube Crypton
悪徳のジャッジメント (Akutoku no Judgment) (Straight) YouTube Crypton
Sweet's Beast (Soft) YouTube Crypton
crystal mic (English) YouTube Crypton
逆罪行進曲 (Gyakuzai Koushinkyoku) (Whisper) YouTube Crypton
時計塔のうた (Tokeitou no Uta) (Straight) YouTube Crypton
カンタレラ (Cantarella) (Short ver.) (Soft feat. Hatsune Miku) YouTube Crypton
YES YES (English) YouTube Crypton
海渡る風の唄 (Umi Wataru Kaze no Uta) (Straight, Whisper, Soft) YouTube Crypton
戦唄 (Ikusa Uta) (Straight) YouTube Crypton
島唄 (Shima Uta) (Straight, English) NicoNico YouTube Crypton
万感吟遊 (Bankan Ginyuu) (Straight) YouTube Crypton
恋するアプリ (Koisuru Apuri) (Soft) YouTube Crypton
Pane dhiria (Straight) YouTube Crypton
Anonymous (English) NicoNico YouTube Crypton
モーニングコール (Morning Call) (Whisper) YouTube Crypton
Chillyditty Of February (Straight, Soft) YouTube Crypton
時忘人 (Tokiwasurebito) (Straight) YouTube Crypton
Mirror Rule (Soft) YouTube Crypton

Demo page

System Requirements[]

  • Windows XP (32bit) / VISTA (32bit) / 7 (32, 64bit) ※ Windows 8 in (64bit) 32bit compatibility mode (WOW64): OS
  • Recommended Intel Core 2 Duo 1.8 GHz or more: CPU
  • Recommended 2GB or more: RAM memory
  • Free space of 25GB or more
  • Other: Display / Internet connection environment of video card / 1280x768px or more corresponding to the driver of the DVD-ROM drive / sound device / OpenGL 3.0 or more (when activated)



Product Information
  Trial/Demo Vers?: No  Starter Available?: No
Package details as noted:

This is the package designed to replace the VOCALOID KAITO vocal package. You do not need the previous VOCALOID version to use this pack, unlike the Append packages for the CV series. This update was to satisfy the needs and demands of VOCALOID fans.[89] Its counterpart is the MEIKO V3 product.

This package comes with Piapro Studio.

The software comes with the Piapro Studio Edition of PreSonus's DAW, "StudioOne ARTIST"[90], which includes various vocal effects and other tools. This also includes a resource pack with 200 virtual instruments. The complete package of Piapro Studio and StudioOne ARTIST means that by purchasing this release, musicians can begin producing straight away with no need to purpose any further software, and all bundled software working together without any problems. The Piapro Studio and StudioOne ARTIST bundle is included in both non-English and English bundled versions of the vocal.

Vocal traits as noted:
  • The purpose of this product has been extensively confirmed to function differently than the Character Vocal's Appends series. The tones within these voicebanks are focused on natural sounding tones, rather than expressive tones.
  • The vocals are completely new recordings, using none of the past VOCALOID data.[91][92]
Software issues as noted:
  • This was originally intended to be a VOCALOID2 release.
  • Despite the inclusion of English, there is no technical support for the English Vocal offered. The English vocal does not have access to any additional tones.
Cross-Synthesis as noted:

When imported into VOCALOID4, the cross-synthesis (XSY) function can be used on all 3 Japanese Kaito voicebanks.

  • In total the package offers the equivalent of 6 additional voicebanks achievable via XSY. The total tone variation offered by the package comes to a theoretical 9 voicebanks in total.
  • Note that these vocals were not designed to be used for XSY originally as it was not part of the VOCALOID3 engine functions. Results may be rougher than VOCALOID4 XSY vocals for that reason.

  • From Ver.4.3.0 of the Vocaloid engine onwards, a XSY group "Kaito V3" was added to Vocaloid. All vocals within the "Kaito V3" group can XSY with each other. This vocal release is part of this group. If a User owns one or more vocal releases within the "Kaito V3" group, XSY between them will open up.[93][94]
    • Currently only one release is found within this group KAITO V3, making it a fairly limited XSY group.
    • The group has 3 unique voicebanks; "Straight", "Whisper" and "Soft"

  • Individual DB[]

    Product Information
      Genre: Folk, Pop, Rock  Optimum Range: B1 ~ C3  Optimum Tempo: 90 ~ 200 BPM
      Total Tempo (min-max): 110 BPM  No. of Keys: W ~ 9, B ~ 5, Total ~ 14
    Package details as noted:

    DB 1 STRAIGHT is designed to hold the same tone as the original VOCALOID vocal.

    Vocal traits as noted:
    • The quality is much higher than the previous VOCALOID vocal, matching the VOCALOID3 standards.
    • The results are much cleaner than the original VOCALOID vocal, but maintain its ability to act as an all-round voice.
    • Out of all 4 vocals, Straight can best handle the fastest tempo.
    • The Straight vocal's optimum vocal range is relatively smaller in comparison to some of the other VOCALOID3 vocals
    • At 90 tempo, Straight has the highest of the minimum optimum tempo range for any of the 4 vocals and is less adapt for slower songs than they are.
    Software issues as noted:
    • While the intent was to match the VOCALOID KAITO vocal's tone, due to using new recordings, it does not produce the exact same results.
    Voicebank sample

    KAITO V3 Straight

    Cross-Synthesis as noted:

    KAITO's main vocal is his "Straight" vocal and it is his base vocal for XSY. By using "Soft", extra expression can be added to the vocal, while "Whisper" is for extra tone control to make KAITO sound more sorrowful. Because of the vocal's design, KAITO's "Straight" vocal is not a bridging vocal as a result, as with many main vocals.

    The paths of expression therefore for "Straight" are;

    • "Straight" ⇄ XSY ("Straight" x "Soft")
    • "Straight" ⇄ XSY ("Straight" x "Soft") ⇄ "Soft"
    • "Straight" ⇄ XSY ("Straight" x "Whisper")
    • "Straight" ⇄ XSY ("Straight" x "Whisper") ⇄ "Whisper"

    Product Information
      Genre: Soft Rock, Folk, Ambient  Optimum Range: D2 ~ B2  Optimum Tempo: 80 ~ 180 BPM
      Total Tempo (min-max): 100B PM  No. of Keys: W ~ 6, B ~ 4, Total ~ 10
    Package details as noted:

    DB 2 SOFT voice is designed to give a softer tone to lyrics.

    Vocal traits as noted:
    • This vocal was considered KAITO's most natural sounding during development according to Crypton Future Media themselves.
    • It has a smaller vocal range then either "Straight" or "English".
    Software issues as noted:
    • Crypton Future Media reported during development that this vocal loses masculinity quite easily when adjusting the GEN factor, rendering the vocal sounding more like a female vocalist than a male.
    Voicebank sample

    KAITO V3 Soft

    Cross-Synthesis as noted:

    "Soft" is a voicebank that in XSY will loosen KAITO's main vocal pronunciation, allowing him to do moodier genres. Because the main vocal "Straight" is not the medium vocal, this role falls upon "Soft". It is the bridge between the "Straight" and "Whisper" vocals.

    Product Information
      Genre: Ballads, Jazz, Soul  Optimum Range: F2 ~ D3  Optimum Tempo: 65 ~ 150 BPM
      Total Tempo (min-max): 85 BPM  No. of Keys: W ~ 6, B ~ 4, Total ~ 10
    Package details as noted:

    DB 3 WHISPER is a gentle and soft "whispery" vocal which is intended for calmer songs.

    Vocal traits as noted:
    • It has a more natural ability to utilize higher pitches.
    • With its optimum range allowing it to go down to a tempo of 60, it is more adapt at slower songs then the other vocals within this back.
    • Like "Soft", its optimum range is smaller than "Straight" or "English".
    • Whisper itself has the smallest optimum tempo range out of the KAITO vocals within this package.
    Phonetic notes as noted:
    • Crypton themselves commented that Whispers use of VOCALOID3 triphones were "wonderful", indicating they are more significant in this particular vocal.
    Voicebank sample

    KAITO V3 Whisper

    Cross-Synthesis as noted:

    "Whisper" is the voicebank with the most loose sounding phonemes, thus the most emotionally capable of the 3 voicebanks. Its main role is to add sorrowful or calm tones to the main "Straight" voicebank. In XSY its role is to aid in the addition of emotion to KAITO's other voicebanks.

    Product Information
      Genre: Crossover, Dance, Electronica  Optimum Range: B1 ~ B2  Optimum Tempo: 70 ~ 190 BPM
      Total Tempo (min-max): 120 BPM  No. of Keys: W ~ 8, B ~ 5, Total ~ 13
    Package details as noted:

    DB 4 ENGLISH, gives English capabilities to the V3 package.

    Vocal traits as noted:
    • Like "Straight" this is an all-round voice, intended to have a natural tone of voice and maintains a natural singing voice.
    • It has the largest tempo range capabilities of all vocals within the KAITO V3 package.
    • Like all vocals within the package, the English vocal has a relatively small optimum vocal range compared to some of the other vocals within the VOCALOID3 lineup. It current ties with Hatsune Miku V3 English as the 2nd smallest tempo range of any VOCALOID3 vocal.
    • Wat mentioned during development that there were many annoyances encountered with this vocal, though excluded to mention what they were.
    Phonetic notes as noted:
    • Has a strong Japanese accent, resulting in issues with pronunciation. For example, when the [u:] phoneme proceeds the [l0] phoneme (Clear L), this one will sound out as the [r] phoneme. So, typing [r u:] and [l u:] into his English voicebank will produce the same result. One solution is to input the phonetic combination of [l w u:], though the [u:] will have adverse effects.
    Software issues as noted:
    • Despite "Straight" being its closest Japanese vocal, there is a difference between range and tempo that may render imperfect transitions when switching vocals.
    Voicebank sample


    1. link
    2. http://twitter.com/vocaloid_cv_cfm/status/10899687460569088
    3. https://twitter.com/vocaloid_cv_cfm/status/13161254973612032
    4. tweet 11:412 PM - Dec 9 10
    5. link
    6. link
    7. link
    8. tweet 5:42 PM - 10 Dec 10
    9. link
    10. tweet 4:26 PM - 17 Dec 10
    11. tweet 7:31 PM - Dec 14 10
    12. tweet 4:37 AM - 14 Dec 10
    13. tweet 1:32 PM - Dec 17 10
    14. tweet 12:22 AM - 17 Dec 10
    15. tweet 3:19 PM - Dec 17 10
    16. link
    17. link
    18. link
    19. link
    20. link
    21. link
    22. tweet 10:44 PM - 8 Jan 11
    23. [1] Piapro Blog - 【プレゼント企画】2月17日でKAITOは発売5周年!ということで、お祝い企画開催決定!! (KAITO's fifth Anniversary - 「KAITO Append(仮)」)
    24. tweet 10:40 PM - 8 Jan 11
    25. link
    26. link
    27. link
    28. link
    29. link
    30. tweet 8:50 PM - Apr 12 11
    31. tweet 21 Apr via web
    32. tweet 12:37 AM - 17 Dec 10
    33. link
    34. tweet 10:05 PM - 10 May 11
    35. [2] Vocaloidism - Kaito Append Alpha Demo: Soft
    36. tweet 1:48 AM - 3 Jun 11
    37. link
    38. link
    39. link
    40. tweet 6:19 PM - 8 Jun 11
    41. link
    42. link
    43. link
    44. link
    45. link
    46. link
    47. link
    48. tweet 1:47 AM - 13 Jun 11
    49. tweet 3:01 AM - 13 Jun 11
    50. tweet 5:04 AM - 13 Jun 11
    51. link
    52. link
    53. tweet 7:06 AM - 13 Oct 11
    54. link
    55. tweet 7:28 AM - 17 Oct 11
    56. link
    57. link
    58. link
    59. link
    60. link
    61. link
    62. tweet 12:47 AM - 16 Feb 12
    63. tweet 12:24 AM - 16 Feb 12
    64. tweet 4:58 PM - 22 Feb 12
    65. tweet 01 AM - 16 Feb 12
    66. tweet 6:02 AM - 22 Feb 12
    67. tweet 5:55 AM - 22 Feb 12
    68. tweet 7:32 PM - 30 Mar 12
    69. tweet 3:39 AM - 2 Mar 12
    70. link
    71. link
    72. link
    73. link
    74. link
    75. link
    76. link
    77. link
    78. link
    79. link
    80. link translation
    81. link translation
    82. link translation
    83. link translation
    84. linktranslation part1 translation part2
    85. Crypton
    86. link
    87. Tweet about future vocals
    88. https://blog.piapro.net/2024/07/ms2407011.html
    89. link
    90. https://www.presonus.com/products/studio-one
    91. link
    92. link
    93. link
    94. link