Talk:UTAU/@comment-27598213-20190108194820/@comment-53539-20190108214331

I know nothing on UTAU creation. I will note, however, a lot of UTAUs fall flat with languages. The main issue is languages your not great at, you'll be off with. There are UTAUs out there with 25 languages and it will often turn out only 2 are good for example.

I can only talk from Vocaloid... With Vocaloid, you have to use a script to get the best results, in other words you record yourself saying various words and then extract the sounds to make the samples. Each sample being 0.3 seconds at least for Japanese and at least 0.5 for English. The UTAUs tend to be basically random with some makers only recording the sounds as they are, but some sounds only can be fully and successfully produced in conjunction with word usage and they can't actually be correctly recorded on their own. With no sense of quality checking with most UTAUs, its all why UTAUs have this reputation for being poor when it comes to multiple languages. Especially when you end up recording only a fraction of the sounds needed for each language, such as English which needs 8,300 sounds within Vocaloid (V2 era) and Japanese which needs 1,500 (again V2 era, these amounts now are greater thanks to triphones). So this is why I often argue with people over just how good UTAU is.

The more sounds you don't include you need to, the worst the quality of your vocal is going to be, just bare that in mind. The only time it doesn't always work is with triphones, too many can apparently be bad. But I thought I'd say this because your skipping on sounds and that could cost you.