martinvicanek wrote:My use of it is mainly to analyze the synthesized material in super slow motion in order to test and improve my own algos, but you can also use it just for fun!
After playing around for a while, I wonder whether there might be uses in speech therapy and language learning for a system like this. I noted in particular that the pitch variation control does a very good job of enhancing, removing, or inverting prosodic cues - for example, the phrase in the included sample seems to change from being a statement to being a question when the pitch variation is inverted.
Such manipulation and/or analysis of prosodic cues, maybe combined with visual feedback, might be a useful tool to supplement sessions with a speech therapist for improving the perception or production of fluent prosody - often found difficult by autistic people, folks with various hearing impairments, aphasias, etc. Likewise, I imagine it could have uses as an aid for learning pronunciation of tonal languages (e.g. most Oriental languages) for learners whose first language is non-tonal.
As a little experiment, I tried it on some recordings of my speech. My prosody is often noted as being very flat by other people (including formally at my Asperger's Syndrome diagnosis), though it doesn't sound that way to me "inside my head" when I'm speaking. Of course, it's hardly a scientific, blinded experiment; but it was interesting to find that exaggerating the pitch variation does indeed seem to make my voice seem more "typical" of what I hear in other people's voices - yet the excellent quality of the processing is such that it remains recognisable as my voice rather than a different speaker.
I had a great time using it "just for fun" too, of course. But, as ever, I think you are too modest; tools like these, in the right hands, may have the potential to be much more than just "toys" or DSP coding aids!