0

TTS needs a more comprehensive API (and ASR would be super!)

Hi,
I really appreciate BrowserPlus and have great use for the TTS. But hopefully the next version will have:

1. an event that tells the environment that speech has stopped.
2. a method to stop the speech
3. parameters to set the voice, volume, speed etc.

Of these I would value 1) most.

And if you could do with ASR (Automatic Speech Recognition) what you've done with TTS that would blow my mind!

Best regards,
Torbjörn

8 Replies
  • Thanks for posting!

    Let's see. Some way of JS understanding that speech is complete: how about we delay invocation of the return callback until the speech is complete?

    A method to stop speech: cancelation, pause, or both?

    Both good ideas to round out the API.

    In terms of voice/volume/speed, am a bit on the fence. Currently we defer to system settings. I don't know that I'd personally be comfortable if a webpage could override my volume setting, or if I've customized the system voice, if a webpage could alter that. What do you think?

    best,
    lloyd
    0
  • Lloyd,
    Thanks for the quick response!

    Delaying invocation of the return callback until the speech is complete would work fine, I think. At the moment the callback is called immediately, which isn't very useful since you already know when speech starts (but not when it is complete).

    Methods to stop speech: Your suggestion makes sense: cancel, pause, resume would cover it all.

    I see your point about a webpage altering systems setting. Couldn't you change these setting without changing the system prefs.? Or change them, and then change them back after speech has completed? If this isn't possible, I guess could live without the ability to change voice, volume etc.

    Thanks,
    Torbjörn



    QUOTE (Lloyd Hilaiel @ Nov 3 2008, 07:06 AM) <{POST_SNAPBACK}>
    Thanks for posting!

    Let's see. Some way of JS understanding that speech is complete: how about we delay invocation of the return callback until the speech is complete?

    A method to stop speech: cancelation, pause, or both?

    Both good ideas to round out the API.

    In terms of voice/volume/speed, am a bit on the fence. Currently we defer to system settings. I don't know that I'd personally be comfortable if a webpage could override my volume setting, or if I've customized the system voice, if a webpage could alter that. What do you think?

    best,
    lloyd
    0
  • Lloyd,
    There's an added complication: Suppose we send a text in English to the TTS on a a computer which has a (say) German or French TTS as default. It will sound funny. Therefore you want to be able to state what language you are using. And maybe you want to be able to check what languages are installed.
    Best,
    Torbjörn
    0
  • QUOTE (torbjorn_laurell @ Nov 3 2008, 07:40 AM) <{POST_SNAPBACK}>
    Lloyd,
    There's an added complication: Suppose we send a text in English to the TTS on a a computer which has a (say) German or French TTS as default. It will sound funny. Therefore you want to be able to state what language you are using. And maybe you want to be able to check what languages are installed.
    Best,
    Torbjörn


    another fine point. I guess one should be able to query supported locales for TTS, and supply a locale along with the utterance...

    gosh, with all of these ideas, it sure would be nice if anyone could go in and implement fixes and new features... Stay tuned on this one... :Dlloyd
    0
  • QUOTE (torbjorn_laurell @ Nov 3 2008, 07:26 AM) <{POST_SNAPBACK}>
    Lloyd,
    Thanks for the quick response!

    Delaying invocation of the return callback until the speech is complete would work fine, I think. At the moment the callback is called immediately, which isn't very useful since you already know when speech starts (but not when it is complete).

    Methods to stop speech: Your suggestion makes sense: cancel, pause, resume would cover it all.

    I see your point about a webpage altering systems setting. Couldn't you change these setting without changing the system prefs.? Or change them, and then change them back after speech has completed? If this isn't possible, I guess could live without the ability to change voice, volume etc.

    Thanks,
    Torbjörn


    Torbjörn,

    It's certainly possible to temporarily change settings such as volume, or even to allow one to specify the volume and have it limited by the system setting (if system sound is muted, can't play voice). So maybe there is a safe compromise somewhere.

    The case I'm fearful of is when I mute my machine by reducing system volume in a quiet office setting/conference/classroom, I wouldn't want a webpage to have anyway to override that.

    lloyd
    0
  • QUOTE (Lloyd Hilaiel @ Nov 3 2008, 08:09 AM) <{POST_SNAPBACK}>
    The case I'm fearful of is when I mute my machine by reducing system volume in a quiet office setting/conference/classroom, I wouldn't want a webpage to have anyway to override that.


    I see what you mean. How about letting the user's setting form the upper bound for volume - so that a webpage can only (temporarily) adjust the volume downwards?

    - Torbjörn
    0
  • Oh, I just noted that what I just wrote simple repeated what you'd already suggested. Sorry about that!
    - Torbjörn
    0
  • I think it would be useful to be able to change the voice too, so I could implement voice fonts (using different voices for things visually represented as bold, grey, etc.)
    0
This forum is locked.

Recent Posts

in Support & General Questions