How I may help
LinkedIn Profile Email me!
Call me using Skype client on your machine

Reload this page Text To Speech

Here is a report on various ways to make your computer talk.

 

Topics this page:

  • Phone Info Tools
  • Phone Tips
  • Phone Costs
  • Slammin'
  • ISDN
  • VoIP
  • Your comments???
  •  

    Site Map List all pages on this site 
    About this site About this site 
    Go to first topic Go to Bottom of this page

    Search

    Top of page Text to Speech (TTS)

    Various companies (AT&T, IBM, and smaller companies) have been working to produce more natural sounding voices. Their work improves prosody -- the rhythm, intonation, and lexical stress in speech not represented in the orthography (written representation) of text.

    Top of page Microsoft Text to Speech

  • YOUTUBE: Change Default Voice in Text to Speech
  • Microsoft Windows has a built-in voice engine.

    On Windows 8 Control Panel, there is a Text to Speech control where you can select voices.

    I prefer the UK_EN British accent. This is not so much that I think British accents are elegant (which I do), but because I am used to hearing American accents. Since British accents are unfamiliar to my brain's autonomic language processing, I am less bothered by unnatural intonations and stress than when listening to a "fake" American computer speak.

    Microsoft Windows comes with a Speech program accessible through its Speech API (SAPI4 or SAPI 5.1). The "Sam" voice that comes from Microsoft sounds like a robot, with unnatural pauses and emphasis.

    Top of page TTS Add-on to Windows

    voices on on Android.

    Ken Fallon's blog on shell script to turn Wikipedia text to speech.

    Speech for Windows Phone 8

    Top of page Phonemes

    Computers may take several seconds to process sounds because it has to look up each word in its database of phonemes which instructs the speech engine how to pronouce each word. The phonemes are converted into a sound file. All this takes much computational power.

    A phoneme string consists of one or more phoneme symbols and stress marks, optionally separated by whitespace.

    The elapsed time from when the client first sends text to the server to when the client receives the first audio buffer from the server is measured using metric named TTFA = time first audio received.

    There are several phoneme dictionaries.
    The SAMPA phoneme set is used internationally for German, French, etc.
    The DARPA phoneme set (nicknamed the darpabet) is used by US English voices to represent the sounds in the English language.
    The IPA (International Phonetic Alphabet) [W] was devised by the IPA (International Phonetic Association) (established in 1886 in Paris) to represent the sounds of all languages. So, (unlike the darpabet) uses non-Latin characters -- 107 distinct letters and 56 diacritics and suprasegmentals visible in a font of their own design.

    Apple Macintosh computers come with a MacinTalk text-to-speech embedded voice synthesizer that turns ASCII text into speech through its speaker.

    Apple's North American phoneme text symbols represent vowels as pairs of uppercase letters and consonants by single letters. However, the DARPA phoneme set (the "darapabet" used by AT&T for English) does not capitalize vowels:

    PhonemeExample WordExample Transcription
    eybaitb ey t
    aebatb ae t
    iybeatb iy t
    ehbetb eh t
    aybiteb ay t
    ihbitb ih t
    owboatb ow t
    aabobb aa b
    aoboughtb ao t
    awbrownb r ow n
    oyboyb oy
    ahbutb ah t
    axaboutax b ow t
    uwbootb uw t
    uhbookb uh k
    erbirdb er d
    bbetb eh t
    chchurchch er ch
    ddogd ao g
    dxbutterb ah dx er
    ffogf ao g
    ggotg aa t
    hhhothh aa t
    jhjumpjh ah m p
    kkitk ih t
    llotl aa t
    emChathamch ae t em
    mMomm aa m
    ensatins aa t en
    nnodn aa d
    ngthingth ih ng
    ppotp aa t
    qbuttonb ah q en
    rratr ae t
    ssats ae t
    shshutsh ah t
    ttopt aa p
    dhthatdh aa t
    ththickth ih k
    vvatv aa t
    wwonw ah n
    yyouy uw
    zzooz uw
    zhmeasurem eh zh er

    Modifiers (also called prosodic control symbols) are used to specify emphasis in the DARPA phoneme set, these

    Description darpabet Apple SAMPA
    Silence pau %  
    No stress 0    
    Breath intake   @  
    Primary stress 1    
    Secondary stress 2    

     

     
    Go to Top of this page.
    Previous topic this page
    Next topic this page


    How I may help

    Send a message with your email client program


    Your rating of this page:
    Low High




    Your first name:

    Your family name:

    Your location (city, country):

    Your Email address: 



      Top of Page Go to top of page

    Thank you!