Next: Intonation
 Up: Modules
 Previous: Phrasing
     Contents 
In the module Word, grapheme-to-phoneme conversion takes place.
This process is divided into three steps:
- Festival first tries to find the word in the addenda lexicon.
  This is a supplement to the lexicon which typically contains some
  user specific entries. Entries can be added using the
  scheme-function ``lex.add.entry''. For the German lexicons
  it is called sampa_addenda and is defined in the file
  ims_german_lexicons.scm.
 
- If the word is not found in the addenda lexicon, it is looked up
  in the compiled lexicon.  This is done by a binary search in the
  compiled lexicon. We currently use two different full-form lexicons:
  the BOMP lexicon [5] from the University of Bonn, Germany,
  which is distributed together with the Open-Source version of IMS
  German Festival, and the Celex lexicon [1]. German
  lexicons are in the directory festival/lib/german/dicts/.
 
- Since German wordforms are highly productive, many words are not
  found in the lexicon. These words are converted to their
  transcription using letter-to-sound (LTS) rules are used. Since
  these rules often provide unsatisfying results, as many words of the
  application domain as possible should be in the lexicon. The LTS
  rules can be found in the fileims_german_lts.scm
 
For a language like German, which is rich in derivations and compounds
it is very helpful to have a morphological component with a lemma
lexicon instead of a full-form lexicon. Currently, such a component is
being developed at the IMS.
 
 
 
  
 Next: Intonation
 Up: Modules
 Previous: Phrasing
     Contents 
Gregor Moehler
2001-07-17