Next: Intonation
Up: Modules
Previous: Phrasing
  Contents
In the module Word, grapheme-to-phoneme conversion takes place.
This process is divided into three steps:
- Festival first tries to find the word in the addenda lexicon.
This is a supplement to the lexicon which typically contains some
user specific entries. Entries can be added using the
scheme-function ``lex.add.entry''. For the German lexicons
it is called sampa_addenda and is defined in the file
ims_german_lexicons.scm.
- If the word is not found in the addenda lexicon, it is looked up
in the compiled lexicon. This is done by a binary search in the
compiled lexicon. We currently use two different full-form lexicons:
the BOMP lexicon [5] from the University of Bonn, Germany,
which is distributed together with the Open-Source version of IMS
German Festival, and the Celex lexicon [1]. German
lexicons are in the directory festival/lib/german/dicts/.
- Since German wordforms are highly productive, many words are not
found in the lexicon. These words are converted to their
transcription using letter-to-sound (LTS) rules are used. Since
these rules often provide unsatisfying results, as many words of the
application domain as possible should be in the lexicon. The LTS
rules can be found in the fileims_german_lts.scm
For a language like German, which is rich in derivations and compounds
it is very helpful to have a morphological component with a lemma
lexicon instead of a full-form lexicon. Currently, such a component is
being developed at the IMS.
Next: Intonation
Up: Modules
Previous: Phrasing
  Contents
Gregor Moehler
2001-07-17