Estonian SpeechDat: a project in progress


    Laboratory of Phonetics and Speech Technology
    Institute of Cybernetics at Tallinn Technical University
    Akadeemia tee 21, Tallinn 12618, Estonia


    A new database project has been launched at the Institute of Cybernetics this year. It aims the collection of  telephone speech from a large number of speakers for speech and speaker recognition purposes. At least 1000 speakers are expected to participate in recordings. SpeechDat databases (, especially Finnish SpeechDat, has been chosen as a prototype for the Estonian database. It means that principles of corpus design, file formats, recording and labeling methods implemented by SpeechDat consortium will be followed as closely as possible. The automatic recording system and labeling software developed by SpeechDat partners have been adopted for Estonian, as well.

    The main characteristics of the Estonian SpeechDat database will be as follows:

  • Sampling rate: 8 kHz
  • Signal format: 8-bit A-law, mono
  • Signal source: calls from fixed and cellular phones
  • Calling environment: home/office, public place
  • Speakers: at least 1000 (500 female, 500 male)
  • Speech items: isolated digits, connected digits, natural numbers, money amounts, spelled words, time phrases, date phrases, yes/no questions, names, application words, phonetically rich words, application phrases, phonetically rich sentences.

  • The duration of the project has been planned for 24 months divided into four main stages:

  • Preparatory activities (6 months)
  • Recordings (4-6 months)
  • Segmentation and labeling (10-12 months)
  • Completion (4-6 months)
  • Currently, the preparatory activities are completed and intensive recording period is about to begin.

    In our presentation different aspects of the first stage of the project will be discussed and up to date progress will be reported.