E.Meister, einar@ioc.ee
J.Lasn, jyrgen@phon.ioc.ee
L.Meister, lya@phon.ioc.ee
Laboratory of Phonetics and Speech Technology
Institute of Cybernetics at Tallinn Technical University
Akadeemia tee 21, Tallinn 12618, Estonia
ABSTRACT
A new database project has been launched at the Institute of Cybernetics this year. It
aims the collection of telephone speech from a large number of speakers for speech and
speaker recognition purposes. At least 1000 speakers are expected to participate in
recordings. SpeechDat databases (http://www.speechdat.org), especially Finnish SpeechDat,
has been chosen as a prototype for the Estonian database. It means that principles of
corpus design, file formats, recording and labeling methods implemented by SpeechDat
consortium will be followed as closely as possible. The automatic recording system and
labeling software developed by SpeechDat partners have been adopted for Estonian, as well.
The main characteristics of the Estonian SpeechDat database will be as follows:
-
Sampling rate: 8 kHz
-
Signal format: 8-bit A-law, mono
-
Signal source: calls from fixed and cellular phones
-
Calling environment: home/office, public place
-
Speakers: at least 1000 (500 female, 500 male)
-
Speech items: isolated digits, connected digits, natural numbers, money amounts,
spelled words, time phrases, date phrases, yes/no questions, names, application words,
phonetically rich words, application phrases, phonetically rich sentences.
The duration of the project has been planned for 24 months divided into four main stages:
-
Preparatory activities (6 months)
-
Recordings (4-6 months)
-
Segmentation and labeling (10-12 months)
-
Completion (4-6 months)
Currently, the preparatory activities are completed and intensive recording period is
about to begin.
In our presentation different aspects of the first stage of the project will be discussed
and up to date progress will be reported.
|
|
|