Research Theme
Communication using speech production and speech perception is one of
the basic ways for human to exchange information. Fully understanding
such mechanisms of human and realizing them by a computer are the
research goal of our laboratory. To do so, we are carrying out of the
following research topics.
Research Topics
Speech Production Mechanisms and Their Modeling
There are still a number of unsolved questions on mechanisms of speech
production, especially for production of emotional speech. To answer
those questions, we used a physiological articulatory model, which has
been developed based on MRI data by this Lab and ATR, to simulate the
processing from articulatory target to speech sound and the inverse
processing from speech sound to articulatory target. The ``true''
mechanisms can be approached using such an iterative approach. An
additional part of this topic is to refine the articulatory model
based on physiological discoveries.
Speech Cognitive Science
Speech cognition (perception) can be considered as an inverse
procedure of the speech production. Since numbers of articulatory
situations are able to produce the same sound, there is one-to-many
inverse problem occurring in the cognition processing, which is a
crucial topic in speech cognition. We are going to challenge the
problem by investigating its causes, which are concerned with the
stability of the articulatory situation, and the physiological and
morphological constraints, via the physiological articulatory model.
Speech Communication within The Brain
According to the motor theory of speech perception, a famous
hypothesis, speech perception is realizing with reference to image or
knowledge of the motor (production) areas (Liberman et al., 1960,
1985). In this research, we are going to verify this theory by
investigating interaction between speech perception and production via
acoustic analysis, EMG measurement and articulatory observation.
Speech Synthesis with Specific Individuality and Emotion
Individuality of speech depends on physiological (inborn) factors and
social (habit-forming) factors. In this study, we focus on the
analysis and modeling of the effects of the former factors on speech.
Emotion is the paralinguistic information to describe a state of the
speaker, which cannot be logically produced. The study is trying to
study emotional speech generation by adapting our experience to the
articulatory model and clarify the relation between the emotion and
acoustic parameters besides the fundamental frequency.
Speech Recognition Considering Auditory, Articulatory and
Physiological Features
We are going to develop some novel methods for speech recognition by
considering human mechanisms. We are using human auditory property
for developing a robust speech recognition method for a noisy
environment, coarticulatory mechanism for missing speech recognition,
and physiological features for speaker identification.