- Robust Online Visual Multi-Target Object Tracking
- MRI Signal Processing
- Audio/speech signal processing
- Automatic speech recognition
- Robust speech recognition
- Speech enhancement
- Speaker adaptation in speech recognition
- Speaker/language recognition
- Audio/speech source localization/tracking/separation
- Audio and speech watermarking
- Image processing
Online multi-object tracking aims at producing complete
tracks of multiple objects using the information accumulated
up to the present moment. It still remains a difficult
problem in complex scenes, because of frequent occlusion
by clutter or other objects, similar appearances of different
objects, and other factors.
Tracking-by-detection has proven to be the most successful strategy to address the task of tracking multiple targets in unconstrained scenarios
Tracking-by-detection has proven to be the most successful strategy to address the task of tracking multiple targets in unconstrained scenarios
Speech recognition systems decode the speech signal into meaningful words, phrases or sentences.
Robustness in speech recognition refers to the need to maintain good recognition accuracy even when
the quality of the input speech is degraded, or when the acoustical, articulatory, or phonetic
characteristics of speech in the training and testing environments differ. Obstacles to robust
recognition include acoustical degradations produced by additive noise, the effects of linear filtering,
nonlinearities in transduction or transmission as well as impulsive interfering sources, and diminished
accuracy caused by changes in articulation produced by the presence of high-intensity noise sources.
Specific activities in this area involve:
Specific activities in this area involve:
- Autocorrelation-based robust speech recognition
- Histogram equalization and other normalizations for robust recognition
- ICA-based robust speech recognition and enhancement
- MVDR and other spectral estimations for robust speech recognition
Speaker adaptation consists of adapting a (usually speaker independent) speech recognizer, to a
new speaker, in order to improve its overall performance for that specific speaker.
Specific activities in this area involve:
Specific activities in this area involve:
- Speaker clustering
- Eigenvoice, MAP and MLLR-based adaptations
Language identification is the process of determining which natural language given content is in.
Traditionally, identification of written language - as practiced, for instance, in library science
- has relied on manually identifying frequent words and letters known to be characteristic of
particular languages. More recently, computational approaches have been applied to the problem.
By viewing language identification as a kind of text categorization, a Natural Language Processing
approach, which relies on statistical methods, may be utilized.
Digital watermarking is the art of embedding useful information into the digital products (such as audio, image, video, text) in a way that does not interfere with normal usage of it. This information is used for different purposes such as copyright protection, content authentication, broadcast monitoring etc. Several issues should be considered in a watermarking system. Three important issues are: transparency of watermark, robustness of the system against attacks and data rate of watermark embedding. A good watermarking system should have all of these requirements at acceptable level.
Specific activities in this area involve
Specific activities in this area involve
- Robust audio watermarking
- Using neural networks in watermarking
- Mathematical modeling of watermarking systems