Supplementary Contents (Audio examples) for IEEE Special Issue
Paper Submitted to Special Issue on Signal Models and Representations for IEEE Transactions on Audio, Speech and Language Processing
Fixed-Dimension Modified Sinusoidal Model: A High Quality Sparse Representation for Mixed Speech and Audio Signals
Pejman Mowlaee, Abolghasem Sayadiyan, and Hamid Sheikhzadeh,
Table 1. Sound quality assessment for male and female speakers.
|
Category\Description |
Adult male |
Old man |
Old woman |
Child |
Adult female |
|
Original signal |
Male | oldman | old woman | Child | woman |
|
FDMSM reconstructed using M=21 |
|||||
| FDMSM reconstructed using M=25 | Male_M25 | oldman_M25 | oldwoman_M25 | Child_M25 | lgwc9_M25 |
| FDMSM reconstructed using M=29 | Male_M29 | oldman_M29 | oldwoman_M29 | Child_M29 | lgwc9_M29 |
| FDMSM reconstructed using M=33 | Male_M33 | oldman_M33 | oldwoman_M33 | Child_M33 | lgwc9_M33 |
| FDMSM reconstructed using M=37 | |||||
| FDMSM reconstructed using M=40 |
Description of files: Male_M21 denote the FDMSM synthesized speech signal for a male speaker using Model order of M=21.
Table 3. Sound quality assessment for male and female speakers from Cooke database in [1].
|
Table 3(Test 1) Female |
Table3 (Test 2) Male |
Category |
| Test 1: “bbaj4n” | Test 2: “bbbe8n” | Description |
| lgwc9 |
Original signal |
|
| lgwc9_M33 | bbbe8n_M36 | FDMSM reconstructed using M=37 |
| lgwc9n_M40 | bbbe8n_M40 | FDMSM reconstructed using M=40 |
| lgwc9n_M50 | bbbe8n_M50 | FDMSM reconstructed using M=50 |
Table 4. Sound quality assessment for signals of Quartet and Glockenspiel from SAQM database in [2].
|
Category |
Table4 (Test 1) |
Description |
Table4 (Test 2) |
|
Quartet of 28 sec |
Original signal |
Glockenspiel of 35 sec |
|
|
Original signal |
|||
|
FDMSM using M=25 |
FDMSM using M=20 |
||
|
FDMSM using M=29 |
FDMSM using M=24 |
||
|
FDMSM using M=33 |
FDMSM using M=25 |
||
|
FDMSM using M=37 |
FDMSM using M=28 |
||
|
FDMSM using M=40 |
FDMSM using M=37,40 |
Table 5. Sound quality assessment for mixtures.
|
Table 6. Sound quality assessment for corrupted signal with Babble/Harmonic noise at SNR=10dB.
|
Harmonic Noise |
||||||
|
Description: harmonic noise, with a fundamental frequency of 300 Hz and its 10 harmonics |
||||||
| Male | Female | Mixture | Music | |||
| Male_harmonic |
Original signal of harmonic noise and Male speaker |
Female_harmonic |
Original signal of harmonic noise and Female speaker |
Mix_harmonic |
Original signal of harmonic noise and speech mixture |
Music_harmonic |
| Male_harmonic_M40 |
FDMSM reconstructed for the mixture of harmonic noise and Male speaker using M=40 |
Female_harmonic_M40 |
FDMSM reconstructed for the mixture of harmonic noise and Female speaker using M=40 |
Mix_harmonic_M37 |
FDMSM reconstructed for the mixture of harmonic noise and speech mixture using M=45 |
|
| Male_harmonic_M45 |
FDMSM reconstructed for the mixture of harmonic noise and Male speaker using M=45 |
Female_harmonic_M45 |
FDMSM reconstructed for the mixture of harmonic noise and Female speaker using M=45 |
Mix_harmonic_M40 |
FDMSM reconstructed for the mixture of harmonic noise and speech mixture using M=55 |
|
| Male_harmonic_M50 |
FDMSM reconstructed for the mixture of harmonic noise and Male speaker using M=50 |
Female_harmonic_M50 |
FDMSM reconstructed for the mixture of harmonic noise and Female speaker using M=50 |
Mix_harmonic_M45 |
FDMSM reconstructed for the mixture of harmonic noise and speech mixture using M=60 |
Music_harmonic_M60 |
|
FDMSM reconstructed for the mixture of harmonic noise and Male speaker using M=68 |
FDMSM reconstructed for the mixture of harmonic noise and speech mixture using M=50,60 |
|||||
|
Babble Noise |
||||||
|
Description: babble noise from NOISEX92 |
||||||
| Male | Female | Mixture | Music | |||
| Male_babble |
Original signal of babble noise and Male speaker |
Female_babble |
Original signal of babble noise and Female speaker |
Mix_babble |
Original signal of babble noise and speech mixture |
Music_babble |
|
FDMSM reconstructed for the mixture of babble noise and Male speaker using M=40 |
Female_babble_M37 |
FDMSM reconstructed for the mixture of babble noise and Female speaker using M=40 |
Mix_babble_M40 |
FDMSM reconstructed for the mixture of babble noise and speech mixture using M=45 |
||
|
FDMSM reconstructed for the mixture of babble noise and Male speaker using M=45 |
Female_babbl_M40 |
FDMSM reconstructed for the mixture of babble noise and Female speaker using M=45 |
Mix_babble_M50 |
FDMSM reconstructed for the mixture of harmonic noise and speech mixture using M=55 |
||
|
FDMSM reconstructed for the mixture of babble noise and Male speaker using M=50 |
Female_babble_M50 |
FDMSM reconstructed for the mixture of babble noise and Female speaker using M=50 |
Mix_babble_M68 |
FDMSM reconstructed for the mixture of babble noise and speech mixture using M=60 |
Music_babble_M40 | |
|
FDMSM reconstructed for the mixture of babble noise and speech mixture using M=45,50 |
||||||
[1] The speech separation challenge database is downloadable from:
[Online]. Available: www.dcs.shef.ac.uk/~martin/SpeechSeparationChallenge.htm
[2] sound quality assessment material (SQAM) database downloadable from
[Online]. Available: http://andrew.csie.ncyu.edu.tw/html/mpeg4/sound.media.mit.edu/mpeg4/audio/sqam/index.html
P. Mowlaee, and A. Sayadiyan,
[1]. A Model Order
based Sinusoidal representation for Audio Signals
Computer Systems and Applications, 2008. AICCSA 2008. IEEE/ACS International
Conference on
March 31 2008-April 4 2008 Page(s):501 - 507
P. Mowlaee, and A. Sayadiyan,
[2]. Model-based Monaural Sound
Separation by Split-VQ of sinusoidal parameter
16th European Signal Processing Conference EUSIPCO 2008
August 24 2008-August 29 2008 (Available online:
http://www.eurasip.org/Proceedings/Eusipco2008/papers/1569096851.pdf)
P. Mowlaee, A. Sayadiyan, and M. Rahmati
[3].
Sparse
Sinusoidal Representation for Speech and Music Signals
CCIS06 Springer Lecture Notes in Computer Science © Springer-Verlag Berlin
Heidelberg 2008
Page(s):469 - 476, 2008 (Available online:
http://www.springerlink.com/content/u572687721773842/)
P. Mowlaee, A. Sayadiyan, and M. Sheikhan
[4].
A Fixed Dimension Modified Sinusoid Model (FD-MSM) for Single Microphone Sound
Separation
IEEE international Conference on Signal Processing and Communications, ICSPC
2007
Page(s):1183 - 1186, November 24-27, 2007.
|
Emails web link: |
Postal Address
Information Processing Laboratory 424, Hafez Ave., Tehran, Iran, 13597-45778 Cell: +98 9111834523, Fax: (+9821) 66406469 |