بهبود کيفيت گفتار نويزي باند محدود با تلفيق الگوريتم‌هاي سري تيلور برداري و گسترش پهناي باند

محورهای موضوعی : مهندسی برق و کامپیوتر

سارا پورمحمدي ^{1
*} , منصور ولي ² , محسن قدياني ³

1 - دانشگاه شاهد
2 - برق
3 - دانشگاه شاهد

تاریخ دریافت : 1394/09/08 تاریخ پذیرش : 1394/09/08 تاریخ انتشار : 1392/03/31

کلید واژه: سري‌هاي تيلور برداري گسترش پهناي باند گفتار نويزي باند محدود مدل ترکيب گوسي,

چکیده مقاله :

در مقاله حاضر با تلفيق دو ديدگاه سري‌هاي تيلور برداري و گسترش پهناي باند مصنوعي، ايده جديدي در زمينه بهبود كيفيت سيگنال گفتار باند محدود تخريب‌شده توسط نويز ارائه شده است. بدين ترتيب كه ابتدا پارامترهاي بازنمايي MFCC استخراج‌شده از گفتار نويزي باند محدود به روش سري‌هاي تيلور برداري اصلاح شده و سپس با استفاده از مدل گسترش پهناي باند مبتني بر GMM، بردارهاي بازنمايي گفتار باند گسترده براي اين پارامترهاي اصلاح‌شده تخمين زده مي‌شوند. سپس به كمك دو معيار اندازه‌گيري PESQ و LSD، ميزان شباهت پوش طيف و سيگنال گفتار تخمين زده شده باند گسترده با پوش طيف باند گسترده و گفتار تميز مرجع سنجيده مي‌شود. نتايج به دست آمده از پياده‌سازي اين الگوريتم به وضوح بيانگر كارايي مناسب ايده پيشنهادي در جهت بهبود كيفيت بردارهاي بازنمايي گفتار باند محدود آلوده به نويز و نزديك‌تر كردن آنها به بردارهاي ويژگي سيگنال گفتار باند گسترده مرجع هستند.

چکیده انگلیسی:

In this paper, we introduce an efficient and previously unreported approach to enhance the quality of corrupted narrowband speech signal using joint Vector Taylor Series (VTS) and Bandwidth Extension (BWE) algorithms. First, feature vectors extracted from the noisy narrowband signal have modified applying VTS technique. Then, the estimation of corresponding wideband features have derived from the compensated parameters using two different artificial BWE methods (Envelope prediction with GMM and Neural Network). Finally, the distance between the wideband feature vectors and their estimated values evaluated using Log Spectral Distortion (LSD) measurement criteria. The results of implementation clearly show the advantage of proposed idea to improve the quality of the contaminated speech. In addition, we show that artificial BWE of speech signal, based on the neural network envelope extension outperforms better results in comparison with the GMM algorithm.

منابع و مأخذ:

[1] M. Vali, S. A. Seyyed Salehi, and K. Karimi, "Robust speech recognition by modifying clean and telephone feature vectors using bidirectional neural network," in Proc. Interspeech, Pittsburgh, US, 17-21 Sep. 2006.
[2] R. M. Stern, B. Raj, and P. J. Moreno, "Compensation for environmental degradation in automatic speech recognition," in Proc. of the Tutorial and Research Workshop, pp. 33-42, 1997.
[3] P. J. Moreno, Speech Recognition in Noisy Environment, Ph.D. Thesis, pp. 79-96 and 121-126, 1996.
[4] P. J. Moreno, B. Raj, and R. M. Stern, "A vector taylor series approach for environment-independent speech recognition," in Proc. ICASSP, vol. 2, pp. 733-736, Atlanta, US, 7-10 May 1996.
[5] N. S. Kim, D. Y. Kim, B. G. Kong, and S. R. Kim, "Application of VTS to environment compensation with noise statistics," in Proc. Interspeech, 2001.
[6] D. Y. Kim, C. K. Un, and N. S. Kim, "Speech recognition in noisy environments using first-order vector taylor series," Speech Communication, vol.24, no.1, pp. 39-49, Apr. 1998.
[7] B. Iser and G. Schmidt, Bandwidth Extension of Telephony Speech, in Adaptive Signal Processing: Next Generation Solutions, eds. T Adali and S. Haykin, New York, Wiley, 2010.
[8] J. Peter and V. Peter, "On artificial bandwidth extension of telephone speech," Signal Processing, vol. 83, no. 8, pp. 1707-1719, 2003.
[9] P. Jax and P. Vary, "Feature selection for improved bandwidth extension of speech signal," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, pp. 697-700, Montreal, Canada, 2004.
[10] A. H. Nour-Eldin and P. Kabal, "Objective analysis of the effect of memory inclusion on bandwidth extension of narrowband speech," in Proc. Interspeech, pp. 2489-2492, Antwerp, Belgium, 2007.
[11] A. H. Nour-Eldin and P. Kabal, "Mel-frequency cepstral coefficient-based bandwidth extension of narrowband speech," in Proc. Interspeech, pp. 53-56, Brisbane, Australia, 22-26 Sep. 2008.
[12] H. Pulakka, U. Remes, K. Palomaki, M. Kurimo, and P. Alku, "Speech bandwidth extension using gaussian mixture model-based estimation of the highband mel spectrum," in Proc. ICASSP, pp. 5100-5103, 2011.
[13] A. Shahina and B. Yegnanarayana, "Mapping neural networks for bandwidth extension of narrowband speech," in Proc. Interspeech, pp 1435-1438, 2006.
[14] B. Milner and X. Shao, "Speech reconstruction from mel-frequency cepstral coefficients using a source-filter model," InterSpeech, pp. 2421-2424, Denver, US, 2002.
[15] L. Laaksonen, H. Pulakka, V. Myllyla, and P. Alku, "Development, evaluation, and implementation of an artificial bandwidth extension method of telephone speech in mobile terminal," IEEE Trans. Consumer Electronics, vol. 55, no. 2, pp. 780-787, May 2009.
[16] ب. زماني دهكردي، ا. اكبري و ب. ناصر شريف، "طرح دو فيلتر جديد براي بهبود كيفيت گفتار مبتني بر توزيع احتمال پسين براي ضرايب موجك،" نشريه علمي پژوهشي انجمن كامپيوتر ايران، جلد 6، شماره 3- ب، صص. 13-1، پاييز 1387.

مقالات مرتبط

یک رهیافت فرااکتشافی چندهدفه برای بهبود پوشش و اتصال در شبکه‌های حسگر بی‌سیم
تاریخ چاپ : 1405/02/22
رویکرد ارزیابی هیجان نوین جهت مراقبت از سرطان مبتنی بر مدل‌های زبانی بزرگ
تاریخ چاپ : 1405/02/22
ارائه روشی برای مدیریت منابع در شبکه‌های Fog-DSDN با بهره‌گیری از معماری میکروسرویس و شبکه‌های ESN
تاریخ چاپ : 1405/02/22
چارچوب ترکیبی سبک‌وزن برای امنیت اینترنت اشیا با استفاده از جنگل تصادفی بهینه و انتخاب ویژگی تطبیقی در معماری لبه-ابری
تاریخ چاپ : 1405/02/22
یک چارچوب یادگیری نیمه‌نظارتی جهت دسته‌بندی دقیق موارد آزمون با بهره‌گیری از تعبیه‌های زبانی و ویژگی‌های معنایی متن
تاریخ چاپ : 1405/02/22
تکنیک هوشمند مبتنی بر الگوریتم چتر دریایی برای زمان‌بندی وظایف بر اساس اولویت در شبکه‌های IoT/Fog
تاریخ چاپ : 1405/02/22

اشتراک گذاری

آدرس مقاله

بهبود کيفيت گفتار نويزي باند محدود با تلفيق الگوريتم‌هاي سري تيلور برداري و گسترش پهناي باند