Improving Formant and Concatenative Speech Synthesis Techniques through Using Vocoders
Subject Areas : electrical and computer engineeringN. Maghsoodi 1 * , M. M. Homayounpour 2
1 -
2 -
Keywords: Concatenative synthesis formant synthesis multiband excitation straight vocoder,
Abstract :
In this paper an approach to improve the quality of synthetic speech in formant and concatenative synthesis techniques is described. To deal with this problem we focused on using vocoders. In concatenative speech synthesis the idea is based on post processing the generated speech to reduce discontinuities. The post processing is consists of integrating Straight method to synthesis system in order to smooth the boundary between units. On the other hand, in formant synthesis we used multi excitation linear predictive method to replace simple excitation signal in Klatt method with multiband excitation. Our synthesis techniques were evaluated with respect to naturalness, fluidity and intelligibility based on subjective methods. These experiments clarified that the naturalness of synthetic speech can be improved by using our smoothing methods and multiband excitation signal.
[1] D. O'Shaughnessy, Speech Communication: Human and Machine, NewYork, Addison - Wesley, 1990.
[2] D. Klatt, "Software for a cascade/parallel formant synthesizer," J. of Acoustical Society of America, vol. 67, no. 3, pp. 971-995, Mar. 1980.
[3] P. Kabal, Code Excited Linear Prediction Coding of Speech at 4.8 kb/s. Technical Report 87-36, INRS-Telecommunications, University of Quebec, 1987.
[4] T. Moriya and M. Honda, "A mixed excitation LPC vocoder model for low bit rate speech coding," IEEE Trans. Speech, Audio Processing, vol. 3, no. 4, pp. 242-250, Jul. 1986.
[5] D. Griffin and J. Lim, "Multiband excitation vocoder," IEEE Trans. on Acoustics, Speech, and Signal Processing, vol. 36, no. 8, pp. 1223-1235, Aug. 1988.
[6] H. Kawahara, I. Masuda - Katsuse, and A. Cheveigne, "Restructuring speech representations using a pitch - adaptive time - frequency smoothing and an instantaneous - frequencybased F0 extraction," Speech Communication, vol. 27, no. 3, pp. 187-207, Apr. 1999.
[7] H. Zen and T. Toda, "An overview of Nitech HMM - based speech synthesis system for blizzard challenge 2005," in Proc. of Interspeech, pp. 93-96, Sep. 2005.
[8] H. Matsui and H. Kawahara, "Investigation of emotionally morphed speech perception and its structure using a high quality speech manipulation system," in Proc. 8th European Conf. on Speech Communication and Technology, pp. 2113-2116, 1-4 Sep. 2003.
[9] T. Yonezawa, N. Suzuki, K. Mase, and K. Kogure, "Gradually changing expression of singing voice based on morphing," in Proc. of Interspeech, pp. 541-544, Sep. 2005.
[10] F. Charpentier and M. G. Stella, "Diphone synthesis using an overlap add technique for speech waveform concatenation," Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 2015-2018, Apr. 1986.
[11] T. Dutoit, An Introduction to Text - to - Speech Synthesis, The Netherlands: Kluwer, 1997.
[12] T. Dutoit and H. Leich, "MBR- PSOLA: text to speech synthesis based on a MBE re - synthesis of the segments database," Speech Communication, vol. 13, no. 3, pp. 435-440, Nov. 1993.
[13] B. Bozkurt, T. Dutoit, C. D'Alessandro, V. Pagel, and R. Prudon, "Improving quality of MBROLA synthesis for non-uniform units synthesis," in Proc. IEEE Workshop Speech Synthesis, pp. 7-9, 11-13 Sep. 2002.
[14] A. Mihelic and J. Zganec-Gros, "Efficient unit-selection in text-to-speech synthesis," in Proc. of the 11th Int. Conf. on Text, Speech, and Dialogue, pp. 411-418, 2008.
[15] T. David, J. Chappell, and H. Hansen, "A comparison of spectral smoothing methods for segment concatenation based speech synthesis," Speech Communication, vol. 36, no. 3-4, pp. 343-373, Mar. 1998.
[16] R. C. Snell and F. Milinazzo, "Formant location from LPC analysis data," IEEE Trans. on Speech and Audio Processing, vol. 1, no. 2, pp. 129-134, Apr. 1993.
[17] ح. قادري، توليد گفتار فارسي از روي دنباله آوايي از طريق مدلکردن ساختار گوياي انسان، پاياننامه کارشناسي ارشد مهندسي کامپيوتر، دانشگاه صنعتي شريف، 1377.