STRAIGHT link page (under construction: Your suggesitons are welcom!)
This page consists of links to pages relating to our STRAIGHT speech manipulation procedures.
STRAIGHT and morphing procedures
-
TANDEM-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0 and aperiodicity estimation,
by Hideki Kawahara, M. Morise, T. Takahashi, R. Nisimura, T. Irino and H. Banno,
Proc. ICASSP'2008, Las Vegas, pp.3933-3936 (2008).
-
Implementatioin of realtime STRAIGHT speech manipulation system: Report on its first implementation,
by Hideki Banno, Hiroaki Hata, Masanori Morise, Toru Takahashi, Toshio Irino and Hideki Kawahara,
Acoustic Science and Technology, 28(3), pp.140-146 (2007)
-
STRAIGHT, exploitation of the other aspect of VOCODER:
Perceptually isomorphic decomposition of speech sounds
, by Hideki Kawahara,
Acoust. Sci. & Tech. 27, 6, pp.349-353 (2006) [invited review]
- Auditory Morphing based on an Elastic Perceptual Distance Metric in an Interference-free Time-frequency Representation,
by Hideki Kawahara and Hisami Matsui,
Proc. ICASSP'2003, vol.I, pp.256-259 (2003).
-
Restructuring speech representations using a pitch-adaptive time-frequency
smoothing and an instantaneous-frequency-based F0 extraction:
Possible role of a reptitive structure in sounds,
by Hideki Kawahara, Ikuyo Masuda-Katsuse and Alain de Cheveigne,
Speech Communication,
27, 3-4, pp.187-207 (1999). [EURASIP best paper award]
Journal and conference/workshop papers using STRAIGHT
- 2009
-
The role of f0 and formant frequencies in distinguishing the voices of men and women,
by James M. Hillenbrand and Michael J. Clark,
Attention, Perception & Psychophysics, 71, pp.1150-1166 (2009).
-
Statistical parametric speech synthesis,
by Heiga Zen, Keiichi Tokuda, and Alan W. Black,
Speech Communication, 51(11), pp.1039-1064 (2009).
- Voice conversion based on state-space model for modeling spectral trajectory,
by N. Xu, X. Yang, L.H. Zhang, W.P. Zue, J.Y. Bao,
Electronics Letters, 45(14), pp.763-764 (2009).
-
A codebook compensative voice morphing algorithm based on maximum likelihood estimation,
by N. Xu, Z. Yang, and L. Zhang,
Journal of Electronics, 26(3), pp.346-352 (2009).
-
Estimating vowel formant discrimination thresholds using a single-interval classification task,
by E. Oglesbee, and D. Kweley-Port,
Journal of the Acoustical Society of America, 125(4), pp.2323-2335 (2009).
-
A statistical, formant-pattern model for segregating vowel type and vocal-tract length in developmental formant data,
by R.E. Turner, T.C. Walters, J.J.M. Monaghan, and R.D. Patterson,
Journal of the Acoustical Society of America, 125(4), pp.2374-2386 (2009).
-
Visual speech improves the intelligibility of time-expanded auditory speech,
by A. Tanaka, S. Sakamoto, K. Tsumura, and Y. Suzuki,
NeuroReport, 20(5), pp.473-477 (2009).
-
A flexible spectral modification method based on temporal decomposition and Gaussian mixture model,
by Binh Phu Nguyen, and Masato Akagi,
Acoustical Science and Technology, 30(3), pp.170-179 (2009).
-
Perception of size modulated vowel sequence: Can we normalize the size of continuously changing vocal tract?,
by M. Tsuzaki, C. Takeshima, and T. Irino,
Acoustical Science and Technology, 30(2), pp.83-88 (2009).
- 2008
-
Task-Dependent Modulation of Medial Geniculate Body Is Behaviorally Relevant for Speech Recognition,
by Katharina von Kriegstein1, Roy D. Patterson, and T.D. Griffiths,
Current Biology, 18(23), pp.1855-1859 (2008).
-
The contribution of obstruent consonants and acoustic landmarks to speech recognition in noise,
by Ning Li and Philipos C. Loizou,
Journal of the Acoustical Society of America, 124(6), pp.3947-3958 (2008).
-
A three-layered model for expressive speech perception,
by Chun-Fang Huang, and Masato Akagi,
Speech Communication, 50(10), pp.810-828 (2008).
- The Nitech-NAIST HMM-Based Speech Synthesis System for the Blizzard Challenge 2006,
by Heiga ZEN, Tomoki TODA, and Keiichi TOKUDA,
IEICE TRANSACTIONS on Information and Systems, E91-D(6), pp.1764-1773 (2008).
-
Auditory Adaptation in Voice Perception,
by Stefan R. Schweinberger, Christoph Casper, Nadine Hauthal, Juergen M. Kaufmann, Hideki Kawahara, Nadine Kloth, David M.C. Robertson, Adrian P. Simpson and Romi Zaeske,
Current Biology 18(9), pp.684-688, May 6, (2008).
-
Morphing rhesus monkey vocalizations,
by Subhojit Chakladara, Nikos K. Logothetisa, and Christopher I. Petkov,
Journal of Neuroscience Methods,
170(1), pp.45-55 (2008).
-
Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model,
by Tomoki Toda, Alan W. Black, and Keiichi Tokuda,
Speech Communication, 50(3), pp.215-227 (2008).
-
Statistical approach to vocal tract transfer function estimation based on factor analyzed trajectory HMM,
by T. Toda, and K. Tokuda,
Proc. ICSSSP2008, pp. 3925-3928 (2008).
-
Stylization of pitch with syllable-based linear segments,
by S. Ravuri, D.P.W. Ellis,
Proc. ICSSSP2008, pp. 3985-3988 (2008).
-
Directional dependency of cepstrum on vocal tract length,
by D. Saito, R. Matsuura, S. Asakawa, N. Minematsu, and K. Hirose,
Proc. ICSSSP2008, pp. 4485-4488 (2008).
-
Combining Spectral Representations for Large-Vocabulary Continuous Speech Recognition,
by G. Garau, and S. Renals,
IEEE Trans. Audio, Speech, and Language Processing,
16(3), pp.508-518 (2008).
-
Speaking Rate and Fundamental Frequency as Speech Cues to Perceived Age,
by J. Harnsberger, R. Shrivastav, W. Brown Jr., H. Rothman, and H. Hollien,
Journal of Voice, 22(1), pp.58-69 (2008).
-
Speech recognition with varying numbers and types of competing talkers by normal-hearing, cochlear-implant, and implant simulation subjects
by Helen E. Cullington and Fan-Gang Zeng,
The Journal of the Acoustical Society of America,
123(1), pp.450-461 (2008).
- 2007
-
Discrimination of speaker sex and size when glottal-pulse rate and vocal-tract length are controlled,
by David R. R. Smith and Thomas C. Walters and Roy D. Patterson,
The Journal of the Acoustical Society of America,
122(6), pp.3628-3639 (2007).
-
Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory,
by Toda, T. and Black, A. W. and Tokuda, K.,
Trans. IEEE on Audio, Speech, and Language Processing,
15(8), pp.2222-2235 (2007).
-
Differential reductions in acoustic startle document the discrimination of speech sounds in rats,
by Owen R. Floody and Michael P. Kilgard,
The Journal of the Acoustical Society of America,
122(4), pp.1884-1887 (2007).
-
Speech-to-Singing Synthesis: Converting Speaking Voices to Singing Voices by Controlling Acoustic Features Unique to Singing Voices,
by Saitou, Takeshi and Goto, Masataka and Unoki, Masashi and Akagi, Masato,
IEEE workshop on Applications of Signal Processing to Audio and Acoustics,
21-24 Oct., pp.215-218 (2007).
-
Perceptural boundary between a single and a geminate stop in Japanese,
by Shigeaki Amano, Ryoko Mugitani, and Tessei Kobayashi,
Proc. ICPhS XVI, Saarbruecken, 6-10 August 2007, pp. 733-736.
-
Effects of cochlear implant processing and fundamental frequency on the intelligibility of competing sentences,
by Ginger S. Stickney and Peter F. Assmann and Janice Chang and Fan-Gang Zeng,
The Journal of the Acoustical Society of America,
122(2), pp.1069-1078 (2007).
-
Relationship between fundamental and formant frequencies in voice preference,
The Journal of the Acoustical Society of America,
by Peter F. Assmann and Terrance M. Nearey,
122(2), pp.EL35-EL-43 (2007).
-
Neural Representation of Auditory Size in the Human Voice and in Sounds from Other Resonant Sources,
by Katharina von Kriegstein, David R.R. Smith, Roy D. Patterson, D. Timothy Ives and Timothy D. Griffiths,
Current Biology, 17(13), pp.1123-1128 (2007).
-
Effects of acoustic modification on perception of speaker characteristics for sustained vowels,
by Tatsuya Kitamura and Takeshi Saitou,
Acoustical Science and Technology,
28(6), pp.434-437 (2007).
-
Detection of temporal modulation of "size" in vowel sequences,
Acoustical Science and Technology,
28(5), pp.349-351 (2007).
-
Perceptual Continuity and Naturalness of Expressive Strength in Singing Voices Based on Speech Morphing,
by Tomoko Yonezawa, Noriko Suzuki, Shinji Abe, Kenji Mase, and Kiyoshi Kogure,
URASIP Journal on Audio, Speech, and Music Processing, vol. 2007, Article ID 23807, 9 pages, 2007. doi:10.1155/2007/23807
-
Vocal-Tract Resonances as Indexical Cues in Rhesus Monkeys,
by Asif A. Ghazanfar, Hjalmar K. Turesson, Joost X. Maier, Ralph van Dinther, Roy D. Patterson and Nikos K. Logothetis,
Current Biology, Volume 17, Issue 5 , 6 March 2007, Pages 425-430.
-
Towards an improved modeling of the glottal source in statistical parametric speech synthesis,
by J. Cabral, S. Renals, K. Richmond, and J. Yamagishi,
Proc.of the 6th ISCA Workshop on Speech Synthesis, Bonn, Germany, 2007
- 2006
- Perception of acoustic scale and size in musical instrument sounds,
by R. van Dinther van and R. D. Patterson ,
J. Acoust. Soc. Am. 120, pp. 2158-2176, 2006
-
Processing the acoustic effect of size in speech sounds,
by K. von Kriegstein, J.D. Warren, D.T. Ives, R.D. Patterson and T.D. Griffiths,
NeuroImage, Volume 32, Issue 1, 1 August 2006, Pages 368-375
-
An evaluation of cost functions sensitively capturing local degradation of naturalness for segment selection in concatenative speech synthesis,
by Tomoki Toda, Hisashi Kawai, Minoru Tsuzaki and Kiyohiro Shikano,
Speech Communication, Volume 48, Issue 1, January 2006, Pages 45-56
-
Modeling the Global Acoustic Correlates of Expressivity for Chinese Text-to-Speech Synthesis,
by Hongwu Yang, Helen M. Meng, Zhiyong Wu and Lianhong Cai.,
IEEE / ACL 2006 Workshop on Spoken Language Technology, Aruba, Dec. 10-13, 2006.
- Slow Speech Enhances Younger
But Not Older Infantsf Perception
of Vocal Emotion,
RESEARCH IN HUMAN DEVELOPMENT, 3(1), pp. 7-19 (2006).
-
The Nitech-NAIST HMM-based speech synthesis system
for the Blizzard Challenge 2006
by Heiga Zen, Tomoki Todaõ , Keiichi Tokuda,
Proc. Blizzard Challenge 2006 Workshop, Pittsburgh, USA, Sep. 2006.
-
Developing a Test Bed of English Text-to-Speech System XIMERA
for the Blizzard Challenge 2006,
by Tomoki Toda, Hisashi Kawai, Toshio Hirai, Jinfu Ni, Nobuyuki Nishizawa,
Junichi Yamagishi, Minoru Tsuzaki, Keiichi Tokuda, Satoshi Nakamura,
Proc. Blizzard Challenge 2006 Workshop, Pittsburgh, USA, Sep. 2006.
-
USTC System for Blizzard Challenge 2006
an Improved HMM-based Speech Synthesis Method
by Zhen-Hua Ling, Yi-Jian Wu, Yu-Ping Wang, Long Qin, Ren-Hua Wang,
Proc. Blizzard Challenge 2006 Workshop, Pittsburgh, USA, Sep. 2006.
- High Quality Sinusoidal Modeling of Wideband Speech for the Purposes of Speech Synthesis and Modification,
by Chazan, D. Hoory, R. Sagi, A. Shechtman, S. Sorin, A. Zhi Wei Shuang Bakis, R.,
Proc. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2006),
Volume: 1, pp.877-880 (2006).
(IEEE Xplore)
-
Contribution of pitch-accent information to Japanese spoken-word recognition,
by Ikuyo Masuda-Katsuse, Acoustical Science and Technology,
Vol. 27 (2006) , No. 2 pp.97-103
- 2005
-
Effect of speaking rate on the acceptability of change in segment duration,
by Makiko Muto, Hiroaki Kato, Minoru Tsuzaki and Yoshinori Sagisaka,
Speech Communication, Volume 47, Issue 3, November 2005, Pages 277-289
-
Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis,
by Takeshi Saitou, Masashi Unoki and Masato Akagi,
Speech Communication, Volume 46, Issues 3-4, July 2005, Pages 405-417
-
Effect of intra-phrase position on acceptability of change in segment duration in sentence speech,
by Makiko Muto, Hiroaki Kato, Minoru Tsuzaki and Yoshinori Sagisaka,
Speech Communication, Volume 45, Issue 4, April 2005, Pages 361-372
- Speech and melody recognition in binaurally combined acoustic
and electric hearing,
by Ying-Yee Kong, Singer S. Stickney, Fan-Gang Zeng,
J. Acoust. Soc. Am. 117 (3), pp.1351-1361, March 2005.
(JASA online)
- Discrimination of speaker size from syllable phrases,
by D. T. Ives, D. R. R. Smith and R. D. Patterson,
J. Acoust. Soc. Am. 118(6), pp.3816-3822, 2005.
- The interaction of glottal-pulse rate and vocal-tract length in judgements of speaker size, sex and age,
by D. R. R. Smith and R. D. Patterson,
J. Acoust. Soc. Am. 118, pp.3177-3186, 2005.
- The processing and perception of size information in speech sounds,
by David R. R. Smith, Roy D. Patterson, Richard Turner, Hideki Kawahara and Toshio Irino,
J. Acoust. Soc. Am. 117 (1), pp.305-318, January 2005.
(JASA online)
- Designing Target Cost Function Based on Prosody of Speech Database,
Kazuki ADACHI, Tomoki TODA, Hiromichi KAWANAMI, Hiroshi SARUWATARI and Kiyohiro SHIKANO,
IEICE Transactions on Information and Systems 2005 E88-D(3):519-524.
(IEICEJ transaction online)
-
An Overview of Nitech HMM-based Speech Synthesis System for Blizzard Challenge 2005,
by Heiga Zen and Tomoki Toda, Proc. INTERSPEECH 2005, pp.93-96 (September 2005).
-
Wideband Speech Coding Algorithm Based on Adaptive Interpolation of Weighted Spectrum
, JOURNAL OF DATA ACQUISITION & PROCESSING 2005 Vol.20 No.1 P.28-33 (Written in Chinese)
-
Spectral Conversion Based on Maximum Likelihood Estimation Considering Global Variance of Converted Parameter,
by Toda, T., Black, A., and Tokuda, K,
ICASSP, Philadelphia, Pennsylvania, Vol.I, pp.9-12 (2005) (IEEE Xplore)
- 2004
- STRAIGHT: A new speech synthesizer for vowel formant discrimination
, by Chang Liu and Diane Kewley-Port, Acoustics Research Letters Online -- April 2004 -- Volume 5, Issue 2, pp. 31-36
-
Cochlear implant speech recognition with speech maskers
, by Ginger S. Stickney, Fan-Gang Zeng, Ruth Litovsky and Peter Assmann, The Journal of the Acoustical Society of America -- August 2004 -- Volume 116, Issue 2, pp. 1081-1091
-
Vowel formant discrimination for high-fidelity speech
, by Chang Liu and Diane Kewley-Port, The Journal of the Acoustical Society of America -- August 2004 -- Volume 116, Issue 2, pp. 1224-1233
-
Correspondence between the speech production and perception
of lexical accent by Kagoshima Japanese speakers,
by Ayako Shirose , Kazuhiko Kakehi, Ichiro Ota and Shigeru Kiritani,
Acoust. Sci. & Tech. 25, 5, pp.379-381 (2004)
-
Acoustic-to-Articulatory Inversion Mapping with Gaussian Mixture Model,
by Toda, T., and Black, A., and Tokuda, K. ,
ICSLP2004, Jeju, Korea, (2004)
-
Mapping from Articulatory Movements to Vocal Tract Spectrum with Gaussian Mixture Model for Articulatory Speech Synthesis,
by T. Toda, A.W. Black, K. Tokuda,
Proc. 5th ISCA Speech Synthesis Workshop (SSW5), pp. 31-36, Pittsburgh, USA, June 2004.
- 2003
-
Frequency Shifts and Vowel Identification,
by P. F. Assmann and T. M. Nearey, Proc. Int. Ntnl. Congr. of Phonetic Sci. (15th ICPhS), Barcelona, Spain, 3-9 Aug 2003.
-
Voice Conversion with Smoothed GMM and MAP Adaptation
, Yining Chen, Min Chu, Eric Chang, Jia Liu, and Runsheng Liu,
Proc. EUROSPEECH 2003, pp.2413-2416 (2003).
- GMM-based Voice Conversion Applied to Emotional Speech Synthesis,
by H. Kawanami, Y. Iwami, T. Toda, H. Saruwatari, K. Shikano,
Proc. Proc. INTERSPEECH2003-EUROSPEECH, pp. 2401-2404, Geneva, Switzerland, Sep. 2003.
-
Investigation of Emotionally Morphed Speech Perception and its Structure
using a High Quality Speech Manipulation System,
Proc. Proc. INTERSPEECH2003-EUROSPEECH, pp. 2113-2116, Geneva, Switzerland, Sep. 2003.
- 2002
-
MODELING THE PERCEPTION OF FREQUENCY-SHIFTED VOWELS
, by P. F. Assmann, T. M. Nearey and J. M. Scott, Proceedings of the 7th International Conference on Spoken Language Processing, pp. 425- 428.. (2002).
-
Extraction of F0 dynamic characteristics and development of F0 control model in singing voice,
by Takeshi Saitou, Masashi Unoki, and Masato Akagi,
Proc. of ICAD2002, pp. 275-278, Kyoto, Japan, July 2002.
-
Plasticity of the Human Auditory Cortex Induced by Discrimination Learning of Non-Native, Mora-Timed Contrasts of the Japanese Language,
by Hans Menning, Satoshi Imaizumi, Pienie Zwitserlood, and Christo Pantev,
LEARNING & MEMORY, Vol. 9, No. 5, pp. 253-267, September/October 2002
- 2001
- 2000
-
STRAIGHT-based voice conversion algorithm based on Gaussian mixture model,
by Tomoki Toda, Jinlin Lu, Hiroshi Saruwatari, Kiyohiro Shikano,
Proc. ICSLP-2000, vol.3, 279-282.
-
Investigation of analysis and synthesis parameters of straight by subjective evaluation,
by Parham Zolfaghari, Yoshinori Atake, Kiyohiro Shikano, Hideki Kawahara,
Proc. ICSLP-2000, vol.3, 498-501.
-
A vocal tract area ratio estimation from spectral parameter extracted by straight,
by Mamoru Iwaki,
Proc. ICSLP-2000, vol.3, 596-599.
- 1999
-
Fundamental frequency and the intelligibility of competing voices
, by P. F. Assmann,
Proceedings of the 14th International Congress of Phonetic Sciences, San Francisco, Aug. 1-7, 1999, pp. 179-182.
-
Speaker conversion through non-linear frequency warping of STRAIGHT spectrum,
by Noriyasu Maeda, Banno Hideki, Shoji Kajita, Kazuya Takeda, Fumitada Itakura,
Proc. Eurospeech'99, pp.827-830, 2000.
Conference/workshop abstracts using STRAIGHT
- 2006
- Matching fundamental and formant frequencies in vowels,
by P. F. Assmann, T. M. Nearey and Derrick Chen,
J. Acoust. Soc. Am., Vol. 120, No. 5, Pt. 2, November 2006 (4aSC5, Fourth Joint Meeting: ASA and ASJ, Honolulu).
- STRAIGHT as a research tool for L2 study: How to manipulate segmental and supra-segmental features,
by Hideki Kawahara and Reiko Akahane-Yamada,
J. Acoust. Soc. Am., Vol. 120, No. 5, Pt. 2, November 2006 (2pSCb2, Fourth Joint Meeting: ASA and ASJ, Honolulu).
- Improving the voice pitch discrimination threshold through cochlear implants.
by Shizuo Hiki and Masae Shiroma,
J. Acoust. Soc. Am., Vol. 120, No. 5, Pt. 2, November 2006 (2pPP47, Fourth Joint Meeting: ASA and ASJ, Honolulu).
- Comparing vowel formant thresholds from two tasks: Classification versus 2-alternative forced choice (2AFC) adaptive tracking,
by Eric Oglesbee and Diane Kewley-Port,
J. Acoust. Soc. Am., Vol. 120, No. 5, Pt. 2, November 2006 (2pPP41, Fourth Joint Meeting: ASA and ASJ, Honolulu).
- Temporal characteristics of extraction of size information in speech sounds,
by Chihiro Takeshima, Minoru Tsuzaki and Toshio Irino,
J. Acoust. Soc. Am., Vol. 120, No. 5, Pt. 2, November 2006 (2pPP38, Fourth Joint Meeting: ASA and ASJ, Honolulu).
- Interaction between the tonotopic and periodic information in recognition of ``melodic'' contours,
by Toshie Matsui, Chihiro Takeshima and Minoru Tsuzaki,
J. Acoust. Soc. Am., Vol. 120, No. 5, Pt. 2, November 2006 (2pPP9, Fourth Joint Meeting: ASA and ASJ, Honolulu).
- Wonders in perception and manipulation of speech,
by Makio Kashino, Hideki Kawahara and Hiroshi Riquimaroux,
J. Acoust. Soc. Am., Vol. 120, No. 5, Pt. 2, November 2006 (2aED12, Fourth Joint Meeting: ASA and ASJ, Honolulu).
- Tools for speech perception, production, and training studies: Web-based second language training system, and a speech resynthesis system,
by Reiko Akahane-Yamada, Takahiro Adachi and Hideki Kawahara,
J. Acoust. Soc. Am., Vol. 120, No. 5, Pt. 2, November 2006 (2aED9, Fourth Joint Meeting: ASA and ASJ, Honolulu).
- Evaluating naturalness of speeches morphed by independently using the interpolation ratios of the time-frequency axes and amplitude,
by Toru Takahashi, Masanori Morise and Toshio Irino,
J. Acoust. Soc. Am., Vol. 120, No. 5, Pt. 2, November 2006 (1pSC3, Fourth Joint Meeting: ASA and ASJ, Honolulu).
- Voice quality of artistic expression in Noh: An analysis-synthesis study on source-related parameters,
by Hideki Kawahara, Osamu Fujimura and Yasuyuki Konparu,
J. Acoust. Soc. Am., Vol. 120, No. 5, Pt. 2, November 2006 (1pMU1, Fourth Joint Meeting: ASA and ASJ, Honolulu).
- 2005
-
Relationship between fundamental and formant frequencies in speech perception,
by P. F. Assmann and T. M. Nearey,
J. Acoustical Society of America, Volume 117, Issue 4, pp. 2374-2374 (2005).
- Factors affecting the perception of Korean-accented American English,
by Kwansun Cho, John G. Harris and Rahul Shrivastav,
The Journal of the Acoustical Society of America, Volume 118, Issue 3, p. 1899 (September 2005).
- 2004
- Effects of context and frequency shifts in vowel identification,
by P. F. Assmann, Catherine M. Glidden and T. M. Nearey,
The Journal of the Acoustical Society of America, Volume 116, Issue 4, p. 2571 (October 2004).
- Effects of speaking rate on the perception of phonemic length contrast in Japanese,
by H. Kato and K. Tajima,
The Journal of the Acoustical Society of America, Volume 115, Issue 5, pp. 2392-2393 (May 2004).
Technical articles on STRAIGHT subsystems and morphing
- Nearly Defect-free F0 Trajectory Extraction for Expressive Speech Modifications based on STRAIGHT,
by Hideki Kawahara, Alain de Cheveigne, Hideki Banno, Toru Takahashi and Toshio Irino,
Proc. Interspeech2005, Lisboa, pp.537-540, Sept. 2005.
- Accurate vocal event detection method based on a fixed-point analysis of mapping from time to weighted average group delay,
by Hideki Kawahara, Yoshinori Atake, Parham Zolfaghari,
In ICSLP-2000, vol.4, pp.664-667 (2000).
- Fixed Point Analysis of Frequency to Instantaneous Frequency Mapping for Accurate Estimation of F0 and Periodicity,
by Hideki Kawahara, Haruhiro Katayose, Alain de Cheveigné, Roy D. Patterson,
Proc. EUROSPEECH'99, Volume 6, Page 2781-2784 (1999).
kawahara@sys.wakayama-u.ac.jp