This presentation deals with the estimation of fundamental frequency (f0) of pseudoperiodic sound signals with important results for polyphonic frequency tracking, and voice separation. Given a set of candidate partials in the signal, the estimation of f0 is taken in the sense of finding the optimal period duration(s) according to a criterion of maximum-likelihood harmonic matching. Excellent results have been obtained on large databases of speech (40 mn) and music [B. Doval and X. Rodet, Proc. IEEE-ICASSP, Toronto, May (1991)]. The algorithm has been implemented at IRCAM to run in real time for live performance frequency tracking. Developments are in several directions. A combined estimation of f0 and of a spectral envelope improves both estimations. Most important is the estimation of the ‘‘apriori’’ distributions of the different random variables on a learning set. Finally, a hidden Markov model tracks f0 trajectories between adjacent frames. The first experiments of polyphonic frequency tracking and voice separation are very promising. The model can be transposed directly to the maximum-likelihood estimation of several harmonic sounds since it already considers more than one f0 value. a)Presently on sabbatical at Ctr. for New Music and Audio Technol. (CNMAT), Univ. of California at Berkeley, 1750 Arch St., Berkeley, CA 94709.
Petre StoicaPeter HändelTorsten Söderström
Francisco Javier Casajús QuirósP.F.-C. Enriquez