Zoom out Search Issue

ManualsBrandsContents Manualsaudio & home theatreZoom in

141

142

143

144

145

146

147

148

149

150

IEEE SIGNAL PROCESSING MAGAZINE [144] MARCH 2015

[11]M. D.Plumbley,T.Blumensath,L.Daudet,R.Gribonval, and M. E.Davies,

“Sparse representations in audio & music: From coding to source separation,”

Proc. IEEE, vol. 98, no. 6, pp. 995–1005, 2009.

[12] J. Nikunen and T. Virtanen, “Object-based audio coding using nonnegative matrix

factorization for the spectrogram representation,” in Proc. 128th Audio Engineering

Society Convention, London, 2010.

[13] C. Févotte, N. Bertin, and J.-L. Durrieu, “Nonnegative matrix factorization

with the Itakura-Saito divergence. With application to music analysis,” Neural

Computat., vol. 21, no. 3, pp. 793–830, 2009.

[14] M. Shashanka, B. Raj, and P. Smaragdis, “Sparse overcomplete latent variable

decomposition of counts data,” in Proc. Neural Information Processing Systems,

Vancouver, Canada, 2007, pp. 1313–1320.

[15] C. Ding, T. Li, and W. Ping, “On the equivalence between nonnegative matrix

factorization and probabilistic latent semantic indexing,” Computat. Stat. Data

Anal., vol. 52, no. 8, pp. 3913–3927, 2008.

[16] E. Zwicker and H. Fastl, Psychoacoustics: Facts and Models.Berlin:Spring-

er-Verlag, 1990.

[17] B. C. J. Moore, Ed., Hearing—Handbook of Perception and Cognition, 2nd

ed. San Diego, CA: Academic Press, 1995.

[18] B. King, C. Févotte, and P. Smaragdis, “Optimal cost function and magnitude

power for NMF-based speech separation and music interpolation,” in Proc. IEEE

Int. Workshop on Machine Learning for Signal Processing,Santander, Spain,

2012, pp. 1–6.

[19] J. Carabias-Orti, F. Rodriguez-Serrano, P. Vera-Candeas, F. Canadas-

Quesada, and N. Ruiz-Reyes, “Constrained nonnegative sparse coding using learnt

instrument templates for realtime music transcription,” in Proc. Engineering Ap-

plications of Artificial Intelligence, 2013, pp. 1671–1680.

[20] F. Weninger and B. Schuller, “Optimization and parallelization of monaural

source separation algorithms in the openBliSSART toolkit,” J. Signal Process.

Syst., vol. 69, no. 3, pp. 267–277, 2012.

[21] C. Févotte and J. Idier, “Algorithms for nonnegative matrix factorization with

the beta-divergence,” Neural Computat., vol. 23, no. 9, pp. 2421–2456, 2011.

[22] D. D. Lee and H. S. Seung, “Algorithms for nonnegative matrix factorization,”

in Proc. Neural Information Processing Systems, Denver, CO, 2000, pp. 556–562.

[23] R. Zdunek and A. Cichocki, “Nonnegative matrix factorization with constrained

second-order optimization,” Signal Process., vol. 87, no. 8, pp. 1904–1916, 2007.

[24] J. Kim and H. Park, “Fast nonnegative matrix factorization: An active-set-like

method and comparisons,” SIAM J. Sci. Comput., vol. 33, no. 6, pp. 3261–3281, 2011.

[25] T. Virtanen, J. Gemmeke, and B. Raj, “Active-set Newton algorithm for overcom-

plete nonnegative representations of audio,” IEEE Trans. Audio, Speech, Lang. Pro-

cessing, vol. 21, no. 11, 2013.

[26] T.

Hofmann, “Unsupervised learning by probabilistic latent semantic analysis,”

Mach. Learn., vol. 42, no. 1–2, pp. 177–196, 2001.

[27] M. Shashanka, B. Raj, and P. Smaragdis, “Probabilistic latent variable models

as nonnegative factorizations,” Computat. Intell. Neurosci.,vol. 2008,2008.

[28] G. J. Mysore, P. Smaragdis, and B. Raj, “Nonnegative hidden Markov model-

ing of audio with application to source separation,” in Proc. 9th Int. Conf. Latent

Variable Analysis and Signal Separation, St. Malo, France, 2010, pp. 140–148.

[29] P. Smaragdis, B. Raj, and M. Shashanka, “Missing data imputation for time-

frequency representations of audio signals,” J. Signal Process. Syst.,vol. 11,no. 3,

pp. 361–370, 2011.

[30] H. Laurberg, M. G. Christensen, M. D. Plumbley, L. K. Hansen, and S. H.

Jensen, “Theorems on positive data: On the uniqueness of NMF,” Computat. Intell.

Neurosci.,vol. 2008,2008.

[31] J. Eggert and E. Korner, “Sparse coding and NMF,” in Proc. IEEE Int. Joint

Conf. Neural Networks, Budapest, Hungary, 2004, pp. 2529–2533.

[32] P. O. Hoyer, “Nonnegative matrix factorization with sparseness constraints,” J.

Mach. Learn. Res., vol. 5, pp. 1457–1469, 2004.

[33] P. D. O. Grady, “Sparse separation of underdetermined speech mixtures,” Ph.D.

dissertation, Natl. Univ. of Ireland, Maynooth, 2007.

[34] T. Virtanen, “Spectral covariance in prior distributions of nonnegative matrix

factorization based speech separation,” in Proc. European Signal Processing Conf.,

Glasgow, Scotland, 2009, pp. 1933–1937.

[35] P. Smaragdis, M. Shashanka, and B. Raj, “A sparse non-parametric approach

for single channel separation of known sounds,” in Proc. Neural Information Pro-

cessing Systems, Vancouver, Canada, 2009, pp. 1705–1713.

[36] D. Griffin and J. Lim, “Signal estimation from modified short-time Fourier

transform,” IEEE Trans. Acoustics, Speech, Signal Processing,vol. 32,no. 2,

pp. 236–242, 1984.

[37] J. Le Roux and E. Vincent, “Consistent Wiener filtering for audio source sepa-

ration,” IEEE Signal Processing Lett., vol. 20, no. 3, pp. 217–220, 2013.

[38] M. Aharon, M. Elad, and A. Bruckstein, “K-SVD and its nonnegative variant

for dictionary design,” in Proc. SPIE Conf. Wavelet Applications in Signal and Im-

age Processing XI,San Diego, CA,2005, pp. 327–339.

[39] R. G. Baraniuk, “Compressive sensing,” IEEE Signal Processing Mag.,vol. 24,

no. 4, pp. 118–121, 2007.

[40] A. Lefèvre, F. Bach, and C. Févotte, “Itakura-Saito nonnegative matrix factor-

ization with group sparsity,” in Proc. IEEE Int. Conf. Audio, Speech and Signal

Processing, Prague, Czech Republic, 2011, pp. 21–24.

[41] A. T. Cemgil, “Bayesian inference for nonnegative matrix factorisation models,”

Computat. Intell. Neurosci.,vol. 2009,2009.

[42] M. N. Schmidt and M. Mørup, “Infinite nonnegative matrix factorizations,” in

Proc. European Signal Processing Conf., Aalborg, Denmark, 2010.

[43] M. N. Schmidt, O.

Winther, and L. K. Hansen, “Bayesian nonnegative matrix

factorization,” in Proc. 8th Int. Conf. Independent Component Analysis and Blind

Signal Separation,Paraty, Brazil,2009, pp. 540–547.

[44] A. Hurmalainen, R. Saeidi, and T. Virtanen, “Group sparsity for speaker identity

discrimination in factorisation-based speech recognition,” in Proc. Interspeech 2012,

Portland, OR, Oregon.

[45] T. N. Sainath, A. Carmi, D. Kanevsky, and B. Ramabhadran, “Bayesian

compressive sensing for phonetic classification,” in Proc. IEEE Int. Conf. Audio,

Speech and Signal Processing, Dallas, TX, 2010, pp. 4370–4373.

[46] J. Gemmeke, L. ten Bosch, L. Boves, and B. Cranen, “Using sparse representa-

tions for exemplar based continuous digit recognition,” in Proc. European Signal

Processing Conf., Glasgow, Scotland, 2009, pp. 24–28.

[47] K. Mahkonen, A. Hurmalainen, T. Virtanen, and J. F. Gemmeke, “Mapping

sparse representation to state likelihoods in noise-robust automatic speech recogni-

tion,” in Proc. Interspeech 2011, Florence, Italy, pp. 465–468.

[48] Y. Sun, B. Cranen, J. F. Gemmeke, L. Boves, L. ten Bosch, and M. M. Doss,

“Using sparse classification outputs as feature observations for noise-robust ASR,”

in Proc. Interspeech 2012,Portland, OR.

[49] T. N. Sainath, D. Nahamoo, D. Kanevsky, and B. Ramabhadran, “Enhancing exem-

plar-based posteriors for speech recognition tasks,” in Proc. Interspeech 2012, Portland,

OR.

[50] B. Raj, R. Singh, M. Shashanka, and P. Smaragdis, “Bandwidth expansion

with a Polya Urn model,” in Proc. IEEE Int. Conf. Audio, Speech and Signal Pro-

cessing,Honolulu, HI,2007, pp. IV-597–IV-600.

[51] R. Takashima, T. Takiguchi, and Y. Ariki, “Exemplar-based voice conversion

in noisy environment,” in Proc. IEEE Spoken Language Technology Workshop,

2012, pp. 313–317.

[52] Z. Wu, T. Virtanen, T. Kinnunen, E. S. Chng, and H. Li, “Exemplar-based

voice conversion using nonnegative spectrogram deconvolution,” in Proc. 8th ISCA

Speech Synthesis Workshop, Barcelona, Spain, 2013, pp. 201–206.

[53] J. F. Gemmeke, H. Van hamme, B. Cranen, and L. Boves, “Compressive sens-

ing for missing data imputation in noise robust speech recognition,” IEEE J. Sel.

Top. Signal Processing, vol. 4, no. 2, pp. 272–287, 2010.

[54] J. Le Roux, H. Kameoka, N. Ono, A. de Cheveigné, and S. Sagayama, “Compu-

tational auditory induction as a missing-data model-fitting problem with Bregman

divergence,” SIAM J. Sci. Comput., vol. 54, no. 5, pp. 658–676, 2011.

[55] J.-L. Durrieu, B. David, and G. Richard, “A musically motivated mid-level rep-

resentation for pitch estimation and musical audio source separation,” IEEE J. Sel.

Top. Signal Processing, vol. 5, no. 6, pp. 1180–1191, 2011.

[56] J.

Carabias-Orti, T. Virtanen, P. Vera-Candeas, N. Ruiz-Reyes, and F. Canadas-

Quesada, “Musical instrument sound multi-excitation model for nonnegative

spectrogram factorization,” IEEE J. Sel. Top. Signal Processing, vol. 5, no. 6,

pp. 1144–1158, 2011.

[57] Y. K. Yilmaz, A. T. Cemgil, and U. Simsekli, “Generalized coupled tensor fac-

torization,” in Proc. Neural Information Processing Systems,Granada, Spain,

2011, pp. 2151–2159.

[58] A. Ozerov, E. Vincent, and F. Bimbot, “A general flexible framework for the

handling of prior information in audio source separation,” IEEE Trans. Audio,

Speech, Lang. Process., vol. 20, no. 4, pp. 1118–1133, 2012.

[59] N. Yasuraoka, H. Kameoka, T. Yoshioka, and H. G. Okuno, “I-divergence-

based dereverberation method with auxiliary function approach,” in Proc. IEEE

Int. Conf. Audio, Speech and Signal Processing, Prague, Czech Republic, 2011,

pp. 369–372.

[60] R. Singh, B. Raj, and P. Smaragdis, “Latent-variable decomposition based

dereverberation of monaural and multi-channel signals,” in Proc. IEEE Int. Conf.

Audio, Speech and Signal Processing, Dallas, TX, 2010, pp. 1914–1917.

[61] F. Weninger, J. Geiger, M. Wöllmer, B. Schuller, and G. Rigoll, “The Munich

2011 CHiME challenge contribution: NMF-BLSTM speech enhancement and recog-

nition for reverberated multisource environments,” in Proc. Int. Workshop on Ma-

chine Listening in Multisource Environments, Florence, Italy, 2011, pp. 24–29.

[62] D. FitzGerald, M. Cranitch, and E. Coyle, “Extended nonnegative tensor factori-

sation models for musical source separation,” Computat. Intell. Neurosci., vol. 2008,

2008.

[63] R. A. Harshman, “Foundations of the PARAFAC procedure: Models and condi-

tions for an ”explanatory” multimodal factor analysis,” in UCLA Working Papers in

Phonetics, vol. 16, pp. 1–84, 1970.

[64] H. Sawada, H. Kameoka, S. Araki, and N. Ueda, “Formulations and algo-

rithms for multichannel complex NMF,” in Proc. IEEE Int. Conf. Audio, Speech

and Signal Processing, Prague, Czech Republic, 2011, pp. 229–232.

[SP]

THE WORLD’S NEWSSTAND

THE WORLD’S NEWSSTAND