Distinguished Lecturer

Prof. Petros Maragos

Petros Maragos

Biosketch

Petros Maragos received the M.Eng. Diploma in E.E. from the National Technical University of Athens (NTUA) in 1980 and the M.Sc. and Ph.D. degrees from Georgia Tech, Atlanta, in 1982 and 1985. In 1985, he joined the faculty of the Division of Applied Sciences at Harvard University, Boston, where he worked for eight years as professor of electrical engineering, affiliated with the Harvard Robotics Lab. He was also a consultant to industry research groups including Xerox's research on image analysis. In 1993, he joined the faculty of the School of ECE at Georgia Tech, affiliated with its Center for Signal and Image Processing. During periods of 1996-98 he had a joint appointment as director of research at the Institute of Language and Speech Processing in Athens. Since 1999, he has been working as a professor at the NTUA School of ECE, where he is currently the Director of the Intelligent Robotics and Automation Lab. He has held visiting scientist positions at MIT in fall 2012 and at the University of Pennsylvania in fall 2016. His research and teaching interests include signal processing, systems theory, machine learning, image processing and computer vision, audio and speech/language processing, cognitive systems, and robotics. In the above areas he has published numerous papers, book chapters, and has also co-edited three Springer research books, one on multimodal processing and two on shape analysis. He has served as: Associate Editor for the IEEE Transactions on Acoustics, Speech & Signal Processing and the Transactions on Pattern Analysis and Machine Intelligence, as well as editorial board member and guest editor for several journals on signal processing, image analysis and vision; Co-organizer of several conferences and workshops, including VCIP'92 (GC), ISMM'96 (GC), MMSP'07 (GC), ECCV'10 (PC), ECCV’10 Workshop on Sign, Gesture and Activity, 2011 & 2014 Dagstuhl Symposia on Shape, IROS’15 Workshop on Cognitive Mobility Assistance Robots, and EUSIPCO 2017 (GC); Member of the IEEE SPS committees on DSP, IMDSP and MMSP. He has also served as member of the Greek National Council for Research and Technology.
He is the recipient or co-recipient of several awards for his academic work, including: a 1987-1992 US NSF Presidential Young Investigator Award; the 1988 IEEE ASSP Young Author Best Paper Award; the 1994 IEEE SPS Senior Best Paper Award; the 1995 IEEE W.R.G. Baker Prize for the most outstanding original paper; the 1996 Pattern Recognition Society's Honorable Mention best paper award; the best paper award from the CVPR-2011 Workshop on Gesture Recognition. Several papers co-authored with his students have also received student best paper awards at conferences. In 1995 he was elected IEEE Fellow for his research contributions. He received the 2007 EURASIP Technical Achievements Award for contributions to nonlinear signal processing, systems theory, image and speech processing. In 2010 he was elected Fellow of EURASIP for his research contributions. He has been elected IEEE SPS Distinguished Lecturer for 2017-2018.

Multimodal Spatio-Temporal Signal Processing and Audio-Visual Perception

In this talk we will present an overview of ideas, methods and research results in multimodal spatio-temporal sensory processing with emphasis on audio-visual signal processing and fusion as applied to problems of attention, inversion and recognition. We shall begin with a brief synopsis of important findings from audio-visual (A-V) perception. Then we shall outline efficient signal processing front-ends and fusion schemes for the problems of A-V speech recognition as well as A-V speech inversion to geometry. Spatio-temporal processing ideas are also applied to multi-sensor a smart home environment. Afterwards, emphasis will be given to problems of attention where we will present improved computational saliency models for audio and visual salient event detection, followed by multimodal saliency estimation; this leads to movie video summarization based on audio, visual, and text modalities. Finally, we will outline the application of some of the above ideas and methods to audio-gestural (visual gesture and spoken command) recognition for problems of human-robot interaction.
More information and related papers can be found in http://cvsp.cs.ntua.gr, http://cognimuse.cs.ntua.gr and http://robotics.ntua.gr.