Machine Learning Approaches for Solving Inverse Imaging Problems

August 29, 2017 9:30 am - 10:30 am

Prof. Aggelos Katsaggelos

Joseph Cummings Chair, Department of EECS, Northwestern University, USA

Abstract

The solution of inverse problems in imaging applications, such as, image recovery, super-resolution, and compressive sensing, to name a few, has a long history of research and development. Analytical approaches have been very successful in providing solutions to such problems. Learning approaches, and more specifically approaches based on the use of deep neural networks have recently been developed and are challenging analytical approaches in establishing the state-of-the-art. One of the first questions we will address is what are the relative advantages of analytical and learning approaches for solving inverse problems, or for a specific problem at hand which of the available approaches in our toolbox should one use? We will also discuss under what circumstances we should expect learning approaches to provide more accurate solutions than analytical approaches. We will present specific examples from our work on video super-resolution and temporal compressive sampling and draw conclusions.

Bio

Aggelos K. Katsaggelos received the Diploma degree in electrical and mechanical engineering from the Aristotelian University of Thessaloniki, Greece, in 1979, and the M.S. and Ph.D. degrees in Electrical Engineering from the Georgia Institute of Technology, in 1981 and 1985, respectively. In 1985, he joined the Department of Electrical Engineering and Computer Science at Northwestern University, where he is currently a Professor holder of the Joseph Cummings chair. He was previously the holder of the Ameritech Chair of Information Technology and the AT&T chair. He is also a member of the Academic Staff, NorthShore University Health System, an affiliated faculty at the Department of Linguistics and he has an appointment with the Argonne National Laboratory. He has published extensively in the areas of multimedia signal processing and communications (over 250 journal papers, 600 conference papers and 40 book chapters) and he is the holder of 25 international patents. He is the co-author of Rate-Distortion Based Video Compression (Kluwer, 1997), Super-Resolution for Images and Video (Claypool, 2007), Joint Source-Channel Video Transmission (Claypool, 2007), and Machine Learning Refined (Cambridge University Press, 2016). He has supervised 55 Ph.D. theses so far.

Among his many professional activities Prof. Katsaggelos was Editor-in-Chief of the IEEE Signal Processing Magazine (1997-2002), a BOG Member of the IEEE Signal Processing Society (1999-2001), a member of the Publication Board of the IEEE Proceedings (2003-2007), and a Member of the Award Board of the IEEE Signal Processing Society. He is a Fellow of the IEEE (1998), SPIE (2009), and EURASIP (2017), and the recipient of the IEEE Third Millennium Medal (2000), the IEEE Signal Processing Society Meritorious Service Award (2001), the IEEE Signal Processing Society Technical Achievement Award (2010), an IEEE Signal Processing Society Best Paper Award (2001), an IEEE ICME Paper Award (2006), an IEEE ICIP Paper Award (2007), an ISPA Paper Award (2009), and a EUSIPCO paper award (2013). He was a Distinguished Lecturer of the IEEE Signal Processing Society (2007-2008).

On Learning Invariants and Representation Spaces of Shapes and Forms

August 30, 2017 8:40 am - 9:40 am

Prof. Ron Kimmel

Department of Computer Science Technion, Israel Institute of Technology, Israel

Abstract

We study the power of the Laplace Beltrami Operator (LBO) in processing and analyzing geometric information. The decomposition of the LBO at one end, and the heat operator at the other end provide us with efficient tools for dealing with images and shapes. Denoising, segmenting, filtering, exaggerating are just few of the problems for which the LBO provides an efficient solution. We review the optimality of a truncated basis provided by the LBO, and a selection of relevant metrics by which such optimal bases are constructed. Specific example is the scale invariant metric for surfaces that we argue to be a natural selection for the study of articulated shapes and forms.

In contrast to geometry understanding there is a new emerging field of deep learning. Learning systems are rapidly dominating the areas of audio, textual, and visual analysis. Recent efforts to convert these successes over to geometry processing indicate that encoding geometric intuition into modeling, training, and testing is a non-trivial task. It appears as if approaches based on geometric understanding are orthogonal to those of data-heavy computational learning. We propose to unify these two methodologies by computationally learning geometric representations and invariants and thereby take a small step towards a new perspective on geometry processing.

I will present examples of shape matching, facial surface reconstruction from a single image, reading facial expressions, shape representation, and finally definition and computation of invariant operators and signatures.

Bio

Ron Kimmel is a Professor of Computer Science at the Technion where he holds the Montreal Chair in Sciences. He held a post-doctoral position at UC Berkeley and a visiting professorship at Stanford University. He has worked in various areas of image and shape analysis in computer vision, image processing, and computer graphics. Kimmel's interest in recent years has been non-rigid shape processing and analysis, medical imaging, learning, understanding, numerical optimization of problems with a geometric flavor, and applications of metric geometry and differential geometry. Kimmel is an IEEE Fellow for his contributions to image processing and non-rigid shape analysis. He is an author of two books and numerous articles. He is the founder of the Geometric Image Processing Lab. and a founder and advisor of several successful image processing and analysis companies. Since the acquisition of his co-founded company InVision 5 years ago he serves as a part time senior principal engineer at Intel's Perceptual Computing, were he now co-directs a research team.

Linearly-Convergent Stochastic Gradient Algorithms.

August 30, 2017 1:30 pm - 2:30 pm

Dr. Francis Bach

Departement d'Informatique de l'Ecole Normale Superieure, Centre de Recherche INRIA de Paris, France

Abstract

Many machine learning and signal processing problems are traditionally cast as convex optimization problems where the objective function is a sum of many simple terms. In this situation, batch algorithms compute gradients of the objective function by summing all individual gradients at every iteration and exhibit a linear convergence rate for strongly-convex problems. Stochastic methods rather select a single function at random at every iteration, classically leading to cheaper iterations but with a convergence rate which decays only as the inverse of the number of iterations. In this talk, I will present the stochastic averaged gradient (SAG) algorithm which is dedicated to minimizing finite sums of smooth functions; it has a linear convergence rate for strongly-convex problems, but with an iteration cost similar to stochastic gradient descent, thus leading to faster convergence for machine learning and signal processing problems. I will also mention several extensions, in particular to saddle-point problems, showing that this new class of incremental algorithms applies more generally.

Bio

Francis Bach is a researcher at INRIA, leading since 2011 the Sierra project-team, which is part of the Computer Science Department at Ecole Normale Superieure. He completed his Ph.D. in Computer Science at U.C. Berkeley, working with Professor Michael Jordan, and spent two years in the Mathematical Morphology group at Ecole des Mines de Paris, then he joined the Willow project-team at INRIA/Ecole Normale Superieure from 2007 to 2010. Francis Bach is interested in statistical machine learning, and especially in graphical models, sparse methods, kernel-based learning, convex optimization, vision and signal processing. He obtained in 2009 a Starting Grant from the European Research Council and received in 2012 the INRIA young researcher prize. In 2015, he was program co-chair of the International Conference in Machine learning (ICML).

The Power of Low Rank Tensor Approximations in Smart Diagnostics

August 31, 2017 8:40 am - 9:40 am

Prof. Sabine Van Huffel

Stadius Centre for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Belgium

Abstract

An overview of applications in Smart Diagnostics is presented in which low rank tensor approximations emerge into their computational core.

Accurate and automated extraction of clinically relevant diagnostic information from patient recordings requires an ingenious combination of adequate pretreatment of the data (e.g. artefact removal), feature selection, pattern recognition, decision support, up to their embedding into user-friendly user interfaces. The underlying computational problems can be solved by making use of low rank tensor approximations as building blocks of higher-level signal processing algorithms. A major challenge here is how to make these mathematical decompositions "interpretable" such that they reveal the underlying clinically relevant information and improve medical diagnosis. The addition of relevant constraints and source models can help to achieve this. Multimodal data fusion poses additional challenges on how to couple the associated tensor decompositions thereby imposing appropriate constraints translating underlying relationships and common dynamics.

The application of these low rank tensor approximations and their benefits will be illustrated in a variety of case studies. In particular, their emerging power in cardiac monitoring using multilead Electrocardiography (ECG) will be shown in T-wave alternans and irregular heartbeat detection. In addition, their added value in Magnetic Resonance Spectroscopic Imaging combined with Magnetic Resonance Imaging is shown to improve brain tumour recognition. Although Canonical Polyadic Decompositions (CPD) and Multilinear Singular Value Decompositions (MLSVD) are most popular, their extensions are emerging in clinical applications, e.g. Block Term Decompositions (BTD) and multiscale MLSVD, as well as coupled tensor decompositions.

In conclusion, tensor decompositions can be highly relevant in biomedical data processing. Nevertheless, their use in smart diagnostics is still largely unexplored.

Bio

Sabine VAN HUFFEL received the MD in computer science engineering in June 1981, the MD in Biomedical engineering in July 1985 and the Ph.D in electrical engineering in June 1987, all from KU Leuven, Belgium. She is full professor at the department of Electrical Engineering, KU Leuven, Belgium, where she is heading the Biomedical Data Processing Research Group (about 25 researchers) . She is programme director of the master of Science in Biomedical Engineering at KU Leuven, and also fellow from IEEE, SIAM and EAMBES. In April 2013 she received an honorary doctorate from Eindhoven University of Technology, together with an appointment as a Distinguished professor.

She performs Research - fundamental/theoretical as well as application oriented- in the domain of (multi)linear algebra, (non)linear signal analysis, classification and system identification with special focus to the development of numerically reliable and robust algorithms for improving medical diagnostics in numerous areas such as epilepsy, stress and neonatal brain monitoring. In particular, she is well-known for her contributions to Total Least Squares (TLS) fitting and more recently for her expertise in matrix/tensor based biomedical multimodal and multichannel data processing with applications in fusing EEG & functional MRI as well as multiparametric MRI.

She published two SIAM monographs, about 420 articles in refereed international journals and about 300 articles in international peer-reviewed conferences (>20.000 google-scholar citations). For more information, see http://www.esat.kuleuven.be/stadius/. She is holder of an ERC Advanced Grant 339804 BIOTENSORS: "Biomedical Data Fusion using Tensor based Blind Source Separation". Period: 01-04-2014 /31-03-2019.

She is (co)supervisor of 52 finished PhDs and 20 ongoing PhDs: mostly interdisciplinary in cosupervision with medical colleagues. In addition, she mentored 30 postdocs.

Speech Synthesis: Where Did the Signal Processing Go?

August 31, 2017 1:30 pm - 2:30 pm

Prof. Simon King

The Centre for Speech Technology Research, University of Edinburgh, UK

Abstract

In current approaches to speech synthesis, we generally obtain the best sounding output by concatenating recordings of natural speech, in the waveform domain. The competing method, in which a statistical model drives a vocoder, has until recently suffered from significant artefacts that reduced perceived naturalness.

Now, statistical parametric methods are suddenly and rapidly improving, and are finally good enough for commercial deployment. The most recent boost to quality has come about through a convergence of acoustic modelling and waveform generation, in which the model directly generates a waveform, or a very closely-related representation.

But, in this rush to use statistical models to directly generate waveforms, much of what we know about speech signal processing - whether source-filter modeling, the cepstrum, or something as simple as perceptually-motivated frequency scale warping - is now being questioned or simply forgotten. We see models generating exceedingly naive representations, such as 8-bit quantised waveform samples.

In my talk, I will ask whether this is just the inevitable march of Machine Learning, or if there is a missed opportunity. It seems rather unlikely that an 8-bit quantised waveform is the best domain for an objective function aiming to maximise perceived naturalness. Surely, experts in signal processing (by which, I mean YOU!) can do better?

Bio

A fundamental question is: What are the basic building blocks of speech? To answer this question, I have worked on a number of problems in speech technology. In recent years, I have concentrated mainly on speech synthesis, working in both unit selection and statistical parametric paradigms (HMM-based and DNN-based). I have been considering the "building blocks question" in text processing, acoustic modelling and waveform generation.

In text processing, building blocks co-exist at many different levels of representation. Some, such as phonemes or syllables, need reliable linguistic knowledge of the language. Other unit types, such as graphemes, can be used in a wider range of practical situations.

In acoustic modelling, the definition of the unit of speech is crucial. Both unit selection and statistical parametric approaches typically use small, naively-defined units such as phones or diphones, but then adorn them with many contextual features, leading to severe sparsity . This sparsity is solved by finding units-in-context that are somehow "equivalent" or "perceptually interchangeable".

In waveform generation, we still obtain the best quality by concatenating recordings, but parametric methods are improving rapidly. There is a gradual convergence of acoustic modelling and waveform generation, which may overcome the limitations of current systems that couple a statistical model with a vocoder. But, in this convergence, much of what we know about the building blocks of speech signals - such as source and filter - is now being questioned or simply ignored.