Deep Learning Scientist
Co-Founder of Google Research Amsterdam, Brain Team
I have been interested in the notion of "how one learns" since my early studies. Throughout the years I have looked at this problem from a multitude of perspectives, including the philosophical, cognitive, logico-mathematical, computational and statistical ones. Eventually I matured to the concept of a distributed representation as the then most powerful representation for knowledge and learning. Nowadays the world's most advanced learning systems rely on distributed learning.
This page contains a selection of my research on this topic: papers, talks and theses. After attending classical high school in Switzerland, I graduated with a Bachelor of Arts with Honors in Philosophy and a Bachelor of Science in Symbolic Systems from Stanford University. I then did my Master's in Logic at the University of Amsterdam and my PhD in Computer Science at Oxford University. After working for three years at DeepMind in London as a Research Scientist, I returned to Amsterdam to create a research team within Google focused on advancing the limits of distributed learning. Check out my works below, my scholar page and my twitter feed.
MetNet is the first Neural Weather Model to show comparable performance to physical systems at granular precipitation prediction up to 8 hours ahead without using physics. Check out the Google AI blogpost.
Axial Attention in Multidimensional Transformers, September 2019
Axial attention is a simple way to efficiently and effectively use self-attention on large images or videos by factorising regular self-attention over the axes.
Neural Machine Translation in Linear Time, October 2016
The first fully convolutional encoder-decoder language and translation model with state-of-the-art results on character-level language and translation modelling. The structure is a precursor to that of Transformers.
Video Pixel Networks
Video Pixel Networks, October 2016
Introduces an autoregressive video model and demonstrates the ability of neural nets to learn and sample from complex natural video distributions.
Recurrent Neural Translation Models
Recurrent Continuous Translation Models, August 2013
This project introduces a sentence-to-sentence encoder decoder neural network trained end to end for machine translation. This model signals the beginnings of neural machine translation. Beam search is used to decode sample translations from the neural network. Neural machine translation is nowadays widely adopted in Google Translate and other translation systems.