About Me

Deep Learning Scientist
Co-Founder of Google Research Amsterdam, Brain Team

I have been interested in the notion of "how one learns" since my early studies. Throughout the years I have looked at this problem from a multitude of perspectives, including the philosophical, cognitive, logico-mathematical, computational and statistical ones. Eventually I matured to the concept of a distributed representation as the then most powerful representation for knowledge and learning. Nowadays the world's most advanced learning systems rely on distributed learning.

This page contains a selection of my research on this topic: papers, talks and theses. After attending classical high school in Switzerland, I graduated with a Bachelor of Arts with Honors in Philosophy and a Bachelor of Science in Symbolic Systems from Stanford University. I then did my Master's in Logic at the University of Amsterdam and my PhD in Computer Science at Oxford University. After working for three years at DeepMind in London as a Research Scientist, I returned to Amsterdam to create a research team within Google focused on advancing the limits of distributed learning. Check out my works below, my scholar page and my twitter feed.


Selected Works


MetNet is the first Neural Weather Model to show comparable performance to physical systems at granular precipitation prediction up to 8 hours ahead without using physics. Check out the Google AI blogpost.

Axial Attention

Axial attention is a simple way to efficiently and effectively use self-attention on large images or videos by factorising regular self-attention over the axes.


Audio synthesis using an extremely lightweight RNN in real time on mobile devices!


The first fully convolutional encoder-decoder language and translation model with state-of-the-art results on character-level language and translation modelling. The structure is a precursor to that of Transformers.

Video Pixel Networks

Video Pixel Networks, October 2016

Introduces an autoregressive video model and demonstrates the ability of neural nets to learn and sample from complex natural video distributions.


The first computer program to ever beat a top professional player at the ancient game of Go.

Recurrent Neural Translation Models

This project introduces a sentence-to-sentence encoder decoder neural network trained end to end for machine translation. This model signals the beginnings of neural machine translation. Beam search is used to decode sample translations from the neural network. Neural machine translation is nowadays widely adopted in Google Translate and other translation systems.

Convolutional Sentence Encoders

Uses convolutional neural networks to encode full sentences for natural language understanding tasks.



Claude Debussylaan 34, 1082 MD Amsterdam


©2020 by Nal Kalchbrenner