Sjoerd van Steenkiste
sjoerd at idsia dot ch

I am a PhD student at the Swiss AI Lab IDSIA with Jürgen Schmidhuber. I am generally interested in Artificial Intelligence, although my current focus is on unsupervised learning algorithms that learn complex symbol-like world representations. Previously I worked on neuroevolution and multiwavelets.

I received a masters in Artificial Intelligence, a masters in Operations Research, and a bachelors in Knowledge Engineering from Maastricht University. I have also spent time at Google Brain, and NNAISENSE as a research intern.

CV  /  Google Scholar  /  GitHub  /  Twitter

What's new?
Research

My current research focus is on unsupervised learning algorithms that learn complex symbol-like world representations. Previously I worked on neuroevolution and multiwavelets.

Are Disentangled Representations Helpful for Abstract Visual Reasoning?
Sjoerd van Steenkiste, Francesco Locatello, Jürgen Schmidhuber, Olivier Bachem
Technical Report - Currently Under review
pdf / code

We conduct a large-scale study that investigates whether disentangled representations are more suitable for abstract reasoning tasks. Using two new tasks similar to Raven's Progressive Matrices, we evaluate the usefulness of the representations learned by 360 state-of-the-art unsupervised disentanglement models. Based on these representations, we train 3600 abstract reasoning models and observe that disentangled representations do in fact lead to better up-stream performance. In particular, they appear to enable quicker learning using fewer samples.

Towards Accurate Generative Models of Video: A New Metric & Challenges
Thomas Unterthiner*, Sjoerd van Steenkiste*, Karol Kurach, Raphaël Marinier, Marcin Michalski, Sylvain Gelly
Technical Report - Currently Under review
pdf / code / dataset

We propose Fréchet Video Distance (FVD), a new metric for generative models of video based on FID, and StarCraft 2 Videos (SCV), a collection of progressively harder datasets that challenge the capabilities of the current iteration of generative models for video. We conduct a large-scale human study, which confirms that FVD correlates well with qualitative human judgment of generated videos, and provide initial benchmark results on SCV.

*Both authors contributed equally

A Case for Object Compositionality in Deep Generative Models of Images
Sjoerd van Steenkiste, Karol Kurach, Sylvain Gelly
NeurIPS workshop on Modeling the Physical World: Perception, Learning, and Control, 2018
NeurIPS workshop on Relational Representation Learning , 2018
pdf / code

We propose to structure the generator of a GAN to consider objects and their relations explicitly, and generate images by means of composition. On several multi-object image datasets we find that the proposed generator learns to identify and disentangle information corresponding to different objects at a representational level. A human study reveals that the resulting generative model is better at generating images that are more faithful to the reference distribution.

Relational Neural Expectation Maximization: Unsupervised Discovery of Objects and their Interactions
Sjoerd van Steenkiste, Michael Chang, Klaus Greff, Jürgen Schmidhuber
International Conference on Learning Representations (ICLR), 2018
pdf / code / poster

We present a novel method that learns to discover objects and model their physical interactions from raw visual images in a purely unsupervised fashion. It incorporates prior knowledge about the compositional nature of human perception to factor interactions between object-pairs and learn efficiently. On videos of bouncing balls we show the superior modelling capabilities of our method compared to other unsupervised neural approaches that do not incorporate such prior knowledge.

Relational Neural Expectation Maximization
Sjoerd van Steenkiste, Michael Chang, Klaus Greff, Jürgen Schmidhuber
NIPS workshop on Cognitively Informed Artificial Intelligence, 2017
Oral Presentation, Oculus Outstanding Paper Award
pdf / code / slides

We propose a novel approach to common-sense physical reasoning that learns physical interactions between objects from raw visual images in a purely unsupervised fashion. Our method incorporates prior knowledge about the compositional nature of human perception, enabling it to discover objects, factor interactions between object-pairs to learn efficiently, and generalize to new environments without re-training.

Neural Expectation Maximization
Klaus Greff*, Sjoerd van Steenkiste*, Jürgen Schmidhuber
Neural Information Processing Systems (NIPS), 2017
NVAIL Pioneering Research Award
pdf / code / poster

In this paper, we explicitly formalize the problem of automatically discovering distributed symbol-like representations as inference in a spatial mixture model where each component is parametrized by a neural network. Based on the Expectation Maximization framework we then derive a differentiable clustering method that simultaneously learns how to group and represent individual entities.

*Both authors contributed equally

A Wavelet-based Encoding for Neuroevolution
Sjoerd van Steenkiste, Jan Koutník, Kurt Driessens, Jürgen Schmidhuber
Genetic and Evolutionary Computation Conference (GECCO), 2016
pdf / code

A new indirect scheme for encoding neural network connection weights as sets of wavelet-domain coefficients is proposed. It exploits spatial regularities in the weight-space to reduce the gene-space dimension by considering the low-frequency wavelet coefficients only. The wavelet-based encoding builds on top of a frequency-domain encoding, but unlike when using a Fourier-type transform, it offers gene locality while preserving continuity of the genotype-phenotype mapping.

Talks

Website template credits.