Keller Colloquium in Computing and Mathematical Sciences
Machine learning algorithms excel primarily in settings where an engineer can first reduce the problem to a particular narrow function (e.g. an image classifier), and then collect a massive amount of hand-labeled input-output pairs for that function. In drastic contrast, humans and animals are capable of learning from streams of high-dimensional, multimodal sensory data with minimal external instruction. In this talk, I will argue that, in order to build intelligent systems that are as capable as humans, machine learning algorithms should not be developed in the context of one particular application or domain. Instead, we should be designing systems that can be versatile, can learn in unstructured settings without detailed human-provided labels, and can accomplish many tasks, all while processing rich, high-dimensional sensory inputs. To do so, these systems must be able to actively explore and experiment, collecting data for themselves rather than relying on detailed human labels.
My talk will focus on two key aspects of this goal: versatility and self-supervision. First, I will show how we can move away from hand-designed, task-specific representations of a robot's environment by enabling the robot to learn high-capacity models, such as deep neural networks, for representing complex skills from raw pixel input. I will also present an algorithm that learns deep models that can be rapidly adapted to different objects, new visual concepts, or varying goals, leading to versatile behaviors. Beyond versatility, a hallmark of human intelligence is unsupervised learning. I will discuss how we can allow a robot to learn by 'playing' with objects in its environment without any human supervision. From this experience, the robot can acquire a visual predictive model of the physical world that can be used for maneuvering many different objects to varying goals. In all settings, our experiments on simulated and real robot platforms demonstrate the ability to scale to complex, vision-based skills with novel objects in the real world.
This lecture is part of the Young Investigators Lecture Series sponsored by the Caltech Division of Engineering & Applied Science.