Analyzing Latent Representations of
Trained MAMLS
Interpreting a popular meta-learning algorithm via novel methods
Meta-learning is to learn how to learn. In optimization-based meta-learning, there is an extremely popular algorithm called Model-Agnostic Meta-Learning (MAML). In a team of three, I spent ten weeks researching the role of adaptation learning rate in MAMLs. We mathematically derived a theoretical interpretation, and designed a novel Protonet-style classification algorithm to empirically analyze the latent representations. We ended up landing 100% on the project (as sophomores) in a class with mostly graduate students.
I contributed to all aspects of the project, whether it was ideation, implementation, or visualization.
Context
MAML is a tremendously popular meta-learning algorithm that involves a bi-level optimization problem (inner and outer loop). Although this enables for its flexibility for a wide range of use cases and effectiveness, it makes the algorithm even harder to interpret and understand compared to typical learning algorithms. Motivated by the mysterious phenomenon that the optimal inner learning rate can actually be negative, we wanted to investigate and interpret the role of the inner learning rate in the MAML algorithm. Given the compute limits we had, we mainly used a classic meta-learning set-up with the Mini-Imagenet dataset (more details in report).
Given the current literature, how can we interpret the inner learning rate of MAML that explains the mysterious negative phenomenon?
Outcome
(Read the paper on the cover for a detailed explanation)
Theoretical Derivation