Image Classification and Detection

Few-Shot Learning

Supervised machine learning relies on the existence of enough labelled data to train a computational algorithm for pattern recognition, classification and other applications. The most promising models for machine translation, speech recognition and object recognition are deep neural networks, which require a large number of training examples to achieve good performance.
Most data sources created for training models have been developed using crowd-sourcing. While cheap and at times faster, crowd-sourcing is often not possible when using proprietary or sensitive data. As such, developing a large enough data set for use cases with proprietary or sensitive data can be a time and cost intensive endeavor.
In addition, machine learning models can be very sensitive to small changes in their operating environment or data sources. As such, updated data sets are needed to deep a machine learning model relevant over time as new technologies become available or use cases evolve.
The DARPA Learning with Less Labels (LwLL) program aims to make the process of training machine learning models more efficient by reducing the amount of labeled data needed to build the model or adapt it to new environments. In the context of this program, we are contributing Probabilistic Model Components to support LwLL. In particular, our most recent contribution has been Simple CNAPS, a visual classification architecture which has up to 9.2% fewer trainable parameters than CNAPS and performs up to 6.1% better than state of the art on the standard few-shot image classification benchmark data-set.

Hard Attention in Visual Classification

Hard visual attention is a promising approach to reduce the computational burden of modern computer vision methodologies. Hard attention mechanisms are typically non-differentiable. They can be trained with reinforcement learning but the high variance training this entails hinders more widespread application.
Our group recently framed hard attention for image classification as a Bayesian optimal experimental design (BOED) problem, and used it to develop ‘nearoptimal’ sequences of attention locations. These sequences can be used to partially supervise, and therefore speed up, the training of a hard attention mechanism. 

Contributors

PI: Frank Wood (UBC)

Co-PI: Leonid Sigal (UBC)

Students: Peyman Bateni – Vadem Masrani – Will Harvey – Raghav Goyal – Siddhesh Khandelwal

Supported by

Charles River Analytics (CRA) Logo
Darpa Logo

Publications