Back to Explorer
Research PaperResearchia:202512.25746498[Computer Vision > Computer Science]

One-shot learning (computer vision)

Prof. Hans Mueller (University of Berlin)

Abstract

One-shot learning (computer vision)

One-shot learning is an object categorization problem, found mostly in computer vision. Whereas most machine learning-based object categorization algorithms require training on hundreds or thousands of examples, one-shot learning aims to classify objects from one, or only a few, examples. The term few-shot learning is also used for these problems, especially when more than one example is needed.

== Motivation == The ability to learn object categories from few examples, and at a rapid pace, has been demonstrated in humans. It is estimated that a child learns almost all of the 10 ~ 30 thousand object categories in the world by age six. This is due not only to the human mind's computational power, but also to its ability to synthesize and learn new object categories from existing information about different, previously learned categories. Given two examples from two object categories: one, an unknown object composed of familiar shapes, the second, an unknown, amorphous shape; it is much easier for humans to recognize the former than the latter, suggesting that humans make use of previously learned categories when learning new ones. The key motivation for solving one-shot learning is that systems, like humans, can use knowledge about object categories to classify new objects.

== Background == As with most classification schemes, one-shot learning involves three main challenges:

Representation: How should objects and categories be described? Learning: How can such descriptions be created? Recognition: How can a known object be filtered from enveloping clutter, irrespective of occlusion, viewpoint, and lighting? One-shot learning differs from single object recognition and standard category recognition algorithms in its emphasis on knowledge transfer, which makes use of previously learned categories.

Model parameters: Reuses model parameters, based on the similarity between old and new categories. Categories are first learned on numerous training examples, then new categories are learned using transformations of model parameters from those initial categories or selecting relevant parameters for a classifier. Feature sharing: Shares parts or features of objects across categories. One algorithm extracts "diagnostic information" in patches from already learned categories by maximizing the patches' mutual information, and then applies these features to the learning of a new category. A dog category, for example, may be learned in one shot from previous knowledge of horse and cow categories, because dog objects may contain similar distinguishing patches. Contextual information: Appeals to global knowledge of the scene in which the object appears. Such global information can be used as frequency distributions in a conditional random field framework to recognize objects. Alternatively context can consider camera height and scene geometry. Algorithms of this type have two advantages. First, they learn object categories that are relatively dissimilar; and second, they perform well in ad hoc situations where an image has not been hand-cropped and aligned.

== Theory == The Bayesian one-shot learning algorithm represents the foreground and background of images as parametrized by a mixture of constellation models. During the learning phase, the parameters of these models are learned using a conjugate density parameter posterior and Variational Bayesian Expectation–Maximization (VBEM). In this stage the previously learned object categories inform the choice of model parameters via transfer by contextual information. For object recognition on new images, the posterior obtained during the learning phase is used in a Bayesian decision framework to estimate the ratio of p(object | test, train) to p(background clutter | test, train) where p is the probability of the outcome.

=== Bayesian framework === Given the task of finding a particular object in a query image, the overall objective of the Bayesian one-shot learning algorithm is to compare the probability that object is present vs the probability that only background clutter is present. If the former probability is higher, the algorithm reports the object's presence, otherwise the algorithm reports its absence. To compute these probabilities, the object class must be modeled from a set of (1 ~ 5) training images containing examples. To formalize these ideas, let ObgO_{bg}
. Also let TT
. We next introduce parametric models for the foreground and background categories with parameters ItI_{t}
, as well as prior information of learned categories. The background model we assume to be uniform across images. Omitting the constant ratio of category priors, p(Ofg)p(Obg){\frac {p(O_{fg})}{p(O_{bg})}}
, and parametrizing over Rp(Iθ,Ofg)p(θIt,Ofg)dθp(Iθbg,Obg)p(θbgIt,Obg)dθbg=p(Iθ)p(θIt,Ofg)dθp(Iθbg)p(θbgIt,Obg)dθbgR\propto {\frac {\int {p(I|\theta ,O_{fg})p(\theta |I_{t},O_{fg})}d\theta }{\int {p(I|\theta _{bg},O_{bg})p(\theta _{bg}|I_{t},O_{bg})}d\theta _{bg}}}={\frac {\int {p(I|\theta )p(\theta |I_{t},O_{fg})}d\theta }{\int {p(I|\theta _{bg})p(\theta _{bg}|I_{t},O_{bg})}d\theta _{bg}}}
, having simplified δ(θML)\delta (\theta ^{ML})
. Instead, it uses a variational approach using prior information from previously learned categories. However, the traditional maximum likelihood estimation of the model parameters is used for the background model and the categories learned in advance through training.

=== Object category model === For each query image ItI_{t}
, a constellation model is used for representation. To obtain this model for a given image II
, first a set of N interesting regions is detected in the image using the Kadir–Brady saliency detector. Each region selected is represented by a location in the image, AiA_{i}
. Letting XtX_{t}
and AtA_{t} (Article truncated for display)

Source

This content is sourced from Wikipedia, the free encyclopedia. Read full article on Wikipedia

Category

Computer Vision - Computer Science

Submission:12/25/2025
Comments:0 comments
Subjects:Computer Science; Computer Vision
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

One-shot learning (computer vision) | Researchia