[FoRK] Fast, Accurate Detection of 100, 000 Object Classes on a Single Machine

Eugen Leitl eugen at leitl.org
Thu Jul 25 08:50:19 PDT 2013


Fast, Accurate Detection of 100,000 Object Classes on a Single Machine

Abstract: Many object detection systems are constrained by the time required
to convolve a target image with a bank of filters that code for different
aspects of an object's appearance, such as the presence of component parts.
We exploit locality-sensitive hashing to replace the dot-product kernel
operator in the convolution with a fixed number of hash-table probes that
effectively sample all of the filter responses in time independent of the
size of the filter bank. To show the effectiveness of the technique, we apply
it to evaluate 100,000 deformable-part models requiring over a million (part)
filters on multiple scales of a target image in less than 20 seconds using a
single multi-core processor with 20GB of RAM. This represents a speed-up of
approximately 20,000 times - four orders of magnitude - when compared with
performing the convolutions explicitly on the same hardware. While mean
average precision over the full set of 100,000 object classes is around 0.16
due in large part to the challenges in gathering training data and collecting
ground truth for so many classes, we achieve a mAP of at least 0.20 on a
third of the classes and 0.30 or better on about 20% of the classes.

More information about the FoRK mailing list