Abstracts

BACK TO FRAME-RATE

Real-time Motion Template Gradients using Intel CVLib
James Davis, Gary Bradski
jdavis@media.mit.edu, gary.bradski@intel.com

Abstract
In this paper, we present an extension to the real-time motion
template research for computer vision as previously developed
in (Davis 1997). The underlying representation is a Motion History
Image (MHI) that temporally layers consecutive image silhouettes
(or motion properties) of a moving person into a single template
form. Originally a global, label-based method was used for
recognition. In this work, we construct a more localized motion
characterization for the MHI that extracts motion orientations in
real-time. Basically, directional motion information can be recovered
directly from the intensity gradients within the MHI. In addition, we
provide a few simple motion features using these orientations. The
approach presented is implemented in real-time on a standard PC
platform employing optimized routines, developed in part for this
research, from the Intel Computer Vision Library (CVLib). We conclude
with an overview of this library and also a performance evaluation
in terms of this research.
BACK TO FRAME-RATE

A Robust Recursive Factorization Method for Recovering
Structure and Motion from Live Video Frames
Takeshi Kurata, Jun Fujikiy, Masakatsu Kourogiz, Katsuhiko Sakauey
kurata@etl.go.jp

Abstract
This paper describes a fast and robust approach for recovering structure
and motion from video frames. It first describes a robust recursive
factorization method for affine projection. Using the Least Median of
Squares (LMedS) criterion, the method estimates the dominant 3D affine
motion and discards feature points regarded as outliers. The
computational cost of the overall procedure is reduced by combining this
robust-statistics-based method with a recursive factorization method that
can at each frame provide the updated 3D structure of an object at a
fixed computational cost by using the principal component analysis. This
paper then describes experiments with synthetic data and with real image
sequences, the results of which demonstrate that the method can be used
to estimate the dominant structure and the motion robustly and in
real-time on an off-the-shelf PC.
BACK TO FRAME-RATE

Frame-Rate Pupil Detector and Gaze Tracker
C.H. Morimoto, D. Koons A. Amir M. Flickner
hitoshi@ime.usp.br fdkoons,arnon,flickg@almaden.ibm.com

Abstract
We present a robust, frame-rate pupil detector technique, based on an active
illumination scheme, used for gaze estimation. The pupil detector uses two
light sources synchronized with the even and odd fields of the video signal
(interlaced frames), to create bright and dark pupil images. The
retro-reflectivity property of the eye is exploited by placing an infra-red
(IR) light source close to the camera's optical axis resulting in an image
with a bright pupil. A similar off axis IR source generates an image with dark
pupils. Pupils are detected from the thresholded difference of the bright and
dark pupil images. After a calibration procedure, the vector computed from the
pupil center to the center of the corneal glints generated from light sources
is used to estimate the gaze position. The frame-rate gaze estimator prototype
is currently being demonstrated in a docked 300 MHz IBM Thinkpad with a PCI
frame grabber, using interlaced frames of resolution 640x480x8 bits.
BACK TO FRAME-RATE

A Statistical Approach for Real-time Robust Background
Subtraction and Shadow Detection
Thanarat Horprasert, David Harwood, and Larry S. Davis
{thanarat, harwood, lsd}@umiacs.umd.edu

Abstract:
This paper presents a novel algorithm for detecting moving
objects from a static background scene that contains shading and shadows
using color images. We develop a robust and efficiently computed background
subtraction algorithm that is able to cope with local illumination changes,
such as shadows and highlights, as well as global illumination changes. The
algorithm is based on a proposed computational color model which separates
the brightness from the chromaticity component. We have applied this method
to real image sequences of both indoor and outdoor scenes. The results,
which demonstrate the system's performance, and some speed up techniques we
employed in our implementation are also shown.
BACK TO FRAME-RATE

Real-Time Tracking of Multiple People Using Stereo
David Beymer and Kurt Konolige
fbeymer, konoligeg@ai.sri.com

Abstract
Recent investigations have shown the advantages of keeping multiple
hypotheses during visual tracking. In this paper we explore an
alternative method that keeps just a single hypothesis per tracked
object for computational efficiency, but displays robust performance
and recovery from error by using segmentation provided by a stereo
module. The method is implemented in the domain of people-tracking,
using a novel combination of stereo information for continuous
detection and intensity image correlation for tracking. Real-time
stereo provides extended information for 3D detection and tracking,
even in the presence of crowded scenes, obscuring objects, and large
scale changes. We are able to reliably detect and track people in
natural environments, on an implemented system that runs at more than
10 Hz on standard PC hardware.
BACK TO FRAME-RATE

Non-parametric Model for Background Subtraction
Ahmed Elgammal - David Harwood - Larry Davis
elgammal@cs.umd.edu

Abstract:
Background subtraction is a method typically used to segment moving
regions in image sequences by comparing each new frame to a model of the
scene background. We present a non-parametric background model and a
background subtraction approach. The model can handle situations where
the background of the scene is cluttered and not completely static but
contains small motions such as tree branches and bushes. The model
estimates the probability of observing pixel intensity values based on a
sample of intensity values for each pixel. The model adapts quickly to
changes in the scene which enables very sensitive detection of moving
targets. We also show how the model can use color information to
suppress detection of shadows. The implementation of the model runs in
real-time for both gray level and color imagery. Evaluation shows that
this approach achieves very sensitive detection with very low false
alarm rates.
BACK TO FRAME-RATE

Implementation of a Real-time Foreground/Background Segmentation System on the Intel Architecture
Fernando C. M. Martins, Brian R. Nickerson, Vareck Bostrom, and Rajeeb Hazra
{Fernando.Martins, {Brian.Nickerson, Vareck.Bostrom, Rajeeb.Hazra}@intel.com

Abstract
Foreground/background segmentation is a technique that shares the same goals of Blue-screen chroma keying - to separate the foreground from the background - but does so without the strong requirement of the existence of a known screen behind the subject of interest. Instead, a model of the background is built using historic and weak prior knowledge. Because of the computation-intensive nature of model-based segmentation algorithms, foreground/background segmentation at video rates is a challenging problem without the use of custom hardware or high-end workstations. We discuss techniques used in the implementation of a real-time foreground/background segmentation algorithm on a general-purpose consumer grade PC. In particular we demonstrate optimization techniques in the implementation of three critical sections of our algorithm: Binary Morphological Filter, Directional Morphological Filter and Region Flood Fill. These techniques exploit the instruction set of the Pentium®II and Pentium®III processors allowing video segmentation of 320x240 color frames at 25 fps. The optimized critical sections may be immediately used in a plethora of other applications. Moreover, the optimization methodology provides useful insight into the optimization of other image processing and computer vision techniques, such as edge detection, object boundary localization, and morphological pre- and post- processing.
BACK TO FRAME-RATE

A Real-Time Video Stabilizer Based on Linear-Programming
Moshe Ben-Ezra Shmuel Peleg Michael Werman
fmoshe, peleg, wermang@cs.huji.ac.il

Abstract
Real-time video stabilization is computed from point-to-line
correspondences using linear-programming. The implementation of the
stabilizer requires special techniques for (i) frame grabbing, (ii)
computing point-to-line correspondences, (iii) linear-program solving
and (iv) image warping. Timing and real-time profiling are also
addressed.
BACK TO FRAME-RATE

Developing Real-Time Computer Vision Applications for Intel Pentium III based Windows NT Workstations
Ross Cutler and Larry Davis
rgc@cs.umd.edu

Abstract
In this paper, we describe our experiences in developing real-time computer
vision applications for Intel Pentium III based Windows NT workstations.
Specifically, we discuss how to optimize your code, efficiently utilize
memory and the file system, utilize multiple CPUs, get video input, and
benchmark your code. Intrinsic soft real-time features of Windows NT are
discussed, as well as hard real-time extensions. An optimized real-time
optical flow application is given. Empirical results of memory subsystems
and cache scheduling issues are also reported.
BACK TO FRAME-RATE

Fast Image-Based Tracking by Selective Pixel Integration
Frank Dellaert and Robert Collins
dellaert@ux2.sp.cs.cmu.edu

Abstract
We provide a fast algorithm to perform image-based tracking, which relies on
the selective integration of a small subset of pixels that contain a lot of
information about the state variables to be estimated. The resulting
dramatic decrease in the number of pixels to process results in a
substantial speedup of the basic tracking algorithm. We have used this new
method within a surveillance application, where it will enable new
capabilities of the system, i.e. real-time, dynamic background subtraction
from a panning and tilting camera.
BACK TO FRAME-RATE

Local Application of Optic Flow to Analyse Rigid versus Non-Rigid Motion
Alan J. Lipton
ajl@cs.cmu.edu

Abstract
Optic flow has been a research topic of interest for many years. It
has, until recently, been largely inapplicable to real-time video
applications due to its computationally expensive nature. This paper presents a new,
reliable flow technique called dynamic region matching, based on
the work of Anandan, Lucas and Kanade, and Okutomi and Kanade, which can
be combined with a motion detection algorithm (from stationary or
stabilised camera image streams) to allow flow-based analyses of moving entities in
real-time. If flow vectors need only be calculated for ``moving'' pixels,
then the computation time is greatly reduced, making it applicable to
real-time implementation on modest computational platforms (such as standard Pentium
II based PCs). Applying this flow technique to moving entities provides
some straightforward primitives for analysing the motion of those objects.
Specifically, in this paper, methods are presented for: analysing rigidity
and cyclic motion using residual flow; and determining
self-occlusion and disambiguating multiple, mutually occluding entities using pixel
contention.
BACK TO FRAME-RATE

Plausible reality for real-time immersion in the virtual arena
Simon M. Rowe
smr@cre.canon.co.uk

Abstract
Over the last three years there has been increased interest in
photo-realistic modelling and rendering techniques and a surge in popularity
of image-based rendering. These techniques aim to accurately model reality
in order to generate virtual imagery that is indistinguishable from it. In
this paper we introduce the concept of plausible reality. The aim of
plausible reality is also to generate virtual imagery that is
indistinguishable from reality, but without necessarily duplicating it. The
key benefit of plausible reality over duplicating reality is that it can be
done in real-time with simple computer vision techniques. We demonstrate a
plausible reality system running real-time on an off-the-shelf PC.
BACK TO FRAME-RATE