A Non-cooperative Long-range Biometric System for Maritime Surveillance

A Non-cooperative Long-range Biometric System for Maritime Surveillance

Xiaokun Lia, Genshe Chena, Qiang Jib, Erik Blaschc aDCM Research Resources, LLC, Germantown, MD, USA bRensselaer Polytechnic Institute, Troy, NY, USA cRYAA Evaluation Branch, Air Force Research Laboratories, Dayton, OH, USA


To address the challenges on non-cooperative long- distance human identification and verification, we propose an innovative cost-efficient system for automatic long-range biometric recognition of non- cooperative individuals in 24/7 operations. The system has three cameras. One is a wide field of view (WFOV) CCD video camera with an Infrared (IR) filter and powerful IR illuminators for human scan in a wide area at a long distance. The other two cameras are high resolution video cameras with narrow field of view (NFOV) and an IR filter & illuminators, mounted on a pan-tilt-unit (PTU) to capture the frontal view of human face and iris respectively. Once the frontal views of moving individuals are captured by the NFOV cameras, the face/iris models will be extracted and classified by the state-of-the-art face/iris recognizers. The hardware of the biometric system also includes one FPGA, three DSP processors, and one Zigbee module for fast bio-data analysis and wireless data transmission.

1. Introduction

Reliable personnel authentication, identification and verification methods are becoming a necessity in today’s life for security and device control. Biometrics can be defined as the automatic identification of an individual based upon his/her one or more intrinsic physical or behavioral traits. Biometric technologies measure and recognize human physical and behavioral characteristics for authentication purposes. Some of the most common physical characteristics include fingerprints, irises, and facial patterns. For instance, a human face is proven a reliable biometric applied in e- passports (Singapore, Russia, Germany), airports (UK, France, USA), commercial agencies (banks, etc). It is not difficult to foresee that in the near future, activitiesin everyone’s life, not just computer users’ life, will require biometric authentication. Because of the heightened security importance, there has been some considerable work on non-cooperative biometric system development, especially on face/iris recognition at a distance [1], [2]. However, most of the existing face/iris recognition systems, especially for iris recognition [3], [4], require a cooperative individual under a controlled environment. While these techniques are more capable of identifying cooperative subjects, they have limited capability of identifying non-cooperative subjects for applications such as surveillance, where the observed individuals are non- cooperating and non-habituated.

In this paper, we propose an innovative scheme for developing an automatic long-range biometric recognition system by combining face recognition (the distance between the subject and camera can be as far as 50m) and iris recognition (up to 3m) of non- cooperative individuals in 24/7 operations, especially for ship-to-ship surveillance.

2. Biometric system design

In this section, we briefly introduce the whole system design. Firstly, the sensing (imaging) unit used in this biometric system consists of three cameras. One is a wide field of view (WFOV) video camera for the purpose of scanning individuals at a long distance (up to 50m) in a wide area and in 24/7 operations. The other two cameras are high resolution video cameras with narrow field of view (NFOV) mounted on a pan- tilt-unit (PTU) to capture the frontal view of human face and iris, respectively. Once a human subject has been detected in the WFOV images, the NFOV video camera for human face identification will be pointed to the person, which is controlled by a PTU device. Then, the detected face will be either entered into a database for modeling or automatically identified by comparing the existing face models available in the database with the state-of-the-art face recognizer. The non- cooperative individuals will be recognized when the recognition has sufficient confidence. When the human is close enough to the imaging system (say within 3 meters), another NFOV video camera will be activated and adjusted to the person’s face to capture and recognize the iris images to verify the facial recognition results.

image001Figure 1 Bio-data acquisition and recognition

The imaging sensors of the system include one WFOV camera (PC810IR), and two NFOV cameras (Pulnix TM-4000CL and Pulnix TM-9700). The PC810IR has a durable metal weatherproof housing with a tough IP 66 weatherproof, built-in auto activation IR array which can reach an astounding 100m range through the clear impact resistant front dome lens for large-area scan and target detection. The camera has an auto-iris zoom lens with a powerful 7.5mm to 50mm, and a Sony CCD chipset with over 540 lines of resolution. The resolution of the NFOV camera (Pulnix TM-9700) for face detection, tracking, and recognition can be up to 525 lines with 30 fpt. The resolution of another NFOV camera (Pulnix TM- 4000CL) for iris recognition can reach to 2048 x 2048 at 15 fpt and its DOV (depth of view) is 0.1m. The two NFOV cameras are cooperated with powerful IR filters & illuminators for night vision. The illustration of using the proposed imaging sensors for bio-data collection is shown in
Figure 1.

In the system, the function of PTU is to control and adjust (i.e. image zoom-in/zoom-out) the two NFOV cameras to track human body and head for face and iris image acquisition, respectively. Many commercial PTU products are available in market for our selection. We use a wire-controlled pan-tilt unit [5] as the platform to mount the two NFOV cameras on its top. The control signals of the PTU are generated automatically by a central signal processor of the system.

The function of the central processor used in the system is to perform real-time and fully automatic image enhancement, human detection, face/eye tracking, and face/iris recognition. All bio-image/video processing and analysis algorithms are optimized and integrated into a high-speed and cost-efficientprocessing unit which includes three Digital Signal Processing (DSP) chips and one FPGA chip. The architecture of the central processing is illustrated in Figure 2. The processing unit consists of a PCB system board which integrates video A/D converters, high- speed data-bus controllers, a FPGA chip for fast video enhancement and human detection and tracking, a DSP chip for face detection and tracking, and two DSP chips for face recognition and iris recognition respectively. The input video stream is first processed by the FPGA chip for image enhancement and human detection & tracking. Then, a DSP chip is used for frontal view extraction and face tracking. The other two DSP chips process the region of interest (facial area) for face and iris recognition simultaneously. In our prototype system, TI TMS320DM642 is chosen for face detection and face/iris recognition. Xilinx LX85 is selected for image enhancement and human subject detection.

Figure 2 Central processor

As shown in Figure 1, the proposed bio-recognition system can be integrated and implemented into a portable device and placed on a small vessel for long- distance maritime surveillance and bio-data collection. Obviously, it is impossible to install a large bio- database and a high-speed bio-search engine in the portable device. The collected bio-data and processing results (e.g. the extracted bio-features and the recognition results based on a small database stored in the portable device) need to be transmitted to a remote center or other distributed bio-processing nodes for further identification and verification. Zigbee, a reliable and efficient wireless network module, is selected for wireless bio-data transmission in the system.

3. Biometric image tracking andrecognition (BITAR) scheme

In our BITAR scheme, as shown in Figure 2, human face and eye detection is processed in one-loop, which makes the recognition system faster and more accurate. Also, during head tracking procedure, we only check if the current frontal view of the human subject is valid for face/iris recognition. Only recognizable face and iris images are selected for human authentication,identification, and verification. To guarantee a fast face/iris recognition rate, we select state-of-the-art face and iris recognizers for accurate face and iris recognition and focus on getting high quality face/iris images by applying advancing video stabilization and debluring algorithms, which is especially meaningful for the outdoor/real-life applications.


Figure 3 Block diagram of the BITAR non- cooperative long-range recognition scheme

3.1. Video stabilization and debluring

Camera/platform motion and/or human subject’ s motion induce video view-jitter and image blur, which will therefore cause many difficulties for human face/iris detection, tracking, and recognition. According to the unique biometrics challenges, we developed an efficient stabilization and debluring algorithm, which is a simplified version of the recent work [6], to stabilize and enhance the incoming video. In order to achieve a steadier and cleaner video, our stabilization and debluring method consists of the following four steps: Global motion estimation, local motion estimation, undesired motion removal, and image debluring. Image blur caused by sensor/human- subject motion during biometric imaging will decrease the image quality and thus reduce human identification rate. By coupling the image spatial-spectral characters for image quality improvement, we employ Winner filter to remove the blurs from the corrupted images once we have estimated the motion pattern of the camera and platform.

3.2. Target detection and tracking in WFOV images

For accurate target tracking, both the spatial context and the temporal context should be taken into account. A good detection scheme in individual frames cannot last long with a poor memory of targets’ appearance. This is why the temporal context is also needed besides spatial context. To incorporate the temporal context, for each target we use an appearance model (See [7] for more details) summarized on the appearances of theobject as seen in the past. The contribution of past appearances makes the model robust to occlusions or illumination changes. The model should then be able to recognize the object when past appearances return in the future. While the model should be adaptive to new appearances of the object, a long-term memory of all appearances will also help to reduce the drift usually happening in adaptive tracking. In the target detection and tracking algorithm, we use probabilistic principle components analysis (PCA) for feature extraction and person recognition.

3.3. Face detection, tracking, and recognition in NFOV images

For multi-view face detection and tracking in NFOV images, we apply Fisher Discriminant Analysis (FDA) and Recursive Non-parametric Discriminant Analysis (RNDA) to extract the statistically significant discriminate features and to minimize the misclassification errors. The RNDA relaxes Gaussian assumptions of Fisher discriminant analysis (FDA), and thus can handle more general class distributions. The resulting RNDA features provide better accuracy than the commonly used Haar features. The selected features are then used to construct a piecewise linear classifier. Experiments (See our recent work in [8], [9]) with real video data show that such constructed classifier can correctly detect both frontal and profile faces and eyes.

3.4. IRIS recognition in NFOV images

Commercial IRIS recognition systems based on the algorithms developed by John Daugman have been available since 1995 and have been used in a variety of practical applications. However, all currently available systems impose substantial constraints on subject position and motion during the iris imaging and recognition process. These constraints are largely from the image acquisition process, rather than the particular pattern-matching algorithm, which greatly limits the use of iris recognition in maritime environments. Among current iris imaging and recognition systems, the system developed by Sarnoff Corporation [10] results to substantially relax the constraints on position and motion by means of a new imaging system based on NFOV high-resolution cameras and video synchronized with IR illumination. The system can capture iris images and correctly recognize human iris for a moving human subject at a distance (as far as 3m) with non-cooperation or minor-cooperation. In our research, we will incorporate the iris imaging sensor and the iris recognition utility, developed by Sarnoff Corporation Inc., into our biometric system.

4. Preliminary results

Some preliminary experimental results on video preprocessing (video stabilization and debluring), single and multi-target detection, tracking, and recognition have been obtained. More than eight scenarios have been designed and tested. In these tests, the human subjects have been inspected at the distance between 20m to 100m. One example, as shown in Figure 4, is selected to illustrate the outputs of the current prototype system.

In the test, the incoming images were captured by a WFOV and a NFOV video camera with 512 × 480 resolution at 25 fps. Human templates (models) in our current testing database stored in the system are 50 in total and named as P1, P2, ..., P50. The selected test was set on two small vessels on a lake. The WFOV and NFOV video cameras were mounted on one vessel and inspected the people standing on the other vessel. In the test, three people walked on the deck of a vessel with non-cooperative manner relative to the cameras mounted on another vessel. The results of human detection, face detection and recognition are shown in Figure 4. After human detection in WFOV images, as shown in (a), the NFOV camera was zoomed in and pointed to the human subjects (region of interest) automatically, controlled by the PTU. Compared with (b), all people in (c) are correctly recognized and (c) shows that the confident score of face detection increases from 0.58 to 0.64 for P1, from 0.43 to 0.63 for P2, and from 0 to 0.58 for P3. Even in a bad situation (e.g. windy weather, strong ocean waves, dark environments), an acceptable recognition rate can still be accomplished.

5. Conclusions

We have presented a novel scheme and system design for automatically detecting and recognizing human subjects via face/iris traits at a long distance without cooperation, applied in a maritime scenario. The effectiveness and efficiency of our video stabilization & debluring, human detection, face detection, tracking, and recognition algorithms have been validated by our preliminary studies.


(a) Human detection and tracking in a WFOV image


(b) Face detection and recognition in an original (“raw”) NFOV image


(c) Face detection and recognition in a stabilized and deblurred NFOV image

Figure 4 Multi-target detection, tracking, and Figure 4 Multi-target detection, tracking, and recognition in NFOV images


The authors would like to thank Prof. James Matey at US Naval Academy and the people at Sarnoff and Naval Research Office for their helpful comments.


[1] G. Medioni, D. Fidaleo, D. Choi, L. Zhang, C.-H., Kuo, and K. Kim, “Recognition of Non-Cooperative Individuals at a Distance with 3D Face Modeling,” IEEE Workshop on Automatic Identification Advanced Technologies, pp.112-117, 2007.

[2] S. V. Duhn, L. Yin, M-Y. Ko, and T. Hung,“Multiple- View Face Tracking For Modeling and Analysis Based On Non-Cooperative Video Imagery,” IEEE Conf. on Computer Vision and Pattern Recognition,pp.1-8, 2007.

[3] L. Ma, T. Tan, D. Zhang, and Y. Wang, “Local Intensity V ariation Analysis for Iris Recognition,” Pattern Recognition, vol. 37, pp. 1287-1298, 2004.

[4] R.P. Wildes, “Iris Recognition: An Emerging Biometric Technology,” Proc. IEEE, vol.85, pp.1348-1363, 1997.

[5] http://www.imagewest.tv

[6] A. Foi, V. Katkovnik, and K. Egiazarian, “Pointwise Shape-Adaptive DCT for High-Quality Denoising and Deblocking of Grayscale and Color Images”, IEEE Trans. Image Process., vol. 16, pp. 1395-1411, 2007.

[7] Hieu T. Nguyen, Qiang Ji, Arnold W.M. Smeulders, “Spatio-temporal context for robust multitarget tracking,” in IEEE PAMI, Vol. 29, No. 1, Jan. 2007

[8] P. Wang and Q. Ji, “Multi-view face tracking with factorial and switching hmm,” in Workshop on the Applications of Computer Vision (WACV), 2005.

[9] J. Zou, Q. Ji, and G. Nagy, “A comparative study of local matching approach for face recognition,” IEEE Transactions on Image Processing, 2007.

[10] J. R. Matey, et al, “Iris on the move: acquisition of images for iris recognition in less constrained environments”, Proceedings of the IEEE, Vol. 94, No. 11, 2006.