The Prospects of Face, Silhouette, and Gait Recognition: RecFaces gives a comprehensive analysis
Facial recognition is the most widespread and reliable biometric technology in large-scale security systems. However, in light of dwindling investments in AI-driven solutions, both solution providers and customers have turned their attention towards the development of new algorithms that can reduce the cost of such comprehensive systems.
What algorithms are these? And are there any solutions already available for autonomous implementation?
The market for facial biometrics is witnessing remarkable growth. The analytical agency MarketsandMarkets predicts an increase in the global biometrics market from $42.9 billion USD in 2022 to $82.9 billion USD by 2027, with an average annual growth rate of 14.1%. According to GrandViewResearch, the revenue from biometric technologies will grow up to $150.58 billion USD by 2030, with the facial recognition segment reaching $19.68 billion. These figures indicate that analysts view facial biometrics technologies as a promising area for investment and business development. The average annual revenue growth rate of 29.25% and the market volume of 15.4% indicate both a continuous advancement of facial identification technologies and a strong demand for them.
This growing demand can be attributed to several factors. These include the development of e-commerce and banking; the widespread adoption of cloud technologies; the need for security in public spaces and transportation; the requirement for identification when accessing government services, and businesses transitioning to hybrid and remote work modes via personal devices.
- What Drives Technology Development: Cost-Savings and Marketing
- The Cost of Cameras Increases as Crowd Density Grows
- What Are the Alternatives?
- Identification Challenges and Vulnerability of Algorithms
- Is There a Working Solution to the Problem of Identifying People in a Flow?
- To Conclude
What Drives Technology Development: Cost-Savings and Marketing
It’s both natural and understandable that customers who use digital systems (government entities in particular) want to achieve significant cost savings on infrastructure. Likewise, developers of these digital systems strive to offer market solutions that make such savings possible and technically justified. However, this is not as straightforward as one might expect. One obvious way to optimize a large-scale video identification system is to combine facial recognition with technologies that are less demanding in terms of camera resolution. This approach offers a compelling advantage: it allows the maximum utilization of devices already installed at the customer's site, even if their resolution is insufficient for facial identification in a crowd. These considerations, as well as the desire to differentiate from competitors, push developers to experiment with algorithms that do not require high-resolution cameras.
The Cost of Cameras Increases as Crowd Density Grows
Facial biometric systems can achieve nearly 100% accuracy in facial identification, but such systems require high-resolution megapixel cameras. What resolution is sufficient? The choice depends on the system's objectives, the area covered in the camera frame, and the density of people passing through.
For successful identification, a face needs to have a size of 128×128 pixels. A 2-megapixel camera (1920×1080 pixels) can fit approximately 120 densely arranged face images, while an 8-megapixel camera (3840×2160 pixels) can capture more than 480 faces. 2-megapixel resolution is sufficient for video surveillance in stores or offices. However, crowded places such as stadiums or public transportation require more expensive devices and additional computational power to process the heavier video stream. These necessitate additional capital investments that increase as the system expands. For industrial security systems, facial biometrics, like the Id-Guard system, is the most reasonable choice; however, implementation of such systems into the “Safe City” infrastructure requires making cost reduction a priority in order to alleviate the overall budget burden. This is particularly important as the foundations of such systems are typically laid down 5-10 years prior and may not be sufficient for current facial identification requirements.
What Are the Alternatives?
The development of technology to complement facial biometrics within a unified video identification system has been ongoing for over 10 years. For example, in 2012 the National Physical Laboratory in the United Kingdom claimed it possible to identify individuals by their gait.
A system utilizing an alternative identifier must meet several requirements:
- Low-resolution images are sufficient for identification;
- The biometric feature is accessible without the person's involvement or consent;
- The system complies with data protection regulations;
- The algorithm «tracks» a person from one high-resolution camera to another. The algorithm is capable of tracking a person across multiple high-resolution cameras.
Currently, there are two groups of identification and tracking algorithms that can work with low-quality images, including small, overexposed, and dark images. These algorithms utilize static and dynamic human features as identifiers. Silhouette recognition involves identifying a person by their outline, including their clothing and headwear. An algorithm using silhouettes as identifiers can track a person's movements within the video surveillance system’s field of view. Periodically, the «tracked» individual enters the visibility range of a high-resolution camera, which then identifies the person's face. This approach can be used to track the trajectory of a street thief, locate a lost child in a crowded area, or monitor suspicious individuals. The claimed accuracy of such algorithms ranges from 90% to 99% , but developers of these solutions do not disclose how exactly this was estimated and whether there is a possibility to integrate the technology with other systems.
- The system can be easily deceived since a silhouette is a temporary feature that can be quickly altered or disguised, e.g. by entering a «blind spot» and changing clothes;
- Silhouettes are challenging to distinguish in dense crowds;
- The algorithm’s performance is sensitive to the quality of the image. The accuracy of recognition when transitioning from one camera to another heavily depends on the camera's viewing angle, resolution, and color representation.
- The system is easily deceived. The silhouette is a temporary feature that a person can quickly get rid of if desired. It is enough to enter a «blind spot» and change clothes;
- It is impossible to distinguish the silhouette in a dense crowd;
- The algorithm's sensitivity to the image. The accuracy of recognition when transitioning from one camera to another heavily depends on the camera's viewing angle, resolution, and color representation.
Dynamic features such as gait and arm movements during walking and running are individual for each person and more difficult to counterfeit than a silhouette. For example, a gait identification algorithm analyzes a combination of parameters such as height, stride length, walking speed, joint movements (hip, knee, ankle), body posture, foot position, and joint angles. Researchers test different combinations of neural networks and training datasets with cameras positioned at various angles of view. Depending on these factors, accuracy ranges from 95.26% to 99.6% in laboratory conditions. These algorithms have an advantage over silhouette recognition. It is more difficult to accurately imitate someone else's gait compared to replicating a silhouette. Additionally, when transitioning to the field of view of another camera, certain gait parameters remain unchanged, such as stride length and walking speed. Interest in silhouette and dynamic features as «temporary biometric identifiers» is also driven by privacy considerations, as many countries impose legislative restrictions on biometrics. However, there are examples of facial biometric systems that store a biometric template in their database which cannot be used to reconstruct an individual's personal data.
Identification Challenges and Vulnerability of Algorithms
Before silhouette and gait recognition algorithms can become cost-effective additions to facial biometrics within multifactor video analytics systems, certain limitations arising from the specific nature of the technology need to be addressed. First, the system needs to unambiguously identify a person using video footage from cameras with:
- A — varying technical characteristics;
- B — different angles.
In other words, the algorithm needs to «understand» that a side-view full-body color image and a dark circle next to similar circles (from a top view) belong to the same person. According to the algorithm developers from NtechLab, «the main challenge in silhouette recognition lies not in identifying and capturing it, but in the technical peculiarities of the diverse 'zoo' of cameras in 'Safe City' systems.» Partially, this problem can be solved through strategic camera placement and elimination of «blind spots» where the top view is not duplicated either by cameras capable of capturing a full silhouette or, at the very least, those providing a facial image sufficient for identification. However, this partial solution makes both the algorithm developer and the client dependent on the organization responsible for designing and installing the video surveillance system. The algorithm developer must either ensure the algorithm's stability against changes both in camera angles and characteristics or simply accept poor accuracy that may result in case of a camera malfunction. At the same time, the client becomes reliant on the urban infrastructure or the expertise of the system designers. Additionally, the video surveillance integration company must ensure a high level of professionalism in design and installation. This is a no-win situation for all parties involved. The solution may lie in a better, more accurate mathematical description of the «transition» between cameras and expanding the size of the training dataset to include various camera angles and characteristics. Still, the main problem for algorithms working with silhouette or gait is that a person's full-body visibility is required for most of the time. In dense crowds, these systems are prone to errors and need to be supplemented with a facial recognition system. To effectively track individuals based on their specific silhouette or gait, the system should be equipped with high-resolution cameras in areas where there are typically expected a lot of people, such as metro vestibules, escalators, and train platforms. This will help to prevent losing the tracked person in a crowd. However, there is a higher risk of losing the person if a crowd spontaneously gathers in a usually quiet area and only inexpensive cameras for silhouette or gait recognition are installed. There are other challenges as well. For example, how to differentiate between the silhouettes of two workers of similar build wearing identical uniforms across different cameras? What if a person moves behind a pillar entering a blind spot and puts on a bulky jacket? Although the algorithm can be effective against minor troublemakers, it still can be easily deceived. In terms of accuracy, systems based on gait or gesture identification are much more promising as they rely on consciously uncontrolled and individually unique data. With further development, gait tracking algorithms can become an interesting component of a «comprehensive therapy» in combination with facial biometrics. However, the independent use of such systems still raises many questions.
Is There a Working Solution to the Problem of Identifying People in a Flow?
What remains the only proven and highly accurate solution for commercial and municipal implementations is facial biometrics. Its cost-efficiency, scalability, compliance with legislative norms, as well as its other properties fully depend on each project’s realization.
For example, the Id-Guard facial recognition system is designed for use in large and geographically distributed facilities, such as industrial plants, transportation infrastructure, and shopping centers. By evenly distributing video preprocessing, the system reduces the load on the central server and communication channels, resulting in significant cost savings in installation and operation. When integrated with video management systems (VMS), Id-Guard enhances their capabilities with biometric functions. The solution offers ready-made certified integration modules for a range of VMS systems from leading manufacturers. The movement of individuals within a facility can be tracked on graphical maps, enabling security services to add suspicious individuals to a temporary observation list and receive notifications about their movements. Id-Guard securely stores periodically updated biometric templates of users in the customer's environment, ensuring that they are not linked to personal data such as names and surnames. It is not possible to reconstruct photos from these templates. This storage approach fully complies with the international General Data Protection Regulation (GDPR) for personal data protection. The RecFaces system encrypts critical data, such as photos, using the AES-256 standard. Besides, the Id-Guard architecture allows integration with additional neural networks and algorithms to expand the system's capabilities, e.g. by including new identification methods. However, to develop the product further and start its real-life implementation, it is essential to ensure that it will truly benefit the customers. The functional capabilities of the Id-Guard solution can be evaluated free of charge via a 3-month demonstration license.
Research on algorithms that can complement facial recognition systems with other biometric features has been ongoing for over a decade. The goal is to differentiate from competitors, expand the functionality of video identification and analytics systems, and reduce costs by utilizing low-resolution cameras. New algorithms based on gait analysis demonstrate laboratory accuracy of up to 99.6%. However, in real-world scenarios, it is facial biometrics that remains the most reliable method of identification within a flow. LEARN MORE[/stu]