Emotion recognition in security: should it be implemented?

According to the report of the FutureWise analytical agency [1] published in October 2022, the market for emotion recognition technologies will grow by an average of 12.9% per year and by 2028 will reach almost $49B USD. Now emotion recognition is widely used in customer service, neuromarketing, and education. Should it be introduced into security as well?
Contents
- What emotions are and why to recognize them?
- Technical implementation of emotion recognition
- Where and how emotion recognition is used
- Smartphones
- Multimodal systems
- Neuromarketing, advertising and customer service
- Education and video game industry
- So... Should we implement...?
- Technical limitations
- Cultural limitations
- Psychological limitations
- Facial expressions
- Ethical and legal issues
- Limitations of standardization and replication
- Conclusion
- Sources:
What emotions are and why to recognize them?
Emotion is personal data about an individual’s states and feelings, their thoughts and intentions (even if subconscious), their responses to stimuli, people and environment.
Psychologists also distinguish other emotional processes: affect, mood and feeling. They differ in duration and a degree of involvement: while in an affective state a person loses their will for a relatively short period of time moods and feelings last longer and are easier to cope with, serving as a background for emotions — a direct reaction to what is happening.
Emotions influence behavior. Of course, a person can control their manifestation to a certain degree. However, M. Bradley and P.J. Lang [3]confirmed experimentally that negative emotions are much more difficult to control than positive ones.
This partly explains why emotion recognition is a popular option in security: poorly controlled negative emotions can easily «fuel» illegal actions. It can be prevented if such emotions are detected in time.
Another explanation lies in the seeming simplicity of capturing and recognizing emotions. An emotion triggers a chain of physiological reactions, including movement of facial muscles. Certain facial expressions are characteristic of different emotions — for a long time it was believed [4]that they are nearly common for different ethnic, age and social groups.
Different models distinguish up to 28 basic emotions. Paul Ekman [4]identifies “microexpressions” — contractions of facial muscles. They reveal our true emotions, even if we successfully hide them. An average person can identify an emotion with an accuracy of up to 70%, (but they actively use contextual information: voice, gestures), while some commercial emotion recognition systems based on neural networks, do it with the accuracy up to 90%. However, it turns out that this number is not enough for security purposes.
Why? First, let’s talk theory.
Technical implementation of emotion recognition
Emotion recognition can be performed differently. For example, it can be based on classification of key points tied to the position of eyes, eyebrows, lips, nose, and jaw. Descriptors based on this visual information are attached to each point, giving a certain vector that helps to identify emotions.
A more advanced method uses highly accurate deep neural networks trained on large datasets. To increase the accuracy, they use not just individual images, but a series of images that show positions of facial muscles in dynamics. This also makes it possible to include representatives from different ethnic and cultural groups, as well as people of different temper who express emotions and react to stress differently.
Where and how emotion recognition is used
Smartphones
Gartner agency claims that in the near future our smartphones will know us better than friends and relatives, and will interact with us on a new level of emotional subtlety. For example, iPhone X has the built-in Face ID recognition technology, which not only unlocks the phone, but can also create emojis with our facial expressions that imitate our microexpressions. Companies can use open scientific data on emotion recognition in a stack with technologies, thus giving rise to the field of affective computing.
Multimodal systems
The only precedent for the widespread introduction of behaviour pattern recognition, including emotion recognition, has been implemented in Chinese Smart Cities, with a certain degree of alarm covered in British The Guardian in 2021, referring to a report by human rights activists [5]. A system of CCTV cameras along with other equipment were connected to police servers, recording voice, body temperature, and movements of people coming into view. Such a multimodal system quite accurately identifies and indicates the source of a potential threat. In some locations (for example, in elevators) it helped to get rid of violations. However, it should be noted that this method recognizes not simply emotions, but whole patterns of human behavior.
On-board car security systems also use multimodality. They record emotions, the frequency of blinking and yawning, changes in the voice and pattern of movements. If the system decides that the driver is tired, it can turn on music or vocally attract the driver’s attention back to the road.
Neuromarketing, advertising and customer service
These are typical tasks for emotion recognition systems. Neural networks follow the direction of subjects’ eyes to identify the most interesting places of the commercial, as well as the emotional response to the advertising message. With the help of cameras installed at the entrance and exit of a store or cafe, it is possible to conclude how people's emotions have changed. This gives marketing specialists analysis and reflection.
Education and video game industry
An emotion recognition system can also increase the effectiveness of a learning process controlling, for example, the level of a student’s attention or how well they process new information. If the student is tired, the system can offer a break. Similarly, “advanced” games can adjust the tension of the game plot.
So... Should we implement...?
AI-solutions serve, primarily, for work automation, reduction of workload and elimination of human factor. For example, watching for unusual behavior in a crowd implies increased workload for security personnel that may lead to oversight. Emotion recognition systems facilitate this process, automatically classifying people in a crowd and minimizing datasets that personnel have to go through manually (e.g. by tagging suspicious people).
However, it is not as easy as it seems. There are so many nuances to emotion recognition that its successful implementation depends on a particular business situation.
Technical limitations
Accurate emotion recognition requires a series of high-quality images. A car security system, for example, works only with images of its driver, while to control a crowd, a system would process images of dozens or even hundreds of people. This requires extremely expensive and sophisticated technology.
Cultural limitations
Different cultural and ethnic groups are characterized by different degrees and patterns of emotional expression. As a rule, one ethnic group always prevails in the training dataset, differing from region to region, while in reality, in a crowd there will be people of many different nationalities. The neural network will produce a response specified for the “national majority”, which will inevitably be less accurate. In addition, many people will try their best to keep a neutral face, considering it impolite to show emotions in public.
Psychological limitations
The intensity of how people manifest their emotions, as well as the content of an emotional reaction itself, depends on many factors: temper, upbringing, education, age, social status etc. People react to stress in different ways: one falls into a stupor, another goes into hysterics, another retains a neutral expression etc. A person can change emotions to the opposite within a short time, reacting to a telephone conversation or to a random poster. The cameras will also record different background emotions depending on where they were installed. This confirms the importance of linking the system to the case.
The emotion recognition system will work well with those offenders who act spontaneously — for example, in an accidental fight provoked by rudeness. But in case of a planned crime, when an offender controls themselves well, mistakes are possible.
The very fact that people will be aware that emotion recognition systems are used to monitor crowds will make people more careful about expressing their emotions.
Facial expressions
Wrinkles and overall facial structure (especially at a certain age) may form a so-called “mask” typical for a particular emotion that appears even in a neutral state. It may have nothing to do with a person’s current emotional state, but rather with their lifestyle, age, sagging muscles (e.g. in the corners of a mouth) etc. A neural network may mistake such a “mask” for a true emotion. Insufficient lighting, low resolution of cameras or their improper location can also significantly increase the probability of error.
Ethical and legal issues
The control over people's emotions by law enforcement and security services erases the personal boundaries and infringes upon international human rights.
The fundamental legal principle — the presumption of innocence. A person can fall under suspicion only because they have “wrong” facial expression, while they did not even think about violating the law. This was noted by ARTICLE 19, which released a report in January 2021 [5]on the use of multimodal emotion recognition systems by Chinese law enforcement agencies and commercial structures.
If emotion recognition systems work in commercial enterprises, this can lead to psychological stress and burnout — as well as other methods of control.
Limitations of standardization and replication
Each variant of the commercial deployment is tied to a specific business case and, most likely, another enterprise will not be able to use it without serious reconfiguration. Monitored emotions, their intensity, recognition accuracy, interpretation of results — all these characteristics of the system vary from implementation to implementation. Therefore, for now we are talking about developing customized solutions only. Sometimes it has commercial potential, but more often it doesn't.
Conclusion
The widespread use of multimodal emotion recognition systems raises ethical and legal issues. On the other hand, such systems require significant computing power and technical equipment, and therefore are extremely expensive and are of no interest to either private customers or the public sector.
Emotion recognition systems, which are already common in marketing and the gaming industry, are not actively used in security. Currently, it is a product of “individual tailoring” for specific requirements and scenarios. In addition, because of the specificity of facial reactions, the accuracy of such systems remains low.
The use of emotion recognition systems in security is still an open question. Such systems will become in demand when the commercial effect of their implementation will surpass the costs of development, installation and configuration.
Sources:
- [1]https://www.futurewiseresearch.com/healthcare-market-research/Emotion-Detection-and/9743 Report. Emotion Detection and Recognition Market By Technology, By Software Tools, By Application Areas, By Component, By Verticals and By Region: Industry Analysis, Market Share, Revenue Opportunity, Competitive Analysis and Forecast 2022—2028.
- [2]APA Dictionary of Psychology. American Psychological Association.
- [3]Bradley M., Lang P.J. The International affective digitized sounds (IADS): stimuli, instruction manual and affective ratings. — NIMH, Center for the Study of Emotion and Attention, 1999.
- [4]Paul Ekman, Richard J. Davidson. The Nature of Emotion: Fundamental Questions (Series in Affective Science). — Oxford University Press; 1st edition (December 22, 1994).
- [5]https://www.article19.org/wp-content/uploads/2021/01/ER-Tech-China-Report.pdf ARTICLE 19, January 2021, Emotional Entanglement: China’s emotion recognition market and its implications for human rights.