Fraunhofer to introduce simultaneous face and speaker recognition at IBC 2023

Fraunhofer to introduce simultaneous face and speaker recognition at IBC 2023
The Fraunhofer Institute will present a new system at IBC 2023 that can automatically localise and identify people in large media archives based on faces and voice.

The Audiovisual Identity Suite combines technologies for face and speaker recognition, using artificial intelligence to analyse media content for the
presence of specific individuals.

This system enables program planners to gain a comprehensive view of individual presences in TV broadcast, identifying specific
individuals with a user-friendly interface that can be used for in-depth insights, trend analyses and statistics.

The tool uses a ‘heatmap’
to identify when and how often an individual is visible or audible on different TV channels, by also identifying when an
individual is speaking but not shown in the picture.

A cross-modal analysis tool is also included to increase the validity and quality
of search results, relying on AI-based algorithms to recognise speakers and classify gender as well as speech quality analysis.

In the
future, Fraunhofer intends to add age estimation features based on visual analysis and audio improvements such as language recognition, speech-to-text conversion
and keyword analytics.

Christian Rollwage, speaker recognition specialist, Fraunhofer Institute, commented: "Our planned enhancements will provide deeper opportunities for analysis. With the
addition of text transcription, we can not only determine how often certain people appear but also which topics they are talking about.”