Multimodal Emotion Recognition Using Multiple AI Algorithms
Abstract
This project aims to develop a real-time multimodal emotion recognition system that detects emotions from video, speech, and text inputs. The system activates the camera to identify emotions instantaneously. To achieve this, we train a robust model using four state-of-the-art deep learning algorithms: VGG19, a deep convolutional neural network known for its intricate feature capture; ResNet50, a 50-layer network that overcomes the vanishing gradient problem; MobileNetV2, a lightweight model optimized for mobile and edge devices; and Xception, which uses depth-wise separable convolutions for high performance. The project involves comparing the accuracy of these algorithms to determine the most effective approach for real-time emotion recognition.