Multimodal fusion to obtain scores by audio and video `FI V2`

e18a378b02754863b055bcedad6cad92

Import required packages

[2]:

from oceanai.modules.lab.build import Run

Build

[3]:

_b5 = Run(
    lang = 'en',              # Inference language
    color_simple = '#333',    # Plain text color (hexadecimal code)
    color_info = '#1776D2',   # The color of the text containing the information (hexadecimal code)
    color_err = '#FF0000',    # Error text color (hexadecimal code)
    color_true = '#008001',   # Text color containing positive information (hexadecimal code)
    bold_text = True,         # Bold text
    num_to_df_display = 30,   # Number of rows to display in tables
    text_runtime = 'Runtime', # Runtime text
    metadata = True           # Displaying information about library
)

[2023-12-14 22:46:31] OCEANAI - personal traits: Authors: Elena Ryumina [ryumina_ev@mail.ru] Dmitry Ryumin [dl_03.03.1991@mail.ru] Alexey Karpov [karpov@iias.spb.su] Maintainers: Elena Ryumina [ryumina_ev@mail.ru] Dmitry Ryumin [dl_03.03.1991@mail.ru] Version: 1.0.0a16 License: BSD License

Getting and displaying versions of installed libraries

_b5.df_pkgs_ - DataFrame with versions of installed libraries

[4]:

_b5.libs_vers(runtime = True, run = True)

	Package	Version
1	TensorFlow	2.15.0
2	Keras	2.15.0
3	OpenCV	4.8.1
4	MediaPipe	0.9.0
5	NumPy	1.26.2
6	SciPy	1.11.4
7	Pandas	2.1.3
8	Scikit-learn	1.3.2
9	OpenSmile	2.5.0
10	Librosa	0.10.1
11	AudioRead	3.0.1
12	IPython	8.18.1
13	PyMediaInfo	6.1.0
14	Requests	2.31.0
15	JupyterLab	4.0.9
16	LIWC	0.5.0
17	Transformers	4.36.0
18	Sentencepiece	0.1.99
19	Torch	2.0.1+cpu
20	Torchaudio	2.0.2+cpu

— Runtime: 0.006 sec. —

Analysing audio information (forming model and loading model weights)

Formation of the neural network architecture of the model for obtaining scores by hand-crafted features (audio modality)

_b5.audio_model_hc_ - Neural network model tf.keras.Model for obtaining scores by hand-crafted features

[5]:

res_load_audio_model_hc = _b5.load_audio_model_hc(
    show_summary = False, # Displaying the formed neural network architecture of the model
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-14 22:46:31] Formation of the neural network architecture of the model for obtaining scores by hand-crafted features (audio modality) …

— Runtime: 0.322 sec. —

Downloading the weights of the neural network model to obtain scores by hand-crafted features (audio modality)

_b5.audio_model_hc_ - Neural network model tf.keras.Model for obtaining scores by hand-crafted features

[6]:

# Core settings
_b5.path_to_save_ = './models' # Directory to save the file
_b5.chunk_size_ = 2000000      # File download size from network in 1 step

url = _b5.weights_for_big5_['audio']['fi']['hc']['sberdisk']

res_load_audio_model_weights_hc = _b5.load_audio_model_weights_hc(
    url = url, # Full path to the file with weights of the neural network model
    force_reload = True, # Forced download of a file with weights of a neural network model from the network
    out = True,          # Display
    runtime = True,      # Runtime count
    run = True           # Run blocking
)

[2023-12-14 22:46:31] Downloading the weights of the neural network model to obtain scores by hand-crafted features (audio modality) …

[2023-12-14 22:46:32] File download “weights_2022-05-05_11-27-55.h5” (100.0%) …

— Runtime: 0.277 sec. —

Formation of the neural network architecture of the model for obtaining scores by deep features (audio modality)

_b5.audio_model_nn_ - Neural network model tf.keras.Model for obtaining scores by deep features

[7]:

res_load_audio_model_nn = _b5.load_audio_model_nn(
    show_summary = False, # Displaying the formed neural network architecture of the model
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-14 22:46:32] Formation of a neural network architecture for obtaining scores by deep features (audio modality) …

— Runtime: 0.244 sec. —

Downloading the weights of the neural network model to obtain scores for deep features (audio modality)

_b5.audio_model_nn_ - Neural network model tf.keras.Model for obtaining scores by deep features

[8]:

# Core settings
_b5.path_to_save_ = './models' # Directory to save the file
_b5.chunk_size_ = 2000000      # File download size from network in 1 step

url = _b5.weights_for_big5_['audio']['fi']['nn']['sberdisk']

res_load_audio_model_weights_nn = _b5.load_audio_model_weights_nn(
    url = url, # Full path to the file with weights of the neural network model
    force_reload = False, # Forced download of a file with weights of a neural network model from the network
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-14 22:46:32] Downloading the weights of the neural network model to obtain scores for deep features (audio modality) …

[2023-12-14 22:46:32] File download “weights_2022-05-03_07-46-14.h5”

— Runtime: 0.389 sec. —

Analysing video information (forming model and loading model weights)

Formation of the neural network architecture of the model for obtaining scores by hand-crafted features (video modality)

_b5.video_model_hc_ - Neural network model tf.keras.Model for obtaining scores by hand-crafted features

[9]:

res_load_video_model_hc = _b5.load_video_model_hc(
    lang = 'en', # Language selection for models trained on First Impressions V2'en' and models trained on for MuPTA 'ru'
    show_summary = False, # Displaying the formed neural network architecture of the model
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-14 22:46:32] Formation of the neural network architecture of the model for obtaining scores by hand-crafted features (video modality) …

— Runtime: 0.257 sec. —

Downloading the weights of the neural network model to obtain scores by hand-crafted features (video modality)

_b5.video_model_hc_ - Neural network model tf.keras.Model for obtaining scores by hand-crafted features

[10]:

# Core settings
_b5.path_to_save_ = './models' # Directory to save the file
_b5.chunk_size_ = 2000000 # File download size from network in 1 step

url = _b5.weights_for_big5_['video']['fi']['hc']['sberdisk']

res_load_video_model_weights_hc = _b5.load_video_model_weights_hc(
    url = url, # Full path to the file with weights of the neural network model
    force_reload = True, # Forced download of a file with weights of a neural network model from the network
    out = True,          # Display
    runtime = True,      # Runtime count
    run = True           # Run blocking
)

[2023-12-14 22:46:32] Downloading the weights of the neural network model to obtain scores by hand-crafted features (video modality) …

[2023-12-14 22:46:33] File download “weights_2022-08-27_18-53-35.h5” (100.0%) …

— Runtime: 0.226 sec. —

Formation of neural network architecture for obtaining deep features

_b5.video_model_deep_fe_ - Neural network model tf.keras.Model for obtaining deep features

[11]:

res_load_video_model_deep_fe = _b5.load_video_model_deep_fe(
    show_summary = False, # Displaying the formed neural network architecture of the model
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-14 22:46:34] Formation of neural network architecture for obtaining deep features (video modality) …

— Runtime: 0.783 sec. —

Downloading weights of a neural network model to obtain deep features

_b5.video_model_deep_fe_ - Neural network model tf.keras.Model for obtaining deep features

[12]:

# Core settings
_b5.path_to_save_ = './models' # Directory to save the file
_b5.chunk_size_ = 2000000 # File download size from network in 1 step

url = _b5.weights_for_big5_['video']['fi']['fe']['sberdisk']

res_load_video_model_weights_deep_fe = _b5.load_video_model_weights_deep_fe(
    url = url, # Full path to the file with weights of the neural network model
    force_reload = True, # Forced download of a file with weights of a neural network model from the network
    out = True,          # Display
    runtime = True,      # Runtime count
    run = True           # Run blocking
)

[2023-12-14 22:46:35] Downloading weights of a neural network model to obtain deep features (video modality) …

[2023-12-14 22:46:40] File download “weights_2022-11-01_12-27-07.h5” (100.0%) …

— Runtime: 4.311 sec. —

Formation of the neural network architecture of the model for obtaining scores by deep features (video modality)

_b5.video_model_nn_ - Neural network model tf.keras.Model for obtaining scores by deep features

[13]:

res_load_video_model_nn = _b5.load_video_model_nn(
    show_summary = False, # Displaying the formed neural network architecture of the model
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-14 22:46:40] Formation of a neural network architecture for obtaining scores by deep features (video modality) …

— Runtime: 0.724 sec. —

Downloading the weights of the neural network model to obtain scores for deep features (video modality)

_b5.video_model_nn_ - Neural network model tf.keras.Model for obtaining scores by deep features

[14]:

# Core settings
_b5.path_to_save_ = './models' # Directory to save the file
_b5.chunk_size_ = 2000000      # File download size from network in 1 step

url = _b5.weights_for_big5_['video']['fi']['nn']['sberdisk']

res_load_video_model_weights_nn = _b5.load_video_model_weights_nn(
    url = url, # Full path to the file with weights of the neural network model
    force_reload = False, # Forced download of a file with weights of a neural network model from the network
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-14 22:46:40] Downloading the weights of the neural network model to obtain scores by deep features (video modality) …

[2023-12-14 22:46:42] File downloading “weights_2022-03-22_16-31-48.h5”

— Runtime: 1.355 sec. —

Analysing multimodal information (forming model, loading model weights, obtaining personality traits scores)

Formation of neural network architectures of models for obtaining the personality traits scores (multimodal fusion)

_b5.av_models_b5_ - Neural network models tf.keras.Model for obtaining the personality traits scores

[15]:

res_load_av_models_b5 = _b5.load_av_models_b5(
    show_summary = False, # Displaying the formed neural network architecture of the model
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-14 22:46:42] Formation of neural network architectures of models for obtaining the personality traits scores (multimodal fusion) …

— Runtime: 0.048 sec. —

ЗDownloading the weights of neural network models to obtain the personality traits scores (multimodal fusion)

_b5.av_models_b5_ - Neural network models tf.keras.Model for obtaining the personality traits scores

[16]:

# Core settings
_b5.path_to_save_ = './models' # Directory to save the file
_b5.chunk_size_ = 2000000      # File download size from network in 1 step

url_openness = _b5.weights_for_big5_['av']['fi']['b5']['openness']['sberdisk']
url_conscientiousness = _b5.weights_for_big5_['av']['fi']['b5']['conscientiousness']['sberdisk']
url_extraversion = _b5.weights_for_big5_['av']['fi']['b5']['extraversion']['sberdisk']
url_agreeableness = _b5.weights_for_big5_['av']['fi']['b5']['agreeableness']['sberdisk']
url_non_neuroticism = _b5.weights_for_big5_['av']['fi']['b5']['non_neuroticism']['sberdisk']

res_load_av_models_weights_b5 = _b5.load_av_models_weights_b5(
    url_openness = url_openness,                   # Openness
    url_conscientiousness = url_conscientiousness, # Conscientiousness
    url_extraversion = url_extraversion,           # Extraversion
    url_agreeableness = url_agreeableness,         # Agreeableness
    url_neuroticism = url_neuroticism,             # Non-Neuroticism
    force_reload = True, # Forced download of a file with weights of a neural network model from the network
    out = True,          # Display
    runtime = True,      # Runtime count
    run = True           # Run blocking
)

[2023-12-14 22:46:47] Downloading the weights of neural network models to obtain the personality traits scores (multimodal fusion) …

[2023-12-14 22:46:47] File download “weights_2022-08-28_11-14-35.h5” Openness

[2023-12-14 22:46:47] File download “weights_2022-08-28_11-08-10.h5” Conscientiousness

[2023-12-14 22:46:47] File download “weights_2022-08-28_11-17-57.h5” Extraversion

[2023-12-14 22:46:47] File download “weights_2022-08-28_11-25-11.h5” Agreeableness

[2023-12-14 22:46:47] File download “weights_2022-06-14_21-44-09.h5” Non-Neuroticism

— Runtime: 0.785 sec. —

Getting scores (multimodal fusion)

_b5.df_files_ - DataFrame with data

_b5.df_accuracy_ - DataFrame with accuracy

[17]:

# Core settings
_b5.path_to_dataset_ = 'E:/Databases/FirstImpressionsV2/test' # Dataset directory
# Directories not included in the selection
_b5.ignore_dirs_ = []
# Key names for DataFrame dataset
_b5.keys_dataset_ = ['Path', 'Openness', 'Conscientiousness', 'Extraversion', 'Agreeableness', 'Non-Neuroticism']
_b5.ext_ = ['.mp4'] # Search file extensions

# Full path to the file containing the ground truth scores for the accuracy calculation
url_accuracy = _b5.true_traits_['fi']['sberdisk']

_b5.get_av_union_predictions(
    depth = 2,         # Hierarchy depth for receiving audio and video data
    recursive = False, # Recursive data search
    sr = 44100,        # Sampling frequency
    window_audio = 2,  # Audio segment window size (in seconds)
    step_audio = 1,    # Audio segment window shift step (in seconds)
    reduction_fps = 5, # Frame rate reduction
    window_video = 10, # Video segment window size (in seconds)
    step_video = 5,    # Video segment window shift step (in seconds)
    lang = 'en',       # Language selection for models trained on First Impressions V2'en' and models trained on for MuPTA 'ru'
    accuracy = True,   # Accuracy
    url_accuracy = url_accuracy,
    logs = True,       # If necessary, generate a LOG file
    out = True,        # Display
    runtime = True,    # Runtime count
    run = True         # Run blocking
)

[2023-12-15 01:11:04] Getting scores and accuracy calculation (multimodal fusion) …

2000 from 2000 (100.0%) … test80_25_Q4wOgixh7E.004.mp4 …

	Path	Openness	Conscientiousness	Extraversion	Agreeableness	Non-Neuroticism
ID
1	E:\Databases\FirstImpressionsV2\test\test80_01...	0.554249	0.506548	0.440194	0.540235	0.48605
2	E:\Databases\FirstImpressionsV2\test\test80_01...	0.558823	0.442357	0.50397	0.558767	0.521587
3	E:\Databases\FirstImpressionsV2\test\test80_01...	0.477549	0.568616	0.333939	0.491873	0.458966
4	E:\Databases\FirstImpressionsV2\test\test80_01...	0.662656	0.621852	0.58996	0.599038	0.636035
5	E:\Databases\FirstImpressionsV2\test\test80_01...	0.645876	0.532378	0.551939	0.589174	0.552269
6	E:\Databases\FirstImpressionsV2\test\test80_01...	0.67497	0.666972	0.617604	0.610567	0.641452
7	E:\Databases\FirstImpressionsV2\test\test80_01...	0.39908	0.397298	0.335823	0.497966	0.39729
8	E:\Databases\FirstImpressionsV2\test\test80_01...	0.577705	0.597157	0.498064	0.640584	0.600152
9	E:\Databases\FirstImpressionsV2\test\test80_01...	0.543675	0.451197	0.449555	0.482371	0.415256
10	E:\Databases\FirstImpressionsV2\test\test80_01...	0.54876	0.51097	0.433856	0.579709	0.536171
11	E:\Databases\FirstImpressionsV2\test\test80_01...	0.546634	0.398485	0.443701	0.518107	0.492343
12	E:\Databases\FirstImpressionsV2\test\test80_01...	0.459302	0.427114	0.315686	0.495817	0.457954
13	E:\Databases\FirstImpressionsV2\test\test80_01...	0.309097	0.317028	0.218514	0.372315	0.241697
14	E:\Databases\FirstImpressionsV2\test\test80_01...	0.643403	0.509414	0.483608	0.503154	0.550979
15	E:\Databases\FirstImpressionsV2\test\test80_01...	0.65016	0.840148	0.535299	0.710939	0.743357
16	E:\Databases\FirstImpressionsV2\test\test80_01...	0.598313	0.520505	0.450767	0.486345	0.561532
17	E:\Databases\FirstImpressionsV2\test\test80_01...	0.571537	0.673989	0.472203	0.615608	0.621064
18	E:\Databases\FirstImpressionsV2\test\test80_01...	0.552433	0.568787	0.457108	0.613188	0.570902
19	E:\Databases\FirstImpressionsV2\test\test80_01...	0.658695	0.625194	0.634877	0.612277	0.626052
20	E:\Databases\FirstImpressionsV2\test\test80_01...	0.660076	0.544358	0.64178	0.604572	0.628259
21	E:\Databases\FirstImpressionsV2\test\test80_01...	0.543881	0.477881	0.407731	0.555772	0.499664
22	E:\Databases\FirstImpressionsV2\test\test80_01...	0.537325	0.46375	0.419255	0.499785	0.455146
23	E:\Databases\FirstImpressionsV2\test\test80_01...	0.464761	0.434816	0.346836	0.428429	0.358087
24	E:\Databases\FirstImpressionsV2\test\test80_01...	0.633951	0.63333	0.584644	0.615227	0.608006
25	E:\Databases\FirstImpressionsV2\test\test80_01...	0.4517	0.574346	0.350136	0.526873	0.468283
26	E:\Databases\FirstImpressionsV2\test\test80_01...	0.602848	0.592382	0.494679	0.539232	0.505865
27	E:\Databases\FirstImpressionsV2\test\test80_01...	0.586638	0.521421	0.485391	0.530296	0.535499
28	E:\Databases\FirstImpressionsV2\test\test80_01...	0.689552	0.643902	0.695799	0.646209	0.686243
29	E:\Databases\FirstImpressionsV2\test\test80_01...	0.583505	0.564313	0.502263	0.554502	0.539899
30	E:\Databases\FirstImpressionsV2\test\test80_01...	0.642695	0.588222	0.617706	0.615312	0.626649

[2023-12-15 01:11:04] Trait-wise accuracy …

	Openness	Conscientiousness	Extraversion	Agreeableness	Non-Neuroticism	Mean
Metrics
MAE	0.0845	0.0802	0.0793	0.0858	0.0847	0.0829
Accuracy	0.9155	0.9198	0.9207	0.9142	0.9153	0.9171

[2023-12-15 01:11:04] Mean absolute error: 0.0829, Accuracy: 0.9171 …

Log files saved successfully …

— Runtime: 8654.754 sec. —

[17]:

True

Multimodal fusion to obtain scores by audio and video FI V2