Multimodal fusion to obtain scores by audio, video and text FI V2

d4177ccfe8d84e73831037fb1e64e498


Import required packages

[2]:
from oceanai.modules.lab.build import Run

Build

[3]:
_b5 = Run(
    lang = 'en',              # Inference language
    color_simple = '#333',    # Plain text color (hexadecimal code)
    color_info = '#1776D2',   # The color of the text containing the information (hexadecimal code)
    color_err = '#FF0000',    # Error text color (hexadecimal code)
    color_true = '#008001',   # Text color containing positive information (hexadecimal code)
    bold_text = True,         # Bold text
    num_to_df_display = 30,   # Number of rows to display in tables
    text_runtime = 'Runtime', # Runtime text
    metadata = True           # Displaying information about library
)

[2023-12-15 07:01:44] OCEANAI - personal traits:    Authors:        Elena Ryumina [ryumina_ev@mail.ru]        Dmitry Ryumin [dl_03.03.1991@mail.ru]        Alexey Karpov [karpov@iias.spb.su]    Maintainers:        Elena Ryumina [ryumina_ev@mail.ru]        Dmitry Ryumin [dl_03.03.1991@mail.ru]    Version: 1.0.0a16    License: BSD License

Getting and displaying versions of installed libraries

  • _b5.df_pkgs_ - DataFrame with versions of installed libraries

[4]:
_b5.libs_vers(runtime = True, run = True)
Package Version
1 TensorFlow 2.15.0
2 Keras 2.15.0
3 OpenCV 4.8.1
4 MediaPipe 0.9.0
5 NumPy 1.26.2
6 SciPy 1.11.4
7 Pandas 2.1.3
8 Scikit-learn 1.3.2
9 OpenSmile 2.5.0
10 Librosa 0.10.1
11 AudioRead 3.0.1
12 IPython 8.18.1
13 PyMediaInfo 6.1.0
14 Requests 2.31.0
15 JupyterLab 4.0.9
16 LIWC 0.5.0
17 Transformers 4.36.0
18 Sentencepiece 0.1.99
19 Torch 2.0.1+cpu
20 Torchaudio 2.0.2+cpu

— Runtime: 0.004 sec. —

Analysing audio information (forming model and loading model weights)

Formation of the neural network architecture of the model for obtaining features / scores by hand-crafted features (audio modality)

  • _b5.audio_model_hc_ - Neural network model tf.keras.Model for obtaining features / scores by hand-crafted features

[5]:
res_load_audio_model_hc = _b5.load_audio_model_hc(
    show_summary = False, # Displaying the formed neural network architecture of the model
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-15 07:01:44] Formation of the neural network architecture of the model for obtaining scores by hand-crafted features (audio modality) …

— Runtime: 0.326 sec. —

Downloading the weights of the neural network model to obtain features / scores by hand-crafted features (audio modality)

  • _b5.audio_model_hc_ - Neural network model tf.keras.Model for obtaining features / scores by hand-crafted features

[6]:
# Core settings
_b5.path_to_save_ = './models' # Directory to save the file
_b5.chunk_size_ = 2000000      # File download size from network in 1 step
corpus = 'fi'
lang = 'en'

url = _b5.weights_for_big5_['audio'][corpus]['hc']['sberdisk']

res_load_audio_model_weights_hc = _b5.load_audio_model_weights_hc(
    url = url, # Full path to the file with weights of the neural network model
    force_reload = True, # Forced download of a file with weights of a neural network model from the network
    out = True,          # Display
    runtime = True,      # Runtime count
    run = True           # Run blocking
)

[[2023-12-15 07:01:45] Downloading the weights of the neural network model to obtain scores by hand-crafted features (audio modality) …

[2023-12-15 07:01:45] File download “weights_2022-05-05_11-27-55.h5” (100.0%) …

— Runtime: 0.226 sec. —

Formation of the neural network architecture of the model for obtaining features / scores by deep features (audio modality)

  • _b5.audio_model_nn_ - Neural network model tf.keras.Model for obtaining features / scores by deep features

[7]:
res_load_audio_model_nn = _b5.load_audio_model_nn(
    show_summary = False, # Displaying the formed neural network architecture of the model
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-15 07:01:45] Formation of a neural network architecture for obtaining scores by deep features (audio modality) …

— Runtime: 0.219 sec. —

Downloading the weights of the neural network model to obtain features / scores for deep features

  • _b5.audio_model_nn_ - Neural network model tf.keras.Model for obtaining features / scores by deep features

[8]:
# Core settings
_b5.path_to_save_ = './models' # Directory to save the file
_b5.chunk_size_ = 2000000      # File download size from network in 1 step

url = _b5.weights_for_big5_['audio'][corpus]['nn']['sberdisk']

res_load_audio_model_weights_nn = _b5.load_audio_model_weights_nn(
    url = url, # Full path to the file with weights of the neural network model
    force_reload = False, # Forced download of a file with weights of a neural network model from the network
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-15 07:01:45] Downloading the weights of the neural network model to obtain scores for deep features (audio modality) …

[2023-12-15 07:01:45] File download “weights_2022-05-03_07-46-14.h5”

— Runtime: 0.328 sec. —

Analysing video information (forming model and loading model weights)

Formation of the neural network architecture of the model for obtaining features / scores by hand-crafted features (audio modality)

  • _b5.video_model_hc_ - Neural network model tf.keras.Model for obtaining features / scores by hand-crafted features

[9]:
res_load_video_model_hc = _b5.load_video_model_hc(
    lang = lang, # Language selection for models trained on First Impressions V2'en' and models trained on for MuPTA 'ru'
    show_summary = False, # Displaying the formed neural network architecture of the model
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-15 07:01:45] Formation of the neural network architecture of the model for obtaining scores by hand-crafted features (video modality) …

— Runtime: 0.252 sec. —

Downloading the weights of the neural network model to obtain features / scores by hand-crafted features (audio modality)

  • _b5.video_model_hc_ - Neural network model tf.keras.Model for obtaining features / scores by hand-crafted features

[10]:
# Core settings
_b5.path_to_save_ = './models' # Directory to save the file
_b5.chunk_size_ = 2000000 # File download size from network in 1 step

url = _b5.weights_for_big5_['video'][corpus]['hc']['sberdisk']

res_load_video_model_weights_hc = _b5.load_video_model_weights_hc(
    url = url, # Full path to the file with weights of the neural network model
    force_reload = True, # Forced download of a file with weights of a neural network model from the network
    out = True,          # Display
    runtime = True,      # Runtime count
    run = True           # Run blocking
)

[2023-12-15 07:01:46] Downloading the weights of the neural network model to obtain scores by hand-crafted features (video modality) …

[2023-12-15 07:01:46] File download “weights_2022-08-27_18-53-35.h5” (100.0%) …

— Runtime: 0.24 sec. —

Formation of neural network architecture for obtaining deep features

  • _b5.video_model_deep_fe_ - Neural network model tf.keras.Model for obtaining deep features

[11]:
res_load_video_model_deep_fe = _b5.load_video_model_deep_fe(
    show_summary = False, # Displaying the formed neural network architecture of the model
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-15 07:01:46] Formation of neural network architecture for obtaining deep features (video modality) …

— Runtime: 0.794 sec. —

Downloading weights of a neural network model to obtain deep features (video modality)

  • _b5.video_model_deep_fe_ - Neural network model tf.keras.Model for obtaining deep features

[12]:
# Core settings
_b5.path_to_save_ = './models' # Directory to save the file
_b5.chunk_size_ = 2000000 # File download size from network in 1 step

url = _b5.weights_for_big5_['video'][corpus]['fe']['sberdisk']

res_load_video_model_weights_deep_fe = _b5.load_video_model_weights_deep_fe(
    url = url, # Full path to the file with weights of the neural network model
    force_reload = True, # Forced download of a file with weights of a neural network model from the network
    out = True,          # Display
    runtime = True,      # Runtime count
    run = True           # Run blocking
)

[2023-12-15 07:01:47] Downloading weights of a neural network model to obtain deep features (video modality) …

[2023-12-15 07:01:50] File download “weights_2022-11-01_12-27-07.h5” (100.0%) …

— Runtime: 3.937 sec. —

Formation of the neural network architecture of the model for obtaining features / scores by deep features (audio modality)

  • _b5.video_model_nn_ - Neural network model tf.keras.Model for obtaining features / scores by deep features

[13]:
res_load_video_model_nn = _b5.load_video_model_nn(
    show_summary = False, # Displaying the formed neural network architecture of the model
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-15 07:01:51] Formation of a neural network architecture for obtaining scores by deep features (video modality) …

— Runtime: 0.707 sec. —

Downloading the weights of the neural network model to obtain features / scores for deep features

  • _b5.video_model_nn_ - Neural network model tf.keras.Model for obtaining features / scores by deep features

[14]:
# Core settings
_b5.path_to_save_ = './models' # Directory to save the file
_b5.chunk_size_ = 2000000      # File download size from network in 1 step

url = _b5.weights_for_big5_['video'][corpus]['nn']['sberdisk']

res_load_video_model_weights_nn = _b5.load_video_model_weights_nn(
    url = url, # Full path to the file with weights of the neural network model
    force_reload = False, # Forced download of a file with weights of a neural network model from the network
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-15 07:01:51] Downloading the weights of the neural network model to obtain scores by deep features (video modality) …

[2023-12-15 07:01:51] File downloading “weights_2022-03-22_16-31-48.h5”

— Runtime: 0.166 sec. —

Analysing text information (forming model and loading model weights)

Loading a dictionary with hand-crafted features

[15]:
# Core setup
_b5.path_to_save_ = './models' # Directory to save the models
_b5.chunk_size_ = 2000000      # File download size from network in one step

res_load_text_features = _b5.load_text_features(
    force_reload = True,       # Forced download file
    out = True,                # Display
    runtime = True,            # Runtime calculation
    run = True                 # Run blocking
)

[2023-12-15 07:01:51] Loading a dictionary with hand-crafted features …

[2023-12-15 07:01:52] Loading the “LIWC2007.txt” file 100.0% …

— Runtime: 0.166 sec. —

Building tokenizer and translation model (RU -> EN)

[16]:
res_setup_translation_model = _b5.setup_translation_model(
    out = True,     # Display
    runtime = True, # Runtime calculation
    run = True      # Run blocking
)

[2023-12-15 07:01:52] Building tokenizer and translation model …

— Runtime: 1.763 sec. —

Building tokenizer and BERT model (for word encoding)

[17]:
# Core setup
_b5.path_to_save_ = './models' # Directory to save the models
_b5.chunk_size_ = 2000000      # File download size from network in one step

res_setup_translation_model = _b5.setup_bert_encoder(
    force_reload = True,       # Forced download file
    out = True,                # Display
    runtime = True,            # Runtime calculation
    run = True                 # Run blocking
)

[2023-12-15 07:01:53] Building tokenizer and BERT model …

[2023-12-15 07:01:55] Loading the “bert-base-multilingual-cased.zip” file

[2023-12-15 07:01:53] Building tokenizer and BERT model …

[2023-12-15 07:01:55] Loading the “bert-base-multilingual-cased.zip” file

[2023-12-15 07:01:55] Unzipping an archive “bert-base-multilingual-cased.zip” …

— Runtime: 5.269 sec. —

Formation of the neural network architecture of the model for obtaining features / scores by hand-crafted features (audio modality)

  • _b5.text_model_hc_ - Neural network model tf.keras.Model for obtaining features / scores by hand-crafted features

[18]:
res_load_text_model_hc_mupta = _b5.load_text_model_hc(
    corpus = corpus, # Corpus selection for models trained on First Impressions V2 'fi' and models trained on for MuPTA 'mupta'
    show_summary = False, # Displaying the formed neural network architecture of the model
    out = True, # Display
    runtime = True, # Runtime count
    run = True # Run blocking
)

[22023-12-15 07:01:59] Formation of the neural network architecture of the model for obtaining scores by hand-crafted features (text modality) …

— Runtime: 0.701 sec. —

Downloading the weights of the neural network model to obtain features / scores by hand-crafted features (audio modality)

  • _b5.text_model_hc_ - Neural network model tf.keras.Model for obtaining features / scores by hand-crafted features

[19]:
# Core settings
_b5.path_to_save_ = './models' # Directory to save the file
_b5.chunk_size_ = 2000000      # File download size from network in 1 step

url = _b5.weights_for_big5_['text'][corpus]['hc']['sberdisk']

res_load_text_model_weights_hc_fi = _b5.load_text_model_weights_hc(
    url = url, # Full path to the file with weights of the neural network model
    force_reload = True, # Forced download of a file with weights of a neural network model from the network
    out = True,     # Display
    runtime = True, # Runtime count
    run = True      # Run blocking
)

[2023-12-15 07:01:59] Downloading the weights of a neural network model to obtain scores by hand-crafted features (text modality) …

[2023-12-15 07:02:00] File download “weights_2023-07-15_10-52-15.h5” 100.0% …

— Runtime: 0.278 sec. —

Formation of the neural network architecture of the model for obtaining features / scores by deep features (audio modality)

  • _b5s.text_model_nn_ - Neural network model tf.keras.Model for obtaining features / scores by deep features

[20]:
res_load_text_model_nn_fi = _b5.load_text_model_nn(
    corpus = corpus, # Corpus selection for models trained on First Impressions V2 'fi' and models trained on for MuPTA 'mupta'
    show_summary = False, # Displaying the formed neural network architecture of the model
    out = True, # Display
    runtime = True, # Runtime count
    run = True # Run blocking
)

[2023-12-15 07:02:00] Formation of a neural network architecture for obtaining scores by deep features (text modality) …

— Runtime: 0.286 sec. —

Downloading the weights of the neural network model to obtain features / scores for deep features

  • _b5s.text_model_nn_ - Neural network model tf.keras.Model for obtaining features / scores by deep features

[21]:
# Core settings
_b5.path_to_save_ = './models' # Directory to save the file
_b5.chunk_size_ = 2000000      # File download size from network in 1 step

url = _b5.weights_for_big5_['text'][corpus]['nn']['sberdisk']

res_load_text_model_weights_nn_fi = _b5.load_text_model_weights_nn(
    url = url, # Full path to the file with weights of the neural network model
    force_reload = True, # Forced download of a file with weights of a neural network model from the network
    out = True,     # Display
    runtime = True, # Runtime count
    run = True      # Run blocking
)

[2023-12-15 07:02:00] Downloading the weights of a neural network model to obtain deep features (text modality) …

[2023-12-15 07:02:00] File download “weights_2023-07-03_15-01-08.h5” 100.0% …

— Runtime: 0.42 sec. —

Analysing multimodal information (forming model, loading model weights, obtaining personality traits scores)

Formation of neural network architectures of models for obtaining the personality traits scores

  • _b5.avt_model_b5_ - Neural network model tf.keras.Model for obtaining the personality traits scores

[22]:
res_load_avt_model_b5 = _b5.load_avt_model_b5(
    show_summary = False, # Displaying the formed neural network architecture of the model
    out = True,           # Display
    runtime = True,       # Runtime count
    run = True            # Run blocking
)

[2023-12-15 07:02:00] Formation of neural network architectures of models for obtaining the personality traits scores (multimodal fusion) …

— Runtime: 0.212 sec. —

ЗDownloading the weights of neural network models to obtain the personality traits scores (multimodal fusion)

  • _b5.avt_model_b5_ - Neural network model tf.keras.Model for obtaining the personality traits scores

[23]:
# Core settings
_b5.path_to_save_ = './models' # Directory to save the file
_b5.chunk_size_ = 2000000      # File download size from network in 1 step

url = _b5.weights_for_big5_['avt'][corpus]['b5']['sberdisk']

res_load_avt_model_weights_b5 = _b5.load_avt_model_weights_b5(
    url = url,
    force_reload = True, # Forced download of a file with weights of a neural network model from the network
    out = True,          # Display
    runtime = True,      # Runtime count
    run = True           # Run blocking
)

[2023-12-15 07:02:01] Downloading the weights of neural network models to obtain the personality traits scores (multimodal fusion) …

[2023-12-15 07:02:01] File download “avt_fi_2023-12-03_11-36-51.h5”

— Runtime: 0.295 sec. —

Getting scores (multimodal fusion)

  • _b5.df_files_ - DataFrame with data

  • _b5.df_accuracy_ - DataFrame with accuracy

[24]:
# Core settings
_b5.path_to_dataset_ = 'E:/Databases/FirstImpressionsV2/test' # Dataset directory
# Directories not included in the selection
_b5.ignore_dirs_ = []
# Key names for DataFrame dataset
_b5.keys_dataset_ = ['Path', 'Openness', 'Conscientiousness', 'Extraversion', 'Agreeableness', 'Non-Neuroticism']
_b5.ext_ = ['.mp4'] # Search file extensions

# Full path to the file containing the ground truth scores for the accuracy calculation
url_accuracy = _b5.true_traits_[corpus]['sberdisk']

_b5.get_avt_predictions(
    depth = 1,         # ГHierarchy depth for receiving audio and video data
    recursive = False, # Recursive data search
    sr = 44100,        # Sampling frequency
    window_audio = 2,  # Audio segment window size (in seconds)
    step_audio = 1,    # Audio segment window shift step (in seconds)
    reduction_fps = 5, # Frame rate reduction
    window_video = 10, # Video segment window size (in seconds)
    step_video = 5,    # Video segment window shift step (in seconds)
    asr = False,       # Using a model for ASR
    lang = lang,       # Language selection for models trained on First Impressions V2'en' and models trained on for MuPTA 'ru'
    accuracy = True,   # Accuracy
    url_accuracy = url_accuracy,
    logs = True,       # If necessary, generate a LOG file
    out = True,        # Display
    runtime = True,    # Runtime count
    run = True         # Run blocking
)

[2023-12-15 10:22:11] Feature extraction (hand-crafted and deep) from text …

[2023-12-15 10:22:14] Getting scores and accuracy calculation (multimodal fusion) …

2000 from 2000 (100.0%) … test80_25_Q4wOgixh7E.004.mp4 …

Path Openness Conscientiousness Extraversion Agreeableness Non-Neuroticism
ID
1 E:\Databases\FirstImpressionsV2\test\test80_01... 0.545377 0.523155 0.456685 0.533811 0.516093
2 E:\Databases\FirstImpressionsV2\test\test80_01... 0.520572 0.396216 0.478419 0.528622 0.459169
3 E:\Databases\FirstImpressionsV2\test\test80_01... 0.450715 0.491121 0.36674 0.510387 0.414304
4 E:\Databases\FirstImpressionsV2\test\test80_01... 0.665193 0.648017 0.640581 0.580625 0.596675
5 E:\Databases\FirstImpressionsV2\test\test80_01... 0.669463 0.606313 0.619956 0.653291 0.618665
6 E:\Databases\FirstImpressionsV2\test\test80_01... 0.632529 0.722035 0.583922 0.63653 0.603358
7 E:\Databases\FirstImpressionsV2\test\test80_01... 0.489579 0.453927 0.373339 0.486156 0.421787
8 E:\Databases\FirstImpressionsV2\test\test80_01... 0.59544 0.615519 0.514064 0.627394 0.601345
9 E:\Databases\FirstImpressionsV2\test\test80_01... 0.559325 0.50692 0.442211 0.537979 0.499341
10 E:\Databases\FirstImpressionsV2\test\test80_01... 0.509495 0.526581 0.406979 0.565923 0.54616
11 E:\Databases\FirstImpressionsV2\test\test80_01... 0.599391 0.516418 0.516382 0.589003 0.558064
12 E:\Databases\FirstImpressionsV2\test\test80_01... 0.458006 0.496319 0.345605 0.48779 0.448027
13 E:\Databases\FirstImpressionsV2\test\test80_01... 0.377578 0.410694 0.283698 0.384478 0.313993
14 E:\Databases\FirstImpressionsV2\test\test80_01... 0.563649 0.499573 0.445833 0.454925 0.463903
15 E:\Databases\FirstImpressionsV2\test\test80_01... 0.7302 0.784698 0.51636 0.698729 0.713016
16 E:\Databases\FirstImpressionsV2\test\test80_01... 0.620163 0.564576 0.556421 0.563072 0.543618
17 E:\Databases\FirstImpressionsV2\test\test80_01... 0.603495 0.644997 0.440616 0.603712 0.578639
18 E:\Databases\FirstImpressionsV2\test\test80_01... 0.543104 0.489751 0.452691 0.566111 0.520961
19 E:\Databases\FirstImpressionsV2\test\test80_01... 0.624445 0.574276 0.609165 0.582815 0.560111
20 E:\Databases\FirstImpressionsV2\test\test80_01... 0.658763 0.545697 0.627865 0.61989 0.609391
21 E:\Databases\FirstImpressionsV2\test\test80_01... 0.562814 0.493076 0.430422 0.539134 0.502142
22 E:\Databases\FirstImpressionsV2\test\test80_01... 0.472688 0.417943 0.423233 0.472491 0.392815
23 E:\Databases\FirstImpressionsV2\test\test80_01... 0.43985 0.429655 0.319237 0.420569 0.414306
24 E:\Databases\FirstImpressionsV2\test\test80_01... 0.638308 0.632067 0.580016 0.642938 0.603159
25 E:\Databases\FirstImpressionsV2\test\test80_01... 0.506815 0.57838 0.367448 0.523856 0.481819
26 E:\Databases\FirstImpressionsV2\test\test80_01... 0.517949 0.562723 0.383299 0.483178 0.467141
27 E:\Databases\FirstImpressionsV2\test\test80_01... 0.570406 0.441804 0.454944 0.530368 0.512669
28 E:\Databases\FirstImpressionsV2\test\test80_01... 0.637813 0.611132 0.607629 0.636313 0.620745
29 E:\Databases\FirstImpressionsV2\test\test80_01... 0.572268 0.532781 0.504937 0.575169 0.518609
30 E:\Databases\FirstImpressionsV2\test\test80_01... 0.658128 0.598394 0.59656 0.621783 0.612908

[2023-12-15 10:22:14] Trait-wise accuracy …

Openness Conscientiousness Extraversion Agreeableness Non-Neuroticism Mean
Metrics
MAE 0.0758 0.0716 0.0688 0.0752 0.0731 0.0729
Accuracy 0.9242 0.9284 0.9312 0.9248 0.9269 0.9271

[2023-12-15 10:22:14] Mean absolute error: 0.0729, Accuracy: 0.9271 …

Log files saved successfully …

— Runtime: 12013.03 sec. —

[24]:
True