Solution of practical task 3
Task: Forming effective work teams
The solution of the practical task is performed in two stages. At the first stage it is necessary to use the OCEAN-AI library to obtain predictions (personality traits scores). The second step is to use the _colleague_ranking method from the OCEAN-AI library to solve the presented practical task using the example of finding suitable colleagues for the target colleague. Examples of the results of the work and implementation are presented below.
Thus, the OCEAN-AI library provides tools to analyze the personality traits of colleagues’ personalities and can help to form effective work groups, improve communication, and reduce team conflicts.
FI V2
[2]:
# Import required tools
import os
import pandas as pd
# Module import
from oceanai.modules.lab.build import Run
# Creating an instance of a class
_b5 = Run(lang = 'en')
# Core setup
_b5.path_to_save_ = './models' # Directory to save the models
_b5.chunk_size_ = 2000000 # File download size from network in one step
corpus = 'fi'
# Building audio models
res_load_model_hc = _b5.load_audio_model_hc()
res_load_model_nn = _b5.load_audio_model_nn()
# Loading audio model weights
url = _b5.weights_for_big5_['audio'][corpus]['hc']['sberdisk']
res_load_model_weights_hc = _b5.load_audio_model_weights_hc(url = url)
url = _b5.weights_for_big5_['audio'][corpus]['nn']['sberdisk']
res_load_model_weights_nn = _b5.load_audio_model_weights_nn(url = url)
# Loading audio model weights
res_load_model_hc = _b5.load_video_model_hc(lang='en')
res_load_model_deep_fe = _b5.load_video_model_deep_fe()
res_load_model_nn = _b5.load_video_model_nn()
# Loading video model weights
url = _b5.weights_for_big5_['video'][corpus]['hc']['sberdisk']
res_load_model_weights_hc = _b5.load_video_model_weights_hc(url = url)
url = _b5.weights_for_big5_['video'][corpus]['fe']['sberdisk']
res_load_model_weights_deep_fe = _b5.load_video_model_weights_deep_fe(url = url)
url = _b5.weights_for_big5_['video'][corpus]['nn']['sberdisk']
res_load_model_weights_nn = _b5.load_video_model_weights_nn(url = url)
# Loading a dictionary with hand-crafted features (text modality)
res_load_text_features = _b5.load_text_features()
# Building text models
res_setup_translation_model = _b5.setup_translation_model()
res_setup_translation_model = _b5.setup_bert_encoder()
res_load_text_model_hc_fi = _b5.load_text_model_hc(corpus=corpus)
res_load_text_model_nn_fi = _b5.load_text_model_nn(corpus=corpus)
# Loading text model weights
url = _b5.weights_for_big5_['text'][corpus]['hc']['sberdisk']
res_load_text_model_weights_hc_fi = _b5.load_text_model_weights_hc(url = url)
url = _b5.weights_for_big5_['text'][corpus]['nn']['sberdisk']
res_load_text_model_weights_nn_fi = _b5.load_text_model_weights_nn(url = url)
# Building model for multimodal information fusion
res_load_avt_model_b5 = _b5.load_avt_model_b5()
# Loading model weights for multimodal information fusion
url = _b5.weights_for_big5_['avt'][corpus]['b5']['sberdisk']
res_load_avt_model_weights_b5 = _b5.load_avt_model_weights_b5(url = url)
PATH_TO_DIR = './video_FI/'
PATH_SAVE_VIDEO = './video_FI/test/'
_b5.path_to_save_ = PATH_SAVE_VIDEO
# Loading 10 test files from the First Impressions V2 corpus
# URL: https://chalearnlap.cvc.uab.cat/dataset/24/description/
domain = 'https://download.sberdisk.ru/download/file/'
tets_name_files = [
'429713680?token=FqHdMLSSh7zYSZt&filename=_plk5k7PBEg.003.mp4',
'429713681?token=Hz9b4lQkrLfic33&filename=be0DQawtVkE.002.mp4',
'429713683?token=EgUXS9Xs8xHm5gz&filename=2d6btbaNdfo.000.mp4',
'429713684?token=1U26753kmPYdIgt&filename=300gK3CnzW0.003.mp4',
'429713685?token=LyigAWLTzDNwKJO&filename=300gK3CnzW0.001.mp4',
'429713686?token=EpfRbCKHyuc4HPu&filename=cLaZxEf1nE4.004.mp4',
'429713687?token=FNTkwqBr4jOS95l&filename=g24JGYuT74A.004.mp4',
'429713688?token=qDT95nz7hfm2Nki&filename=JZNMxa3OKHY.000.mp4',
'429713689?token=noLguEGXDpbcKhg&filename=nvlqJbHk_Lc.003.mp4',
'429713679?token=9L7RQ0hgdJlcek6&filename=4vdJGgZpj4k.003.mp4'
]
for curr_files in tets_name_files:
_b5.download_file_from_url(url = domain + curr_files, out = True)
# Getting scores
_b5.path_to_dataset_ = PATH_TO_DIR # Dataset directory
_b5.ext_ = ['.mp4'] # Search file extensions
# Full path to the file with ground truth scores for accuracy calculation
url_accuracy = _b5.true_traits_[corpus]['sberdisk']
_b5.get_avt_predictions(url_accuracy = url_accuracy, lang = 'en')
[2023-12-16 19:24:17] Feature extraction (hand-crafted and deep) from text …
[2023-12-16 19:24:19] Getting scores and accuracy calculation (multimodal fusion) …
10 from 10 (100.0%) … GitHub:nbsphinx-math:OCEANAI\docs\source\user_guide:nbsphinx-math:notebooks\video_FI:nbsphinx-math:test_plk5k7PBEg.003.mp4 …
Path | Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | |
---|---|---|---|---|---|---|
Person ID | ||||||
1 | 2d6btbaNdfo.000.mp4 | 0.581159 | 0.628822 | 0.466609 | 0.622129 | 0.553832 |
2 | 300gK3CnzW0.001.mp4 | 0.463991 | 0.418851 | 0.41301 | 0.493329 | 0.423093 |
3 | 300gK3CnzW0.003.mp4 | 0.454281 | 0.415049 | 0.39189 | 0.485114 | 0.420741 |
4 | 4vdJGgZpj4k.003.mp4 | 0.588461 | 0.643233 | 0.530789 | 0.603038 | 0.593398 |
5 | be0DQawtVkE.002.mp4 | 0.633433 | 0.533295 | 0.523742 | 0.608591 | 0.588456 |
6 | cLaZxEf1nE4.004.mp4 | 0.636944 | 0.542386 | 0.558461 | 0.570975 | 0.558983 |
7 | g24JGYuT74A.004.mp4 | 0.531518 | 0.376987 | 0.393309 | 0.4904 | 0.447881 |
8 | JZNMxa3OKHY.000.mp4 | 0.610342 | 0.541418 | 0.563163 | 0.595013 | 0.569461 |
9 | nvlqJbHk_Lc.003.mp4 | 0.495809 | 0.458526 | 0.414436 | 0.469152 | 0.435461 |
10 | _plk5k7PBEg.003.mp4 | 0.60707 | 0.591893 | 0.520662 | 0.603938 | 0.565726 |
[2023-12-16 19:24:19] Trait-wise accuracy …
Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | Mean | |
---|---|---|---|---|---|---|
Metrics | ||||||
MAE | 0.0589 | 0.0612 | 0.0864 | 0.0697 | 0.0582 | 0.0669 |
Accuracy | 0.9411 | 0.9388 | 0.9136 | 0.9303 | 0.9418 | 0.9331 |
[2023-12-16 19:24:19] Mean absolute errors: 0.0669, average accuracy: 0.9331 …
Log files saved successfully …
— Runtime: 67.109 sec. —
[2]:
True
To find the suitable colleague, you need to know two correlation coefficients for each of a personality traits. These coefficients should show how the trait score of one person changes when it is higher or lower than the same trait score of another person.
As an example, it is proposed to use the correlation coefficients between two people in the context of a manager-employee relationship presented in the article:
Kuroda S., Yamamoto I. Good boss, bad boss, workers’ mental health and productivity: Evidence from Japan // Japan & The World Economy. – 2018. – vol. 48. – pp. 106-118.
The user can set their own correlation coefficients.
[3]:
# Loading dataframe with correlation coefficients
url = 'https://download.sberdisk.ru/download/file/478675819?token=LuB7L1QsEY0UuSs&filename=colleague_ranking.csv'
df_correlation_coefficients = pd.read_csv(url)
df_correlation_coefficients = pd.DataFrame(
df_correlation_coefficients.drop(['ID'], axis = 1)
)
df_correlation_coefficients.index.name = 'ID'
df_correlation_coefficients.index += 1
df_correlation_coefficients.index = df_correlation_coefficients.index.map(str)
df_correlation_coefficients
[3]:
Score_comparison | Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | |
---|---|---|---|---|---|---|
ID | ||||||
1 | higher | -0.0602 | 0.0471 | -0.1070 | -0.0832 | 0.190 |
2 | lower | -0.1720 | -0.1050 | 0.0772 | 0.0703 | -0.229 |
Finding a suitable senior colleague
[4]:
# List of personality traits scores of the target person
target_scores = [0.527886, 0.522337, 0.458468, 0.51761, 0.444649]
_b5._colleague_ranking(
correlation_coefficients = df_correlation_coefficients,
target_scores = target_scores,
colleague = 'major',
equal_coefficients = 0.5,
out = False
)
_b5._save_logs(df = _b5.df_files_colleague_, name = 'major_colleague_ranking_fi_en', out = True)
# Optional
df = _b5.df_files_colleague_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = df.columns[1:]
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[4]:
Path | OPE | CON | EXT | AGR | NNEU | Match | |
---|---|---|---|---|---|---|---|
Person ID | |||||||
7 | g24JGYuT74A.004.mp4 | 0.532 | 0.377 | 0.393 | 0.490 | 0.448 | 0.078 |
4 | 4vdJGgZpj4k.003.mp4 | 0.588 | 0.643 | 0.531 | 0.603 | 0.593 | 0.001 |
1 | 2d6btbaNdfo.000.mp4 | 0.581 | 0.629 | 0.467 | 0.622 | 0.554 | -0.002 |
10 | _plk5k7PBEg.003.mp4 | 0.607 | 0.592 | 0.521 | 0.604 | 0.566 | -0.007 |
5 | be0DQawtVkE.002.mp4 | 0.633 | 0.533 | 0.524 | 0.609 | 0.588 | -0.008 |
8 | JZNMxa3OKHY.000.mp4 | 0.610 | 0.541 | 0.563 | 0.595 | 0.569 | -0.013 |
6 | cLaZxEf1nE4.004.mp4 | 0.637 | 0.542 | 0.558 | 0.571 | 0.559 | -0.014 |
3 | 300gK3CnzW0.003.mp4 | 0.454 | 0.415 | 0.392 | 0.485 | 0.421 | -0.154 |
2 | 300gK3CnzW0.001.mp4 | 0.464 | 0.419 | 0.413 | 0.493 | 0.423 | -0.154 |
9 | nvlqJbHk_Lc.003.mp4 | 0.496 | 0.459 | 0.414 | 0.469 | 0.435 | -0.168 |
Finding a suitable junior colleague
[5]:
# List of personality traits scores of the target person
target_scores = [0.527886, 0.522337, 0.458468, 0.51761, 0.444649]
_b5._colleague_ranking(
correlation_coefficients = df_correlation_coefficients,
target_scores = target_scores,
colleague = 'minor',
equal_coefficients = 0.5,
out = False
)
_b5._save_logs(df = _b5.df_files_colleague_, name = 'minor_colleague_ranking_fi_en', out = True)
# Optional
df = _b5.df_files_colleague_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = df.columns[1:]
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[5]:
Path | OPE | CON | EXT | AGR | NNEU | Match | |
---|---|---|---|---|---|---|---|
Person ID | |||||||
9 | nvlqJbHk_Lc.003.mp4 | 0.496 | 0.459 | 0.414 | 0.469 | 0.435 | -0.009 |
3 | 300gK3CnzW0.003.mp4 | 0.454 | 0.415 | 0.392 | 0.485 | 0.421 | -0.010 |
2 | 300gK3CnzW0.001.mp4 | 0.464 | 0.419 | 0.413 | 0.493 | 0.423 | -0.013 |
8 | JZNMxa3OKHY.000.mp4 | 0.610 | 0.541 | 0.563 | 0.595 | 0.569 | -0.207 |
6 | cLaZxEf1nE4.004.mp4 | 0.637 | 0.542 | 0.558 | 0.571 | 0.559 | -0.211 |
1 | 2d6btbaNdfo.000.mp4 | 0.581 | 0.629 | 0.467 | 0.622 | 0.554 | -0.213 |
10 | _plk5k7PBEg.003.mp4 | 0.607 | 0.592 | 0.521 | 0.604 | 0.566 | -0.213 |
5 | be0DQawtVkE.002.mp4 | 0.633 | 0.533 | 0.524 | 0.609 | 0.588 | -0.216 |
4 | 4vdJGgZpj4k.003.mp4 | 0.588 | 0.643 | 0.531 | 0.603 | 0.593 | -0.221 |
7 | g24JGYuT74A.004.mp4 | 0.532 | 0.377 | 0.393 | 0.490 | 0.448 | -0.259 |
MuPTA
(ru)
[6]:
import os
import pandas as pd
# Module import
from oceanai.modules.lab.build import Run
# Creating an instance of a class
_b5 = Run(lang = 'en')
corpus = 'mupta'
lang = 'ru'
# Core setup
_b5.path_to_save_ = './models' # Directory to save the models
_b5.chunk_size_ = 2000000 # File download size from network in one step
# Building audio models
res_load_model_hc = _b5.load_audio_model_hc()
res_load_model_nn = _b5.load_audio_model_nn()
# Loading audio model weights
url = _b5.weights_for_big5_['audio'][corpus]['hc']['sberdisk']
res_load_model_weights_hc = _b5.load_audio_model_weights_hc(url = url)
url = _b5.weights_for_big5_['audio'][corpus]['nn']['sberdisk']
res_load_model_weights_nn = _b5.load_audio_model_weights_nn(url = url)
# Building video models
res_load_model_hc = _b5.load_video_model_hc(lang=lang)
res_load_model_deep_fe = _b5.load_video_model_deep_fe()
res_load_model_nn = _b5.load_video_model_nn()
# Loading video model weights
url = _b5.weights_for_big5_['video'][corpus]['hc']['sberdisk']
res_load_model_weights_hc = _b5.load_video_model_weights_hc(url = url)
url = _b5.weights_for_big5_['video'][corpus]['fe']['sberdisk']
res_load_model_weights_deep_fe = _b5.load_video_model_weights_deep_fe(url = url)
url = _b5.weights_for_big5_['video'][corpus]['nn']['sberdisk']
res_load_model_weights_nn = _b5.load_video_model_weights_nn(url = url)
# Loading a dictionary with hand-crafted features (text modality)
res_load_text_features = _b5.load_text_features()
# Building text models
res_setup_translation_model = _b5.setup_translation_model()
res_setup_translation_model = _b5.setup_bert_encoder()
res_load_text_model_hc_fi = _b5.load_text_model_hc(corpus=corpus)
res_load_text_model_nn_fi = _b5.load_text_model_nn(corpus=corpus)
# Loading text model weights
url = _b5.weights_for_big5_['text'][corpus]['hc']['sberdisk']
res_load_text_model_weights_hc_fi = _b5.load_text_model_weights_hc(url = url)
url = _b5.weights_for_big5_['text'][corpus]['nn']['sberdisk']
res_load_text_model_weights_nn_fi = _b5.load_text_model_weights_nn(url = url)
# Building model for multimodal information fusion
res_load_avt_model_b5 = _b5.load_avt_model_b5()
# Loading model weights for multimodal information fusion
url = _b5.weights_for_big5_['avt'][corpus]['b5']['sberdisk']
res_load_avt_model_weights_b5 = _b5.load_avt_model_weights_b5(url = url)
PATH_TO_DIR = './video_MuPTA/'
PATH_SAVE_VIDEO = './video_MuPTA/test/'
_b5.path_to_save_ = PATH_SAVE_VIDEO
# Loading 10 test files from the MuPTA corpus
# URL: https://hci.nw.ru/en/pages/mupta-corpus
domain = 'https://download.sberdisk.ru/download/file/'
tets_name_files = [
'477995979?token=2cvyk7CS0mHx2MJ&filename=speaker_06_center_83.mov',
'477995980?token=jGPtBPS69uzFU6Y&filename=speaker_01_center_83.mov',
'477995967?token=zCaRbNB6ht5wMPq&filename=speaker_11_center_83.mov',
'477995966?token=B1rbinDYRQKrI3T&filename=speaker_15_center_83.mov',
'477995978?token=dEpVDtZg1EQiEQ9&filename=speaker_07_center_83.mov',
'477995961?token=o1hVjw8G45q9L9Z&filename=speaker_19_center_83.mov',
'477995964?token=5K220Aqf673VHPq&filename=speaker_23_center_83.mov',
'477995965?token=v1LVD2KT1cU7Lpb&filename=speaker_24_center_83.mov',
'477995962?token=tmaSGyyWLA6XCy9&filename=speaker_27_center_83.mov',
'477995963?token=bTpo96qNDPcwGqb&filename=speaker_10_center_83.mov',
]
for curr_files in tets_name_files:
_b5.download_file_from_url(url = domain + curr_files, out = True)
# Getting scores
_b5.path_to_dataset_ = PATH_TO_DIR # Dataset directory
_b5.ext_ = ['.mov'] # Search file extensions
# Full path to the file with ground truth scores for accuracy calculation
url_accuracy = _b5.true_traits_['mupta']['sberdisk']
_b5.get_avt_predictions(url_accuracy = url_accuracy, lang = lang)
[2023-12-16 19:32:56] Feature extraction (hand-crafted and deep) from text …
[2023-12-16 19:33:00] Getting scores and accuracy calculation (multimodal fusion) …
10 from 10 (100.0%) … GitHub:nbsphinx-math:OCEANAI\docs\source\user_guide:nbsphinx-math:notebooks\video_MuPTA:nbsphinx-math:test\speaker_27_center_83.mov …
Path | Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | |
---|---|---|---|---|---|---|
Person ID | ||||||
1 | speaker_01_center_83.mov | 0.758137 | 0.693356 | 0.650108 | 0.744589 | 0.488671 |
2 | speaker_06_center_83.mov | 0.681602 | 0.654339 | 0.607156 | 0.731282 | 0.417908 |
3 | speaker_07_center_83.mov | 0.666104 | 0.656836 | 0.567863 | 0.685067 | 0.378102 |
4 | speaker_10_center_83.mov | 0.694171 | 0.596195 | 0.571414 | 0.66223 | 0.348639 |
5 | speaker_11_center_83.mov | 0.712885 | 0.594764 | 0.571709 | 0.716696 | 0.37802 |
6 | speaker_15_center_83.mov | 0.664158 | 0.670411 | 0.60421 | 0.696056 | 0.399842 |
7 | speaker_19_center_83.mov | 0.761213 | 0.652635 | 0.651028 | 0.788677 | 0.459676 |
8 | speaker_23_center_83.mov | 0.692788 | 0.68324 | 0.616737 | 0.795205 | 0.447242 |
9 | speaker_24_center_83.mov | 0.705923 | 0.658382 | 0.610645 | 0.697415 | 0.411988 |
10 | speaker_27_center_83.mov | 0.753417 | 0.708372 | 0.654608 | 0.816416 | 0.504743 |
[2023-12-16 19:33:00] Trait-wise accuracy …
Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | Mean | |
---|---|---|---|---|---|---|
Metrics | ||||||
MAE | 0.0673 | 0.0789 | 0.1325 | 0.102 | 0.1002 | 0.0962 |
Accuracy | 0.9327 | 0.9211 | 0.8675 | 0.898 | 0.8998 | 0.9038 |
[2023-12-16 19:33:00] Mean absolute errors: 0.0962, average accuracy: 0.9038 …
Log files saved successfully …
— Runtime: 444.191 sec. —
[6]:
True
To find the suitable colleague, you need to know two correlation coefficients for each of a personality traits. These coefficients should show how the trait score of one person changes when it is higher or lower than the same trait score of another person.
As an example, it is proposed to use the correlation coefficients between two people in the context of a manager-employee relationship presented in the article:
Kuroda S., Yamamoto I. Good boss, bad boss, workers’ mental health and productivity: Evidence from Japan // Japan & The World Economy. – 2018. – vol. 48. – pp. 106-118.
The user can set their own correlation coefficients.
[7]:
# Loading dataframe with correlation coefficients
url = 'https://download.sberdisk.ru/download/file/478675819?token=LuB7L1QsEY0UuSs&filename=colleague_ranking.csv'
df_correlation_coefficients = pd.read_csv(url)
df_correlation_coefficients = pd.DataFrame(
df_correlation_coefficients.drop(['ID'], axis = 1)
)
df_correlation_coefficients.index.name = 'ID'
df_correlation_coefficients.index += 1
df_correlation_coefficients.index = df_correlation_coefficients.index.map(str)
df_correlation_coefficients
[7]:
Score_comparison | Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | |
---|---|---|---|---|---|---|
ID | ||||||
1 | higher | -0.0602 | 0.0471 | -0.1070 | -0.0832 | 0.190 |
2 | lower | -0.1720 | -0.1050 | 0.0772 | 0.0703 | -0.229 |
Finding a suitable senior colleague
[8]:
# List of personality traits scores of the target person
target_scores = [0.527886, 0.522337, 0.458468, 0.51761, 0.444649]
_b5._colleague_ranking(
correlation_coefficients = df_correlation_coefficients,
target_scores = target_scores,
colleague = 'major',
equal_coefficients = 0.5,
out = False
)
_b5._save_logs(df = _b5.df_files_colleague_, name = 'major_colleague_ranking_mupta_ru', out = True)
# Optional
df = _b5.df_files_colleague_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = df.columns[1:]
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[8]:
Path | OPE | CON | EXT | AGR | NNEU | Match | |
---|---|---|---|---|---|---|---|
Person ID | |||||||
1 | speaker_01_center_83.mov | 0.758 | 0.693 | 0.650 | 0.745 | 0.489 | -0.052 |
10 | speaker_27_center_83.mov | 0.753 | 0.708 | 0.655 | 0.816 | 0.505 | -0.054 |
8 | speaker_23_center_83.mov | 0.693 | 0.683 | 0.617 | 0.795 | 0.447 | -0.057 |
7 | speaker_19_center_83.mov | 0.761 | 0.653 | 0.651 | 0.789 | 0.460 | -0.063 |
4 | speaker_10_center_83.mov | 0.694 | 0.596 | 0.571 | 0.662 | 0.349 | -0.210 |
3 | speaker_07_center_83.mov | 0.666 | 0.657 | 0.568 | 0.685 | 0.378 | -0.214 |
5 | speaker_11_center_83.mov | 0.713 | 0.595 | 0.572 | 0.717 | 0.378 | -0.222 |
6 | speaker_15_center_83.mov | 0.664 | 0.670 | 0.604 | 0.696 | 0.400 | -0.223 |
9 | speaker_24_center_83.mov | 0.706 | 0.658 | 0.611 | 0.697 | 0.412 | -0.229 |
2 | speaker_06_center_83.mov | 0.682 | 0.654 | 0.607 | 0.731 | 0.418 | -0.232 |
Finding a suitable junior colleague
[9]:
# List of personality traits scores of the target person
target_scores = [0.527886, 0.522337, 0.458468, 0.51761, 0.444649]
_b5._colleague_ranking(
correlation_coefficients = df_correlation_coefficients,
target_scores = target_scores,
colleague = 'minor',
equal_coefficients = 0.5,
out = False
)
_b5._save_logs(df = _b5.df_files_colleague_, name = 'minor_colleague_ranking_mupta_ru', out = True)
# Optional
df = _b5.df_files_colleague_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = df.columns[1:]
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[9]:
Path | OPE | CON | EXT | AGR | NNEU | Match | |
---|---|---|---|---|---|---|---|
Person ID | |||||||
2 | speaker_06_center_83.mov | 0.682 | 0.654 | 0.607 | 0.731 | 0.418 | -0.008 |
6 | speaker_15_center_83.mov | 0.664 | 0.670 | 0.604 | 0.696 | 0.400 | -0.013 |
9 | speaker_24_center_83.mov | 0.706 | 0.658 | 0.611 | 0.697 | 0.412 | -0.016 |
5 | speaker_11_center_83.mov | 0.713 | 0.595 | 0.572 | 0.717 | 0.378 | -0.019 |
3 | speaker_07_center_83.mov | 0.666 | 0.657 | 0.568 | 0.685 | 0.378 | -0.020 |
4 | speaker_10_center_83.mov | 0.694 | 0.596 | 0.571 | 0.662 | 0.349 | -0.025 |
8 | speaker_23_center_83.mov | 0.693 | 0.683 | 0.617 | 0.795 | 0.447 | -0.190 |
7 | speaker_19_center_83.mov | 0.761 | 0.653 | 0.651 | 0.789 | 0.460 | -0.199 |
10 | speaker_27_center_83.mov | 0.753 | 0.708 | 0.655 | 0.816 | 0.505 | -0.212 |
1 | speaker_01_center_83.mov | 0.758 | 0.693 | 0.650 | 0.745 | 0.489 | -0.213 |
MuPTA
(en)
[10]:
import os
import pandas as pd
# Module import
from oceanai.modules.lab.build import Run
# Creating an instance of a class
_b5 = Run(lang = 'en')
corpus = 'fi'
lang = 'en'
# Core setup
_b5.path_to_save_ = './models' # Directory to save the models
_b5.chunk_size_ = 2000000 # File download size from network in one step
# Building audio models
res_load_model_hc = _b5.load_audio_model_hc()
res_load_model_nn = _b5.load_audio_model_nn()
# Loading audio model weights
url = _b5.weights_for_big5_['audio'][corpus]['hc']['sberdisk']
res_load_model_weights_hc = _b5.load_audio_model_weights_hc(url = url)
url = _b5.weights_for_big5_['audio'][corpus]['nn']['sberdisk']
res_load_model_weights_nn = _b5.load_audio_model_weights_nn(url = url)
# Building video models
res_load_model_hc = _b5.load_video_model_hc(lang=lang)
res_load_model_deep_fe = _b5.load_video_model_deep_fe()
res_load_model_nn = _b5.load_video_model_nn()
# Loading video model weights
url = _b5.weights_for_big5_['video'][corpus]['hc']['sberdisk']
res_load_model_weights_hc = _b5.load_video_model_weights_hc(url = url)
url = _b5.weights_for_big5_['video'][corpus]['fe']['sberdisk']
res_load_model_weights_deep_fe = _b5.load_video_model_weights_deep_fe(url = url)
url = _b5.weights_for_big5_['video'][corpus]['nn']['sberdisk']
res_load_model_weights_nn = _b5.load_video_model_weights_nn(url = url)
# Loading a dictionary with hand-crafted features (text modality)
res_load_text_features = _b5.load_text_features()
# Building text models
res_setup_translation_model = _b5.setup_translation_model()
res_setup_translation_model = _b5.setup_bert_encoder()
res_load_text_model_hc_fi = _b5.load_text_model_hc(corpus=corpus)
res_load_text_model_nn_fi = _b5.load_text_model_nn(corpus=corpus)
# Loading text model weights
url = _b5.weights_for_big5_['text'][corpus]['hc']['sberdisk']
res_load_text_model_weights_hc_fi = _b5.load_text_model_weights_hc(url = url)
url = _b5.weights_for_big5_['text'][corpus]['nn']['sberdisk']
res_load_text_model_weights_nn_fi = _b5.load_text_model_weights_nn(url = url)
# Building model for multimodal information fusion
res_load_avt_model_b5 = _b5.load_avt_model_b5()
# Building model for multimodal information fusion
url = _b5.weights_for_big5_['avt'][corpus]['b5']['sberdisk']
res_load_avt_model_weights_b5 = _b5.load_avt_model_weights_b5(url = url)
PATH_TO_DIR = './video_MuPTA/'
PATH_SAVE_VIDEO = './video_MuPTA/test/'
_b5.path_to_save_ = PATH_SAVE_VIDEO
# Loading 10 test files from the MuPTA corpus
# URL: https://hci.nw.ru/en/pages/mupta-corpus
domain = 'https://download.sberdisk.ru/download/file/'
tets_name_files = [
'477995979?token=2cvyk7CS0mHx2MJ&filename=speaker_06_center_83.mov',
'477995980?token=jGPtBPS69uzFU6Y&filename=speaker_01_center_83.mov',
'477995967?token=zCaRbNB6ht5wMPq&filename=speaker_11_center_83.mov',
'477995966?token=B1rbinDYRQKrI3T&filename=speaker_15_center_83.mov',
'477995978?token=dEpVDtZg1EQiEQ9&filename=speaker_07_center_83.mov',
'477995961?token=o1hVjw8G45q9L9Z&filename=speaker_19_center_83.mov',
'477995964?token=5K220Aqf673VHPq&filename=speaker_23_center_83.mov',
'477995965?token=v1LVD2KT1cU7Lpb&filename=speaker_24_center_83.mov',
'477995962?token=tmaSGyyWLA6XCy9&filename=speaker_27_center_83.mov',
'477995963?token=bTpo96qNDPcwGqb&filename=speaker_10_center_83.mov',
]
for curr_files in tets_name_files:
_b5.download_file_from_url(url = domain + curr_files, out = True)
# Getting scores
_b5.path_to_dataset_ = PATH_TO_DIR # Dataset directory
_b5.ext_ = ['.mov'] # Search file extensions
# Full path to the file with ground truth scores for accuracy calculation
url_accuracy = _b5.true_traits_['mupta']['sberdisk']
_b5.get_avt_predictions(url_accuracy = url_accuracy, lang = lang)
[2023-12-16 19:40:25] Feature extraction (hand-crafted and deep) from text …
[2023-12-16 19:40:28] Getting scores and accuracy calculation (multimodal fusion) …
10 from 10 (100.0%) … GitHub:nbsphinx-math:OCEANAI\docs\source\user_guide:nbsphinx-math:notebooks\video_MuPTA:nbsphinx-math:test\speaker_27_center_83.mov …
Path | Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | |
---|---|---|---|---|---|---|
Person ID | ||||||
1 | speaker_01_center_83.mov | 0.564985 | 0.539052 | 0.440615 | 0.59251 | 0.488763 |
2 | speaker_06_center_83.mov | 0.650774 | 0.663849 | 0.607308 | 0.643847 | 0.620627 |
3 | speaker_07_center_83.mov | 0.435976 | 0.486683 | 0.313828 | 0.415446 | 0.396618 |
4 | speaker_10_center_83.mov | 0.498542 | 0.511243 | 0.412592 | 0.468947 | 0.44399 |
5 | speaker_11_center_83.mov | 0.394776 | 0.341608 | 0.327082 | 0.427304 | 0.354936 |
6 | speaker_15_center_83.mov | 0.566107 | 0.543811 | 0.492766 | 0.587411 | 0.499433 |
7 | speaker_19_center_83.mov | 0.506271 | 0.438215 | 0.430894 | 0.456177 | 0.44075 |
8 | speaker_23_center_83.mov | 0.486463 | 0.521755 | 0.309894 | 0.432291 | 0.433601 |
9 | speaker_24_center_83.mov | 0.417404 | 0.473339 | 0.320714 | 0.445086 | 0.414649 |
10 | speaker_27_center_83.mov | 0.526112 | 0.661107 | 0.443167 | 0.558965 | 0.554224 |
[2023-12-16 19:40:28] Trait-wise accuracy …
Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | Mean | |
---|---|---|---|---|---|---|
Metrics | ||||||
MAE | 0.1727 | 0.1672 | 0.1661 | 0.2579 | 0.107 | 0.1742 |
Accuracy | 0.8273 | 0.8328 | 0.8339 | 0.7421 | 0.893 | 0.8258 |
[2023-12-16 19:40:28] Mean absolute errors: 0.1742, average accuracy: 0.8258 …
Log files saved successfully …
— Runtime: 377.119 sec. —
[10]:
True
To find the suitable colleague, you need to know two correlation coefficients for each of a personality traits. These coefficients should show how the trait score of one person changes when it is higher or lower than the same trait score of another person.
As an example, it is proposed to use the correlation coefficients between two people in the context of a manager-employee relationship presented in the article:
Kuroda S., Yamamoto I. Good boss, bad boss, workers’ mental health and productivity: Evidence from Japan // Japan & The World Economy. – 2018. – vol. 48. – pp. 106-118.
The user can set their own correlation coefficients.
[11]:
# Loading dataframe with correlation coefficients
url = 'https://download.sberdisk.ru/download/file/478675819?token=LuB7L1QsEY0UuSs&filename=colleague_ranking.csv'
df_correlation_coefficients = pd.read_csv(url)
df_correlation_coefficients = pd.DataFrame(
df_correlation_coefficients.drop(['ID'], axis = 1)
)
df_correlation_coefficients.index.name = 'ID'
df_correlation_coefficients.index += 1
df_correlation_coefficients.index = df_correlation_coefficients.index.map(str)
df_correlation_coefficients
[11]:
Score_comparison | Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | |
---|---|---|---|---|---|---|
ID | ||||||
1 | higher | -0.0602 | 0.0471 | -0.1070 | -0.0832 | 0.190 |
2 | lower | -0.1720 | -0.1050 | 0.0772 | 0.0703 | -0.229 |
Finding a suitable senior colleague
[12]:
# List of personality traits scores of the target person
target_scores = [0.527886, 0.522337, 0.458468, 0.51761, 0.444649]
_b5._colleague_ranking(
correlation_coefficients = df_correlation_coefficients,
target_scores = target_scores,
colleague = 'major',
equal_coefficients = 0.5,
out = False
)
_b5._save_logs(df = _b5.df_files_colleague_, name = 'major_colleague_ranking_mupta_en', out = True)
# Optional
df = _b5.df_files_colleague_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = df.columns[1:]
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[12]:
Path | OPE | CON | EXT | AGR | NNEU | Match | |
---|---|---|---|---|---|---|---|
Person ID | |||||||
1 | speaker_01_center_83.mov | 0.565 | 0.539 | 0.441 | 0.593 | 0.489 | 0.069 |
10 | speaker_27_center_83.mov | 0.526 | 0.661 | 0.443 | 0.559 | 0.554 | 0.034 |
2 | speaker_06_center_83.mov | 0.651 | 0.664 | 0.607 | 0.644 | 0.621 | -0.009 |
6 | speaker_15_center_83.mov | 0.566 | 0.544 | 0.493 | 0.587 | 0.499 | -0.015 |
5 | speaker_11_center_83.mov | 0.395 | 0.342 | 0.327 | 0.427 | 0.355 | -0.130 |
9 | speaker_24_center_83.mov | 0.417 | 0.473 | 0.321 | 0.445 | 0.415 | -0.160 |
3 | speaker_07_center_83.mov | 0.436 | 0.487 | 0.314 | 0.415 | 0.397 | -0.163 |
7 | speaker_19_center_83.mov | 0.506 | 0.438 | 0.431 | 0.456 | 0.441 | -0.169 |
4 | speaker_10_center_83.mov | 0.499 | 0.511 | 0.413 | 0.469 | 0.444 | -0.176 |
8 | speaker_23_center_83.mov | 0.486 | 0.522 | 0.310 | 0.432 | 0.434 | -0.183 |
Finding a suitable junior colleague
[13]:
# List of personality traits scores of the target person
target_scores = [0.527886, 0.522337, 0.458468, 0.51761, 0.444649]
_b5._colleague_ranking(
correlation_coefficients = df_correlation_coefficients,
target_scores = target_scores,
colleague = 'minor',
equal_coefficients = 0.5,
out = False
)
_b5._save_logs(df = _b5.df_files_colleague_, name = 'minor_colleague_ranking_mupta_en', out = True)
# Optional
df = _b5.df_files_colleague_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = df.columns[1:]
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[13]:
Path | OPE | CON | EXT | AGR | NNEU | Match | |
---|---|---|---|---|---|---|---|
Person ID | |||||||
8 | speaker_23_center_83.mov | 0.486 | 0.522 | 0.310 | 0.432 | 0.434 | 0.009 |
9 | speaker_24_center_83.mov | 0.417 | 0.473 | 0.321 | 0.445 | 0.415 | 0.005 |
3 | speaker_07_center_83.mov | 0.436 | 0.487 | 0.314 | 0.415 | 0.397 | 0.004 |
4 | speaker_10_center_83.mov | 0.499 | 0.511 | 0.413 | 0.469 | 0.444 | -0.005 |
7 | speaker_19_center_83.mov | 0.506 | 0.438 | 0.431 | 0.456 | 0.441 | -0.010 |
5 | speaker_11_center_83.mov | 0.395 | 0.342 | 0.327 | 0.427 | 0.355 | -0.011 |
6 | speaker_15_center_83.mov | 0.566 | 0.544 | 0.493 | 0.587 | 0.499 | -0.189 |
2 | speaker_06_center_83.mov | 0.651 | 0.664 | 0.607 | 0.644 | 0.621 | -0.232 |
10 | speaker_27_center_83.mov | 0.526 | 0.661 | 0.443 | 0.559 | 0.554 | -0.236 |
1 | speaker_01_center_83.mov | 0.565 | 0.539 | 0.441 | 0.593 | 0.489 | -0.271 |