Solution of practical task 2

Task: Predicting consumer preferences for industrial goods

The solution of the practical task is performed in two stages. At the first stage it is necessary to use the OCEAN-AI library to obtain predictions (personality traits scores). The second step is to use the _priority_calculation method from the OCEAN-AI library to solve the presented practical task. Examples of the results of the work and implementation are presented below.

Thus, the OCEAN-AI library provides tools to analyze the personality traits of consumers, aiding in predicting their interests. This enables companies to tailor products and services more accurately to consumer preferences, enhancing uniqueness and personalization.

b467492c0f8f4cbdad9df298d2b7a1a9

8d00ae8cd95d4c979381d1dda5478a03


FI V2

[2]:
# Import required tools
import os
import pandas as pd

# Module import
from oceanai.modules.lab.build import Run

# Creating an instance of a class
_b5 = Run(lang = 'en')

# Core setup
_b5.path_to_save_ = './models' # Directory to save the models
_b5.chunk_size_ = 2000000      # File download size from network in one step

corpus = 'fi'

# Building audio models
res_load_model_hc = _b5.load_audio_model_hc()
res_load_model_nn = _b5.load_audio_model_nn()

# Loading audio model weights
url = _b5.weights_for_big5_['audio'][corpus]['hc']['sberdisk']
res_load_model_weights_hc = _b5.load_audio_model_weights_hc(url = url)

url = _b5.weights_for_big5_['audio'][corpus]['nn']['sberdisk']
res_load_model_weights_nn = _b5.load_audio_model_weights_nn(url = url)

# Loading audio model weights
res_load_model_hc = _b5.load_video_model_hc(lang='en')
res_load_model_deep_fe = _b5.load_video_model_deep_fe()
res_load_model_nn = _b5.load_video_model_nn()

# Loading video model weights
url = _b5.weights_for_big5_['video'][corpus]['hc']['sberdisk']
res_load_model_weights_hc = _b5.load_video_model_weights_hc(url = url)

url = _b5.weights_for_big5_['video'][corpus]['fe']['sberdisk']
res_load_model_weights_deep_fe = _b5.load_video_model_weights_deep_fe(url = url)

url = _b5.weights_for_big5_['video'][corpus]['nn']['sberdisk']
res_load_model_weights_nn = _b5.load_video_model_weights_nn(url = url)

# Loading a dictionary with hand-crafted features (text modality)
res_load_text_features = _b5.load_text_features()

# Building text models
res_setup_translation_model = _b5.setup_translation_model()
res_setup_translation_model = _b5.setup_bert_encoder()
res_load_text_model_hc_fi = _b5.load_text_model_hc(corpus=corpus)
res_load_text_model_nn_fi = _b5.load_text_model_nn(corpus=corpus)

# Loading text model weights
url = _b5.weights_for_big5_['text'][corpus]['hc']['sberdisk']
res_load_text_model_weights_hc_fi = _b5.load_text_model_weights_hc(url = url)

url = _b5.weights_for_big5_['text'][corpus]['nn']['sberdisk']
res_load_text_model_weights_nn_fi = _b5.load_text_model_weights_nn(url = url)

# Building model for multimodal information fusion
res_load_avt_model_b5 = _b5.load_avt_model_b5()

# Loading model weights for multimodal information fusion
url = _b5.weights_for_big5_['avt'][corpus]['b5']['sberdisk']
res_load_avt_model_weights_b5 = _b5.load_avt_model_weights_b5(url = url)

PATH_TO_DIR = './video_FI/'
PATH_SAVE_VIDEO = './video_FI/test/'

_b5.path_to_save_ = PATH_SAVE_VIDEO

# Loading 10 test files from the First Impressions V2 corpus
# URL: https://chalearnlap.cvc.uab.cat/dataset/24/description/
domain = 'https://download.sberdisk.ru/download/file/'
tets_name_files = [
    '429713680?token=FqHdMLSSh7zYSZt&filename=_plk5k7PBEg.003.mp4',
    '429713681?token=Hz9b4lQkrLfic33&filename=be0DQawtVkE.002.mp4',
    '429713683?token=EgUXS9Xs8xHm5gz&filename=2d6btbaNdfo.000.mp4',
    '429713684?token=1U26753kmPYdIgt&filename=300gK3CnzW0.003.mp4',
    '429713685?token=LyigAWLTzDNwKJO&filename=300gK3CnzW0.001.mp4',
    '429713686?token=EpfRbCKHyuc4HPu&filename=cLaZxEf1nE4.004.mp4',
    '429713687?token=FNTkwqBr4jOS95l&filename=g24JGYuT74A.004.mp4',
    '429713688?token=qDT95nz7hfm2Nki&filename=JZNMxa3OKHY.000.mp4',
    '429713689?token=noLguEGXDpbcKhg&filename=nvlqJbHk_Lc.003.mp4',
    '429713679?token=9L7RQ0hgdJlcek6&filename=4vdJGgZpj4k.003.mp4'
]

for curr_files in tets_name_files:
    _b5.download_file_from_url(url = domain + curr_files, out = True)

# Getting scores
_b5.path_to_dataset_ = PATH_TO_DIR # Dataset directory
_b5.ext_ = ['.mp4'] # Search file extensions

# Full path to the file with ground truth scores for accuracy calculation
url_accuracy = _b5.true_traits_[corpus]['sberdisk']

_b5.get_avt_predictions(url_accuracy = url_accuracy, lang = 'en')

[2023-12-16 19:05:15] Feature extraction (hand-crafted and deep) from text …

[2023-12-16 19:05:17] Getting scores and accuracy calculation (multimodal fusion) …

10 from 10 (100.0%) … GitHub:nbsphinx-math:OCEANAI\docs\source\user_guide:nbsphinx-math:notebooks\video_FI:nbsphinx-math:test_plk5k7PBEg.003.mp4 …

Path Openness Conscientiousness Extraversion Agreeableness Non-Neuroticism
Person ID
1 2d6btbaNdfo.000.mp4 0.581159 0.628822 0.466609 0.622129 0.553832
2 300gK3CnzW0.001.mp4 0.463991 0.418851 0.41301 0.493329 0.423093
3 300gK3CnzW0.003.mp4 0.454281 0.415049 0.39189 0.485114 0.420741
4 4vdJGgZpj4k.003.mp4 0.588461 0.643233 0.530789 0.603038 0.593398
5 be0DQawtVkE.002.mp4 0.633433 0.533295 0.523742 0.608591 0.588456
6 cLaZxEf1nE4.004.mp4 0.636944 0.542386 0.558461 0.570975 0.558983
7 g24JGYuT74A.004.mp4 0.531518 0.376987 0.393309 0.4904 0.447881
8 JZNMxa3OKHY.000.mp4 0.610342 0.541418 0.563163 0.595013 0.569461
9 nvlqJbHk_Lc.003.mp4 0.495809 0.458526 0.414436 0.469152 0.435461
10 _plk5k7PBEg.003.mp4 0.60707 0.591893 0.520662 0.603938 0.565726

[2023-12-16 19:05:17] Trait-wise accuracy …

Openness Conscientiousness Extraversion Agreeableness Non-Neuroticism Mean
Metrics
MAE 0.0589 0.0612 0.0864 0.0697 0.0582 0.0669
Accuracy 0.9411 0.9388 0.9136 0.9303 0.9418 0.9331

[2023-12-16 19:05:17] Mean absolute errors: 0.0669, average accuracy: 0.9331 …

Log files saved successfully …

— Runtime: 64.147 sec. —

[2]:
True

To predict consumer preferences for industrial goods, it is necessary to know the correlation coefficients that determine the relationship between personality traits and preferences in goods or services.

As an example, it is proposed to use the correlation coefficients between the personality traits and the characteristics of the cars presented in the article:

  1. O’Connor P. J. et al. What Drives Consumer Automobile Choice? Investigating Personality Trait Predictors of Vehicle Preference Factors // Personality and Individual Differences. – 2022. – Vol. 184. – pp. 111220.

The user can set their own correlation coefficients.

Predicting consumer preferences for industrial goods on the example of car characteristics

[3]:
# Loading dataframe with correlation coefficients
url = 'https://download.sberdisk.ru/download/file/478675818?token=EjfLMqOeK8cfnOu&filename=auto_characteristics.csv'
df_correlation_coefficients = pd.read_csv(url)
df_correlation_coefficients = pd.DataFrame(
    df_correlation_coefficients.drop(['Style and performance', 'Safety and practicality'], axis = 1)
)
df_correlation_coefficients.index.name = 'ID'
df_correlation_coefficients.index += 1
df_correlation_coefficients.index = df_correlation_coefficients.index.map(str)

df_correlation_coefficients
[3]:
Trait Performance Classic car features Luxury additions Fashion and attention Recreation Technology Family friendly Safe and reliable Practical and easy to use Economical/low cost Basic features
ID
1 Openness 0.020000 -0.033333 -0.030000 -0.050000 0.033333 0.013333 -0.030000 0.136667 0.106667 0.093333 0.006667
2 Conscientiousness 0.013333 -0.193333 -0.063333 -0.096667 -0.096667 0.086667 -0.063333 0.280000 0.180000 0.130000 0.143333
3 Extraversion 0.133333 0.060000 0.106667 0.123333 0.126667 0.120000 0.090000 0.136667 0.043333 0.073333 0.050000
4 Agreeableness -0.036667 -0.193333 -0.133333 -0.133333 -0.090000 0.046667 -0.016667 0.240000 0.160000 0.120000 0.083333
5 Non-Neuroticism 0.016667 -0.006667 -0.010000 -0.006667 -0.033333 0.046667 -0.023333 0.093333 0.046667 0.046667 -0.040000
[4]:
_b5._priority_calculation(
    correlation_coefficients = df_correlation_coefficients,
    col_name_ocean = 'Trait',
    threshold = 0.55,
    number_priority = 3,
    number_importance_traits = 3,
    out = False
)

_b5._save_logs(df = _b5.df_files_priority_, name = 'auto_characteristics_priorities_fi_en', out = True)

# Optional
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[4]:
Path OPE CON EXT AGR NNEU Priority 1 Priority 2 Priority 3 Trait importance 1 Trait importance 2 Trait importance 3
Person ID
1 2d6btbaNdfo.000.mp4 0.581 0.629 0.467 0.622 0.554 Safe and reliable Practical and easy to use Economical/low cost Conscientiousness Agreeableness Openness
2 300gK3CnzW0.001.mp4 0.464 0.419 0.413 0.493 0.423 Classic car features Fashion and attention Luxury additions Agreeableness Conscientiousness Openness
3 300gK3CnzW0.003.mp4 0.454 0.415 0.392 0.485 0.421 Classic car features Fashion and attention Luxury additions Agreeableness Conscientiousness Openness
4 4vdJGgZpj4k.003.mp4 0.588 0.643 0.531 0.603 0.593 Safe and reliable Practical and easy to use Economical/low cost Conscientiousness Agreeableness Openness
5 be0DQawtVkE.002.mp4 0.633 0.533 0.524 0.609 0.588 Practical and easy to use Safe and reliable Economical/low cost Agreeableness Openness Non-Neuroticism
6 cLaZxEf1nE4.004.mp4 0.637 0.542 0.558 0.571 0.559 Safe and reliable Economical/low cost Practical and easy to use Agreeableness Openness Extraversion
7 g24JGYuT74A.004.mp4 0.532 0.377 0.393 0.490 0.448 Classic car features Fashion and attention Luxury additions Agreeableness Conscientiousness Openness
8 JZNMxa3OKHY.000.mp4 0.610 0.541 0.563 0.595 0.569 Safe and reliable Economical/low cost Practical and easy to use Agreeableness Openness Extraversion
9 nvlqJbHk_Lc.003.mp4 0.496 0.459 0.414 0.469 0.435 Classic car features Fashion and attention Luxury additions Agreeableness Conscientiousness Openness
10 _plk5k7PBEg.003.mp4 0.607 0.592 0.521 0.604 0.566 Safe and reliable Practical and easy to use Economical/low cost Conscientiousness Agreeableness Openness

Predicting consumer preferences for industrial goods on the example of mobile device application categories

As an example, it is proposed to use the correlation coefficients between the personality traits and the mobile device application categories presented in the article:

  1. Peltonen E., Sharmila P., Asare K. O., Visuri A., Lagerspetz E., Ferreira D. (2020). When phones get personal: Predicting Big Five personality traits from application usage // Pervasive and Mobile Computing. – 2020. – Vol. 69. – 101269.

[5]:
# Loading a dataframe with correlation coefficients
url = 'https://download.sberdisk.ru/download/file/478676690?token=7KcAxPqMpWiYQnx&filename=divice_characteristics.csv'
df_divice_characteristics = pd.read_csv(url)

df_divice_characteristics.index.name = 'ID'
df_divice_characteristics.index += 1
df_divice_characteristics.index = df_divice_characteristics.index.map(str)

df_divice_characteristics
[5]:
Trait Communication Game Action Game Board Game Casino Game Educational Game Simulation Game Trivia Entertainment Finance Health and Fitness Media and Video Music and Audio News and Magazines Personalisation Travel and Local Weather
ID
1 Openness 0.118 0.056 0.079 0.342 0.027 0.104 0.026 0.000 0.006 0.002 0.000 0.000 0.001 0.004 0.002 0.004
2 Conscientiousness 0.119 0.043 0.107 0.448 0.039 0.012 0.119 0.000 0.005 0.001 0.000 0.002 0.002 0.001 0.001 0.003
3 Extraversion 0.246 0.182 0.211 0.311 0.102 0.165 0.223 0.001 0.003 0.000 0.001 0.001 0.001 0.004 0.009 0.003
4 Agreeableness 0.218 0.104 0.164 0.284 0.165 0.122 0.162 0.000 0.003 0.001 0.000 0.002 0.002 0.001 0.004 0.003
5 Non-Neuroticism 0.046 0.047 0.125 0.515 0.272 0.179 0.214 0.002 0.030 0.001 0.000 0.005 0.003 0.008 0.004 0.007
[6]:
_b5._priority_calculation(
    correlation_coefficients = df_divice_characteristics,
    col_name_ocean = 'Trait',
    threshold = 0.55,
    number_priority = 3,
    number_importance_traits = 3,
    out = True
)

_b5._save_logs(df = _b5.df_files_priority_, name = 'divice_characteristics_priorities_fi_en', out = True)

# Optional
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[6]:
Path OPE CON EXT AGR NNEU Priority 1 Priority 2 Priority 3 Trait importance 1 Trait importance 2 Trait importance 3
Person ID
1 2d6btbaNdfo.000.mp4 0.581 0.629 0.467 0.622 0.554 Game Casino Game Educational Game Trivia Non-Neuroticism Conscientiousness Agreeableness
2 300gK3CnzW0.001.mp4 0.464 0.419 0.413 0.493 0.423 Media and Video Entertainment Health and Fitness Conscientiousness Agreeableness Extraversion
3 300gK3CnzW0.003.mp4 0.454 0.415 0.392 0.485 0.421 Media and Video Entertainment Health and Fitness Conscientiousness Agreeableness Extraversion
4 4vdJGgZpj4k.003.mp4 0.588 0.643 0.531 0.603 0.593 Game Casino Game Educational Game Trivia Non-Neuroticism Conscientiousness Agreeableness
5 be0DQawtVkE.002.mp4 0.633 0.533 0.524 0.609 0.588 Game Casino Game Educational Game Simulation Non-Neuroticism Agreeableness Openness
6 cLaZxEf1nE4.004.mp4 0.637 0.542 0.558 0.571 0.559 Game Casino Game Simulation Game Educational Non-Neuroticism Agreeableness Extraversion
7 g24JGYuT74A.004.mp4 0.532 0.377 0.393 0.490 0.448 Media and Video Entertainment Health and Fitness Conscientiousness Agreeableness Extraversion
8 JZNMxa3OKHY.000.mp4 0.610 0.541 0.563 0.595 0.569 Game Casino Game Simulation Game Educational Non-Neuroticism Agreeableness Extraversion
9 nvlqJbHk_Lc.003.mp4 0.496 0.459 0.414 0.469 0.435 Media and Video Entertainment Health and Fitness Conscientiousness Agreeableness Extraversion
10 _plk5k7PBEg.003.mp4 0.607 0.592 0.521 0.604 0.566 Game Casino Game Educational Game Trivia Non-Neuroticism Agreeableness Conscientiousness

MuPTA (ru)

[7]:
import os
import pandas as pd

# Module import
from oceanai.modules.lab.build import Run

# Creating an instance of a class
_b5 = Run(lang = 'en')

corpus = 'mupta'
lang = 'ru'

# Core setup
_b5.path_to_save_ = './models' # Directory to save the models
_b5.chunk_size_ = 2000000      # File download size from network in one step

# Building audio models
res_load_model_hc = _b5.load_audio_model_hc()
res_load_model_nn = _b5.load_audio_model_nn()

# Loading audio model weights
url = _b5.weights_for_big5_['audio'][corpus]['hc']['sberdisk']
res_load_model_weights_hc = _b5.load_audio_model_weights_hc(url = url)

url = _b5.weights_for_big5_['audio'][corpus]['nn']['sberdisk']
res_load_model_weights_nn = _b5.load_audio_model_weights_nn(url = url)

# Building video models
res_load_model_hc = _b5.load_video_model_hc(lang=lang)
res_load_model_deep_fe = _b5.load_video_model_deep_fe()
res_load_model_nn = _b5.load_video_model_nn()

# Loading video model weights
url = _b5.weights_for_big5_['video'][corpus]['hc']['sberdisk']
res_load_model_weights_hc = _b5.load_video_model_weights_hc(url = url)

url = _b5.weights_for_big5_['video'][corpus]['fe']['sberdisk']
res_load_model_weights_deep_fe = _b5.load_video_model_weights_deep_fe(url = url)

url = _b5.weights_for_big5_['video'][corpus]['nn']['sberdisk']
res_load_model_weights_nn = _b5.load_video_model_weights_nn(url = url)

# Loading a dictionary with hand-crafted features (text modality)
res_load_text_features = _b5.load_text_features()

# Building text models
res_setup_translation_model = _b5.setup_translation_model()
res_setup_translation_model = _b5.setup_bert_encoder()
res_load_text_model_hc_fi = _b5.load_text_model_hc(corpus=corpus)
res_load_text_model_nn_fi = _b5.load_text_model_nn(corpus=corpus)

# Loading text model weights
url = _b5.weights_for_big5_['text'][corpus]['hc']['sberdisk']
res_load_text_model_weights_hc_fi = _b5.load_text_model_weights_hc(url = url)

url = _b5.weights_for_big5_['text'][corpus]['nn']['sberdisk']
res_load_text_model_weights_nn_fi = _b5.load_text_model_weights_nn(url = url)

# Building model for multimodal information fusion
res_load_avt_model_b5 = _b5.load_avt_model_b5()

# Loading model weights for multimodal information fusion
url = _b5.weights_for_big5_['avt'][corpus]['b5']['sberdisk']
res_load_avt_model_weights_b5 = _b5.load_avt_model_weights_b5(url = url)

PATH_TO_DIR = './video_MuPTA/'
PATH_SAVE_VIDEO = './video_MuPTA/test/'

_b5.path_to_save_ = PATH_SAVE_VIDEO

# Loading 10 test files from the MuPTA corpus
# URL: https://hci.nw.ru/en/pages/mupta-corpus
domain = 'https://download.sberdisk.ru/download/file/'
tets_name_files = [
    '477995979?token=2cvyk7CS0mHx2MJ&filename=speaker_06_center_83.mov',
    '477995980?token=jGPtBPS69uzFU6Y&filename=speaker_01_center_83.mov',
    '477995967?token=zCaRbNB6ht5wMPq&filename=speaker_11_center_83.mov',
    '477995966?token=B1rbinDYRQKrI3T&filename=speaker_15_center_83.mov',
    '477995978?token=dEpVDtZg1EQiEQ9&filename=speaker_07_center_83.mov',
    '477995961?token=o1hVjw8G45q9L9Z&filename=speaker_19_center_83.mov',
    '477995964?token=5K220Aqf673VHPq&filename=speaker_23_center_83.mov',
    '477995965?token=v1LVD2KT1cU7Lpb&filename=speaker_24_center_83.mov',
    '477995962?token=tmaSGyyWLA6XCy9&filename=speaker_27_center_83.mov',
    '477995963?token=bTpo96qNDPcwGqb&filename=speaker_10_center_83.mov',
]

for curr_files in tets_name_files:
    _b5.download_file_from_url(url = domain + curr_files, out = True)

# Getting scores
_b5.path_to_dataset_ = PATH_TO_DIR # Dataset directory
_b5.ext_ = ['.mov'] # Search file extensions

# Full path to the file with ground truth scores for accuracy calculation
url_accuracy = _b5.true_traits_['mupta']['sberdisk']

_b5.get_avt_predictions(url_accuracy = url_accuracy, lang = lang)

[2023-12-16 19:13:25] Feature extraction (hand-crafted and deep) from text …

[2023-12-16 19:13:30] Getting scores and accuracy calculation (multimodal fusion) …

10 from 10 (100.0%) … GitHub:nbsphinx-math:OCEANAI\docs\source\user_guide:nbsphinx-math:notebooks\video_MuPTA:nbsphinx-math:test\speaker_27_center_83.mov …

Path Openness Conscientiousness Extraversion Agreeableness Non-Neuroticism
Person ID
1 speaker_01_center_83.mov 0.758137 0.693356 0.650108 0.744589 0.488671
2 speaker_06_center_83.mov 0.681602 0.654339 0.607156 0.731282 0.417908
3 speaker_07_center_83.mov 0.666104 0.656836 0.567863 0.685067 0.378102
4 speaker_10_center_83.mov 0.694171 0.596195 0.571414 0.66223 0.348639
5 speaker_11_center_83.mov 0.712885 0.594764 0.571709 0.716696 0.37802
6 speaker_15_center_83.mov 0.664158 0.670411 0.60421 0.696056 0.399842
7 speaker_19_center_83.mov 0.761213 0.652635 0.651028 0.788677 0.459676
8 speaker_23_center_83.mov 0.692788 0.68324 0.616737 0.795205 0.447242
9 speaker_24_center_83.mov 0.705923 0.658382 0.610645 0.697415 0.411988
10 speaker_27_center_83.mov 0.753417 0.708372 0.654608 0.816416 0.504743

[2023-12-16 19:13:30] Trait-wise accuracy …

Openness Conscientiousness Extraversion Agreeableness Non-Neuroticism Mean
Metrics
MAE 0.0673 0.0789 0.1325 0.102 0.1002 0.0962
Accuracy 0.9327 0.9211 0.8675 0.898 0.8998 0.9038

[2023-12-16 19:13:30] Mean absolute errors: 0.0962, average accuracy: 0.9038 …

Log files saved successfully …

— Runtime: 416.453 sec. —

[7]:
True

To predict consumer preferences for industrial goods, it is necessary to know the correlation coefficients that determine the relationship between personality traits and preferences in goods or services.

As an example, it is proposed to use the correlation coefficients between the personality traits and the characteristics of the cars presented in the article:

  1. O’Connor P. J. et al. What Drives Consumer Automobile Choice? Investigating Personality Trait Predictors of Vehicle Preference Factors // Personality and Individual Differences. – 2022. – Vol. 184. – pp. 111220.

The user can set their own correlation coefficients.

Predicting consumer preferences for industrial goods on the example of car characteristics

[8]:
# Loading dataframe with correlation coefficients
url = 'https://download.sberdisk.ru/download/file/478675818?token=EjfLMqOeK8cfnOu&filename=auto_characteristics.csv'
df_correlation_coefficients = pd.read_csv(url)
df_correlation_coefficients = pd.DataFrame(
    df_correlation_coefficients.drop(['Style and performance', 'Safety and practicality'], axis = 1)
)
df_correlation_coefficients.index.name = 'ID'
df_correlation_coefficients.index += 1
df_correlation_coefficients.index = df_correlation_coefficients.index.map(str)

df_correlation_coefficients
[8]:
Trait Performance Classic car features Luxury additions Fashion and attention Recreation Technology Family friendly Safe and reliable Practical and easy to use Economical/low cost Basic features
ID
1 Openness 0.020000 -0.033333 -0.030000 -0.050000 0.033333 0.013333 -0.030000 0.136667 0.106667 0.093333 0.006667
2 Conscientiousness 0.013333 -0.193333 -0.063333 -0.096667 -0.096667 0.086667 -0.063333 0.280000 0.180000 0.130000 0.143333
3 Extraversion 0.133333 0.060000 0.106667 0.123333 0.126667 0.120000 0.090000 0.136667 0.043333 0.073333 0.050000
4 Agreeableness -0.036667 -0.193333 -0.133333 -0.133333 -0.090000 0.046667 -0.016667 0.240000 0.160000 0.120000 0.083333
5 Non-Neuroticism 0.016667 -0.006667 -0.010000 -0.006667 -0.033333 0.046667 -0.023333 0.093333 0.046667 0.046667 -0.040000
[9]:
_b5._priority_calculation(
    correlation_coefficients = df_correlation_coefficients,
    col_name_ocean = 'Trait',
    threshold = 0.55,
    number_priority = 3,
    number_importance_traits = 3,
    out = False
)

_b5._save_logs(df = _b5.df_files_priority_, name = 'auto_characteristics_priorities_mupta_ru', out = True)

# Optional
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[9]:
Path OPE CON EXT AGR NNEU Priority 1 Priority 2 Priority 3 Trait importance 1 Trait importance 2 Trait importance 3
Person ID
1 speaker_01_center_83.mov 0.758 0.693 0.650 0.745 0.489 Safe and reliable Practical and easy to use Economical/low cost Conscientiousness Agreeableness Openness
2 speaker_06_center_83.mov 0.682 0.654 0.607 0.731 0.418 Safe and reliable Practical and easy to use Economical/low cost Conscientiousness Agreeableness Openness
3 speaker_07_center_83.mov 0.666 0.657 0.568 0.685 0.378 Safe and reliable Practical and easy to use Economical/low cost Conscientiousness Agreeableness Openness
4 speaker_10_center_83.mov 0.694 0.596 0.571 0.662 0.349 Safe and reliable Practical and easy to use Economical/low cost Conscientiousness Agreeableness Openness
5 speaker_11_center_83.mov 0.713 0.595 0.572 0.717 0.378 Safe and reliable Practical and easy to use Economical/low cost Agreeableness Conscientiousness Openness
6 speaker_15_center_83.mov 0.664 0.670 0.604 0.696 0.400 Safe and reliable Practical and easy to use Economical/low cost Conscientiousness Agreeableness Openness
7 speaker_19_center_83.mov 0.761 0.653 0.651 0.789 0.460 Safe and reliable Practical and easy to use Economical/low cost Agreeableness Conscientiousness Openness
8 speaker_23_center_83.mov 0.693 0.683 0.617 0.795 0.447 Safe and reliable Practical and easy to use Economical/low cost Agreeableness Conscientiousness Openness
9 speaker_24_center_83.mov 0.706 0.658 0.611 0.697 0.412 Safe and reliable Practical and easy to use Economical/low cost Conscientiousness Agreeableness Openness
10 speaker_27_center_83.mov 0.753 0.708 0.655 0.816 0.505 Safe and reliable Practical and easy to use Economical/low cost Agreeableness Conscientiousness Openness

Predicting consumer preferences for industrial goods on the example of mobile device application categories

As an example, it is proposed to use the correlation coefficients between the personality traits and the mobile device application categories presented in the article:

  1. Peltonen E., Sharmila P., Asare K. O., Visuri A., Lagerspetz E., Ferreira D. (2020). When phones get personal: Predicting Big Five personality traits from application usage // Pervasive and Mobile Computing. – 2020. – Vol. 69. – 101269.

[10]:
# Loading a dataframe with correlation coefficients
url = 'https://download.sberdisk.ru/download/file/478676690?token=7KcAxPqMpWiYQnx&filename=divice_characteristics.csv'
df_divice_characteristics = pd.read_csv(url)

df_divice_characteristics.index.name = 'ID'
df_divice_characteristics.index += 1
df_divice_characteristics.index = df_divice_characteristics.index.map(str)

df_divice_characteristics
[10]:
Trait Communication Game Action Game Board Game Casino Game Educational Game Simulation Game Trivia Entertainment Finance Health and Fitness Media and Video Music and Audio News and Magazines Personalisation Travel and Local Weather
ID
1 Openness 0.118 0.056 0.079 0.342 0.027 0.104 0.026 0.000 0.006 0.002 0.000 0.000 0.001 0.004 0.002 0.004
2 Conscientiousness 0.119 0.043 0.107 0.448 0.039 0.012 0.119 0.000 0.005 0.001 0.000 0.002 0.002 0.001 0.001 0.003
3 Extraversion 0.246 0.182 0.211 0.311 0.102 0.165 0.223 0.001 0.003 0.000 0.001 0.001 0.001 0.004 0.009 0.003
4 Agreeableness 0.218 0.104 0.164 0.284 0.165 0.122 0.162 0.000 0.003 0.001 0.000 0.002 0.002 0.001 0.004 0.003
5 Non-Neuroticism 0.046 0.047 0.125 0.515 0.272 0.179 0.214 0.002 0.030 0.001 0.000 0.005 0.003 0.008 0.004 0.007
[11]:
_b5._priority_calculation(
    correlation_coefficients = df_divice_characteristics,
    col_name_ocean = 'Trait',
    threshold = 0.55,
    number_priority = 3,
    number_importance_traits = 3,
    out = True
)

_b5._save_logs(df = _b5.df_files_priority_, name = 'divice_characteristics_priorities_mupta_ru', out = True)

# Optional
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[11]:
Path OPE CON EXT AGR NNEU Priority 1 Priority 2 Priority 3 Trait importance 1 Trait importance 2 Trait importance 3
Person ID
1 speaker_01_center_83.mov 0.758 0.693 0.650 0.745 0.489 Game Casino Communication Game Board Extraversion Agreeableness Conscientiousness
2 speaker_06_center_83.mov 0.682 0.654 0.607 0.731 0.418 Game Casino Communication Game Board Agreeableness Extraversion Conscientiousness
3 speaker_07_center_83.mov 0.666 0.657 0.568 0.685 0.378 Game Casino Communication Game Board Agreeableness Conscientiousness Extraversion
4 speaker_10_center_83.mov 0.694 0.596 0.571 0.662 0.349 Game Casino Communication Game Board Agreeableness Extraversion Conscientiousness
5 speaker_11_center_83.mov 0.713 0.595 0.572 0.717 0.378 Game Casino Communication Game Board Agreeableness Extraversion Conscientiousness
6 speaker_15_center_83.mov 0.664 0.670 0.604 0.696 0.400 Game Casino Communication Game Board Extraversion Agreeableness Conscientiousness
7 speaker_19_center_83.mov 0.761 0.653 0.651 0.789 0.460 Game Casino Communication Game Board Agreeableness Extraversion Conscientiousness
8 speaker_23_center_83.mov 0.693 0.683 0.617 0.795 0.447 Game Casino Communication Game Board Agreeableness Extraversion Conscientiousness
9 speaker_24_center_83.mov 0.706 0.658 0.611 0.697 0.412 Game Casino Communication Game Board Extraversion Agreeableness Conscientiousness
10 speaker_27_center_83.mov 0.753 0.708 0.655 0.816 0.505 Game Casino Communication Game Board Agreeableness Extraversion Conscientiousness

MuPTA (en)

[12]:
import os
import pandas as pd

# Module import
from oceanai.modules.lab.build import Run

# Creating an instance of a class
_b5 = Run(lang = 'en')

corpus = 'fi'
lang = 'en'

# Core setup
_b5.path_to_save_ = './models' # Directory to save the models
_b5.chunk_size_ = 2000000      # File download size from network in one step

# Building audio models
res_load_model_hc = _b5.load_audio_model_hc()
res_load_model_nn = _b5.load_audio_model_nn()

# Loading audio model weights
url = _b5.weights_for_big5_['audio'][corpus]['hc']['sberdisk']
res_load_model_weights_hc = _b5.load_audio_model_weights_hc(url = url)

url = _b5.weights_for_big5_['audio'][corpus]['nn']['sberdisk']
res_load_model_weights_nn = _b5.load_audio_model_weights_nn(url = url)

# Building video models
res_load_model_hc = _b5.load_video_model_hc(lang=lang)
res_load_model_deep_fe = _b5.load_video_model_deep_fe()
res_load_model_nn = _b5.load_video_model_nn()

# Loading video model weights
url = _b5.weights_for_big5_['video'][corpus]['hc']['sberdisk']
res_load_model_weights_hc = _b5.load_video_model_weights_hc(url = url)

url = _b5.weights_for_big5_['video'][corpus]['fe']['sberdisk']
res_load_model_weights_deep_fe = _b5.load_video_model_weights_deep_fe(url = url)

url = _b5.weights_for_big5_['video'][corpus]['nn']['sberdisk']
res_load_model_weights_nn = _b5.load_video_model_weights_nn(url = url)

# Loading a dictionary with hand-crafted features (text modality)
res_load_text_features = _b5.load_text_features()

# Building text models
res_setup_translation_model = _b5.setup_translation_model()
res_setup_translation_model = _b5.setup_bert_encoder()
res_load_text_model_hc_fi = _b5.load_text_model_hc(corpus=corpus)
res_load_text_model_nn_fi = _b5.load_text_model_nn(corpus=corpus)

# Loading text model weights
url = _b5.weights_for_big5_['text'][corpus]['hc']['sberdisk']
res_load_text_model_weights_hc_fi = _b5.load_text_model_weights_hc(url = url)

url = _b5.weights_for_big5_['text'][corpus]['nn']['sberdisk']
res_load_text_model_weights_nn_fi = _b5.load_text_model_weights_nn(url = url)

# Building model for multimodal information fusion
res_load_avt_model_b5 = _b5.load_avt_model_b5()

# Building model for multimodal information fusion
url = _b5.weights_for_big5_['avt'][corpus]['b5']['sberdisk']
res_load_avt_model_weights_b5 = _b5.load_avt_model_weights_b5(url = url)

PATH_TO_DIR = './video_MuPTA/'
PATH_SAVE_VIDEO = './video_MuPTA/test/'

_b5.path_to_save_ = PATH_SAVE_VIDEO

# Loading 10 test files from the MuPTA corpus
# URL: https://hci.nw.ru/en/pages/mupta-corpus
domain = 'https://download.sberdisk.ru/download/file/'
tets_name_files = [
    '477995979?token=2cvyk7CS0mHx2MJ&filename=speaker_06_center_83.mov',
    '477995980?token=jGPtBPS69uzFU6Y&filename=speaker_01_center_83.mov',
    '477995967?token=zCaRbNB6ht5wMPq&filename=speaker_11_center_83.mov',
    '477995966?token=B1rbinDYRQKrI3T&filename=speaker_15_center_83.mov',
    '477995978?token=dEpVDtZg1EQiEQ9&filename=speaker_07_center_83.mov',
    '477995961?token=o1hVjw8G45q9L9Z&filename=speaker_19_center_83.mov',
    '477995964?token=5K220Aqf673VHPq&filename=speaker_23_center_83.mov',
    '477995965?token=v1LVD2KT1cU7Lpb&filename=speaker_24_center_83.mov',
    '477995962?token=tmaSGyyWLA6XCy9&filename=speaker_27_center_83.mov',
    '477995963?token=bTpo96qNDPcwGqb&filename=speaker_10_center_83.mov',
]

for curr_files in tets_name_files:
    _b5.download_file_from_url(url = domain + curr_files, out = True)

# Getting scores
_b5.path_to_dataset_ = PATH_TO_DIR # Dataset directory
_b5.ext_ = ['.mov'] # Search file extensions

# Full path to the file with ground truth scores for accuracy calculation
url_accuracy = _b5.true_traits_['mupta']['sberdisk']

_b5.get_avt_predictions(url_accuracy = url_accuracy, lang = lang)

[2023-12-16 19:20:55] Feature extraction (hand-crafted and deep) from text …

[2023-12-16 19:20:57] Getting scores and accuracy calculation (multimodal fusion) …

10 from 10 (100.0%) … GitHub:nbsphinx-math:OCEANAI\docs\source\user_guide:nbsphinx-math:notebooks\video_MuPTA:nbsphinx-math:test\speaker_27_center_83.mov …

Path Openness Conscientiousness Extraversion Agreeableness Non-Neuroticism
Person ID
1 speaker_01_center_83.mov 0.564985 0.539052 0.440615 0.59251 0.488763
2 speaker_06_center_83.mov 0.650774 0.663849 0.607308 0.643847 0.620627
3 speaker_07_center_83.mov 0.435976 0.486683 0.313828 0.415446 0.396618
4 speaker_10_center_83.mov 0.498542 0.511243 0.412592 0.468947 0.44399
5 speaker_11_center_83.mov 0.394776 0.341608 0.327082 0.427304 0.354936
6 speaker_15_center_83.mov 0.566107 0.543811 0.492766 0.587411 0.499433
7 speaker_19_center_83.mov 0.506271 0.438215 0.430894 0.456177 0.44075
8 speaker_23_center_83.mov 0.486463 0.521755 0.309894 0.432291 0.433601
9 speaker_24_center_83.mov 0.417404 0.473339 0.320714 0.445086 0.414649
10 speaker_27_center_83.mov 0.526112 0.661107 0.443167 0.558965 0.554224

[2023-12-16 19:20:57] Trait-wise accuracy …

Openness Conscientiousness Extraversion Agreeableness Non-Neuroticism Mean
Metrics
MAE 0.1727 0.1672 0.1661 0.2579 0.107 0.1742
Accuracy 0.8273 0.8328 0.8339 0.7421 0.893 0.8258

[2023-12-16 19:20:57] Mean absolute errors: 0.1742, average accuracy: 0.8258 …

Log files saved successfully …

— Runtime: 379.936 sec. —

[12]:
True

To predict consumer preferences for industrial goods, it is necessary to know the correlation coefficients that determine the relationship between personality traits and preferences in goods or services.

As an example, it is proposed to use the correlation coefficients between the personality traits and the characteristics of the cars presented in the article:

  1. O’Connor P. J. et al. What Drives Consumer Automobile Choice? Investigating Personality Trait Predictors of Vehicle Preference Factors // Personality and Individual Differences. – 2022. – Vol. 184. – pp. 111220.

The user can set their own correlation coefficients.

Predicting consumer preferences for industrial goods on the example of car characteristics

[13]:
# Loading dataframe with correlation coefficients
url = 'https://download.sberdisk.ru/download/file/478675818?token=EjfLMqOeK8cfnOu&filename=auto_characteristics.csv'
df_correlation_coefficients = pd.read_csv(url)
df_correlation_coefficients = pd.DataFrame(
    df_correlation_coefficients.drop(['Style and performance', 'Safety and practicality'], axis = 1)
)
df_correlation_coefficients.index.name = 'ID'
df_correlation_coefficients.index += 1
df_correlation_coefficients.index = df_correlation_coefficients.index.map(str)

df_correlation_coefficients
[13]:
Trait Performance Classic car features Luxury additions Fashion and attention Recreation Technology Family friendly Safe and reliable Practical and easy to use Economical/low cost Basic features
ID
1 Openness 0.020000 -0.033333 -0.030000 -0.050000 0.033333 0.013333 -0.030000 0.136667 0.106667 0.093333 0.006667
2 Conscientiousness 0.013333 -0.193333 -0.063333 -0.096667 -0.096667 0.086667 -0.063333 0.280000 0.180000 0.130000 0.143333
3 Extraversion 0.133333 0.060000 0.106667 0.123333 0.126667 0.120000 0.090000 0.136667 0.043333 0.073333 0.050000
4 Agreeableness -0.036667 -0.193333 -0.133333 -0.133333 -0.090000 0.046667 -0.016667 0.240000 0.160000 0.120000 0.083333
5 Non-Neuroticism 0.016667 -0.006667 -0.010000 -0.006667 -0.033333 0.046667 -0.023333 0.093333 0.046667 0.046667 -0.040000
[14]:
_b5._priority_calculation(
    correlation_coefficients = df_correlation_coefficients,
    col_name_ocean = 'Trait',
    threshold = 0.55,
    number_priority = 3,
    number_importance_traits = 3,
    out = False
)

_b5._save_logs(df = _b5.df_files_priority_, name = 'auto_characteristics_priorities_mupta_en', out = True)

# Optional
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[14]:
Path OPE CON EXT AGR NNEU Priority 1 Priority 2 Priority 3 Trait importance 1 Trait importance 2 Trait importance 3
Person ID
1 speaker_01_center_83.mov 0.565 0.539 0.441 0.593 0.489 Practical and easy to use Economical/low cost Family friendly Agreeableness Openness Non-Neuroticism
2 speaker_06_center_83.mov 0.651 0.664 0.607 0.644 0.621 Safe and reliable Practical and easy to use Economical/low cost Conscientiousness Agreeableness Openness
3 speaker_07_center_83.mov 0.436 0.487 0.314 0.415 0.397 Classic car features Fashion and attention Luxury additions Agreeableness Conscientiousness Openness
4 speaker_10_center_83.mov 0.499 0.511 0.413 0.469 0.444 Classic car features Fashion and attention Luxury additions Agreeableness Conscientiousness Openness
5 speaker_11_center_83.mov 0.395 0.342 0.327 0.427 0.355 Classic car features Fashion and attention Luxury additions Agreeableness Conscientiousness Openness
6 speaker_15_center_83.mov 0.566 0.544 0.493 0.587 0.499 Practical and easy to use Economical/low cost Family friendly Agreeableness Openness Non-Neuroticism
7 speaker_19_center_83.mov 0.506 0.438 0.431 0.456 0.441 Classic car features Fashion and attention Luxury additions Agreeableness Conscientiousness Openness
8 speaker_23_center_83.mov 0.486 0.522 0.310 0.432 0.434 Classic car features Fashion and attention Luxury additions Agreeableness Conscientiousness Openness
9 speaker_24_center_83.mov 0.417 0.473 0.321 0.445 0.415 Classic car features Fashion and attention Luxury additions Agreeableness Conscientiousness Openness
10 speaker_27_center_83.mov 0.526 0.661 0.443 0.559 0.554 Safe and reliable Practical and easy to use Economical/low cost Conscientiousness Agreeableness Non-Neuroticism

Predicting consumer preferences for industrial goods on the example of mobile device application categories

As an example, it is proposed to use the correlation coefficients between the personality traits and the mobile device application categories presented in the article:

  1. Peltonen E., Sharmila P., Asare K. O., Visuri A., Lagerspetz E., Ferreira D. (2020). When phones get personal: Predicting Big Five personality traits from application usage // Pervasive and Mobile Computing. – 2020. – Vol. 69. – 101269.

[15]:
# Loading a dataframe with correlation coefficients
url = 'https://download.sberdisk.ru/download/file/478676690?token=7KcAxPqMpWiYQnx&filename=divice_characteristics.csv'
df_divice_characteristics = pd.read_csv(url)

df_divice_characteristics.index.name = 'ID'
df_divice_characteristics.index += 1
df_divice_characteristics.index = df_divice_characteristics.index.map(str)

df_divice_characteristics
[15]:
Trait Communication Game Action Game Board Game Casino Game Educational Game Simulation Game Trivia Entertainment Finance Health and Fitness Media and Video Music and Audio News and Magazines Personalisation Travel and Local Weather
ID
1 Openness 0.118 0.056 0.079 0.342 0.027 0.104 0.026 0.000 0.006 0.002 0.000 0.000 0.001 0.004 0.002 0.004
2 Conscientiousness 0.119 0.043 0.107 0.448 0.039 0.012 0.119 0.000 0.005 0.001 0.000 0.002 0.002 0.001 0.001 0.003
3 Extraversion 0.246 0.182 0.211 0.311 0.102 0.165 0.223 0.001 0.003 0.000 0.001 0.001 0.001 0.004 0.009 0.003
4 Agreeableness 0.218 0.104 0.164 0.284 0.165 0.122 0.162 0.000 0.003 0.001 0.000 0.002 0.002 0.001 0.004 0.003
5 Non-Neuroticism 0.046 0.047 0.125 0.515 0.272 0.179 0.214 0.002 0.030 0.001 0.000 0.005 0.003 0.008 0.004 0.007
[16]:
_b5._priority_calculation(
    correlation_coefficients = df_divice_characteristics,
    col_name_ocean = 'Trait',
    threshold = 0.55,
    number_priority = 3,
    number_importance_traits = 3,
    out = True
)

_b5._save_logs(df = _b5.df_files_priority_, name = 'divice_characteristics_priorities_mupta_en', out = True)

# Optional
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[16]:
Path OPE CON EXT AGR NNEU Priority 1 Priority 2 Priority 3 Trait importance 1 Trait importance 2 Trait importance 3
Person ID
1 speaker_01_center_83.mov 0.565 0.539 0.441 0.593 0.489 Communication Health and Fitness Media and Video Agreeableness Openness Non-Neuroticism
2 speaker_06_center_83.mov 0.651 0.664 0.607 0.644 0.621 Game Casino Communication Game Trivia Non-Neuroticism Extraversion Conscientiousness
3 speaker_07_center_83.mov 0.436 0.487 0.314 0.415 0.397 Media and Video Entertainment Health and Fitness Agreeableness Conscientiousness Extraversion
4 speaker_10_center_83.mov 0.499 0.511 0.413 0.469 0.444 Media and Video Entertainment Health and Fitness Agreeableness Conscientiousness Extraversion
5 speaker_11_center_83.mov 0.395 0.342 0.327 0.427 0.355 Media and Video Entertainment Health and Fitness Conscientiousness Agreeableness Extraversion
6 speaker_15_center_83.mov 0.566 0.544 0.493 0.587 0.499 Health and Fitness Media and Video News and Magazines Agreeableness Openness Extraversion
7 speaker_19_center_83.mov 0.506 0.438 0.431 0.456 0.441 Media and Video Entertainment Health and Fitness Conscientiousness Agreeableness Extraversion
8 speaker_23_center_83.mov 0.486 0.522 0.310 0.432 0.434 Media and Video Entertainment Health and Fitness Agreeableness Conscientiousness Extraversion
9 speaker_24_center_83.mov 0.417 0.473 0.321 0.445 0.415 Media and Video Entertainment Health and Fitness Agreeableness Conscientiousness Extraversion
10 speaker_27_center_83.mov 0.526 0.661 0.443 0.559 0.554 Game Casino Game Educational Game Trivia Non-Neuroticism Conscientiousness Agreeableness