Solution of practical task 2
Task: Predicting consumer preferences for industrial goods
The solution of the practical task is performed in two stages. At the first stage it is necessary to use the OCEAN-AI library to obtain predictions (personality traits scores). The second step is to use the _priority_calculation method from the OCEAN-AI library to solve the presented practical task. Examples of the results of the work and implementation are presented below.
Thus, the OCEAN-AI library provides tools to analyze the personality traits of consumers, aiding in predicting their interests. This enables companies to tailor products and services more accurately to consumer preferences, enhancing uniqueness and personalization.
FI V2
[2]:
# Импорт необходимых инструментов
import os
import pandas as pd
# Импорт модуля
from oceanai.modules.lab.build import Run
# Создание экземпляра класса
_b5 = Run()
# Настройка ядра
_b5.path_to_save_ = './models' # Директория для сохранения файла
_b5.chunk_size_ = 2000000 # Размер загрузки файла из сети за 1 шаг
corpus = 'fi'
# Формирование аудиомоделей
res_load_model_hc = _b5.load_audio_model_hc()
res_load_model_nn = _b5.load_audio_model_nn()
# Загрузка весов аудиомоделей
url = _b5.weights_for_big5_['audio'][corpus]['hc']['googledisk']
res_load_model_weights_hc = _b5.load_audio_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['audio'][corpus]['nn']['googledisk']
res_load_model_weights_nn = _b5.load_audio_model_weights_nn(url = url, force_reload = False)
# Формирование видеомоделей
res_load_model_hc = _b5.load_video_model_hc(lang='en')
res_load_model_deep_fe = _b5.load_video_model_deep_fe()
res_load_model_nn = _b5.load_video_model_nn()
# Загрузка весов видеомоделей
url = _b5.weights_for_big5_['video'][corpus]['hc']['googledisk']
res_load_model_weights_hc = _b5.load_video_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['video'][corpus]['fe']['googledisk']
res_load_model_weights_deep_fe = _b5.load_video_model_weights_deep_fe(url = url, force_reload = False)
url = _b5.weights_for_big5_['video'][corpus]['nn']['googledisk']
res_load_model_weights_nn = _b5.load_video_model_weights_nn(url = url, force_reload = False)
# Загрузка словаря с экспертными признаками (текстовая модальность)
res_load_text_features = _b5.load_text_features()
# Формирование текстовых моделей
res_setup_translation_model = _b5.setup_translation_model() # только для русского языка
res_setup_translation_model = _b5.setup_bert_encoder(force_reload = False)
res_load_text_model_hc_fi = _b5.load_text_model_hc(corpus=corpus)
res_load_text_model_nn_fi = _b5.load_text_model_nn(corpus=corpus)
# Загрузка весов текстовых моделей
url = _b5.weights_for_big5_['text'][corpus]['hc']['googledisk']
res_load_text_model_weights_hc_fi = _b5.load_text_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['text'][corpus]['nn']['googledisk']
res_load_text_model_weights_nn_fi = _b5.load_text_model_weights_nn(url = url, force_reload = False)
# Формирование модели для мультимодального объединения информации
res_load_avt_model_b5 = _b5.load_avt_model_b5()
# Загрузка весов модели для мультимодального объединения информации
url = _b5.weights_for_big5_['avt'][corpus]['b5']['googledisk']
res_load_avt_model_weights_b5 = _b5.load_avt_model_weights_b5(url = url, force_reload = False)
PATH_TO_DIR = './video_FI/'
PATH_SAVE_VIDEO = './video_FI/test/'
_b5.path_to_save_ = PATH_SAVE_VIDEO
# Загрузка 10 тестовых аудиовидеозаписей из корпуса First Impression V2
# URL: https://chalearnlap.cvc.uab.cat/dataset/24/description/
domain = 'https://download.sberdisk.ru/download/file/'
tets_name_files = [
'429713680?token=FqHdMLSSh7zYSZt&filename=_plk5k7PBEg.003.mp4',
'429713681?token=Hz9b4lQkrLfic33&filename=be0DQawtVkE.002.mp4',
'429713683?token=EgUXS9Xs8xHm5gz&filename=2d6btbaNdfo.000.mp4',
'429713684?token=1U26753kmPYdIgt&filename=300gK3CnzW0.003.mp4',
'429713685?token=LyigAWLTzDNwKJO&filename=300gK3CnzW0.001.mp4',
'429713686?token=EpfRbCKHyuc4HPu&filename=cLaZxEf1nE4.004.mp4',
'429713687?token=FNTkwqBr4jOS95l&filename=g24JGYuT74A.004.mp4',
'429713688?token=qDT95nz7hfm2Nki&filename=JZNMxa3OKHY.000.mp4',
'429713689?token=noLguEGXDpbcKhg&filename=nvlqJbHk_Lc.003.mp4',
'429713679?token=9L7RQ0hgdJlcek6&filename=4vdJGgZpj4k.003.mp4'
]
for curr_files in tets_name_files:
_b5.download_file_from_url(url = domain + curr_files, out = True)
# Получение прогнозов
_b5.path_to_dataset_ = PATH_TO_DIR # Директория набора данных
_b5.ext_ = ['.mp4'] # Расширения искомых файлов
# Полный путь к файлу с верными предсказаниями для подсчета точности
url_accuracy = _b5.true_traits_[corpus]['googledisk']
_b5.get_avt_predictions(url_accuracy = url_accuracy, lang = 'en')
[2024-10-10 18:10:50] Извлечение признаков (экспертных и нейросетевых) из текста …
[2024-10-10 18:10:50] Получение прогнозов и вычисление точности (мультимодальное объединение) …
10 из 10 (100.0%) … GitHub:nbsphinx-math:OCEANAI\docs\source\user_guide:nbsphinx-math:notebooks\video_FI:nbsphinx-math:test_plk5k7PBEg.003.mp4 …
| Path | Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | |
|---|---|---|---|---|---|---|
| Person ID | ||||||
| 1 | 2d6btbaNdfo.000.mp4 | 0.618917 | 0.660694 | 0.477656 | 0.654437 | 0.601256 |
| 2 | 300gK3CnzW0.001.mp4 | 0.461732 | 0.413451 | 0.415706 | 0.498301 | 0.431224 |
| 3 | 300gK3CnzW0.003.mp4 | 0.468002 | 0.448618 | 0.371742 | 0.509602 | 0.453739 |
| 4 | 4vdJGgZpj4k.003.mp4 | 0.585348 | 0.616446 | 0.49443 | 0.605614 | 0.587017 |
| 5 | be0DQawtVkE.002.mp4 | 0.680991 | 0.56602 | 0.553915 | 0.646545 | 0.64246 |
| 6 | cLaZxEf1nE4.004.mp4 | 0.66342 | 0.551018 | 0.557912 | 0.585238 | 0.587174 |
| 7 | g24JGYuT74A.004.mp4 | 0.590237 | 0.399273 | 0.409554 | 0.531861 | 0.507134 |
| 8 | JZNMxa3OKHY.000.mp4 | 0.60577 | 0.523617 | 0.531137 | 0.594406 | 0.57984 |
| 9 | nvlqJbHk_Lc.003.mp4 | 0.511002 | 0.464702 | 0.390882 | 0.443663 | 0.438811 |
| 10 | _plk5k7PBEg.003.mp4 | 0.647606 | 0.610466 | 0.524718 | 0.61428 | 0.606428 |
[2024-10-10 18:10:50] Точность по отдельным персональным качествам личности человека …
| Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | Mean | |
|---|---|---|---|---|---|---|
| Metrics | ||||||
| MAE | 0.0735 | 0.0631 | 0.0914 | 0.0706 | 0.0691 | 0.0735 |
| Accuracy | 0.9265 | 0.9369 | 0.9086 | 0.9294 | 0.9309 | 0.9265 |
[2024-10-10 18:10:50] Средняя средних абсолютных ошибок: 0.0735, средняя точность: 0.9265 …
Лог файлы успешно сохранены …
— Время выполнения: 35.449 сек. —
[2]:
True
To predict consumer preferences for industrial goods, it is necessary to know the correlation coefficients that determine the relationship between personality traits and preferences in goods or services.
As an example, it is proposed to use the correlation coefficients between the personality traits and the characteristics of the cars presented in the article:
O’Connor P. J. et al. What Drives Consumer Automobile Choice? Investigating Personality Trait Predictors of Vehicle Preference Factors // Personality and Individual Differences. – 2022. – Vol. 184. – pp. 111220.
The user can set their own correlation coefficients.
Predicting consumer preferences for industrial goods on the example of car characteristics
[3]:
# Loading dataframe with correlation coefficients
url = 'https://download.sberdisk.ru/download/file/478675818?token=EjfLMqOeK8cfnOu&filename=auto_characteristics.csv'
df_correlation_coefficients = pd.read_csv(url)
df_correlation_coefficients = pd.DataFrame(
df_correlation_coefficients.drop(['Style and performance', 'Safety and practicality'], axis = 1)
)
df_correlation_coefficients.index.name = 'ID'
df_correlation_coefficients.index += 1
df_correlation_coefficients.index = df_correlation_coefficients.index.map(str)
df_correlation_coefficients
[3]:
| Trait | Performance | Classic car features | Luxury additions | Fashion and attention | Recreation | Technology | Family friendly | Safe and reliable | Practical and easy to use | Economical/low cost | Basic features | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ID | ||||||||||||
| 1 | Openness | 0.020000 | -0.033333 | -0.030000 | -0.050000 | 0.033333 | 0.013333 | -0.030000 | 0.136667 | 0.106667 | 0.093333 | 0.006667 |
| 2 | Conscientiousness | 0.013333 | -0.193333 | -0.063333 | -0.096667 | -0.096667 | 0.086667 | -0.063333 | 0.280000 | 0.180000 | 0.130000 | 0.143333 |
| 3 | Extraversion | 0.133333 | 0.060000 | 0.106667 | 0.123333 | 0.126667 | 0.120000 | 0.090000 | 0.136667 | 0.043333 | 0.073333 | 0.050000 |
| 4 | Agreeableness | -0.036667 | -0.193333 | -0.133333 | -0.133333 | -0.090000 | 0.046667 | -0.016667 | 0.240000 | 0.160000 | 0.120000 | 0.083333 |
| 5 | Non-Neuroticism | 0.016667 | -0.006667 | -0.010000 | -0.006667 | -0.033333 | 0.046667 | -0.023333 | 0.093333 | 0.046667 | 0.046667 | -0.040000 |
[4]:
_b5._priority_calculation(
correlation_coefficients = df_correlation_coefficients,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = False
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'auto_characteristics_priorities_fi_en', out = True)
# Optional
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[4]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | 2d6btbaNdfo.000.mp4 | 0.619 | 0.661 | 0.478 | 0.654 | 0.601 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 2 | 300gK3CnzW0.001.mp4 | 0.462 | 0.413 | 0.416 | 0.498 | 0.431 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 3 | 300gK3CnzW0.003.mp4 | 0.468 | 0.449 | 0.372 | 0.510 | 0.454 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 4 | 4vdJGgZpj4k.003.mp4 | 0.585 | 0.616 | 0.494 | 0.606 | 0.587 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 5 | be0DQawtVkE.002.mp4 | 0.681 | 0.566 | 0.554 | 0.647 | 0.642 | Safe and reliable | Practical and easy to use | Economical/low cost | Agreeableness | Conscientiousness | Openness |
| 6 | cLaZxEf1nE4.004.mp4 | 0.663 | 0.551 | 0.558 | 0.585 | 0.587 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 7 | g24JGYuT74A.004.mp4 | 0.590 | 0.399 | 0.410 | 0.532 | 0.507 | Classic car features | Recreation | Luxury additions | Agreeableness | Conscientiousness | Non-Neuroticism |
| 8 | JZNMxa3OKHY.000.mp4 | 0.606 | 0.524 | 0.531 | 0.594 | 0.580 | Practical and easy to use | Safe and reliable | Economical/low cost | Agreeableness | Openness | Non-Neuroticism |
| 9 | nvlqJbHk_Lc.003.mp4 | 0.511 | 0.465 | 0.391 | 0.444 | 0.439 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 10 | _plk5k7PBEg.003.mp4 | 0.648 | 0.610 | 0.525 | 0.614 | 0.606 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
Predicting consumer preferences for industrial goods on the example of mobile device application categories
As an example, it is proposed to use the correlation coefficients between the personality traits and the mobile device application categories presented in the article:
Peltonen E., Sharmila P., Asare K. O., Visuri A., Lagerspetz E., Ferreira D. (2020). When phones get personal: Predicting Big Five personality traits from application usage // Pervasive and Mobile Computing. – 2020. – Vol. 69. – 101269.
[5]:
# Loading a dataframe with correlation coefficients
url = 'https://download.sberdisk.ru/download/file/478676690?token=7KcAxPqMpWiYQnx&filename=divice_characteristics.csv'
df_divice_characteristics = pd.read_csv(url)
df_divice_characteristics.index.name = 'ID'
df_divice_characteristics.index += 1
df_divice_characteristics.index = df_divice_characteristics.index.map(str)
df_divice_characteristics
[5]:
| Trait | Communication | Game Action | Game Board | Game Casino | Game Educational | Game Simulation | Game Trivia | Entertainment | Finance | Health and Fitness | Media and Video | Music and Audio | News and Magazines | Personalisation | Travel and Local | Weather | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ID | |||||||||||||||||
| 1 | Openness | 0.118 | 0.056 | 0.079 | 0.342 | 0.027 | 0.104 | 0.026 | 0.000 | 0.006 | 0.002 | 0.000 | 0.000 | 0.001 | 0.004 | 0.002 | 0.004 |
| 2 | Conscientiousness | 0.119 | 0.043 | 0.107 | 0.448 | 0.039 | 0.012 | 0.119 | 0.000 | 0.005 | 0.001 | 0.000 | 0.002 | 0.002 | 0.001 | 0.001 | 0.003 |
| 3 | Extraversion | 0.246 | 0.182 | 0.211 | 0.311 | 0.102 | 0.165 | 0.223 | 0.001 | 0.003 | 0.000 | 0.001 | 0.001 | 0.001 | 0.004 | 0.009 | 0.003 |
| 4 | Agreeableness | 0.218 | 0.104 | 0.164 | 0.284 | 0.165 | 0.122 | 0.162 | 0.000 | 0.003 | 0.001 | 0.000 | 0.002 | 0.002 | 0.001 | 0.004 | 0.003 |
| 5 | Non-Neuroticism | 0.046 | 0.047 | 0.125 | 0.515 | 0.272 | 0.179 | 0.214 | 0.002 | 0.030 | 0.001 | 0.000 | 0.005 | 0.003 | 0.008 | 0.004 | 0.007 |
[6]:
_b5._priority_calculation(
correlation_coefficients = df_divice_characteristics,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = True
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'divice_characteristics_priorities_fi_en', out = True)
# Optional
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[6]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | 2d6btbaNdfo.000.mp4 | 0.619 | 0.661 | 0.478 | 0.654 | 0.601 | Game Casino | Game Educational | Game Trivia | Non-Neuroticism | Conscientiousness | Agreeableness |
| 2 | 300gK3CnzW0.001.mp4 | 0.462 | 0.413 | 0.416 | 0.498 | 0.431 | Media and Video | Entertainment | Health and Fitness | Conscientiousness | Agreeableness | Extraversion |
| 3 | 300gK3CnzW0.003.mp4 | 0.468 | 0.449 | 0.372 | 0.510 | 0.454 | Media and Video | Entertainment | Health and Fitness | Conscientiousness | Agreeableness | Extraversion |
| 4 | 4vdJGgZpj4k.003.mp4 | 0.585 | 0.616 | 0.494 | 0.606 | 0.587 | Game Casino | Game Educational | Game Trivia | Non-Neuroticism | Conscientiousness | Agreeableness |
| 5 | be0DQawtVkE.002.mp4 | 0.681 | 0.566 | 0.554 | 0.647 | 0.642 | Game Casino | Communication | Game Trivia | Non-Neuroticism | Extraversion | Agreeableness |
| 6 | cLaZxEf1nE4.004.mp4 | 0.663 | 0.551 | 0.558 | 0.585 | 0.587 | Game Casino | Communication | Game Trivia | Non-Neuroticism | Extraversion | Agreeableness |
| 7 | g24JGYuT74A.004.mp4 | 0.590 | 0.399 | 0.410 | 0.532 | 0.507 | Health and Fitness | Media and Video | Entertainment | Openness | Conscientiousness | Agreeableness |
| 8 | JZNMxa3OKHY.000.mp4 | 0.606 | 0.524 | 0.531 | 0.594 | 0.580 | Game Casino | Game Educational | Game Simulation | Non-Neuroticism | Agreeableness | Openness |
| 9 | nvlqJbHk_Lc.003.mp4 | 0.511 | 0.465 | 0.391 | 0.444 | 0.439 | Media and Video | Entertainment | Health and Fitness | Agreeableness | Conscientiousness | Extraversion |
| 10 | _plk5k7PBEg.003.mp4 | 0.648 | 0.610 | 0.525 | 0.614 | 0.606 | Game Casino | Game Educational | Game Trivia | Non-Neuroticism | Agreeableness | Conscientiousness |
Прогнозирование потребительских предпочтений по стилю одежды
В качестве примера предлагается использование коэффициентов корреляции между персональными качествами человека и стилем одежды, представленными в статье:
Stolovy T. Styling the self: clothing practices, personality traits, and body image among Israeli women // Frontiers in psychology. - 2022. - vol. 12. - 719318.
[7]:
# Загрузка датафрейма с коэффициентами корреляции
url = 'https://download.sberdisk.ru/download/file/493644097?token=KGtSGMxjZtWXmBz&filename=df_%D1%81lothing_style_correlation.csv'
df_clothing_styles = pd.read_csv(url)
df_clothing_styles.index.name = 'ID'
df_clothing_styles.index += 1
df_clothing_styles.index = df_clothing_styles.index.map(str)
df_clothing_styles
[7]:
| Trait | Comfort | Camouflage | Assurance | Fashion | Individuality | |
|---|---|---|---|---|---|---|
| ID | ||||||
| 1 | Openness | 0.01 | -0.24 | 0.31 | 0.07 | 0.31 |
| 2 | Conscientiousness | -0.03 | -0.24 | 0.17 | 0.09 | 0.15 |
| 3 | Extraversion | -0.01 | -0.19 | 0.30 | 0.13 | 0.14 |
| 4 | Agreeableness | 0.16 | -0.16 | 0.15 | -0.09 | -0.05 |
| 5 | Non-Neuroticism | 0.03 | -0.16 | 0.01 | 0.00 | 0.06 |
[8]:
_b5._priority_calculation(
correlation_coefficients = df_clothing_styles,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = True
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'clothing_styles_priorities_fi_en', out = True)
# Опционно
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[8]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | 2d6btbaNdfo.000.mp4 | 0.619 | 0.661 | 0.478 | 0.654 | 0.601 | Assurance | Individuality | Comfort | Openness | Conscientiousness | Agreeableness |
| 2 | 300gK3CnzW0.001.mp4 | 0.462 | 0.413 | 0.416 | 0.498 | 0.431 | Camouflage | Fashion | Comfort | Conscientiousness | Openness | Non-Neuroticism |
| 3 | 300gK3CnzW0.003.mp4 | 0.468 | 0.449 | 0.372 | 0.510 | 0.454 | Camouflage | Fashion | Comfort | Conscientiousness | Openness | Non-Neuroticism |
| 4 | 4vdJGgZpj4k.003.mp4 | 0.585 | 0.616 | 0.494 | 0.606 | 0.587 | Assurance | Individuality | Comfort | Openness | Conscientiousness | Agreeableness |
| 5 | be0DQawtVkE.002.mp4 | 0.681 | 0.566 | 0.554 | 0.647 | 0.642 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 6 | cLaZxEf1nE4.004.mp4 | 0.663 | 0.551 | 0.558 | 0.585 | 0.587 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 7 | g24JGYuT74A.004.mp4 | 0.590 | 0.399 | 0.410 | 0.532 | 0.507 | Camouflage | Individuality | Fashion | Agreeableness | Openness | Non-Neuroticism |
| 8 | JZNMxa3OKHY.000.mp4 | 0.606 | 0.524 | 0.531 | 0.594 | 0.580 | Comfort | Individuality | Assurance | Openness | Agreeableness | Non-Neuroticism |
| 9 | nvlqJbHk_Lc.003.mp4 | 0.511 | 0.465 | 0.391 | 0.444 | 0.439 | Camouflage | Comfort | Fashion | Conscientiousness | Openness | Non-Neuroticism |
| 10 | _plk5k7PBEg.003.mp4 | 0.648 | 0.610 | 0.525 | 0.614 | 0.606 | Assurance | Individuality | Comfort | Openness | Conscientiousness | Agreeableness |
MuPTA (ru)
[9]:
import os
import pandas as pd
# Импорт модуля
from oceanai.modules.lab.build import Run
# Создание экземпляра класса
_b5 = Run()
corpus = 'mupta'
lang = 'ru'
# Настройка ядра
_b5.path_to_save_ = './models' # Директория для сохранения файла
_b5.chunk_size_ = 2000000 # Размер загрузки файла из сети за 1 шаг
# Формирование аудиомоделей
res_load_model_hc = _b5.load_audio_model_hc()
res_load_model_nn = _b5.load_audio_model_nn()
# Загрузка весов аудиомоделей
url = _b5.weights_for_big5_['audio'][corpus]['hc']['googledisk']
res_load_model_weights_hc = _b5.load_audio_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['audio'][corpus]['nn']['googledisk']
res_load_model_weights_nn = _b5.load_audio_model_weights_nn(url = url, force_reload = False)
# Формирование видеомоделей
res_load_model_hc = _b5.load_video_model_hc(lang=lang)
res_load_model_deep_fe = _b5.load_video_model_deep_fe()
res_load_model_nn = _b5.load_video_model_nn()
# Загрузка весов видеомоделей
url = _b5.weights_for_big5_['video'][corpus]['hc']['googledisk']
res_load_model_weights_hc = _b5.load_video_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['video'][corpus]['fe']['googledisk']
res_load_model_weights_deep_fe = _b5.load_video_model_weights_deep_fe(url = url, force_reload = False)
url = _b5.weights_for_big5_['video'][corpus]['nn']['googledisk']
res_load_model_weights_nn = _b5.load_video_model_weights_nn(url = url, force_reload = False)
# Загрузка словаря с экспертными признаками (текстовая модальность)
res_load_text_features = _b5.load_text_features()
# Формирование текстовых моделей
res_setup_translation_model = _b5.setup_translation_model() # только для русского языка
res_setup_translation_model = _b5.setup_bert_encoder(force_reload = False)
res_load_text_model_hc_fi = _b5.load_text_model_hc(corpus=corpus)
res_load_text_model_nn_fi = _b5.load_text_model_nn(corpus=corpus)
# Загрузка весов текстовых моделей
url = _b5.weights_for_big5_['text'][corpus]['hc']['googledisk']
res_load_text_model_weights_hc_fi = _b5.load_text_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['text'][corpus]['nn']['googledisk']
res_load_text_model_weights_nn_fi = _b5.load_text_model_weights_nn(url = url, force_reload = False)
# Формирование модели для мультимодального объединения информации
res_load_avt_model_b5 = _b5.load_avt_model_b5()
# Загрузка весов модели для мультимодального объединения информации
url = _b5.weights_for_big5_['avt'][corpus]['b5']['googledisk']
res_load_avt_model_weights_b5 = _b5.load_avt_model_weights_b5(url = url, force_reload = False)
PATH_TO_DIR = './video_MuPTA/'
PATH_SAVE_VIDEO = './video_MuPTA/test/'
_b5.path_to_save_ = PATH_SAVE_VIDEO
# Загрузка 10 тестовых аудиовидеозаписей из корпуса MuPTA
# URL: https://hci.nw.ru/en/pages/mupta-corpus
domain = 'https://download.sberdisk.ru/download/file/'
tets_name_files = [
'477995979?token=2cvyk7CS0mHx2MJ&filename=speaker_06_center_83.mov',
'477995980?token=jGPtBPS69uzFU6Y&filename=speaker_01_center_83.mov',
'477995967?token=zCaRbNB6ht5wMPq&filename=speaker_11_center_83.mov',
'477995966?token=B1rbinDYRQKrI3T&filename=speaker_15_center_83.mov',
'477995978?token=dEpVDtZg1EQiEQ9&filename=speaker_07_center_83.mov',
'477995961?token=o1hVjw8G45q9L9Z&filename=speaker_19_center_83.mov',
'477995964?token=5K220Aqf673VHPq&filename=speaker_23_center_83.mov',
'477995965?token=v1LVD2KT1cU7Lpb&filename=speaker_24_center_83.mov',
'477995962?token=tmaSGyyWLA6XCy9&filename=speaker_27_center_83.mov',
'477995963?token=bTpo96qNDPcwGqb&filename=speaker_10_center_83.mov',
]
for curr_files in tets_name_files:
_b5.download_file_from_url(url = domain + curr_files, out = True)
# Получение прогнозов
_b5.path_to_dataset_ = PATH_TO_DIR # Директория набора данных
_b5.ext_ = ['.mov'] # Расширения искомых файлов
# Полный путь к файлу с верными предсказаниями для подсчета точности
url_accuracy = _b5.true_traits_['mupta']['sberdisk']
_b5.get_avt_predictions(url_accuracy = url_accuracy, lang = lang)
[2024-10-10 18:20:29] Извлечение признаков (экспертных и нейросетевых) из текста …
[2024-10-10 18:20:30] Получение прогнозов и вычисление точности (мультимодальное объединение) …
10 из 10 (100.0%) … GitHub:nbsphinx-math:OCEANAI\docs\source\user_guide:nbsphinx-math:notebooks\video_MuPTA:nbsphinx-math:test\speaker_27_center_83.mov …
| Path | Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | |
|---|---|---|---|---|---|---|
| Person ID | ||||||
| 1 | speaker_01_center_83.mov | 0.765745 | 0.696637 | 0.656309 | 0.75986 | 0.494141 |
| 2 | speaker_06_center_83.mov | 0.686514 | 0.659488 | 0.611838 | 0.749739 | 0.420672 |
| 3 | speaker_07_center_83.mov | 0.671993 | 0.661216 | 0.571759 | 0.704542 | 0.381026 |
| 4 | speaker_10_center_83.mov | 0.69828 | 0.59893 | 0.571893 | 0.674907 | 0.35082 |
| 5 | speaker_11_center_83.mov | 0.718329 | 0.598986 | 0.573518 | 0.73201 | 0.379845 |
| 6 | speaker_15_center_83.mov | 0.670932 | 0.671055 | 0.602337 | 0.708656 | 0.399527 |
| 7 | speaker_19_center_83.mov | 0.767261 | 0.658167 | 0.653367 | 0.801366 | 0.463443 |
| 8 | speaker_23_center_83.mov | 0.699837 | 0.684907 | 0.616671 | 0.806437 | 0.447853 |
| 9 | speaker_24_center_83.mov | 0.710566 | 0.66299 | 0.610562 | 0.711242 | 0.413696 |
| 10 | speaker_27_center_83.mov | 0.759404 | 0.712562 | 0.658357 | 0.830507 | 0.507612 |
[2024-10-10 18:20:30] Точность по отдельным персональным качествам личности человека …
| Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | Mean | |
|---|---|---|---|---|---|---|
| Metrics | ||||||
| MAE | 0.0706 | 0.0788 | 0.1328 | 0.1071 | 0.1002 | 0.0979 |
| Accuracy | 0.9294 | 0.9212 | 0.8672 | 0.8929 | 0.8998 | 0.9021 |
[2024-10-10 18:20:30] Средняя средних абсолютных ошибок: 0.0979, средняя точность: 0.9021 …
Лог файлы успешно сохранены …
— Время выполнения: 324.067 сек. —
[9]:
True
To predict consumer preferences for industrial goods, it is necessary to know the correlation coefficients that determine the relationship between personality traits and preferences in goods or services.
As an example, it is proposed to use the correlation coefficients between the personality traits and the characteristics of the cars presented in the article:
O’Connor P. J. et al. What Drives Consumer Automobile Choice? Investigating Personality Trait Predictors of Vehicle Preference Factors // Personality and Individual Differences. – 2022. – Vol. 184. – pp. 111220.
The user can set their own correlation coefficients.
Predicting consumer preferences for industrial goods on the example of car characteristics
[10]:
# Loading dataframe with correlation coefficients
url = 'https://download.sberdisk.ru/download/file/478675818?token=EjfLMqOeK8cfnOu&filename=auto_characteristics.csv'
df_correlation_coefficients = pd.read_csv(url)
df_correlation_coefficients = pd.DataFrame(
df_correlation_coefficients.drop(['Style and performance', 'Safety and practicality'], axis = 1)
)
df_correlation_coefficients.index.name = 'ID'
df_correlation_coefficients.index += 1
df_correlation_coefficients.index = df_correlation_coefficients.index.map(str)
df_correlation_coefficients
[10]:
| Trait | Performance | Classic car features | Luxury additions | Fashion and attention | Recreation | Technology | Family friendly | Safe and reliable | Practical and easy to use | Economical/low cost | Basic features | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ID | ||||||||||||
| 1 | Openness | 0.020000 | -0.033333 | -0.030000 | -0.050000 | 0.033333 | 0.013333 | -0.030000 | 0.136667 | 0.106667 | 0.093333 | 0.006667 |
| 2 | Conscientiousness | 0.013333 | -0.193333 | -0.063333 | -0.096667 | -0.096667 | 0.086667 | -0.063333 | 0.280000 | 0.180000 | 0.130000 | 0.143333 |
| 3 | Extraversion | 0.133333 | 0.060000 | 0.106667 | 0.123333 | 0.126667 | 0.120000 | 0.090000 | 0.136667 | 0.043333 | 0.073333 | 0.050000 |
| 4 | Agreeableness | -0.036667 | -0.193333 | -0.133333 | -0.133333 | -0.090000 | 0.046667 | -0.016667 | 0.240000 | 0.160000 | 0.120000 | 0.083333 |
| 5 | Non-Neuroticism | 0.016667 | -0.006667 | -0.010000 | -0.006667 | -0.033333 | 0.046667 | -0.023333 | 0.093333 | 0.046667 | 0.046667 | -0.040000 |
[11]:
_b5._priority_calculation(
correlation_coefficients = df_correlation_coefficients,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = False
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'auto_characteristics_priorities_mupta_ru', out = True)
# Optional
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[11]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | speaker_01_center_83.mov | 0.766 | 0.697 | 0.656 | 0.760 | 0.494 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 2 | speaker_06_center_83.mov | 0.687 | 0.659 | 0.612 | 0.750 | 0.421 | Safe and reliable | Practical and easy to use | Economical/low cost | Agreeableness | Conscientiousness | Openness |
| 3 | speaker_07_center_83.mov | 0.672 | 0.661 | 0.572 | 0.705 | 0.381 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 4 | speaker_10_center_83.mov | 0.698 | 0.599 | 0.572 | 0.675 | 0.351 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 5 | speaker_11_center_83.mov | 0.718 | 0.599 | 0.574 | 0.732 | 0.380 | Safe and reliable | Practical and easy to use | Economical/low cost | Agreeableness | Conscientiousness | Openness |
| 6 | speaker_15_center_83.mov | 0.671 | 0.671 | 0.602 | 0.709 | 0.400 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 7 | speaker_19_center_83.mov | 0.767 | 0.658 | 0.653 | 0.801 | 0.463 | Safe and reliable | Practical and easy to use | Economical/low cost | Agreeableness | Conscientiousness | Openness |
| 8 | speaker_23_center_83.mov | 0.700 | 0.685 | 0.617 | 0.806 | 0.448 | Safe and reliable | Practical and easy to use | Economical/low cost | Agreeableness | Conscientiousness | Openness |
| 9 | speaker_24_center_83.mov | 0.711 | 0.663 | 0.611 | 0.711 | 0.414 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 10 | speaker_27_center_83.mov | 0.759 | 0.713 | 0.658 | 0.831 | 0.508 | Safe and reliable | Practical and easy to use | Economical/low cost | Agreeableness | Conscientiousness | Openness |
Predicting consumer preferences for industrial goods on the example of mobile device application categories
As an example, it is proposed to use the correlation coefficients between the personality traits and the mobile device application categories presented in the article:
Peltonen E., Sharmila P., Asare K. O., Visuri A., Lagerspetz E., Ferreira D. (2020). When phones get personal: Predicting Big Five personality traits from application usage // Pervasive and Mobile Computing. – 2020. – Vol. 69. – 101269.
[12]:
# Loading a dataframe with correlation coefficients
url = 'https://download.sberdisk.ru/download/file/478676690?token=7KcAxPqMpWiYQnx&filename=divice_characteristics.csv'
df_divice_characteristics = pd.read_csv(url)
df_divice_characteristics.index.name = 'ID'
df_divice_characteristics.index += 1
df_divice_characteristics.index = df_divice_characteristics.index.map(str)
df_divice_characteristics
[12]:
| Trait | Communication | Game Action | Game Board | Game Casino | Game Educational | Game Simulation | Game Trivia | Entertainment | Finance | Health and Fitness | Media and Video | Music and Audio | News and Magazines | Personalisation | Travel and Local | Weather | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ID | |||||||||||||||||
| 1 | Openness | 0.118 | 0.056 | 0.079 | 0.342 | 0.027 | 0.104 | 0.026 | 0.000 | 0.006 | 0.002 | 0.000 | 0.000 | 0.001 | 0.004 | 0.002 | 0.004 |
| 2 | Conscientiousness | 0.119 | 0.043 | 0.107 | 0.448 | 0.039 | 0.012 | 0.119 | 0.000 | 0.005 | 0.001 | 0.000 | 0.002 | 0.002 | 0.001 | 0.001 | 0.003 |
| 3 | Extraversion | 0.246 | 0.182 | 0.211 | 0.311 | 0.102 | 0.165 | 0.223 | 0.001 | 0.003 | 0.000 | 0.001 | 0.001 | 0.001 | 0.004 | 0.009 | 0.003 |
| 4 | Agreeableness | 0.218 | 0.104 | 0.164 | 0.284 | 0.165 | 0.122 | 0.162 | 0.000 | 0.003 | 0.001 | 0.000 | 0.002 | 0.002 | 0.001 | 0.004 | 0.003 |
| 5 | Non-Neuroticism | 0.046 | 0.047 | 0.125 | 0.515 | 0.272 | 0.179 | 0.214 | 0.002 | 0.030 | 0.001 | 0.000 | 0.005 | 0.003 | 0.008 | 0.004 | 0.007 |
[13]:
_b5._priority_calculation(
correlation_coefficients = df_divice_characteristics,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = True
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'divice_characteristics_priorities_mupta_ru', out = True)
# Optional
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[13]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | speaker_01_center_83.mov | 0.766 | 0.697 | 0.656 | 0.760 | 0.494 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 2 | speaker_06_center_83.mov | 0.687 | 0.659 | 0.612 | 0.750 | 0.421 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 3 | speaker_07_center_83.mov | 0.672 | 0.661 | 0.572 | 0.705 | 0.381 | Game Casino | Communication | Game Board | Agreeableness | Conscientiousness | Extraversion |
| 4 | speaker_10_center_83.mov | 0.698 | 0.599 | 0.572 | 0.675 | 0.351 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 5 | speaker_11_center_83.mov | 0.718 | 0.599 | 0.574 | 0.732 | 0.380 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 6 | speaker_15_center_83.mov | 0.671 | 0.671 | 0.602 | 0.709 | 0.400 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 7 | speaker_19_center_83.mov | 0.767 | 0.658 | 0.653 | 0.801 | 0.463 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 8 | speaker_23_center_83.mov | 0.700 | 0.685 | 0.617 | 0.806 | 0.448 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 9 | speaker_24_center_83.mov | 0.711 | 0.663 | 0.611 | 0.711 | 0.414 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 10 | speaker_27_center_83.mov | 0.759 | 0.713 | 0.658 | 0.831 | 0.508 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
Прогнозирование потребительских предпочтений по стилю одежды
В качестве примера предлагается использование коэффициентов корреляции между персональными качествами человека и стилем одежды, представленными в статье:
Stolovy T. Styling the self: clothing practices, personality traits, and body image among Israeli women // Frontiers in psychology. - 2022. - vol. 12. - 719318.
[14]:
# Загрузка датафрейма с коэффициентами корреляции
url = 'https://download.sberdisk.ru/download/file/493644097?token=KGtSGMxjZtWXmBz&filename=df_%D1%81lothing_style_correlation.csv'
df_clothing_styles = pd.read_csv(url)
df_clothing_styles.index.name = 'ID'
df_clothing_styles.index += 1
df_clothing_styles.index = df_clothing_styles.index.map(str)
df_clothing_styles
[14]:
| Trait | Comfort | Camouflage | Assurance | Fashion | Individuality | |
|---|---|---|---|---|---|---|
| ID | ||||||
| 1 | Openness | 0.01 | -0.24 | 0.31 | 0.07 | 0.31 |
| 2 | Conscientiousness | -0.03 | -0.24 | 0.17 | 0.09 | 0.15 |
| 3 | Extraversion | -0.01 | -0.19 | 0.30 | 0.13 | 0.14 |
| 4 | Agreeableness | 0.16 | -0.16 | 0.15 | -0.09 | -0.05 |
| 5 | Non-Neuroticism | 0.03 | -0.16 | 0.01 | 0.00 | 0.06 |
[15]:
_b5._priority_calculation(
correlation_coefficients = df_clothing_styles,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = True
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'clothing_styles_priorities_mupta_ru', out = True)
# Опционно
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[15]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | speaker_01_center_83.mov | 0.766 | 0.697 | 0.656 | 0.760 | 0.494 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 2 | speaker_06_center_83.mov | 0.687 | 0.659 | 0.612 | 0.750 | 0.421 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 3 | speaker_07_center_83.mov | 0.672 | 0.661 | 0.572 | 0.705 | 0.381 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 4 | speaker_10_center_83.mov | 0.698 | 0.599 | 0.572 | 0.675 | 0.351 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 5 | speaker_11_center_83.mov | 0.718 | 0.599 | 0.574 | 0.732 | 0.380 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 6 | speaker_15_center_83.mov | 0.671 | 0.671 | 0.602 | 0.709 | 0.400 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 7 | speaker_19_center_83.mov | 0.767 | 0.658 | 0.653 | 0.801 | 0.463 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 8 | speaker_23_center_83.mov | 0.700 | 0.685 | 0.617 | 0.806 | 0.448 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 9 | speaker_24_center_83.mov | 0.711 | 0.663 | 0.611 | 0.711 | 0.414 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 10 | speaker_27_center_83.mov | 0.759 | 0.713 | 0.658 | 0.831 | 0.508 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
MuPTA (en)
[16]:
import os
import pandas as pd
# Импорт модуля
from oceanai.modules.lab.build import Run
# Создание экземпляра класса
_b5 = Run()
corpus = 'fi'
lang = 'en'
# Настройка ядра
_b5.path_to_save_ = './models' # Директория для сохранения файла
_b5.chunk_size_ = 2000000 # Размер загрузки файла из сети за 1 шаг
# Формирование аудиомоделей
res_load_model_hc = _b5.load_audio_model_hc()
res_load_model_nn = _b5.load_audio_model_nn()
# Загрузка весов аудиомоделей
url = _b5.weights_for_big5_['audio'][corpus]['hc']['googledisk']
res_load_model_weights_hc = _b5.load_audio_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['audio'][corpus]['nn']['googledisk']
res_load_model_weights_nn = _b5.load_audio_model_weights_nn(url = url, force_reload = False)
# Формирование видеомоделей
res_load_model_hc = _b5.load_video_model_hc(lang=lang)
res_load_model_deep_fe = _b5.load_video_model_deep_fe()
res_load_model_nn = _b5.load_video_model_nn()
# Загрузка весов видеомоделей
url = _b5.weights_for_big5_['video'][corpus]['hc']['googledisk']
res_load_model_weights_hc = _b5.load_video_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['video'][corpus]['fe']['googledisk']
res_load_model_weights_deep_fe = _b5.load_video_model_weights_deep_fe(url = url, force_reload = False)
url = _b5.weights_for_big5_['video'][corpus]['nn']['googledisk']
res_load_model_weights_nn = _b5.load_video_model_weights_nn(url = url, force_reload = False)
# Загрузка словаря с экспертными признаками (текстовая модальность)
res_load_text_features = _b5.load_text_features()
# Формирование текстовых моделей
res_setup_translation_model = _b5.setup_translation_model() # только для русского языка
res_setup_translation_model = _b5.setup_bert_encoder(force_reload = False)
res_load_text_model_hc_fi = _b5.load_text_model_hc(corpus=corpus)
res_load_text_model_nn_fi = _b5.load_text_model_nn(corpus=corpus)
# Загрузка весов текстовых моделей
url = _b5.weights_for_big5_['text'][corpus]['hc']['googledisk']
res_load_text_model_weights_hc_fi = _b5.load_text_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['text'][corpus]['nn']['googledisk']
res_load_text_model_weights_nn_fi = _b5.load_text_model_weights_nn(url = url, force_reload = False)
# Формирование модели для мультимодального объединения информации
res_load_avt_model_b5 = _b5.load_avt_model_b5()
# Загрузка весов модели для мультимодального объединения информации
url = _b5.weights_for_big5_['avt'][corpus]['b5']['googledisk']
res_load_avt_model_weights_b5 = _b5.load_avt_model_weights_b5(url = url, force_reload = False)
PATH_TO_DIR = './video_MuPTA/'
PATH_SAVE_VIDEO = './video_MuPTA/test/'
_b5.path_to_save_ = PATH_SAVE_VIDEO
# Загрузка 10 тестовых аудиовидеозаписей из корпуса MuPTA
# URL: https://hci.nw.ru/en/pages/mupta-corpus
domain = 'https://download.sberdisk.ru/download/file/'
tets_name_files = [
'477995979?token=2cvyk7CS0mHx2MJ&filename=speaker_06_center_83.mov',
'477995980?token=jGPtBPS69uzFU6Y&filename=speaker_01_center_83.mov',
'477995967?token=zCaRbNB6ht5wMPq&filename=speaker_11_center_83.mov',
'477995966?token=B1rbinDYRQKrI3T&filename=speaker_15_center_83.mov',
'477995978?token=dEpVDtZg1EQiEQ9&filename=speaker_07_center_83.mov',
'477995961?token=o1hVjw8G45q9L9Z&filename=speaker_19_center_83.mov',
'477995964?token=5K220Aqf673VHPq&filename=speaker_23_center_83.mov',
'477995965?token=v1LVD2KT1cU7Lpb&filename=speaker_24_center_83.mov',
'477995962?token=tmaSGyyWLA6XCy9&filename=speaker_27_center_83.mov',
'477995963?token=bTpo96qNDPcwGqb&filename=speaker_10_center_83.mov',
]
for curr_files in tets_name_files:
_b5.download_file_from_url(url = domain + curr_files, out = True)
# Получение прогнозов
_b5.path_to_dataset_ = PATH_TO_DIR # Директория набора данных
_b5.ext_ = ['.mov'] # Расширения искомых файлов
# Полный путь к файлу с верными предсказаниями для подсчета точности
url_accuracy = _b5.true_traits_['mupta']['sberdisk']
_b5.get_avt_predictions(url_accuracy = url_accuracy, lang = lang)
[2024-10-10 18:29:55] Извлечение признаков (экспертных и нейросетевых) из текста …
[2024-10-10 18:29:56] Получение прогнозов и вычисление точности (мультимодальное объединение) …
10 из 10 (100.0%) … GitHub:nbsphinx-math:OCEANAI\docs\source\user_guide:nbsphinx-math:notebooks\video_MuPTA:nbsphinx-math:test\speaker_27_center_83.mov …
| Path | Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | |
|---|---|---|---|---|---|---|
| Person ID | ||||||
| 1 | speaker_01_center_83.mov | 0.59561 | 0.542967 | 0.440668 | 0.589769 | 0.515306 |
| 2 | speaker_06_center_83.mov | 0.661347 | 0.673973 | 0.603208 | 0.64543 | 0.6431 |
| 3 | speaker_07_center_83.mov | 0.439868 | 0.465049 | 0.284547 | 0.422551 | 0.396058 |
| 4 | speaker_10_center_83.mov | 0.47715 | 0.502563 | 0.373686 | 0.441372 | 0.424637 |
| 5 | speaker_11_center_83.mov | 0.403292 | 0.344359 | 0.317304 | 0.422228 | 0.384346 |
| 6 | speaker_15_center_83.mov | 0.581837 | 0.562177 | 0.504623 | 0.602169 | 0.522254 |
| 7 | speaker_19_center_83.mov | 0.510444 | 0.448468 | 0.425599 | 0.451861 | 0.447891 |
| 8 | speaker_23_center_83.mov | 0.500526 | 0.541376 | 0.308529 | 0.441178 | 0.452412 |
| 9 | speaker_24_center_83.mov | 0.427677 | 0.511355 | 0.301078 | 0.434281 | 0.442301 |
| 10 | speaker_27_center_83.mov | 0.566414 | 0.659169 | 0.434059 | 0.59122 | 0.579172 |
[2024-10-10 18:29:56] Точность по отдельным персональным качествам личности человека …
| Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | Mean | |
|---|---|---|---|---|---|---|
| Metrics | ||||||
| MAE | 0.1632 | 0.1621 | 0.176 | 0.2589 | 0.1122 | 0.1745 |
| Accuracy | 0.8368 | 0.8379 | 0.824 | 0.7411 | 0.8878 | 0.8255 |
[2024-10-10 18:29:56] Средняя средних абсолютных ошибок: 0.1745, средняя точность: 0.8255 …
Лог файлы успешно сохранены …
— Время выполнения: 320.737 сек. —
[16]:
True
To predict consumer preferences for industrial goods, it is necessary to know the correlation coefficients that determine the relationship between personality traits and preferences in goods or services.
As an example, it is proposed to use the correlation coefficients between the personality traits and the characteristics of the cars presented in the article:
O’Connor P. J. et al. What Drives Consumer Automobile Choice? Investigating Personality Trait Predictors of Vehicle Preference Factors // Personality and Individual Differences. – 2022. – Vol. 184. – pp. 111220.
The user can set their own correlation coefficients.
Predicting consumer preferences for industrial goods on the example of car characteristics
[17]:
# Loading dataframe with correlation coefficients
url = 'https://download.sberdisk.ru/download/file/478675818?token=EjfLMqOeK8cfnOu&filename=auto_characteristics.csv'
df_correlation_coefficients = pd.read_csv(url)
df_correlation_coefficients = pd.DataFrame(
df_correlation_coefficients.drop(['Style and performance', 'Safety and practicality'], axis = 1)
)
df_correlation_coefficients.index.name = 'ID'
df_correlation_coefficients.index += 1
df_correlation_coefficients.index = df_correlation_coefficients.index.map(str)
df_correlation_coefficients
[17]:
| Trait | Performance | Classic car features | Luxury additions | Fashion and attention | Recreation | Technology | Family friendly | Safe and reliable | Practical and easy to use | Economical/low cost | Basic features | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ID | ||||||||||||
| 1 | Openness | 0.020000 | -0.033333 | -0.030000 | -0.050000 | 0.033333 | 0.013333 | -0.030000 | 0.136667 | 0.106667 | 0.093333 | 0.006667 |
| 2 | Conscientiousness | 0.013333 | -0.193333 | -0.063333 | -0.096667 | -0.096667 | 0.086667 | -0.063333 | 0.280000 | 0.180000 | 0.130000 | 0.143333 |
| 3 | Extraversion | 0.133333 | 0.060000 | 0.106667 | 0.123333 | 0.126667 | 0.120000 | 0.090000 | 0.136667 | 0.043333 | 0.073333 | 0.050000 |
| 4 | Agreeableness | -0.036667 | -0.193333 | -0.133333 | -0.133333 | -0.090000 | 0.046667 | -0.016667 | 0.240000 | 0.160000 | 0.120000 | 0.083333 |
| 5 | Non-Neuroticism | 0.016667 | -0.006667 | -0.010000 | -0.006667 | -0.033333 | 0.046667 | -0.023333 | 0.093333 | 0.046667 | 0.046667 | -0.040000 |
[18]:
_b5._priority_calculation(
correlation_coefficients = df_correlation_coefficients,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = False
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'auto_characteristics_priorities_mupta_en', out = True)
# Optional
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[18]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | speaker_01_center_83.mov | 0.596 | 0.543 | 0.441 | 0.590 | 0.515 | Practical and easy to use | Economical/low cost | Recreation | Openness | Agreeableness | Non-Neuroticism |
| 2 | speaker_06_center_83.mov | 0.661 | 0.674 | 0.603 | 0.645 | 0.643 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 3 | speaker_07_center_83.mov | 0.440 | 0.465 | 0.285 | 0.423 | 0.396 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 4 | speaker_10_center_83.mov | 0.477 | 0.503 | 0.374 | 0.441 | 0.425 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 5 | speaker_11_center_83.mov | 0.403 | 0.344 | 0.317 | 0.422 | 0.384 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 6 | speaker_15_center_83.mov | 0.582 | 0.562 | 0.505 | 0.602 | 0.522 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 7 | speaker_19_center_83.mov | 0.510 | 0.448 | 0.426 | 0.452 | 0.448 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 8 | speaker_23_center_83.mov | 0.501 | 0.541 | 0.309 | 0.441 | 0.452 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 9 | speaker_24_center_83.mov | 0.428 | 0.511 | 0.301 | 0.434 | 0.442 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 10 | speaker_27_center_83.mov | 0.566 | 0.659 | 0.434 | 0.591 | 0.579 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
Predicting consumer preferences for industrial goods on the example of mobile device application categories
As an example, it is proposed to use the correlation coefficients between the personality traits and the mobile device application categories presented in the article:
Peltonen E., Sharmila P., Asare K. O., Visuri A., Lagerspetz E., Ferreira D. (2020). When phones get personal: Predicting Big Five personality traits from application usage // Pervasive and Mobile Computing. – 2020. – Vol. 69. – 101269.
[19]:
# Loading a dataframe with correlation coefficients
url = 'https://download.sberdisk.ru/download/file/478676690?token=7KcAxPqMpWiYQnx&filename=divice_characteristics.csv'
df_divice_characteristics = pd.read_csv(url)
df_divice_characteristics.index.name = 'ID'
df_divice_characteristics.index += 1
df_divice_characteristics.index = df_divice_characteristics.index.map(str)
df_divice_characteristics
[19]:
| Trait | Communication | Game Action | Game Board | Game Casino | Game Educational | Game Simulation | Game Trivia | Entertainment | Finance | Health and Fitness | Media and Video | Music and Audio | News and Magazines | Personalisation | Travel and Local | Weather | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ID | |||||||||||||||||
| 1 | Openness | 0.118 | 0.056 | 0.079 | 0.342 | 0.027 | 0.104 | 0.026 | 0.000 | 0.006 | 0.002 | 0.000 | 0.000 | 0.001 | 0.004 | 0.002 | 0.004 |
| 2 | Conscientiousness | 0.119 | 0.043 | 0.107 | 0.448 | 0.039 | 0.012 | 0.119 | 0.000 | 0.005 | 0.001 | 0.000 | 0.002 | 0.002 | 0.001 | 0.001 | 0.003 |
| 3 | Extraversion | 0.246 | 0.182 | 0.211 | 0.311 | 0.102 | 0.165 | 0.223 | 0.001 | 0.003 | 0.000 | 0.001 | 0.001 | 0.001 | 0.004 | 0.009 | 0.003 |
| 4 | Agreeableness | 0.218 | 0.104 | 0.164 | 0.284 | 0.165 | 0.122 | 0.162 | 0.000 | 0.003 | 0.001 | 0.000 | 0.002 | 0.002 | 0.001 | 0.004 | 0.003 |
| 5 | Non-Neuroticism | 0.046 | 0.047 | 0.125 | 0.515 | 0.272 | 0.179 | 0.214 | 0.002 | 0.030 | 0.001 | 0.000 | 0.005 | 0.003 | 0.008 | 0.004 | 0.007 |
[20]:
_b5._priority_calculation(
correlation_coefficients = df_divice_characteristics,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = True
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'divice_characteristics_priorities_mupta_en', out = True)
# Optional
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[20]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | speaker_01_center_83.mov | 0.596 | 0.543 | 0.441 | 0.590 | 0.515 | Communication | Health and Fitness | Media and Video | Agreeableness | Openness | Non-Neuroticism |
| 2 | speaker_06_center_83.mov | 0.661 | 0.674 | 0.603 | 0.645 | 0.643 | Game Casino | Communication | Game Trivia | Non-Neuroticism | Extraversion | Conscientiousness |
| 3 | speaker_07_center_83.mov | 0.440 | 0.465 | 0.285 | 0.423 | 0.396 | Media and Video | Entertainment | Health and Fitness | Agreeableness | Conscientiousness | Extraversion |
| 4 | speaker_10_center_83.mov | 0.477 | 0.503 | 0.374 | 0.441 | 0.425 | Media and Video | Entertainment | Health and Fitness | Agreeableness | Conscientiousness | Extraversion |
| 5 | speaker_11_center_83.mov | 0.403 | 0.344 | 0.317 | 0.422 | 0.384 | Media and Video | Entertainment | Health and Fitness | Conscientiousness | Agreeableness | Extraversion |
| 6 | speaker_15_center_83.mov | 0.582 | 0.562 | 0.505 | 0.602 | 0.522 | Game Casino | Communication | Game Board | Agreeableness | Conscientiousness | Openness |
| 7 | speaker_19_center_83.mov | 0.510 | 0.448 | 0.426 | 0.452 | 0.448 | Media and Video | Entertainment | Health and Fitness | Conscientiousness | Agreeableness | Extraversion |
| 8 | speaker_23_center_83.mov | 0.501 | 0.541 | 0.309 | 0.441 | 0.452 | Media and Video | Entertainment | Health and Fitness | Agreeableness | Conscientiousness | Extraversion |
| 9 | speaker_24_center_83.mov | 0.428 | 0.511 | 0.301 | 0.434 | 0.442 | Media and Video | Entertainment | Health and Fitness | Agreeableness | Conscientiousness | Extraversion |
| 10 | speaker_27_center_83.mov | 0.566 | 0.659 | 0.434 | 0.591 | 0.579 | Game Casino | Game Educational | Game Trivia | Non-Neuroticism | Conscientiousness | Agreeableness |
Прогнозирование потребительских предпочтений по стилю одежды
В качестве примера предлагается использование коэффициентов корреляции между персональными качествами человека и стилем одежды, представленными в статье:
Stolovy T. Styling the self: clothing practices, personality traits, and body image among Israeli women // Frontiers in psychology. - 2022. - vol. 12. - 719318.
[21]:
# Загрузка датафрейма с коэффициентами корреляции
url = 'https://download.sberdisk.ru/download/file/493644097?token=KGtSGMxjZtWXmBz&filename=df_%D1%81lothing_style_correlation.csv'
df_clothing_styles = pd.read_csv(url)
df_clothing_styles.index.name = 'ID'
df_clothing_styles.index += 1
df_clothing_styles.index = df_clothing_styles.index.map(str)
df_clothing_styles
[21]:
| Trait | Comfort | Camouflage | Assurance | Fashion | Individuality | |
|---|---|---|---|---|---|---|
| ID | ||||||
| 1 | Openness | 0.01 | -0.24 | 0.31 | 0.07 | 0.31 |
| 2 | Conscientiousness | -0.03 | -0.24 | 0.17 | 0.09 | 0.15 |
| 3 | Extraversion | -0.01 | -0.19 | 0.30 | 0.13 | 0.14 |
| 4 | Agreeableness | 0.16 | -0.16 | 0.15 | -0.09 | -0.05 |
| 5 | Non-Neuroticism | 0.03 | -0.16 | 0.01 | 0.00 | 0.06 |
[22]:
_b5._priority_calculation(
correlation_coefficients = df_clothing_styles,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = True
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'clothing_styles_priorities_mupta_en', out = True)
# Опционно
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[22]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | speaker_01_center_83.mov | 0.596 | 0.543 | 0.441 | 0.590 | 0.515 | Comfort | Camouflage | Assurance | Agreeableness | Non-Neuroticism | Conscientiousness |
| 2 | speaker_06_center_83.mov | 0.661 | 0.674 | 0.603 | 0.645 | 0.643 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 3 | speaker_07_center_83.mov | 0.440 | 0.465 | 0.285 | 0.423 | 0.396 | Camouflage | Comfort | Fashion | Conscientiousness | Openness | Non-Neuroticism |
| 4 | speaker_10_center_83.mov | 0.477 | 0.503 | 0.374 | 0.441 | 0.425 | Camouflage | Comfort | Fashion | Conscientiousness | Openness | Non-Neuroticism |
| 5 | speaker_11_center_83.mov | 0.403 | 0.344 | 0.317 | 0.422 | 0.384 | Camouflage | Fashion | Comfort | Openness | Conscientiousness | Non-Neuroticism |
| 6 | speaker_15_center_83.mov | 0.582 | 0.562 | 0.505 | 0.602 | 0.522 | Assurance | Individuality | Comfort | Openness | Conscientiousness | Agreeableness |
| 7 | speaker_19_center_83.mov | 0.510 | 0.448 | 0.426 | 0.452 | 0.448 | Camouflage | Comfort | Fashion | Openness | Conscientiousness | Non-Neuroticism |
| 8 | speaker_23_center_83.mov | 0.501 | 0.541 | 0.309 | 0.441 | 0.452 | Camouflage | Comfort | Fashion | Conscientiousness | Openness | Non-Neuroticism |
| 9 | speaker_24_center_83.mov | 0.428 | 0.511 | 0.301 | 0.434 | 0.442 | Camouflage | Comfort | Fashion | Conscientiousness | Openness | Non-Neuroticism |
| 10 | speaker_27_center_83.mov | 0.566 | 0.659 | 0.434 | 0.591 | 0.579 | Assurance | Individuality | Comfort | Openness | Conscientiousness | Agreeableness |