Решение практической задачи 2
Задача: прогнозирование потребительских предпочтений на промышленные товары
Решение практической задачи выполняется в два этапа. На первом этапе необходимо использовать библиотеку OCEAN-AI для получения гипотез предсказаний (оценок персональных качеств личности человека). На втором этапе следует использовать метод _priority_calculation из библиотеки OCEAN-AI для решения представленной практической задачи. Примеры результатов работы и реализации представлены ниже.
Таким образом, библиотека OCEAN-AI предоставляет инструмент для анализа персональных качеств личности потребителей, что полезно для предсказания того, что их заинтересует. Это позволит компаниям более точно адаптировать свои товары и услуги к предпочтениям потребителей, делая их более уникальными и персонализированными.
FI V2
[2]:
# Импорт необходимых инструментов
import os
import pandas as pd
# Импорт модуля
from oceanai.modules.lab.build import Run
# Создание экземпляра класса
_b5 = Run()
# Настройка ядра
_b5.path_to_save_ = './models' # Директория для сохранения файла
_b5.chunk_size_ = 2000000 # Размер загрузки файла из сети за 1 шаг
corpus = 'fi'
# Формирование аудиомоделей
res_load_model_hc = _b5.load_audio_model_hc()
res_load_model_nn = _b5.load_audio_model_nn()
# Загрузка весов аудиомоделей
url = _b5.weights_for_big5_['audio'][corpus]['hc']['googledisk']
res_load_model_weights_hc = _b5.load_audio_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['audio'][corpus]['nn']['googledisk']
res_load_model_weights_nn = _b5.load_audio_model_weights_nn(url = url, force_reload = False)
# Формирование видеомоделей
res_load_model_hc = _b5.load_video_model_hc(lang='en')
res_load_model_deep_fe = _b5.load_video_model_deep_fe()
res_load_model_nn = _b5.load_video_model_nn()
# Загрузка весов видеомоделей
url = _b5.weights_for_big5_['video'][corpus]['hc']['googledisk']
res_load_model_weights_hc = _b5.load_video_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['video'][corpus]['fe']['googledisk']
res_load_model_weights_deep_fe = _b5.load_video_model_weights_deep_fe(url = url, force_reload = False)
url = _b5.weights_for_big5_['video'][corpus]['nn']['googledisk']
res_load_model_weights_nn = _b5.load_video_model_weights_nn(url = url, force_reload = False)
# Загрузка словаря с экспертными признаками (текстовая модальность)
res_load_text_features = _b5.load_text_features()
# Формирование текстовых моделей
res_setup_translation_model = _b5.setup_translation_model() # только для русского языка
res_setup_translation_model = _b5.setup_bert_encoder(force_reload = False)
res_load_text_model_hc_fi = _b5.load_text_model_hc(corpus=corpus)
res_load_text_model_nn_fi = _b5.load_text_model_nn(corpus=corpus)
# Загрузка весов текстовых моделей
url = _b5.weights_for_big5_['text'][corpus]['hc']['googledisk']
res_load_text_model_weights_hc_fi = _b5.load_text_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['text'][corpus]['nn']['googledisk']
res_load_text_model_weights_nn_fi = _b5.load_text_model_weights_nn(url = url, force_reload = False)
# Формирование модели для мультимодального объединения информации
res_load_avt_model_b5 = _b5.load_avt_model_b5()
# Загрузка весов модели для мультимодального объединения информации
url = _b5.weights_for_big5_['avt'][corpus]['b5']['googledisk']
res_load_avt_model_weights_b5 = _b5.load_avt_model_weights_b5(url = url, force_reload = False)
PATH_TO_DIR = './video_FI/'
PATH_SAVE_VIDEO = './video_FI/test/'
_b5.path_to_save_ = PATH_SAVE_VIDEO
# Загрузка 10 тестовых аудиовидеозаписей из корпуса First Impression V2
# URL: https://chalearnlap.cvc.uab.cat/dataset/24/description/
domain = 'https://download.sberdisk.ru/download/file/'
tets_name_files = [
'429713680?token=FqHdMLSSh7zYSZt&filename=_plk5k7PBEg.003.mp4',
'429713681?token=Hz9b4lQkrLfic33&filename=be0DQawtVkE.002.mp4',
'429713683?token=EgUXS9Xs8xHm5gz&filename=2d6btbaNdfo.000.mp4',
'429713684?token=1U26753kmPYdIgt&filename=300gK3CnzW0.003.mp4',
'429713685?token=LyigAWLTzDNwKJO&filename=300gK3CnzW0.001.mp4',
'429713686?token=EpfRbCKHyuc4HPu&filename=cLaZxEf1nE4.004.mp4',
'429713687?token=FNTkwqBr4jOS95l&filename=g24JGYuT74A.004.mp4',
'429713688?token=qDT95nz7hfm2Nki&filename=JZNMxa3OKHY.000.mp4',
'429713689?token=noLguEGXDpbcKhg&filename=nvlqJbHk_Lc.003.mp4',
'429713679?token=9L7RQ0hgdJlcek6&filename=4vdJGgZpj4k.003.mp4'
]
for curr_files in tets_name_files:
_b5.download_file_from_url(url = domain + curr_files, out = True)
# Получение прогнозов
_b5.path_to_dataset_ = PATH_TO_DIR # Директория набора данных
_b5.ext_ = ['.mp4'] # Расширения искомых файлов
# Полный путь к файлу с верными предсказаниями для подсчета точности
url_accuracy = _b5.true_traits_[corpus]['googledisk']
_b5.get_avt_predictions(url_accuracy = url_accuracy, lang = 'en')
[2024-10-10 18:10:50] Извлечение признаков (экспертных и нейросетевых) из текста …
[2024-10-10 18:10:50] Получение прогнозов и вычисление точности (мультимодальное объединение) …
10 из 10 (100.0%) … GitHub:nbsphinx-math:OCEANAI\docs\source\user_guide:nbsphinx-math:notebooks\video_FI:nbsphinx-math:test_plk5k7PBEg.003.mp4 …
| Path | Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | |
|---|---|---|---|---|---|---|
| Person ID | ||||||
| 1 | 2d6btbaNdfo.000.mp4 | 0.618917 | 0.660694 | 0.477656 | 0.654437 | 0.601256 |
| 2 | 300gK3CnzW0.001.mp4 | 0.461732 | 0.413451 | 0.415706 | 0.498301 | 0.431224 |
| 3 | 300gK3CnzW0.003.mp4 | 0.468002 | 0.448618 | 0.371742 | 0.509602 | 0.453739 |
| 4 | 4vdJGgZpj4k.003.mp4 | 0.585348 | 0.616446 | 0.49443 | 0.605614 | 0.587017 |
| 5 | be0DQawtVkE.002.mp4 | 0.680991 | 0.56602 | 0.553915 | 0.646545 | 0.64246 |
| 6 | cLaZxEf1nE4.004.mp4 | 0.66342 | 0.551018 | 0.557912 | 0.585238 | 0.587174 |
| 7 | g24JGYuT74A.004.mp4 | 0.590237 | 0.399273 | 0.409554 | 0.531861 | 0.507134 |
| 8 | JZNMxa3OKHY.000.mp4 | 0.60577 | 0.523617 | 0.531137 | 0.594406 | 0.57984 |
| 9 | nvlqJbHk_Lc.003.mp4 | 0.511002 | 0.464702 | 0.390882 | 0.443663 | 0.438811 |
| 10 | _plk5k7PBEg.003.mp4 | 0.647606 | 0.610466 | 0.524718 | 0.61428 | 0.606428 |
[2024-10-10 18:10:50] Точность по отдельным персональным качествам личности человека …
| Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | Mean | |
|---|---|---|---|---|---|---|
| Metrics | ||||||
| MAE | 0.0735 | 0.0631 | 0.0914 | 0.0706 | 0.0691 | 0.0735 |
| Accuracy | 0.9265 | 0.9369 | 0.9086 | 0.9294 | 0.9309 | 0.9265 |
[2024-10-10 18:10:50] Средняя средних абсолютных ошибок: 0.0735, средняя точность: 0.9265 …
Лог файлы успешно сохранены …
— Время выполнения: 35.449 сек. —
[2]:
True
Для прогнозирования потребительских предпочтений в промышленных товарах необходимо знать коэффициенты корреляции, определяющие взаимосвязь между персональными качествами личности человека и предпочтениями в товарах или услугах.
В качестве примера предлагается использование коэффициентов корреляции между персональными качествами человека и характеристиками автомобилей, представленными в статье:
O’Connor P. J. et al. What Drives Consumer Automobile Choice? Investigating Personality Trait Predictors of Vehicle Preference Factors // Personality and Individual Differences. – 2022. – Vol. 184. – pp. 111220.
Пользователь может установить свои коэффициенты корреляции.
Прогнозирование потребительских предпочтений на характеристики атомобиля
[3]:
# Загрузка датафрейма с коэффициентами корреляции
url = 'https://download.sberdisk.ru/download/file/478675818?token=EjfLMqOeK8cfnOu&filename=auto_characteristics.csv'
df_correlation_coefficients = pd.read_csv(url)
df_correlation_coefficients = pd.DataFrame(
df_correlation_coefficients.drop(['Style and performance', 'Safety and practicality'], axis = 1)
)
df_correlation_coefficients.index.name = 'ID'
df_correlation_coefficients.index += 1
df_correlation_coefficients.index = df_correlation_coefficients.index.map(str)
df_correlation_coefficients
[3]:
| Trait | Performance | Classic car features | Luxury additions | Fashion and attention | Recreation | Technology | Family friendly | Safe and reliable | Practical and easy to use | Economical/low cost | Basic features | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ID | ||||||||||||
| 1 | Openness | 0.020000 | -0.033333 | -0.030000 | -0.050000 | 0.033333 | 0.013333 | -0.030000 | 0.136667 | 0.106667 | 0.093333 | 0.006667 |
| 2 | Conscientiousness | 0.013333 | -0.193333 | -0.063333 | -0.096667 | -0.096667 | 0.086667 | -0.063333 | 0.280000 | 0.180000 | 0.130000 | 0.143333 |
| 3 | Extraversion | 0.133333 | 0.060000 | 0.106667 | 0.123333 | 0.126667 | 0.120000 | 0.090000 | 0.136667 | 0.043333 | 0.073333 | 0.050000 |
| 4 | Agreeableness | -0.036667 | -0.193333 | -0.133333 | -0.133333 | -0.090000 | 0.046667 | -0.016667 | 0.240000 | 0.160000 | 0.120000 | 0.083333 |
| 5 | Non-Neuroticism | 0.016667 | -0.006667 | -0.010000 | -0.006667 | -0.033333 | 0.046667 | -0.023333 | 0.093333 | 0.046667 | 0.046667 | -0.040000 |
[4]:
_b5._priority_calculation(
correlation_coefficients = df_correlation_coefficients,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = False
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'auto_characteristics_priorities_fi_en', out = True)
# Опционно
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[4]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | 2d6btbaNdfo.000.mp4 | 0.619 | 0.661 | 0.478 | 0.654 | 0.601 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 2 | 300gK3CnzW0.001.mp4 | 0.462 | 0.413 | 0.416 | 0.498 | 0.431 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 3 | 300gK3CnzW0.003.mp4 | 0.468 | 0.449 | 0.372 | 0.510 | 0.454 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 4 | 4vdJGgZpj4k.003.mp4 | 0.585 | 0.616 | 0.494 | 0.606 | 0.587 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 5 | be0DQawtVkE.002.mp4 | 0.681 | 0.566 | 0.554 | 0.647 | 0.642 | Safe and reliable | Practical and easy to use | Economical/low cost | Agreeableness | Conscientiousness | Openness |
| 6 | cLaZxEf1nE4.004.mp4 | 0.663 | 0.551 | 0.558 | 0.585 | 0.587 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 7 | g24JGYuT74A.004.mp4 | 0.590 | 0.399 | 0.410 | 0.532 | 0.507 | Classic car features | Recreation | Luxury additions | Agreeableness | Conscientiousness | Non-Neuroticism |
| 8 | JZNMxa3OKHY.000.mp4 | 0.606 | 0.524 | 0.531 | 0.594 | 0.580 | Practical and easy to use | Safe and reliable | Economical/low cost | Agreeableness | Openness | Non-Neuroticism |
| 9 | nvlqJbHk_Lc.003.mp4 | 0.511 | 0.465 | 0.391 | 0.444 | 0.439 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 10 | _plk5k7PBEg.003.mp4 | 0.648 | 0.610 | 0.525 | 0.614 | 0.606 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
Прогнозирование потребительских предпочтений на характеристики мобильного устройства
В качестве примера предлагается использование коэффициентов корреляции между персональными качествами человека и характеристиками мобильного устройства, представленными в статье:
Peltonen E., Sharmila P., Asare K. O., Visuri A., Lagerspetz E., Ferreira D. (2020). When phones get personal: Predicting Big Five personality traits from application usage // Pervasive and Mobile Computing. – 2020. – Vol. 69. – 101269.
[5]:
# Загрузка датафрейма с коэффициентами корреляции
url = 'https://download.sberdisk.ru/download/file/478676690?token=7KcAxPqMpWiYQnx&filename=divice_characteristics.csv'
df_divice_characteristics = pd.read_csv(url)
df_divice_characteristics.index.name = 'ID'
df_divice_characteristics.index += 1
df_divice_characteristics.index = df_divice_characteristics.index.map(str)
df_divice_characteristics
[5]:
| Trait | Communication | Game Action | Game Board | Game Casino | Game Educational | Game Simulation | Game Trivia | Entertainment | Finance | Health and Fitness | Media and Video | Music and Audio | News and Magazines | Personalisation | Travel and Local | Weather | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ID | |||||||||||||||||
| 1 | Openness | 0.118 | 0.056 | 0.079 | 0.342 | 0.027 | 0.104 | 0.026 | 0.000 | 0.006 | 0.002 | 0.000 | 0.000 | 0.001 | 0.004 | 0.002 | 0.004 |
| 2 | Conscientiousness | 0.119 | 0.043 | 0.107 | 0.448 | 0.039 | 0.012 | 0.119 | 0.000 | 0.005 | 0.001 | 0.000 | 0.002 | 0.002 | 0.001 | 0.001 | 0.003 |
| 3 | Extraversion | 0.246 | 0.182 | 0.211 | 0.311 | 0.102 | 0.165 | 0.223 | 0.001 | 0.003 | 0.000 | 0.001 | 0.001 | 0.001 | 0.004 | 0.009 | 0.003 |
| 4 | Agreeableness | 0.218 | 0.104 | 0.164 | 0.284 | 0.165 | 0.122 | 0.162 | 0.000 | 0.003 | 0.001 | 0.000 | 0.002 | 0.002 | 0.001 | 0.004 | 0.003 |
| 5 | Non-Neuroticism | 0.046 | 0.047 | 0.125 | 0.515 | 0.272 | 0.179 | 0.214 | 0.002 | 0.030 | 0.001 | 0.000 | 0.005 | 0.003 | 0.008 | 0.004 | 0.007 |
[6]:
_b5._priority_calculation(
correlation_coefficients = df_divice_characteristics,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = True
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'divice_characteristics_priorities_fi_en', out = True)
# Опционно
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[6]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | 2d6btbaNdfo.000.mp4 | 0.619 | 0.661 | 0.478 | 0.654 | 0.601 | Game Casino | Game Educational | Game Trivia | Non-Neuroticism | Conscientiousness | Agreeableness |
| 2 | 300gK3CnzW0.001.mp4 | 0.462 | 0.413 | 0.416 | 0.498 | 0.431 | Media and Video | Entertainment | Health and Fitness | Conscientiousness | Agreeableness | Extraversion |
| 3 | 300gK3CnzW0.003.mp4 | 0.468 | 0.449 | 0.372 | 0.510 | 0.454 | Media and Video | Entertainment | Health and Fitness | Conscientiousness | Agreeableness | Extraversion |
| 4 | 4vdJGgZpj4k.003.mp4 | 0.585 | 0.616 | 0.494 | 0.606 | 0.587 | Game Casino | Game Educational | Game Trivia | Non-Neuroticism | Conscientiousness | Agreeableness |
| 5 | be0DQawtVkE.002.mp4 | 0.681 | 0.566 | 0.554 | 0.647 | 0.642 | Game Casino | Communication | Game Trivia | Non-Neuroticism | Extraversion | Agreeableness |
| 6 | cLaZxEf1nE4.004.mp4 | 0.663 | 0.551 | 0.558 | 0.585 | 0.587 | Game Casino | Communication | Game Trivia | Non-Neuroticism | Extraversion | Agreeableness |
| 7 | g24JGYuT74A.004.mp4 | 0.590 | 0.399 | 0.410 | 0.532 | 0.507 | Health and Fitness | Media and Video | Entertainment | Openness | Conscientiousness | Agreeableness |
| 8 | JZNMxa3OKHY.000.mp4 | 0.606 | 0.524 | 0.531 | 0.594 | 0.580 | Game Casino | Game Educational | Game Simulation | Non-Neuroticism | Agreeableness | Openness |
| 9 | nvlqJbHk_Lc.003.mp4 | 0.511 | 0.465 | 0.391 | 0.444 | 0.439 | Media and Video | Entertainment | Health and Fitness | Agreeableness | Conscientiousness | Extraversion |
| 10 | _plk5k7PBEg.003.mp4 | 0.648 | 0.610 | 0.525 | 0.614 | 0.606 | Game Casino | Game Educational | Game Trivia | Non-Neuroticism | Agreeableness | Conscientiousness |
Прогнозирование потребительских предпочтений по стилю одежды
В качестве примера предлагается использование коэффициентов корреляции между персональными качествами человека и стилем одежды, представленными в статье:
Stolovy T. Styling the self: clothing practices, personality traits, and body image among Israeli women // Frontiers in psychology. - 2022. - vol. 12. - 719318.
[7]:
# Загрузка датафрейма с коэффициентами корреляции
url = 'https://download.sberdisk.ru/download/file/493644097?token=KGtSGMxjZtWXmBz&filename=df_%D1%81lothing_style_correlation.csv'
df_clothing_styles = pd.read_csv(url)
df_clothing_styles.index.name = 'ID'
df_clothing_styles.index += 1
df_clothing_styles.index = df_clothing_styles.index.map(str)
df_clothing_styles
[7]:
| Trait | Comfort | Camouflage | Assurance | Fashion | Individuality | |
|---|---|---|---|---|---|---|
| ID | ||||||
| 1 | Openness | 0.01 | -0.24 | 0.31 | 0.07 | 0.31 |
| 2 | Conscientiousness | -0.03 | -0.24 | 0.17 | 0.09 | 0.15 |
| 3 | Extraversion | -0.01 | -0.19 | 0.30 | 0.13 | 0.14 |
| 4 | Agreeableness | 0.16 | -0.16 | 0.15 | -0.09 | -0.05 |
| 5 | Non-Neuroticism | 0.03 | -0.16 | 0.01 | 0.00 | 0.06 |
[8]:
_b5._priority_calculation(
correlation_coefficients = df_clothing_styles,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = True
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'clothing_styles_priorities_fi_en', out = True)
# Опционно
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[8]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | 2d6btbaNdfo.000.mp4 | 0.619 | 0.661 | 0.478 | 0.654 | 0.601 | Assurance | Individuality | Comfort | Openness | Conscientiousness | Agreeableness |
| 2 | 300gK3CnzW0.001.mp4 | 0.462 | 0.413 | 0.416 | 0.498 | 0.431 | Camouflage | Fashion | Comfort | Conscientiousness | Openness | Non-Neuroticism |
| 3 | 300gK3CnzW0.003.mp4 | 0.468 | 0.449 | 0.372 | 0.510 | 0.454 | Camouflage | Fashion | Comfort | Conscientiousness | Openness | Non-Neuroticism |
| 4 | 4vdJGgZpj4k.003.mp4 | 0.585 | 0.616 | 0.494 | 0.606 | 0.587 | Assurance | Individuality | Comfort | Openness | Conscientiousness | Agreeableness |
| 5 | be0DQawtVkE.002.mp4 | 0.681 | 0.566 | 0.554 | 0.647 | 0.642 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 6 | cLaZxEf1nE4.004.mp4 | 0.663 | 0.551 | 0.558 | 0.585 | 0.587 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 7 | g24JGYuT74A.004.mp4 | 0.590 | 0.399 | 0.410 | 0.532 | 0.507 | Camouflage | Individuality | Fashion | Agreeableness | Openness | Non-Neuroticism |
| 8 | JZNMxa3OKHY.000.mp4 | 0.606 | 0.524 | 0.531 | 0.594 | 0.580 | Comfort | Individuality | Assurance | Openness | Agreeableness | Non-Neuroticism |
| 9 | nvlqJbHk_Lc.003.mp4 | 0.511 | 0.465 | 0.391 | 0.444 | 0.439 | Camouflage | Comfort | Fashion | Conscientiousness | Openness | Non-Neuroticism |
| 10 | _plk5k7PBEg.003.mp4 | 0.648 | 0.610 | 0.525 | 0.614 | 0.606 | Assurance | Individuality | Comfort | Openness | Conscientiousness | Agreeableness |
MuPTA (ru)
[9]:
import os
import pandas as pd
# Импорт модуля
from oceanai.modules.lab.build import Run
# Создание экземпляра класса
_b5 = Run()
corpus = 'mupta'
lang = 'ru'
# Настройка ядра
_b5.path_to_save_ = './models' # Директория для сохранения файла
_b5.chunk_size_ = 2000000 # Размер загрузки файла из сети за 1 шаг
# Формирование аудиомоделей
res_load_model_hc = _b5.load_audio_model_hc()
res_load_model_nn = _b5.load_audio_model_nn()
# Загрузка весов аудиомоделей
url = _b5.weights_for_big5_['audio'][corpus]['hc']['googledisk']
res_load_model_weights_hc = _b5.load_audio_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['audio'][corpus]['nn']['googledisk']
res_load_model_weights_nn = _b5.load_audio_model_weights_nn(url = url, force_reload = False)
# Формирование видеомоделей
res_load_model_hc = _b5.load_video_model_hc(lang=lang)
res_load_model_deep_fe = _b5.load_video_model_deep_fe()
res_load_model_nn = _b5.load_video_model_nn()
# Загрузка весов видеомоделей
url = _b5.weights_for_big5_['video'][corpus]['hc']['googledisk']
res_load_model_weights_hc = _b5.load_video_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['video'][corpus]['fe']['googledisk']
res_load_model_weights_deep_fe = _b5.load_video_model_weights_deep_fe(url = url, force_reload = False)
url = _b5.weights_for_big5_['video'][corpus]['nn']['googledisk']
res_load_model_weights_nn = _b5.load_video_model_weights_nn(url = url, force_reload = False)
# Загрузка словаря с экспертными признаками (текстовая модальность)
res_load_text_features = _b5.load_text_features()
# Формирование текстовых моделей
res_setup_translation_model = _b5.setup_translation_model() # только для русского языка
res_setup_translation_model = _b5.setup_bert_encoder(force_reload = False)
res_load_text_model_hc_fi = _b5.load_text_model_hc(corpus=corpus)
res_load_text_model_nn_fi = _b5.load_text_model_nn(corpus=corpus)
# Загрузка весов текстовых моделей
url = _b5.weights_for_big5_['text'][corpus]['hc']['googledisk']
res_load_text_model_weights_hc_fi = _b5.load_text_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['text'][corpus]['nn']['googledisk']
res_load_text_model_weights_nn_fi = _b5.load_text_model_weights_nn(url = url, force_reload = False)
# Формирование модели для мультимодального объединения информации
res_load_avt_model_b5 = _b5.load_avt_model_b5()
# Загрузка весов модели для мультимодального объединения информации
url = _b5.weights_for_big5_['avt'][corpus]['b5']['googledisk']
res_load_avt_model_weights_b5 = _b5.load_avt_model_weights_b5(url = url, force_reload = False)
PATH_TO_DIR = './video_MuPTA/'
PATH_SAVE_VIDEO = './video_MuPTA/test/'
_b5.path_to_save_ = PATH_SAVE_VIDEO
# Загрузка 10 тестовых аудиовидеозаписей из корпуса MuPTA
# URL: https://hci.nw.ru/en/pages/mupta-corpus
domain = 'https://download.sberdisk.ru/download/file/'
tets_name_files = [
'477995979?token=2cvyk7CS0mHx2MJ&filename=speaker_06_center_83.mov',
'477995980?token=jGPtBPS69uzFU6Y&filename=speaker_01_center_83.mov',
'477995967?token=zCaRbNB6ht5wMPq&filename=speaker_11_center_83.mov',
'477995966?token=B1rbinDYRQKrI3T&filename=speaker_15_center_83.mov',
'477995978?token=dEpVDtZg1EQiEQ9&filename=speaker_07_center_83.mov',
'477995961?token=o1hVjw8G45q9L9Z&filename=speaker_19_center_83.mov',
'477995964?token=5K220Aqf673VHPq&filename=speaker_23_center_83.mov',
'477995965?token=v1LVD2KT1cU7Lpb&filename=speaker_24_center_83.mov',
'477995962?token=tmaSGyyWLA6XCy9&filename=speaker_27_center_83.mov',
'477995963?token=bTpo96qNDPcwGqb&filename=speaker_10_center_83.mov',
]
for curr_files in tets_name_files:
_b5.download_file_from_url(url = domain + curr_files, out = True)
# Получение прогнозов
_b5.path_to_dataset_ = PATH_TO_DIR # Директория набора данных
_b5.ext_ = ['.mov'] # Расширения искомых файлов
# Полный путь к файлу с верными предсказаниями для подсчета точности
url_accuracy = _b5.true_traits_['mupta']['sberdisk']
_b5.get_avt_predictions(url_accuracy = url_accuracy, lang = lang)
[2024-10-10 18:20:29] Извлечение признаков (экспертных и нейросетевых) из текста …
[2024-10-10 18:20:30] Получение прогнозов и вычисление точности (мультимодальное объединение) …
10 из 10 (100.0%) … GitHub:nbsphinx-math:OCEANAI\docs\source\user_guide:nbsphinx-math:notebooks\video_MuPTA:nbsphinx-math:test\speaker_27_center_83.mov …
| Path | Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | |
|---|---|---|---|---|---|---|
| Person ID | ||||||
| 1 | speaker_01_center_83.mov | 0.765745 | 0.696637 | 0.656309 | 0.75986 | 0.494141 |
| 2 | speaker_06_center_83.mov | 0.686514 | 0.659488 | 0.611838 | 0.749739 | 0.420672 |
| 3 | speaker_07_center_83.mov | 0.671993 | 0.661216 | 0.571759 | 0.704542 | 0.381026 |
| 4 | speaker_10_center_83.mov | 0.69828 | 0.59893 | 0.571893 | 0.674907 | 0.35082 |
| 5 | speaker_11_center_83.mov | 0.718329 | 0.598986 | 0.573518 | 0.73201 | 0.379845 |
| 6 | speaker_15_center_83.mov | 0.670932 | 0.671055 | 0.602337 | 0.708656 | 0.399527 |
| 7 | speaker_19_center_83.mov | 0.767261 | 0.658167 | 0.653367 | 0.801366 | 0.463443 |
| 8 | speaker_23_center_83.mov | 0.699837 | 0.684907 | 0.616671 | 0.806437 | 0.447853 |
| 9 | speaker_24_center_83.mov | 0.710566 | 0.66299 | 0.610562 | 0.711242 | 0.413696 |
| 10 | speaker_27_center_83.mov | 0.759404 | 0.712562 | 0.658357 | 0.830507 | 0.507612 |
[2024-10-10 18:20:30] Точность по отдельным персональным качествам личности человека …
| Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | Mean | |
|---|---|---|---|---|---|---|
| Metrics | ||||||
| MAE | 0.0706 | 0.0788 | 0.1328 | 0.1071 | 0.1002 | 0.0979 |
| Accuracy | 0.9294 | 0.9212 | 0.8672 | 0.8929 | 0.8998 | 0.9021 |
[2024-10-10 18:20:30] Средняя средних абсолютных ошибок: 0.0979, средняя точность: 0.9021 …
Лог файлы успешно сохранены …
— Время выполнения: 324.067 сек. —
[9]:
True
Для прогнозирования потребительских предпочтений в промышленных товарах необходимо знать коэффициенты корреляции, определяющие взаимосвязь между персональными качествами личности человека и предпочтениями в товарах или услугах.
В качестве примера предлагается использование коэффициентов корреляции между персональными качествами человека и характеристиками автомобилей, представленными в статье:
O’Connor P. J. et al. What Drives Consumer Automobile Choice? Investigating Personality Trait Predictors of Vehicle Preference Factors // Personality and Individual Differences. – 2022. – Vol. 184. – pp. 111220.
Пользователь может установить свои коэффициенты корреляции.
Прогнозирование потребительских предпочтений на характеристики атомобиля
[10]:
# Загрузка датафрейма с коэффициентами корреляции
url = 'https://download.sberdisk.ru/download/file/478675818?token=EjfLMqOeK8cfnOu&filename=auto_characteristics.csv'
df_correlation_coefficients = pd.read_csv(url)
df_correlation_coefficients = pd.DataFrame(
df_correlation_coefficients.drop(['Style and performance', 'Safety and practicality'], axis = 1)
)
df_correlation_coefficients.index.name = 'ID'
df_correlation_coefficients.index += 1
df_correlation_coefficients.index = df_correlation_coefficients.index.map(str)
df_correlation_coefficients
[10]:
| Trait | Performance | Classic car features | Luxury additions | Fashion and attention | Recreation | Technology | Family friendly | Safe and reliable | Practical and easy to use | Economical/low cost | Basic features | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ID | ||||||||||||
| 1 | Openness | 0.020000 | -0.033333 | -0.030000 | -0.050000 | 0.033333 | 0.013333 | -0.030000 | 0.136667 | 0.106667 | 0.093333 | 0.006667 |
| 2 | Conscientiousness | 0.013333 | -0.193333 | -0.063333 | -0.096667 | -0.096667 | 0.086667 | -0.063333 | 0.280000 | 0.180000 | 0.130000 | 0.143333 |
| 3 | Extraversion | 0.133333 | 0.060000 | 0.106667 | 0.123333 | 0.126667 | 0.120000 | 0.090000 | 0.136667 | 0.043333 | 0.073333 | 0.050000 |
| 4 | Agreeableness | -0.036667 | -0.193333 | -0.133333 | -0.133333 | -0.090000 | 0.046667 | -0.016667 | 0.240000 | 0.160000 | 0.120000 | 0.083333 |
| 5 | Non-Neuroticism | 0.016667 | -0.006667 | -0.010000 | -0.006667 | -0.033333 | 0.046667 | -0.023333 | 0.093333 | 0.046667 | 0.046667 | -0.040000 |
[11]:
_b5._priority_calculation(
correlation_coefficients = df_correlation_coefficients,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = False
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'auto_characteristics_priorities_mupta_ru', out = True)
# Опционно
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[11]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | speaker_01_center_83.mov | 0.766 | 0.697 | 0.656 | 0.760 | 0.494 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 2 | speaker_06_center_83.mov | 0.687 | 0.659 | 0.612 | 0.750 | 0.421 | Safe and reliable | Practical and easy to use | Economical/low cost | Agreeableness | Conscientiousness | Openness |
| 3 | speaker_07_center_83.mov | 0.672 | 0.661 | 0.572 | 0.705 | 0.381 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 4 | speaker_10_center_83.mov | 0.698 | 0.599 | 0.572 | 0.675 | 0.351 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 5 | speaker_11_center_83.mov | 0.718 | 0.599 | 0.574 | 0.732 | 0.380 | Safe and reliable | Practical and easy to use | Economical/low cost | Agreeableness | Conscientiousness | Openness |
| 6 | speaker_15_center_83.mov | 0.671 | 0.671 | 0.602 | 0.709 | 0.400 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 7 | speaker_19_center_83.mov | 0.767 | 0.658 | 0.653 | 0.801 | 0.463 | Safe and reliable | Practical and easy to use | Economical/low cost | Agreeableness | Conscientiousness | Openness |
| 8 | speaker_23_center_83.mov | 0.700 | 0.685 | 0.617 | 0.806 | 0.448 | Safe and reliable | Practical and easy to use | Economical/low cost | Agreeableness | Conscientiousness | Openness |
| 9 | speaker_24_center_83.mov | 0.711 | 0.663 | 0.611 | 0.711 | 0.414 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 10 | speaker_27_center_83.mov | 0.759 | 0.713 | 0.658 | 0.831 | 0.508 | Safe and reliable | Practical and easy to use | Economical/low cost | Agreeableness | Conscientiousness | Openness |
Прогнозирование потребительских предпочтений на характеристики мобильного устройства
В качестве примера предлагается использование коэффициентов корреляции между персональными качествами человека и характеристиками мобильного устройства, представленными в статье:
Peltonen E., Sharmila P., Asare K. O., Visuri A., Lagerspetz E., Ferreira D. (2020). When phones get personal: Predicting Big Five personality traits from application usage // Pervasive and Mobile Computing. – 2020. – Vol. 69. – 101269.
[12]:
# Загрузка датафрейма с коэффициентами корреляции
url = 'https://download.sberdisk.ru/download/file/478676690?token=7KcAxPqMpWiYQnx&filename=divice_characteristics.csv'
df_divice_characteristics = pd.read_csv(url)
df_divice_characteristics.index.name = 'ID'
df_divice_characteristics.index += 1
df_divice_characteristics.index = df_divice_characteristics.index.map(str)
df_divice_characteristics
[12]:
| Trait | Communication | Game Action | Game Board | Game Casino | Game Educational | Game Simulation | Game Trivia | Entertainment | Finance | Health and Fitness | Media and Video | Music and Audio | News and Magazines | Personalisation | Travel and Local | Weather | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ID | |||||||||||||||||
| 1 | Openness | 0.118 | 0.056 | 0.079 | 0.342 | 0.027 | 0.104 | 0.026 | 0.000 | 0.006 | 0.002 | 0.000 | 0.000 | 0.001 | 0.004 | 0.002 | 0.004 |
| 2 | Conscientiousness | 0.119 | 0.043 | 0.107 | 0.448 | 0.039 | 0.012 | 0.119 | 0.000 | 0.005 | 0.001 | 0.000 | 0.002 | 0.002 | 0.001 | 0.001 | 0.003 |
| 3 | Extraversion | 0.246 | 0.182 | 0.211 | 0.311 | 0.102 | 0.165 | 0.223 | 0.001 | 0.003 | 0.000 | 0.001 | 0.001 | 0.001 | 0.004 | 0.009 | 0.003 |
| 4 | Agreeableness | 0.218 | 0.104 | 0.164 | 0.284 | 0.165 | 0.122 | 0.162 | 0.000 | 0.003 | 0.001 | 0.000 | 0.002 | 0.002 | 0.001 | 0.004 | 0.003 |
| 5 | Non-Neuroticism | 0.046 | 0.047 | 0.125 | 0.515 | 0.272 | 0.179 | 0.214 | 0.002 | 0.030 | 0.001 | 0.000 | 0.005 | 0.003 | 0.008 | 0.004 | 0.007 |
[13]:
_b5._priority_calculation(
correlation_coefficients = df_divice_characteristics,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = True
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'divice_characteristics_priorities_mupta_ru', out = True)
# Опционно
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[13]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | speaker_01_center_83.mov | 0.766 | 0.697 | 0.656 | 0.760 | 0.494 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 2 | speaker_06_center_83.mov | 0.687 | 0.659 | 0.612 | 0.750 | 0.421 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 3 | speaker_07_center_83.mov | 0.672 | 0.661 | 0.572 | 0.705 | 0.381 | Game Casino | Communication | Game Board | Agreeableness | Conscientiousness | Extraversion |
| 4 | speaker_10_center_83.mov | 0.698 | 0.599 | 0.572 | 0.675 | 0.351 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 5 | speaker_11_center_83.mov | 0.718 | 0.599 | 0.574 | 0.732 | 0.380 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 6 | speaker_15_center_83.mov | 0.671 | 0.671 | 0.602 | 0.709 | 0.400 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 7 | speaker_19_center_83.mov | 0.767 | 0.658 | 0.653 | 0.801 | 0.463 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 8 | speaker_23_center_83.mov | 0.700 | 0.685 | 0.617 | 0.806 | 0.448 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 9 | speaker_24_center_83.mov | 0.711 | 0.663 | 0.611 | 0.711 | 0.414 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
| 10 | speaker_27_center_83.mov | 0.759 | 0.713 | 0.658 | 0.831 | 0.508 | Game Casino | Communication | Game Board | Agreeableness | Extraversion | Conscientiousness |
Прогнозирование потребительских предпочтений по стилю одежды
В качестве примера предлагается использование коэффициентов корреляции между персональными качествами человека и стилем одежды, представленными в статье:
Stolovy T. Styling the self: clothing practices, personality traits, and body image among Israeli women // Frontiers in psychology. - 2022. - vol. 12. - 719318.
[14]:
# Загрузка датафрейма с коэффициентами корреляции
url = 'https://download.sberdisk.ru/download/file/493644097?token=KGtSGMxjZtWXmBz&filename=df_%D1%81lothing_style_correlation.csv'
df_clothing_styles = pd.read_csv(url)
df_clothing_styles.index.name = 'ID'
df_clothing_styles.index += 1
df_clothing_styles.index = df_clothing_styles.index.map(str)
df_clothing_styles
[14]:
| Trait | Comfort | Camouflage | Assurance | Fashion | Individuality | |
|---|---|---|---|---|---|---|
| ID | ||||||
| 1 | Openness | 0.01 | -0.24 | 0.31 | 0.07 | 0.31 |
| 2 | Conscientiousness | -0.03 | -0.24 | 0.17 | 0.09 | 0.15 |
| 3 | Extraversion | -0.01 | -0.19 | 0.30 | 0.13 | 0.14 |
| 4 | Agreeableness | 0.16 | -0.16 | 0.15 | -0.09 | -0.05 |
| 5 | Non-Neuroticism | 0.03 | -0.16 | 0.01 | 0.00 | 0.06 |
[15]:
_b5._priority_calculation(
correlation_coefficients = df_clothing_styles,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = True
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'clothing_styles_priorities_mupta_ru', out = True)
# Опционно
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[15]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | speaker_01_center_83.mov | 0.766 | 0.697 | 0.656 | 0.760 | 0.494 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 2 | speaker_06_center_83.mov | 0.687 | 0.659 | 0.612 | 0.750 | 0.421 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 3 | speaker_07_center_83.mov | 0.672 | 0.661 | 0.572 | 0.705 | 0.381 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 4 | speaker_10_center_83.mov | 0.698 | 0.599 | 0.572 | 0.675 | 0.351 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 5 | speaker_11_center_83.mov | 0.718 | 0.599 | 0.574 | 0.732 | 0.380 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 6 | speaker_15_center_83.mov | 0.671 | 0.671 | 0.602 | 0.709 | 0.400 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 7 | speaker_19_center_83.mov | 0.767 | 0.658 | 0.653 | 0.801 | 0.463 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 8 | speaker_23_center_83.mov | 0.700 | 0.685 | 0.617 | 0.806 | 0.448 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 9 | speaker_24_center_83.mov | 0.711 | 0.663 | 0.611 | 0.711 | 0.414 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 10 | speaker_27_center_83.mov | 0.759 | 0.713 | 0.658 | 0.831 | 0.508 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
MuPTA (en)
[16]:
import os
import pandas as pd
# Импорт модуля
from oceanai.modules.lab.build import Run
# Создание экземпляра класса
_b5 = Run()
corpus = 'fi'
lang = 'en'
# Настройка ядра
_b5.path_to_save_ = './models' # Директория для сохранения файла
_b5.chunk_size_ = 2000000 # Размер загрузки файла из сети за 1 шаг
# Формирование аудиомоделей
res_load_model_hc = _b5.load_audio_model_hc()
res_load_model_nn = _b5.load_audio_model_nn()
# Загрузка весов аудиомоделей
url = _b5.weights_for_big5_['audio'][corpus]['hc']['googledisk']
res_load_model_weights_hc = _b5.load_audio_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['audio'][corpus]['nn']['googledisk']
res_load_model_weights_nn = _b5.load_audio_model_weights_nn(url = url, force_reload = False)
# Формирование видеомоделей
res_load_model_hc = _b5.load_video_model_hc(lang=lang)
res_load_model_deep_fe = _b5.load_video_model_deep_fe()
res_load_model_nn = _b5.load_video_model_nn()
# Загрузка весов видеомоделей
url = _b5.weights_for_big5_['video'][corpus]['hc']['googledisk']
res_load_model_weights_hc = _b5.load_video_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['video'][corpus]['fe']['googledisk']
res_load_model_weights_deep_fe = _b5.load_video_model_weights_deep_fe(url = url, force_reload = False)
url = _b5.weights_for_big5_['video'][corpus]['nn']['googledisk']
res_load_model_weights_nn = _b5.load_video_model_weights_nn(url = url, force_reload = False)
# Загрузка словаря с экспертными признаками (текстовая модальность)
res_load_text_features = _b5.load_text_features()
# Формирование текстовых моделей
res_setup_translation_model = _b5.setup_translation_model() # только для русского языка
res_setup_translation_model = _b5.setup_bert_encoder(force_reload = False)
res_load_text_model_hc_fi = _b5.load_text_model_hc(corpus=corpus)
res_load_text_model_nn_fi = _b5.load_text_model_nn(corpus=corpus)
# Загрузка весов текстовых моделей
url = _b5.weights_for_big5_['text'][corpus]['hc']['googledisk']
res_load_text_model_weights_hc_fi = _b5.load_text_model_weights_hc(url = url, force_reload = False)
url = _b5.weights_for_big5_['text'][corpus]['nn']['googledisk']
res_load_text_model_weights_nn_fi = _b5.load_text_model_weights_nn(url = url, force_reload = False)
# Формирование модели для мультимодального объединения информации
res_load_avt_model_b5 = _b5.load_avt_model_b5()
# Загрузка весов модели для мультимодального объединения информации
url = _b5.weights_for_big5_['avt'][corpus]['b5']['googledisk']
res_load_avt_model_weights_b5 = _b5.load_avt_model_weights_b5(url = url, force_reload = False)
PATH_TO_DIR = './video_MuPTA/'
PATH_SAVE_VIDEO = './video_MuPTA/test/'
_b5.path_to_save_ = PATH_SAVE_VIDEO
# Загрузка 10 тестовых аудиовидеозаписей из корпуса MuPTA
# URL: https://hci.nw.ru/en/pages/mupta-corpus
domain = 'https://download.sberdisk.ru/download/file/'
tets_name_files = [
'477995979?token=2cvyk7CS0mHx2MJ&filename=speaker_06_center_83.mov',
'477995980?token=jGPtBPS69uzFU6Y&filename=speaker_01_center_83.mov',
'477995967?token=zCaRbNB6ht5wMPq&filename=speaker_11_center_83.mov',
'477995966?token=B1rbinDYRQKrI3T&filename=speaker_15_center_83.mov',
'477995978?token=dEpVDtZg1EQiEQ9&filename=speaker_07_center_83.mov',
'477995961?token=o1hVjw8G45q9L9Z&filename=speaker_19_center_83.mov',
'477995964?token=5K220Aqf673VHPq&filename=speaker_23_center_83.mov',
'477995965?token=v1LVD2KT1cU7Lpb&filename=speaker_24_center_83.mov',
'477995962?token=tmaSGyyWLA6XCy9&filename=speaker_27_center_83.mov',
'477995963?token=bTpo96qNDPcwGqb&filename=speaker_10_center_83.mov',
]
for curr_files in tets_name_files:
_b5.download_file_from_url(url = domain + curr_files, out = True)
# Получение прогнозов
_b5.path_to_dataset_ = PATH_TO_DIR # Директория набора данных
_b5.ext_ = ['.mov'] # Расширения искомых файлов
# Полный путь к файлу с верными предсказаниями для подсчета точности
url_accuracy = _b5.true_traits_['mupta']['sberdisk']
_b5.get_avt_predictions(url_accuracy = url_accuracy, lang = lang)
[2024-10-10 18:29:55] Извлечение признаков (экспертных и нейросетевых) из текста …
[2024-10-10 18:29:56] Получение прогнозов и вычисление точности (мультимодальное объединение) …
10 из 10 (100.0%) … GitHub:nbsphinx-math:OCEANAI\docs\source\user_guide:nbsphinx-math:notebooks\video_MuPTA:nbsphinx-math:test\speaker_27_center_83.mov …
| Path | Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | |
|---|---|---|---|---|---|---|
| Person ID | ||||||
| 1 | speaker_01_center_83.mov | 0.59561 | 0.542967 | 0.440668 | 0.589769 | 0.515306 |
| 2 | speaker_06_center_83.mov | 0.661347 | 0.673973 | 0.603208 | 0.64543 | 0.6431 |
| 3 | speaker_07_center_83.mov | 0.439868 | 0.465049 | 0.284547 | 0.422551 | 0.396058 |
| 4 | speaker_10_center_83.mov | 0.47715 | 0.502563 | 0.373686 | 0.441372 | 0.424637 |
| 5 | speaker_11_center_83.mov | 0.403292 | 0.344359 | 0.317304 | 0.422228 | 0.384346 |
| 6 | speaker_15_center_83.mov | 0.581837 | 0.562177 | 0.504623 | 0.602169 | 0.522254 |
| 7 | speaker_19_center_83.mov | 0.510444 | 0.448468 | 0.425599 | 0.451861 | 0.447891 |
| 8 | speaker_23_center_83.mov | 0.500526 | 0.541376 | 0.308529 | 0.441178 | 0.452412 |
| 9 | speaker_24_center_83.mov | 0.427677 | 0.511355 | 0.301078 | 0.434281 | 0.442301 |
| 10 | speaker_27_center_83.mov | 0.566414 | 0.659169 | 0.434059 | 0.59122 | 0.579172 |
[2024-10-10 18:29:56] Точность по отдельным персональным качествам личности человека …
| Openness | Conscientiousness | Extraversion | Agreeableness | Non-Neuroticism | Mean | |
|---|---|---|---|---|---|---|
| Metrics | ||||||
| MAE | 0.1632 | 0.1621 | 0.176 | 0.2589 | 0.1122 | 0.1745 |
| Accuracy | 0.8368 | 0.8379 | 0.824 | 0.7411 | 0.8878 | 0.8255 |
[2024-10-10 18:29:56] Средняя средних абсолютных ошибок: 0.1745, средняя точность: 0.8255 …
Лог файлы успешно сохранены …
— Время выполнения: 320.737 сек. —
[16]:
True
Для прогнозирования потребительских предпочтений в промышленных товарах необходимо знать коэффициенты корреляции, определяющие взаимосвязь между персональными качествами личности человека и предпочтениями в товарах или услугах.
В качестве примера предлагается использование коэффициентов корреляции между персональными качествами человека и характеристиками автомобилей, представленными в статье:
O’Connor P. J. et al. What Drives Consumer Automobile Choice? Investigating Personality Trait Predictors of Vehicle Preference Factors // Personality and Individual Differences. – 2022. – Vol. 184. – pp. 111220.
Пользователь может установить свои коэффициенты корреляции.
Прогнозирование потребительских предпочтений на характеристики атомобиля
[17]:
# Загрузка датафрейма с коэффициентами корреляции
url = 'https://download.sberdisk.ru/download/file/478675818?token=EjfLMqOeK8cfnOu&filename=auto_characteristics.csv'
df_correlation_coefficients = pd.read_csv(url)
df_correlation_coefficients = pd.DataFrame(
df_correlation_coefficients.drop(['Style and performance', 'Safety and practicality'], axis = 1)
)
df_correlation_coefficients.index.name = 'ID'
df_correlation_coefficients.index += 1
df_correlation_coefficients.index = df_correlation_coefficients.index.map(str)
df_correlation_coefficients
[17]:
| Trait | Performance | Classic car features | Luxury additions | Fashion and attention | Recreation | Technology | Family friendly | Safe and reliable | Practical and easy to use | Economical/low cost | Basic features | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ID | ||||||||||||
| 1 | Openness | 0.020000 | -0.033333 | -0.030000 | -0.050000 | 0.033333 | 0.013333 | -0.030000 | 0.136667 | 0.106667 | 0.093333 | 0.006667 |
| 2 | Conscientiousness | 0.013333 | -0.193333 | -0.063333 | -0.096667 | -0.096667 | 0.086667 | -0.063333 | 0.280000 | 0.180000 | 0.130000 | 0.143333 |
| 3 | Extraversion | 0.133333 | 0.060000 | 0.106667 | 0.123333 | 0.126667 | 0.120000 | 0.090000 | 0.136667 | 0.043333 | 0.073333 | 0.050000 |
| 4 | Agreeableness | -0.036667 | -0.193333 | -0.133333 | -0.133333 | -0.090000 | 0.046667 | -0.016667 | 0.240000 | 0.160000 | 0.120000 | 0.083333 |
| 5 | Non-Neuroticism | 0.016667 | -0.006667 | -0.010000 | -0.006667 | -0.033333 | 0.046667 | -0.023333 | 0.093333 | 0.046667 | 0.046667 | -0.040000 |
[18]:
_b5._priority_calculation(
correlation_coefficients = df_correlation_coefficients,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = False
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'auto_characteristics_priorities_mupta_en', out = True)
# Опционно
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[18]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | speaker_01_center_83.mov | 0.596 | 0.543 | 0.441 | 0.590 | 0.515 | Practical and easy to use | Economical/low cost | Recreation | Openness | Agreeableness | Non-Neuroticism |
| 2 | speaker_06_center_83.mov | 0.661 | 0.674 | 0.603 | 0.645 | 0.643 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 3 | speaker_07_center_83.mov | 0.440 | 0.465 | 0.285 | 0.423 | 0.396 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 4 | speaker_10_center_83.mov | 0.477 | 0.503 | 0.374 | 0.441 | 0.425 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 5 | speaker_11_center_83.mov | 0.403 | 0.344 | 0.317 | 0.422 | 0.384 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 6 | speaker_15_center_83.mov | 0.582 | 0.562 | 0.505 | 0.602 | 0.522 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
| 7 | speaker_19_center_83.mov | 0.510 | 0.448 | 0.426 | 0.452 | 0.448 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 8 | speaker_23_center_83.mov | 0.501 | 0.541 | 0.309 | 0.441 | 0.452 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 9 | speaker_24_center_83.mov | 0.428 | 0.511 | 0.301 | 0.434 | 0.442 | Classic car features | Fashion and attention | Luxury additions | Agreeableness | Conscientiousness | Openness |
| 10 | speaker_27_center_83.mov | 0.566 | 0.659 | 0.434 | 0.591 | 0.579 | Safe and reliable | Practical and easy to use | Economical/low cost | Conscientiousness | Agreeableness | Openness |
Прогнозирование потребительских предпочтений на характеристики мобильного устройства
В качестве примера предлагается использование коэффициентов корреляции между персональными качествами человека и характеристиками мобильного устройства, представленными в статье:
Peltonen E., Sharmila P., Asare K. O., Visuri A., Lagerspetz E., Ferreira D. (2020). When phones get personal: Predicting Big Five personality traits from application usage // Pervasive and Mobile Computing. – 2020. – Vol. 69. – 101269.
[19]:
# Загрузка датафрейма с коэффициентами корреляции
url = 'https://download.sberdisk.ru/download/file/478676690?token=7KcAxPqMpWiYQnx&filename=divice_characteristics.csv'
df_divice_characteristics = pd.read_csv(url)
df_divice_characteristics.index.name = 'ID'
df_divice_characteristics.index += 1
df_divice_characteristics.index = df_divice_characteristics.index.map(str)
df_divice_characteristics
[19]:
| Trait | Communication | Game Action | Game Board | Game Casino | Game Educational | Game Simulation | Game Trivia | Entertainment | Finance | Health and Fitness | Media and Video | Music and Audio | News and Magazines | Personalisation | Travel and Local | Weather | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ID | |||||||||||||||||
| 1 | Openness | 0.118 | 0.056 | 0.079 | 0.342 | 0.027 | 0.104 | 0.026 | 0.000 | 0.006 | 0.002 | 0.000 | 0.000 | 0.001 | 0.004 | 0.002 | 0.004 |
| 2 | Conscientiousness | 0.119 | 0.043 | 0.107 | 0.448 | 0.039 | 0.012 | 0.119 | 0.000 | 0.005 | 0.001 | 0.000 | 0.002 | 0.002 | 0.001 | 0.001 | 0.003 |
| 3 | Extraversion | 0.246 | 0.182 | 0.211 | 0.311 | 0.102 | 0.165 | 0.223 | 0.001 | 0.003 | 0.000 | 0.001 | 0.001 | 0.001 | 0.004 | 0.009 | 0.003 |
| 4 | Agreeableness | 0.218 | 0.104 | 0.164 | 0.284 | 0.165 | 0.122 | 0.162 | 0.000 | 0.003 | 0.001 | 0.000 | 0.002 | 0.002 | 0.001 | 0.004 | 0.003 |
| 5 | Non-Neuroticism | 0.046 | 0.047 | 0.125 | 0.515 | 0.272 | 0.179 | 0.214 | 0.002 | 0.030 | 0.001 | 0.000 | 0.005 | 0.003 | 0.008 | 0.004 | 0.007 |
[20]:
_b5._priority_calculation(
correlation_coefficients = df_divice_characteristics,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = True
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'divice_characteristics_priorities_mupta_en', out = True)
# Опционно
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[20]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | speaker_01_center_83.mov | 0.596 | 0.543 | 0.441 | 0.590 | 0.515 | Communication | Health and Fitness | Media and Video | Agreeableness | Openness | Non-Neuroticism |
| 2 | speaker_06_center_83.mov | 0.661 | 0.674 | 0.603 | 0.645 | 0.643 | Game Casino | Communication | Game Trivia | Non-Neuroticism | Extraversion | Conscientiousness |
| 3 | speaker_07_center_83.mov | 0.440 | 0.465 | 0.285 | 0.423 | 0.396 | Media and Video | Entertainment | Health and Fitness | Agreeableness | Conscientiousness | Extraversion |
| 4 | speaker_10_center_83.mov | 0.477 | 0.503 | 0.374 | 0.441 | 0.425 | Media and Video | Entertainment | Health and Fitness | Agreeableness | Conscientiousness | Extraversion |
| 5 | speaker_11_center_83.mov | 0.403 | 0.344 | 0.317 | 0.422 | 0.384 | Media and Video | Entertainment | Health and Fitness | Conscientiousness | Agreeableness | Extraversion |
| 6 | speaker_15_center_83.mov | 0.582 | 0.562 | 0.505 | 0.602 | 0.522 | Game Casino | Communication | Game Board | Agreeableness | Conscientiousness | Openness |
| 7 | speaker_19_center_83.mov | 0.510 | 0.448 | 0.426 | 0.452 | 0.448 | Media and Video | Entertainment | Health and Fitness | Conscientiousness | Agreeableness | Extraversion |
| 8 | speaker_23_center_83.mov | 0.501 | 0.541 | 0.309 | 0.441 | 0.452 | Media and Video | Entertainment | Health and Fitness | Agreeableness | Conscientiousness | Extraversion |
| 9 | speaker_24_center_83.mov | 0.428 | 0.511 | 0.301 | 0.434 | 0.442 | Media and Video | Entertainment | Health and Fitness | Agreeableness | Conscientiousness | Extraversion |
| 10 | speaker_27_center_83.mov | 0.566 | 0.659 | 0.434 | 0.591 | 0.579 | Game Casino | Game Educational | Game Trivia | Non-Neuroticism | Conscientiousness | Agreeableness |
Прогнозирование потребительских предпочтений по стилю одежды
В качестве примера предлагается использование коэффициентов корреляции между персональными качествами человека и стилем одежды, представленными в статье:
Stolovy T. Styling the self: clothing practices, personality traits, and body image among Israeli women // Frontiers in psychology. - 2022. - vol. 12. - 719318.
[21]:
# Загрузка датафрейма с коэффициентами корреляции
url = 'https://download.sberdisk.ru/download/file/493644097?token=KGtSGMxjZtWXmBz&filename=df_%D1%81lothing_style_correlation.csv'
df_clothing_styles = pd.read_csv(url)
df_clothing_styles.index.name = 'ID'
df_clothing_styles.index += 1
df_clothing_styles.index = df_clothing_styles.index.map(str)
df_clothing_styles
[21]:
| Trait | Comfort | Camouflage | Assurance | Fashion | Individuality | |
|---|---|---|---|---|---|---|
| ID | ||||||
| 1 | Openness | 0.01 | -0.24 | 0.31 | 0.07 | 0.31 |
| 2 | Conscientiousness | -0.03 | -0.24 | 0.17 | 0.09 | 0.15 |
| 3 | Extraversion | -0.01 | -0.19 | 0.30 | 0.13 | 0.14 |
| 4 | Agreeableness | 0.16 | -0.16 | 0.15 | -0.09 | -0.05 |
| 5 | Non-Neuroticism | 0.03 | -0.16 | 0.01 | 0.00 | 0.06 |
[22]:
_b5._priority_calculation(
correlation_coefficients = df_clothing_styles,
col_name_ocean = 'Trait',
threshold = 0.55,
number_priority = 3,
number_importance_traits = 3,
out = True
)
_b5._save_logs(df = _b5.df_files_priority_, name = 'clothing_styles_priorities_mupta_en', out = True)
# Опционно
df = _b5.df_files_priority_.rename(columns = {'Openness':'OPE', 'Conscientiousness':'CON', 'Extraversion': 'EXT', 'Agreeableness': 'AGR', 'Non-Neuroticism': 'NNEU'})
columns_to_round = ['OPE', 'CON', 'EXT', 'AGR', 'NNEU']
df[columns_to_round] = df[columns_to_round].apply(lambda x: [round(i, 3) for i in x])
df
[22]:
| Path | OPE | CON | EXT | AGR | NNEU | Priority 1 | Priority 2 | Priority 3 | Trait importance 1 | Trait importance 2 | Trait importance 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Person ID | ||||||||||||
| 1 | speaker_01_center_83.mov | 0.596 | 0.543 | 0.441 | 0.590 | 0.515 | Comfort | Camouflage | Assurance | Agreeableness | Non-Neuroticism | Conscientiousness |
| 2 | speaker_06_center_83.mov | 0.661 | 0.674 | 0.603 | 0.645 | 0.643 | Assurance | Individuality | Fashion | Openness | Extraversion | Conscientiousness |
| 3 | speaker_07_center_83.mov | 0.440 | 0.465 | 0.285 | 0.423 | 0.396 | Camouflage | Comfort | Fashion | Conscientiousness | Openness | Non-Neuroticism |
| 4 | speaker_10_center_83.mov | 0.477 | 0.503 | 0.374 | 0.441 | 0.425 | Camouflage | Comfort | Fashion | Conscientiousness | Openness | Non-Neuroticism |
| 5 | speaker_11_center_83.mov | 0.403 | 0.344 | 0.317 | 0.422 | 0.384 | Camouflage | Fashion | Comfort | Openness | Conscientiousness | Non-Neuroticism |
| 6 | speaker_15_center_83.mov | 0.582 | 0.562 | 0.505 | 0.602 | 0.522 | Assurance | Individuality | Comfort | Openness | Conscientiousness | Agreeableness |
| 7 | speaker_19_center_83.mov | 0.510 | 0.448 | 0.426 | 0.452 | 0.448 | Camouflage | Comfort | Fashion | Openness | Conscientiousness | Non-Neuroticism |
| 8 | speaker_23_center_83.mov | 0.501 | 0.541 | 0.309 | 0.441 | 0.452 | Camouflage | Comfort | Fashion | Conscientiousness | Openness | Non-Neuroticism |
| 9 | speaker_24_center_83.mov | 0.428 | 0.511 | 0.301 | 0.434 | 0.442 | Camouflage | Comfort | Fashion | Conscientiousness | Openness | Non-Neuroticism |
| 10 | speaker_27_center_83.mov | 0.566 | 0.659 | 0.434 | 0.591 | 0.579 | Assurance | Individuality | Comfort | Openness | Conscientiousness | Agreeableness |