Sunbelt Computer Software

Put all the files in the same directory

requirements

sklearn, pickle, numpy, keras, sys, tensorflow 1.12

Make sure run the following steps sequentially To save time, you can run steps[j:] where j in [1,2,3,4]

steps

STEP 1. Data Preprocessing and Word Embedding

Command: python process_data.py Input: GoogleNews-vectors-negative300.txt, essays.csv, mairesse.csv Output: essays_mairesse.p Time Consumption: 5 hours or more

STEP 2. Feature Extraction with CNN

Command: python cnn_feature.py Input: essays_mairesse.p Output: EXT.p, NEU.p, AGR.p, OPN.p, CON.p (corresponding to five personality traits) Time Consumption: 10 hours or more

STEP 3. TF-IDF Feature Extraction Extraction of TF-IDF features consists of the following three steps, reading&preprocessing, vectorization, and saving.

Command: python tfidf_feature.py Input essays_mairesse.p Output: tfidf.p Time Consumption: 1 hour

STEP 4. Classification

The following command outputs train and test accuracies for class openness using different models different features. If want to classify other classes, just replace OPN.p by other class data files (AGR.p, CON.p, EXT.p, NEU.p)

Command: python classification.py OPN.p tfidf.p Input: tfidf.p, EXT.p, NEU.p, AGR.p, OPN.p, CON.p Output: printed Time Consumption: 20 min for each personality

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Put all the files in the same directory

requirements

steps

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
595_group_project.pdf		595_group_project.pdf
AGR.p		AGR.p
CON.p		CON.p
EXT.p		EXT.p
Emotion_Lexicon.csv		Emotion_Lexicon.csv
NEU.p		NEU.p
OPN.p		OPN.p
classification.py		classification.py
cnn_feature.py		cnn_feature.py
essays.csv		essays.csv
mairesse.csv		mairesse.csv
process_data.py		process_data.py
readme.md		readme.md
tfidf.p		tfidf.p
tfidf_feature.py		tfidf_feature.py

Sunbelt Computer Software

PL/B Language Development and Support

Folders and files

Latest commit

History

Repository files navigation

Put all the files in the same directory

requirements

steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages