Process the profile data

In this section we do two things to the profiles.csv, the first is to drop the link column which will not be used in this project, and the second is to add the "num_favorite" column which records the number of favorite anime of each user.

Handle missing values in the rating.csv file

As the website indicated, all entries with rating = 0 are actually missing values, so we have to impute them as None. Also, weird values that are not in the indicated range would also be treated as missing.

For the dataset that stores information of various animes, we also want to delete unimportant columns

Finally we need to process the review.csv data

Extract the sentiment scores of each review, and also flatten the score dictionary.