User Bias Removal in Fine Grained Sentiment Analysis

Rahul Wadbude^*, Vivek Gupta^*, Dheeraj Mekala, Janish Jindal, Harish Karnick

^*Equal contribution

The Crux

Major problem in current sentiment classification models is noise due to presence of user biases in reviews rating.
We worked on two simple statistical methods to remove user bias noise to improve fine grained sentimental classification.
We applied our methods on SNAP published Amazon Fine Food Reviews data-set and two major categories Electronics and Movies & TV of e-Commerce Reviews data-set.
We gained improvement on standard evaluation metrics (rmse) for three commonly used feature representation after removing user bias compared to one without removing bias on task of fine grained sentiment analysis

We report results on three datasets: Amazon food reviews, Amazon electronics reviews and Amazon movies & TV reviews
The review text was represented into three features representation (tf-idf,lda & doc2vec) and evaluated on standard metric like rmse. Below tables, shows the performance of different classification methods with various feature representations.

Methods	tf-idf	LDA	PV-DBOW
Majority Voting	1.535	1.535	1.535
User Mean	0.599	0.599	0.599
User Mode	2.557	2.557	2.557
Product Mean	1.140	1.140	1.140
Product Mode	1.746	1.746	1.746
Direct	0.888	1.494	1.06
Direct(bigram)	0.737	-	-
UBR-I	0.546	0.597	0.56
UBR-I(bigram)	0.529	-	-
UBR-II	0.669	0.778	0.71
UBR-II(bigram)	0.642	-	-

Methods	tf-idf	LDA	PV-DBOW
Majority Voting	1.417	1.417	1.417
User Mean	1.022	1.022	1.022
User Mode	1.278	1.278	1.278
Product Mean	1.095	1.095	1.095
Product Mode	1.358	1.358	1.358
Direct	0.932	1.434	1.1
Direct(bigram)	0.805	-	-
UBR-I	0.815	0.988	0.86
UBR-I(bigram)	0.763	-	-
UBR-II	0.821	1.011	0.9
UBR-II(bigram)	0.761	-	-

Methods	tf-idf	LDA	PV-DBOW
Majority Voting	1.494	1.494	1.494
User Mean	1.005	1.005	1.005
User Mode	1.258	1.258	1.258
Product Mean	1.066	1.066	1.066
Product Mode	1.347	1.347	1.347
Direct	0.936	1.273	1.08
Direct(bigram)	0.853	-	-
UBR-I	0.818	0.959	0.87
UBR-I(bigram)	0.783	-	-
UBR-II	0.814	0.982	0.87
UBR-II(bigram)	0.775	-	-

Our experiments show that user-bias removal is a legitimate problem and need to be handled in fine grained sentiment analysis.
Our experiments show that the methods proposed above remove user bias and improve fine grained sentiment analysis.
Our experiments show that the methods proposed above work well with commonly use text feature representation methods.