Date of Award

Spring 5-22-2021

Degree Type

Thesis

Degree Name

Master of Science (MS)

Department

Electrical Engineering and Computer Science

Advisor(s)

Reza Zafarani

Subject Categories

Computer Sciences | Physical Sciences and Mathematics

Abstract

Interactive websites generate terabytes of data on a daily basis. This data canbe used in multiple analytical applications to teach computers more about human behavior. Text classification is such an application. Multiple freely available user-generated text data can be used to teach computers to identify the sentiments behind a user’s on-screen interactions without the need of any human intervention. Sentiment analysis is an interesting problem, solving which would theoretically get a computer closer to passing the Turing test. Through this thesis, we test the ability of a classifier to accurately identify user sentiments. However, we do not focus on standard classification settings and the aim is to train the classifier in such a way that it would also be effective in identifying sentiment behind user generated text generated from a completely new social media platform. To be able to do this, we must first identify behavioral bias based on user interactions in two different social media sites as well as websites that accept user reviews. This bias must then be mitigated in order to obtain an unbiased classifier that can then be used to identify user sentiments on any social media platform. For the research in this thesis, such user-generated text is obtained from the social media sites Reddit and Twitter. We also obtain product review data related to both books and wine. Various natural language processing techniques are then employed to process the data and extract similar and dissimilar trends. Vectorized user text would be used to train sentiment classifiers. Finally, classification bias would be identified and mitigated in order to obtain classifiers that can identify human sentiments in real-time with an improved accuracy with limited dependency on source information.

Access

Open Access

Recommended Citation

Deshpande, Alpana, "Sentiment Classification Bias In User Generated Content" (2021). Theses - ALL. 478.
https://surface.syr.edu/thesis/478

Download

Included in

Computer Sciences Commons

COinS

Theses - ALL

Sentiment Classification Bias In User Generated Content

Date of Award

Degree Type

Degree Name

Department

Advisor(s)

Subject Categories

Abstract

Access

Recommended Citation

Included in

Browse

Search

Author Resources

Theses - ALL

Sentiment Classification Bias In User Generated Content

Author

Date of Award

Degree Type

Degree Name

Department

Advisor(s)

Subject Categories

Abstract

Access

Recommended Citation

Included in

Share

Browse

Search

Author Resources