Date of Award
5-11-2025
Date Published
June 2025
Degree Type
Thesis
Degree Name
Master of Science (MS)
Department
Information Science & Technology
Advisor(s)
Reza Zafarani
Keywords
Cross-Culture;False Claim;Hate Perception;Hate Speech Detection;LLMs
Subject Categories
Computer Sciences | Physical Sciences and Mathematics
Abstract
Hate is a sentiment, while hate speech refers to the expression of hate in a form that targets and attacks specific groups, such as race, religion, or gender. With the rise of the internet and social media, hate speech has spread rapidly, gaining wide exposure and posing threats to individual well-being, the profits of major tech companies, and social stability. As a result, both industry and academia have turned their attention to the study of hate speech. One of the most active areas is hate speech detection, which involves training models to predict whether a given piece of content is hateful. However, the choice of models and methods can vary depending on the form in which the hate speech is conveyed. In this thesis, we address two key issues: 1.In the task of hateful meme classification, many existing approaches focus on stacking model parameters to achieve better performance, but lack a deep understanding of how hateful memes are constructed. Furthermore, they have not effectively leveraged large language models (LLMs) for this task. 2.Although the definitions of hate and hate speech are well-established, individuals’ perceptions of hate can vary due to differences in cultural background. As a result, judgments about whether a piece of content is hateful may differ from person to person. However, current hate speech detection models typically rely on labels obtained through majority voting, without accounting for the cultural specificity of individual annotators. To address the first issue, we observe that creators of hateful memes often exaggerate their emotions and reinforce stereotypes, leading to a mismatch between the text and image within the meme. We refer to this phenomenon as a false claim. Based on this insight, we propose the FACT model(FAlse Claim haTeful meme classification model), a model built upon a large language model(LLM), which identifies false claims in memes to assist the classification process. To address the second issue, we first evaluate LLMs and find that they are unable to effectively utilize cultural background information to support reasoning, while historical labeling proves to be useful. Based on this, we hypothesize that an individual’s perception of hate is influenced by specific combinations of cultural background factors. To incorporate this insight, we apply matrix factorization techniques from recommender systems to learn interaction features for each cultural background combination. These features are then used to support culture-aware hate speech detection.
Access
Open Access
Recommended Citation
Cai, Weibin, "Harnessing LLMs to Detect Hate Speech" (2025). Theses - ALL. 933.
https://surface.syr.edu/thesis/933