AI-powered content moderation has become an essential tool for managing online content.1 However, when applied to languages like Arabic, with its rich linguistic diversity and cultural nuances, these systems often face challenges.
This research paper by Mona Elswah from Harvard University and University of Oxford, investigates the biases and limitations of AI content moderation systems in the Arabic context.
We scrutinize the challenges posed by over- and under-moderation, the impact on freedom of expression, and the need for greater transparency and accountability in algorithmic decision-making.
Summary of the Research Paper on AI Content Moderation in Arabic
It explores the challenges and biases inherent in AI-powered content moderation systems, specifically focusing on Arabic language content on platforms like Facebook. Key findings and conclusions include:
Inconsistent Moderation and Algorithmic Biases
Dialectal Bias: The AI models used for moderation often exhibit bias towards specific Arabic dialects, such as over-moderating content in Syrian or Palestinian Arabic while under-moderating content in Maghrebi Arabic.
💡 This suggests that the training data for these models is not representative of the full range of Arabic dialects.
Source Bias: The AI systems tend to over-moderate content from new or unverified pages, while allowing content from established pages with large followings to pass through with fewer restrictions.
💡 This creates a disparity in treatment based on the source of the content.
Contextual Bias: The AI models struggle to understand the nuances of Arabic language and culture, leading to over- or under-moderation based on misinterpretations of context.
📌 For example, names or terms that are common in Arabic-speaking countries might be flagged as extremist or harmful due to their association with specific groups or events.
Lack of Transparency and Accountability
The algorithms used for content moderation are often shrouded in secrecy, making it difficult for users, researchers, and civil society organizations to understand how decisions are made. This lack of transparency hinders efforts to identify and address biases in the system.
Despite efforts to increase transparency, such as the establishment of oversight boards by platforms like Facebook, these bodies often have limited power to compel the company to make changes to its algorithms or policies.
Consequences for Users and Civil Society
The over-moderation of Arabic content can have a chilling effect on freedom of expression, particularly for activists and journalists who rely on these platforms to share information and ideas.
The inconsistent and biased application of content moderation rules erode trust in social media platforms among Arabic-speaking users.
The over-reliance on AI for content moderation has not been effective in preventing the spread of harmful content, as bad actors have found ways to evade detection.
Research recommendations
Civil society members are frustrated that technology companies are not implementing their recommendations, which may discourage them from participating in the future.
Companies should inform civil society about the decisions made based on their feedback to increase trust and foster participation.
There is an urgent need to increase the number of local experts who are proficient in Arabic and its various dialects to improve the performance of AI models.
The current appeal process is inefficient, as users face significant difficulties in recovering content that has been mistakenly removed, leading to a loss of trust in the platforms.
A fast and effective appeal process should be developed to address the issue of content being mistakenly removed and improve the performance of algorithms.
In conclusion, the research highlights the urgent need for tech companies to address the biases and shortcomings of their AI-powered content moderation systems, particularly when it comes to Arabic language content. By taking the steps outlined above, platforms can create a more equitable and inclusive online environment for all users.
For more details you can read the research paper:
Does AI Understand Arabic? Evaluating The Politics Behind the Algorithmic Arabic Content Moderation. 5 Feb 2024
Disinformation Nation: Social Media’s Role in Promoting Extremism and Misinformation, 117th Cong. (2021) (statement of Mark Zuckerberg, Facebook) reported:
More than 95% of the hate speech that we take down is done by an AI [artificial intelligence] and not by a person. . . . And I think it’s 98 or 99% of the terrorist content that we take down is identified by an AI and not a person.