Are you sure?
That’s the name of Tinder’s newest AI safety feature—and what it wants to ask users before they share potentially offensive messages. It’s designed to prompt users to reconsider before sharing something hateful in conversation.
Case study: In early tests, the feature reduced inappropriate messages sent by more than 10%, according to Tinder, and users were less likely to be reported for that reason in subsequent weeks—which Tinder views as a sign of longer-term behavioral change.
Tinder isn’t the first platform to lean on AI for content moderation. Last year, Twitter started testing a feature that asked users to think twice before posting potentially offensive replies, and in 2019, Instagram began warning users about captions that might violate its terms.
The flip side: AI for content moderation—in particular, models trained to flag hate speech—can disproportionately flag Black people, amplifying racial bias.
- One study found that tweets written by African Americans were 1.5x more likely to be categorized as hateful or offensive. AI models were also 2.2x more likely to flag tweets written in African American English dialect.—HF
Keep up with the innovative tech transforming business
Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.