Twitter is updating its “be nice, think twice” system that prompts users to reconsider when they’re about to tweet a “potentially harmful or offensive” reply. The upgraded feature is now better at spotting “strong language,” claims Twitter; is more aware of vocabulary that has been “reclaimed by underrepresented communities” and is used in non-harmful ways; and also now takes into account your relationship with the person you’re messaging.
In other words, if you’re tweeting at a mutual who you interact with regularly, Twitter will assume “there’s a higher likelihood [you] have a better understanding of preferred tone of communication” and not show you a prompt. So, you can call your friend a **** or a *–* or even a ***– son of a ****-less ***** and Twitter won’t care. That’s freedom, folks.
Twitter first started testing this system in May 2020, paused it a little later, then brought it back to life in February this year. It’s one of a number of prompts the company has been testing to try and shape user behavior, including its “read before you retweet” message.
Improvements to the offensive-tweets prompt will roll out to English users of the Twitter iOS app today and to Android users “in the next few days.” The company says it’s already making a difference to how people interact on the platform, though.
Twitter claims internal tests show 34 percent of people who were served such a prompt “revised their initial reply or decided to not send their reply at all.” After receiving such a prompt once, people composed, on average, 11 percent “fewer offensive replies.” And people who were prompted about a reply (and therefore may have toned down their language) were themselves “less likely to receive offensive and harmful replies back.”
These “statistics” are as opaque as you would expect from any major internet platform (how exactly has the company quantified “less likely” in that last example? How many people are included in any of these tests? How do we know that people who revised their reply made it less offensive, or did they just use offensive language the system didn’t recognize?). But the ongoing roll-out does suggest that the feature is, at least, not making things actively worse on Twitter. That’s probably the best we can hope for.