Moderator Bots: Baby Steps

 

In a world of ever-growing conversations and large forums, moderating manpower is in high demand. Websites turn to bots. Is that really the best idea?

 

Children’s MMOs And Overzealous Bots

 

Poorly configured bots will spot curse words in other words, so bot configuration is especially important to prevent kids from reverse-discovering a curse word. Kid’s games with open chat are notorious for this issue, even though they should have more attention and care put into their bot moderation than anywhere else. That’s the problem: they’ll go to extreme lengths to protect these children! The people programming auto-moderator bots get overaggressive and say ‘no exceptions. None’ to their bots. Context doesn’t matter, if it sees a combination of letters that add up to a curse word, then it has to be removed before other children see it. This, however, causes problems.

If someone tries to type ‘assess the situation’ they may end up with a message that says ‘***ess the situation’. They can confirm or deny words their friends told them were actually curse words by bouncing it off the chat filter. Children may be naïve, but they aren’t stupid!

Moderator bots were also trained to spot curse words separated by spaces l i k e t h i s later on. This isn’t a bad idea – it just has to be more delicately configured. People will do their best to worm around content filters, and if spaces work, then they’ll use spaces to curse out other players. The problem is that the machine frequently doesn’t understand the context of the letters surrounding it, and you get “Ay* **mells weird” instead of “Aya Ssmells weird” from some little kid’s typo.

The irony of all of this is that it creates a reverse censor effect – clean words seem dirty because the bot’s censored them, words like ‘Assassinate’, or “Scattered”, things kids might use in a game. Typos under this system turn into a fount of forbidden knowledge. People will worm around bot moderators, but – especially on children’s forums – it’s important that the bot understands context, at least a little. If it can’t do that, a human teammate is necessary to whitelist weird word combinations as they appear.

 

Paleontology and Oversized Profanity Libraries

 

There are many bones. And if you were going to single out a specific bone (in the context of paleontology) just to cause problems, which bone would you pick? The censor library picked the pubic bone, alongside a host of other totally normal words like ‘stream’ and ‘crack’. There were curse words in the library too, but, of course, like most normal, professional conferences, the curse words did not appear nearly as much as the other words used in completely scientific contexts.

As in the children’s MMO example, it wasn’t an innuendo to say ‘the bone was found in a stream’ until the censor library did the equivalent of adding the flirty wink emoji to the end of the statement. Since tone can’t be conveyed over text except by word choice, the computer choosing to single out a definition for ‘stream’ and apply it to all uses is what made it a dirty word. Besides the words with no connection to actual profanity, pubic bones do come up quite a lot when talking about fossils, because it provides information about how fossilized animals would walk. The pubic bone is the ‘front’ bone in the pelvis: two-legged animals have a differently shaped one than four-legged ones, and animals that walk totally upright like humans have differently shaped ones than animals that ‘lean forwards’, like birds.

Why make a moderation bot too strict to have conversations around? They didn’t make the bot! The conference organizers were using a pre-made program that included its own profanity library. Buying a software that includes censorship already baked-in sounds like a great idea! If applied correctly, it can save everyone time and prevent profanity from appearing where it shouldn’t, even anonymously. However, ask two people what profanity is, and you’ll get two different answers. Everyone has a different threshold for professional language, so it’s better to build a library of the ‘obvious’ ones and go from there based on the event. The best censoring software is the kind you don’t have to use. Professional events are better off stating their expectations, before frustrating their attendees with a software that causes more harm than good.

 

Weaponizing Profanity Filters

 

Twitter had a bit of a kerfuffle involving the city of Memphis. People using the word Memphis in a tweet got a temporary ban. Then, a rash of baiting other Twitter users into using Memphis hit once word got around. Memphis getting users banned was the result of a bug, but the incident itself highlights issues with profanity filters. It’s possible to bait people into using banned words, especially if they aren’t inherently a profane word when used out of context.

For example, some online games will filter out the countries of Niger and Nigeria, to prevent misspellings of a racial slur from evading a deserved ban. Why would North Americans ever be discussing African countries over a game set in Russia, after all? But, by including them, they’ve created a way to troll other players without saying anything profane (in context). Baiting another user into answering questions about the countries will result in them getting banned, not the question-asker. The person who answered now has to contact the human support line to get unbanned, or wait for their timeout to end, which is annoying and inconvenient for them. The anti-profanity filter has been weaponized!

Building a positive culture around a game takes a lot of effort, and profanity filters are an integral part of keeping arsonists and trolls out. Nobody should feel targeted in game chat for reasons outside the game. However, just like with every example mentioned here, humans should be on call to un-ban and un-block users who were genuinely attempting to answer a question. Err on the side of caution, both with the software and customer support.

 

Are Bots a Cure?

 

Short answer: no. Most good moderation teams have at least one human on them in case the bot screws up. Preferably, they’ll be able to respond to ‘deleted comment’ or ‘banned user’ complaints right away. Even better, if the bots are configured well enough, they’re not going to be jumping the gun often enough to take a team!

It’s just very difficult to make a bot that understands people well enough to understand every instance of bad language.

If you’re running a forum and you don’t want people using profanity, you will censor the profane words. A bot could do that. But then there’s things like LeetSpeek, where users will spell the colloquial name for a donkey with two fives in place of the ‘s’s. Do you ban that too? Sure, you could add that to the bot’s library. But then they change the A to a 4. Do you censor that too? If you do, people will push to figure out what is and isn’t acceptable to your bots, and they will. Not. Stop.

And then there’s things like homophones! TikTok, a popular video app, has a fairly robust profanity filter for text. Videos with curse words and sensitive topics in them are noticeably less popular than ones without those words, due to TikTok’s algorithm.  However, people making videos on sensitive topics use phrases like ‘Sewer Slide’ and ‘Home of Phobia’ to evade the bots. The bots, then, have not stopped anything. These conversations will happen no matter what TikTok’s moderators want, and banning the word ‘sewer’ is only displacing the problem. If you don’t want users discussing these things on your site, you’ll have to have human moderators at some point.

Language is dynamic, and bots simply can’t keep up. It takes real people to study languages – why wouldn’t it take real people to moderate it online?

Sources:

https://www.theguardian.com/science/2020/oct/16/profanity-filter-bones-paleontologists-conference

https://www.brennancenter.org/sites/default/files/2019-08/Report_Internet-Filters-2nd-edition.pdf

https://blog.twitter.com/en_us/topics/company/2019/hatefulconductupdate.html

https://www.engadget.com/twitter-bug-memphis-ban-133327641.html

https://www.theguardian.com/technology/2021/mar/15/twitter-accidentally-blocks-users-who-post-the-word-memphis