Remove Accents from Characters in Go

Users are pretty sneaky. They try to find dozens upon dozens of ways to exploit your application, to do things that weren’t intended to be done. Replacing some letters from the alphabet by the same letters, but with an accent, is a prime example of how to easily avoid being caught by a basic profanity filter. Luckily, you can easily prevent that from happening by removing all accents from a string before comparing it.

While working on a Discord bot in Go for the sake of learning Go, I had to make a profanity filter, because I don’t want users making the bot do problematic things. Rather than including the profanity filter in the bot’s source, I decided to make a library separate from the project to reduce the overhead it would create if implemented directly in a bot. If you’re curious, the profanity filter library is called go-away, which I thought was a pretty clever name, probably.

Micro services are easier to manage than overly huge and obnoxious services, that’s why decoupling was a great idea, in this case, not to mention a profanity filter has many uses unrelated to bots.

Anyways, users could get around the profanity filter by simply replacing letters with numbers (13375p34k), but even more so, they could just use accents on some letters to get around the filter. There was a need to sanitize the string as much as possible before analyzing it for profanities, and the following function is what took care of removing accents from characters:

import (
	"unicode"
	"golang.org/x/text/transform"
	"golang.org/x/text/unicode/norm"
	"golang.org/x/text/runes"
)

func removeAccents(s string) string  {
	t := transform.Chain(norm.NFD, runes.Remove(runes.In(unicode.Mn)), norm.NFC)
	output, _, e := transform.String(t, s)
	if e != nil {
		panic(e)
	}
	return output
}