compliance_monitoring · media · workflow

AWS fine-tunes BERTweet LLM to classify toxic speech for a large gaming company

A large gaming company needed to automate detection of toxic speech in player interactions but lacked sufficient labeled training data and had cost and time constraints that made training a custom language model from scratch unviable.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Voice/text excerpt submitted

Voice and text excerpts from player interactions are submitted for toxic language detection.

Tools used

Amazon SageMakerBERTweetbertweet-base-offensivebertweet-base-hateHugging FaceRoBERTa

Outcome

AWS ProServe MLDT productionized a single-stage bertweet-base-offensive model that met the customer's accuracy threshold while improving ease of maintenance and lowering cost; precision decreased by only 3% compared to the two-stage approach.

What failed first

The PoC two-stage model architecture required double the model monitoring, increased costs from running two models, and slower inference speed, prompting a redesign to a single-stage model before production.

Results

Volume.92

Source

https://aws.amazon.com/blogs/machine-learning/aws-performs-fine-tuning-on-a-large-language-model-llm-to-classify-toxic-speech-for-a-large-gaming-company?tag=soumet-20

How we source this →

Grounding & classification

Source type: technical build writeup

30 fields verified against source quotes.

document classificationcall recordingchat transcriptfailure mode describedmetric backedproduction runtime claimedtools describedworkflow describedmediaaccuracy improvementautomation ratecost reductiontechnical build writeupcompliance monitoringquality assuranceextract classify routemonitor detect alert