compliance_monitoring · media · workflow
AWS fine-tunes BERTweet LLM to classify toxic speech for a large gaming company
A large gaming company needed to automate detection of toxic speech in player interactions but lacked sufficient labeled training data and had cost and time constraints that made training a custom language model from scratch unviable.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Voice/text excerpt submitted
Voice and text excerpts from player interactions are submitted for toxic language detection.
Tools used
Amazon SageMakerBERTweetbertweet-base-offensivebertweet-base-hateHugging FaceRoBERTa
Outcome
AWS ProServe MLDT productionized a single-stage bertweet-base-offensive model that met the customer's accuracy threshold while improving ease of maintenance and lowering cost; precision decreased by only 3% compared to the two-stage approach.
What failed first
The PoC two-stage model architecture required double the model monitoring, increased costs from running two models, and slower inference speed, prompting a redesign to a single-stage model before production.
Results
Volume.92
Grounding & classification
Source type: technical build writeup
30 fields verified against source quotes.
document classificationcall recordingchat transcriptfailure mode describedmetric backedproduction runtime claimedtools describedworkflow describedmediaaccuracy improvementautomation ratecost reductiontechnical build writeupcompliance monitoringquality assuranceextract classify routemonitor detect alert