compliance_monitoring · saas · workflow

GitHub builds Copilot secret scanning to detect leaked passwords with AI

Regular expressions, while effective for detecting provider-formatted secrets, could not handle the nuanced and varied structures of generic passwords, generating excessive noise for security teams and developers.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Git push or history scan
Secret scanning triggers on incoming Git pushes and also scans the entire Git history on all branches.
Tools used
GitHub CopilotGPT-3.5-TurboGPT-4GPT-4-TurboGPT-4o-miniMetaReflection
Outcome

After iterative improvements, Copilot secret scanning reached general availability in October 2024, achieving up to a 94% reduction in false positives in some organizations, and is now detecting passwords on nearly 35% of all GitHub Secret Protection repositories.

What failed first

An early iteration using GPT-3.5-Turbo with few-shot prompting worked in offline evaluation but failed for unconventional file types and structures encountered in actual customer repositories.

Results
Volume94%
Running sinceOctober 2024
Source

https://github.blog/engineering/platform-security/finding-leaked-passwords-with-ai-how-we-built-copilot-secret-scanning/

How we source this →

Grounding & classification
Source type: technical build writeup
25 fields verified against source quotes.
anomaly detectiondocument classificationcode diff prfailure mode describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedsoftwareaccuracy improvementerror reductiontechnical build writeupcompliance monitoringquality assurancemonitor detect alert