compliance_monitoring · saas · workflow

GitHub builds Copilot secret scanning to detect leaked passwords with AI

Regular expressions, while effective for detecting provider-formatted secrets, could not handle the nuanced and varied structures of generic passwords, generating excessive noise for security teams and developers.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Git push or history scan

Secret scanning triggers on incoming Git pushes and also scans the entire Git history on all branches.

Tools used

GitHub CopilotGPT-3.5-TurboGPT-4GPT-4-TurboGPT-4o-miniMetaReflection

Outcome

After iterative improvements, Copilot secret scanning reached general availability in October 2024, achieving up to a 94% reduction in false positives in some organizations, and is now detecting passwords on nearly 35% of all GitHub Secret Protection repositories.

What failed first

An early iteration using GPT-3.5-Turbo with few-shot prompting worked in offline evaluation but failed for unconventional file types and structures encountered in actual customer repositories.

Results

Volume94%

Running sinceOctober 2024

Source

https://github.blog/engineering/platform-security/finding-leaked-passwords-with-ai-how-we-built-copilot-secret-scanning/

How we source this →

Grounding & classification

Source type: technical build writeup

25 fields verified against source quotes.

anomaly detectiondocument classificationcode diff prfailure mode describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedsoftwareaccuracy improvementerror reductiontechnical build writeupcompliance monitoringquality assurancemonitor detect alert