back_office_ops · workflow

Dropbox uses ML model to identify date formats in file names for automated naming conventions

Dropbox's naming conventions feature needed to detect dates already present in file names before renaming them, but the wide variety of date formats, abbreviations, and inconsistent separators across files made reliable date identification very difficult.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · File upload triggers renaming
Files uploaded to a specific folder are automatically flagged for renaming to match the preferred naming convention.
Tools used
DistilRobertaSentencePieceDoccano
Outcome

The ML model achieved a 40% increase in renamed files over the rule-based baseline, and following rollout in August 2022 naming conventions were applied to more than one million files in the feature's first few weeks.

What failed first

A rule-based approach to date identification was tried first but could not handle the breadth of formats encountered at Dropbox's scale without requiring impractical enumeration of every possible format.

Results
Time savedmore than one million
Volume40%
Running sinceAugust 2022
Source

https://dropbox.tech/machine-learning/using-ml-to-identify-date-formats-in-file-names

How we source this →

Grounding & classification
Source type: technical build writeup
20 fields verified against source quotes.
data extractiondocument classificationbuilder submittedfailure mode describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedsoftwareautomation ratethroughput increasetechnical build writeupback office opsextract classify route