Classifier Format
The classifier CSV should follow this format:
file_name,mime_label,config_type
Cargo.toml,toml,ci_cd
.github/workflows/ci.yml,yaml,ci_cd
.dockerignore,text,non_config
file_name: file to match againstmime_label: MIME label from a scannerconfig_type: eitherci_cdornon_config
Tips:
- Avoid duplicate file names unless necessary
- Normalize paths (e.g.
.github/workflows/*.yml) - Keep MIME labels lowercase and simplified
The CSV is extensible. The more diverse your dataset, the more robust your classification becomes.