Classifier Format
The classifier CSV should follow this format:
file_name,mime_label,config_type
Cargo.toml,toml,ci_cd
.github/workflows/ci.yml,yaml,ci_cd
.dockerignore,text,non_config
file_name
: file to match againstmime_label
: MIME label from a scannerconfig_type
: eitherci_cd
ornon_config
Tips:
- Avoid duplicate file names unless necessary
- Normalize paths (e.g.
.github/workflows/*.yml
) - Keep MIME labels lowercase and simplified
The CSV is extensible. The more diverse your dataset, the more robust your classification becomes.