API Reference
This page documents the public API of the Confignet library. If you are embedding Confignet into another tool (like Dodo), you’ll primarily interact with the ConfigClassifier
type.
Structs
ConfigRecord
A deserialized record from the classifier CSV.
#![allow(unused)] fn main() { pub struct ConfigRecord { pub file_name: String, pub mime_label: String, pub config_type: String, } }
Fields:
file_name
: The canonical file name for comparison (e.g.Cargo.toml
)mime_label
: The mime-type label assigned to the file (e.g.toml
,yaml
)config_type
: Either a type likeci_cd
,build
, ornon_config
This struct is used internally by the classifier.
ConfigClassifier
The main classifier struct that loads and queries classification rules.
#![allow(unused)] fn main() { pub struct ConfigClassifier { // Hidden internals } }
Constructor
#![allow(unused)] fn main() { pub fn from_csv<P: AsRef<Path>>(path: P) -> Result<Self> }
Loads a ConfigClassifier
from a given CSV file.
path
: The path to the.csv
file- Returns:
Result<ConfigClassifier>
Usage:
#![allow(unused)] fn main() { let classifier = ConfigClassifier::from_csv("data/labeled/ci_cd.csv")?; }
Method
#![allow(unused)] fn main() { pub fn classify(&self, file_name: &str, mime_label: &str) -> Option<ClassifiedResult> }
Attempts to classify a file given its name and mime type.
file_name
: Name of the file (e.g.,main.rs
,Dockerfile
)mime_label
: Mime type label from tools like Magika (e.g.,toml
,json
)- Returns:
Option<ClassifiedResult>
, orNone
if no suitable match is found
Example:
#![allow(unused)] fn main() { let result = classifier.classify("Cargo.toml", "toml"); }
Structs
ClassifiedResult
Returned from classify()
if a match is found.
#![allow(unused)] fn main() { pub struct ClassifiedResult { pub file_name: String, pub is_ci_cd: bool, } }
Fields:
file_name
: The best-matching canonical file name (e.g., from CSV)is_ci_cd
: Whether this file is used for CI/CD based onconfig_type
Internal Utilities
Confignet also includes a Levenshtein distance utility for fuzzy file matching:
#![allow(unused)] fn main() { fn levenshtein(a: &str, b: &str) -> usize }
This is used internally in classify()
to find the closest filename match in the dataset when multiple candidates exist with the same mime type.
Example Integration
#![allow(unused)] fn main() { use confignet::{ConfigClassifier, ClassifiedResult}; let classifier = ConfigClassifier::from_csv("data/labeled/ci_cd.csv")?; let result = classifier.classify("Dockerfile.ci", "text"); match result { Some(r) => println!("File: {}, Is CI/CD? {}", r.file_name, r.is_ci_cd), None => println!("Unrecognized file"), } }