# Performance Shell completion has a tight latency budget. Users press TAB repeatedly, and visible delays interrupt command-line editing. ## Design for speed conda-completion avoids the two main sources of latency in shell completion: 1. **No Python on the hot path.** The Rust binary is the only thing that runs on TAB press. Python startup can already be perceptible before conda imports plugins or configuration. 2. **No file re-parsing on repeat presses.** A stat-based file cache tracks which project and global files have changed. If nothing changed since the last TAB press (the common case), the binary skips all TOML/YAML parsing and reads pre-parsed results from a small cache file. ## What happens on each TAB press **Common case (commands, flags, package names):** 1. `stat()` each source file (one syscall per file) 2. Compare against cached `(mtime, size)` tuples 3. On cache hit: read pre-parsed results from `context_cache.msgpack` 4. Read `completion.msgpack` (command tree + package names) 5. Prefix-filter candidates and output This path is fast because it does no parsing, just binary file reads and string comparisons. **Version completion (when `=` is detected):** Same as above, plus loading `versions.index` and one record from `versions.store` for the requested package. This path does extra I/O and msgpack decoding, so version completion is slower than command or package name completion, but it avoids deserializing every package's versions for one lookup. **Fuzzy matching (when no prefix or substring match exists):** Same as the common case, but instead of a simple prefix filter, runs normalized Damerau-Levenshtein similarity scoring over all package names. This is the slowest path, and only runs after prefix and substring matching return no results. ## The stat cache The stat cache is the key optimization. On each invocation: 1. Call `stat()` on every source file. One syscall per file. 2. Compare each file's `(mtime, size)` tuple against cached values. 3. If all match (the common case), read pre-parsed candidates from `context_cache.msgpack`. No TOML/YAML parsing at all. 4. If any file changed, re-parse only that file. Merge with cached results for unchanged files. Write the updated cache atomically (write to `.tmp`, then rename). ### Why stat and not content hashing? Content hashing (e.g., xxhash of file contents) requires reading the entire file before deciding whether to parse it. `stat()` answers the same question with a single syscall that reads only filesystem metadata. The only false-negative case (content changes without mtime or size changing) is vanishingly rare in normal editing workflows. ## Fuzzy matching When no prefix or substring match is found, the binary falls back to normalized Damerau-Levenshtein similarity. This handles common typos like transpositions ("nupmy" for "numpy") and near-misses ("numpie" for "numpy"). The matching uses a three-stage strategy to avoid unnecessary work: 1. **Prefix match**: return immediately if any candidates start with the query. This is the common case and is essentially free. 2. **Substring match**: return if any candidates contain the query. Still fast, one pass over the candidate list. 3. **Similarity**: only runs when the first two stages return nothing. Scores are filtered at a 0.6 threshold and capped at 10 results. ## Comparison with existing tools | Tool | Approach | |---|---| | `conda-bash-completion` | Runs `conda --help` on every TAB press, parses output with sed/awk | | `conda-zsh-completion` | Hand-written static script with 12-hour package cache | | Fish built-in `conda.fish` | Static script (based on conda 4.4.11) | | `argc-completions` | Generic `--help` parser | | **conda-completion** | **Pre-generated msgpack manifest, Rust binary, stat-cached context** | The key difference is that conda-completion never starts a Python process on TAB press. Tools that run `conda --help` or similar pay Python's startup cost on every keypress. ## Binary size The Rust binary is compiled with LTO (link-time optimization), size optimization (`opt-level = "z"`), and symbol stripping. Dependencies are kept minimal: | Dependency | Purpose | |---|---| | `serde` + `rmp-serde` | msgpack manifest and cache deserialization | | `serde` + `toml` | TOML workspace/project file parsing (conda.toml, pixi.toml) | | `serde-saphyr` | YAML parsing (environment.yml, .condarc, lockfiles) | | `fs-err` | Better I/O errors | Heavier alternatives were evaluated and rejected: - **rattler crates** (`rattler_conda_types`, `rattler_lock`): pull in nom, regex, simd-json, rayon, purl, fancy-regex. Significant binary size and startup overhead for functionality we don't need. - **clap_complete**: adds clap's full argument parsing framework. Unnecessary when the completion engine is custom. - **serde_yml**: unmaintained, has a RUSTSEC advisory (RUSTSEC-2025-0068). Replaced by serde-saphyr (pure Rust, actively maintained).