# Performance

Shell completion has a tight latency budget. Users press TAB repeatedly,
and visible delays interrupt command-line editing.

## Design for speed

conda-completion avoids the two main sources of latency in shell
completion:

1. **No Python on the hot path.** The Rust binary is the only thing that
   runs on TAB press. Python startup can already be perceptible before
   conda imports plugins or configuration.

2. **No file re-parsing on repeat presses.** A stat-based file cache
   tracks which project and global files have changed. If nothing
   changed since the last TAB press (the common case), the binary skips
   all TOML/YAML parsing and reads pre-parsed results from a small
   cache file.

## What happens on each TAB press

**Common case (commands, flags, package names):**

1. `stat()` each source file (one syscall per file)
2. Compare against cached `(mtime, size)` tuples
3. On cache hit: read pre-parsed results from `context_cache.msgpack`
4. Read `completion.msgpack` (command tree + package names)
5. Prefix-filter candidates and output

This path is fast because it does no parsing, just binary file reads
and string comparisons.

**Version completion (when `=` is detected):**

Same as above, plus loading `versions.index` and one record from
`versions.store` for the requested package. This path does extra I/O
and msgpack decoding, so version completion is slower than command or
package name completion, but it avoids deserializing every package's
versions for one lookup.

**Fuzzy matching (when no prefix or substring match exists):**

Same as the common case, but instead of a simple prefix filter, runs
normalized Damerau-Levenshtein similarity scoring over all package
names. This is the slowest path, and only runs after prefix and
substring matching return no results.

## The stat cache

The stat cache is the key optimization. On each invocation:

1. Call `stat()` on every source file. One syscall per file.
2. Compare each file's `(mtime, size)` tuple against cached values.
3. If all match (the common case), read pre-parsed candidates from
   `context_cache.msgpack`. No TOML/YAML parsing at all.
4. If any file changed, re-parse only that file. Merge with cached
   results for unchanged files. Write the updated cache atomically
   (write to `.tmp`, then rename).

### Why stat and not content hashing?

Content hashing (e.g., xxhash of file contents) requires reading the
entire file before deciding whether to parse it. `stat()` answers the
same question with a single syscall that reads only filesystem metadata.
The only false-negative case (content changes without mtime or size
changing) is vanishingly rare in normal editing workflows.

## Fuzzy matching

When no prefix or substring match is found, the binary falls back to
normalized Damerau-Levenshtein similarity. This handles common typos
like transpositions ("nupmy" for "numpy") and near-misses ("numpie"
for "numpy").

The matching uses a three-stage strategy to avoid unnecessary work:

1. **Prefix match**: return immediately if any candidates start with the
   query. This is the common case and is essentially free.
2. **Substring match**: return if any candidates contain the query.
   Still fast, one pass over the candidate list.
3. **Similarity**: only runs when the first two stages return nothing. Scores
   are filtered at a 0.6 threshold and capped at 10 results.

## Comparison with existing tools

| Tool | Approach |
|---|---|
| `conda-bash-completion` | Runs `conda --help` on every TAB press, parses output with sed/awk |
| `conda-zsh-completion` | Hand-written static script with 12-hour package cache |
| Fish built-in `conda.fish` | Static script (based on conda 4.4.11) |
| `argc-completions` | Generic `--help` parser |
| **conda-completion** | **Pre-generated msgpack manifest, Rust binary, stat-cached context** |

The key difference is that conda-completion never starts a Python
process on TAB press. Tools that run `conda --help` or similar pay
Python's startup cost on every keypress.

## Binary size

The Rust binary is compiled with LTO (link-time optimization), size
optimization (`opt-level = "z"`), and symbol stripping. Dependencies
are kept minimal:

| Dependency | Purpose |
|---|---|
| `serde` + `rmp-serde` | msgpack manifest and cache deserialization |
| `serde` + `toml` | TOML workspace/project file parsing (conda.toml, pixi.toml) |
| `serde-saphyr` | YAML parsing (environment.yml, .condarc, lockfiles) |
| `fs-err` | Better I/O errors |

Heavier alternatives were evaluated and rejected:

- **rattler crates** (`rattler_conda_types`, `rattler_lock`): pull in
  nom, regex, simd-json, rayon, purl, fancy-regex. Significant binary
  size and startup overhead for functionality we don't need.
- **clap_complete**: adds clap's full argument parsing framework.
  Unnecessary when the completion engine is custom.
- **serde_yml**: unmaintained, has a RUSTSEC advisory (RUSTSEC-2025-0068).
  Replaced by serde-saphyr (pure Rust, actively maintained).