LLM Evaluation

Evaluated by: xiaomi/mimo-v2-flash:free

Last evaluated: May 17, 2026

Prompt Quality

3.0 /5

Evaluation error: RetryError[]

Usefulness

3.0 /5

Evaluation error: RetryError[]

Overall Rating

3.0 /5

Evaluation failed

Prompt Preview

---
name: monitor-binary-version-baselines
description: >
  Establish and maintain longitudinal baselines of CLI binary contents
  across versions. Covers marker selection by category (API / identity /
  config / telemetry / flag / function), weighted scoring, threshold-based
  system-presence detection, and per-version baseline records. Use when
  tracking a feature's lifecycle across releases, when probing for
  dark-launched or removed capabilities, or when verifying that a scanning
  tool it...

Full prompt length: 14204 characters