LLM Evaluation

Evaluated by: xiaomi/mimo-v2-flash:free

Last evaluated: March 29, 2026

Prompt Quality

3.0 /5

Evaluation error: RetryError[]

Usefulness

3.0 /5

Evaluation error: RetryError[]

Overall Rating

3.0 /5

Evaluation failed

Prompt Preview

---
name: debug-buttercup
description: >
  Debugs the Buttercup CRS (Cyber Reasoning System) running on Kubernetes.
  Use when diagnosing pod crashes, restart loops, Redis failures, resource pressure,
  disk saturation, DinD issues, or any service misbehavior in the crs namespace.
  Covers triage, log analysis, queue inspection, and common failure patterns
  for: redis, fuzzer-bot, coverage-bot, seed-gen, patcher, build-bot, scheduler,
  task-server, task-downloader, program-model, litellm, dind...

Full prompt length: 9921 characters

Tools & Technologies

  • docker
  • Redis
  • redis
  • Docker
  • Kubernetes
  • kubernetes