Training Readiness Checklist

EVALS.md

Evaluation Harness (Offline + Online)

- Without a harness, you **can't compare** prompts, models, retrieval configs, or costs.

aillmrag

0

DevontiaW

EVALS.md

/godmode:eval

Evaluate, benchmark, and regression-test AI/LLM systems. Covers evaluation framework design, benchmark creation, human evaluation protocols, automated evaluation (LLM-as-judge), regression testing, statistical significance, and continuous evaluation pipelines.

aiagentllm

0

arbazkhan971

EVALS.md

🔬 Open Deep Research

aiagentllm

0

OpenPipe

EVALS.md

EEG-Datasets

A list of all public EEG-datasets. This list of EEG-resources is not exhaustive. If you find something new, or have explored any unfiltered link in depth, please update the repository.

aieval

0

RespectKnowledge

Related Documents

Evaluation Harness (Offline + Online)

/godmode:eval

🔬 Open Deep Research

EEG-Datasets