K

.md Directory

All Documents New Popular

K

All Documents — .md Directory | Neura Market

All Documents

3,528 documents available

EVALS.md

Context-Aware RAG Agent: Flow & Evaluation Plan

This document details the exact execution flow of the system and the offline validation and evaluation framework implemented using RAGAs.

aiagentllm

nithin-pulla

MONITORING.md

Comprehensive Evaluation Plan (Version 2)

**Project Name:** Rubby the Duck

aievalworkflow

anrj

EVALS.md

Example: LLM-as-judge for answer quality

**Codewords:** evaluation, offline eval, online eval, golden set, LLM-as-judge, rubric, metric, regression tests, RAG evaluation, RAGAS, faithfulness, context precision, answer relevance, task success rate, context recall, answer correctness, hallucination, retrieval quality, generation quality, human eval, automated eval, evaluation dataset, ground truth

aiagentllm

Kalmy8

EVALS.md

Comprehensive Evaluation Plan - MVP

This document outlines the complete evaluation strategy for Aegis AI Video Censoring Platform MVP, covering core metrics, test suites, user testing protocols, regression testing, and a measurement timeline from Weeks 4-15.

aievalsafety

daatoo

EVALS.md

evaluation_driven_development

title: "Evaluation Driven AI Development"

aiagentllm

AOEpeople

EVALS.md

ProtoExtract — Evaluation Approach Using OmniDocBench Methodology

Define a rigorous, evidence-based evaluation framework for the ProtoExtract

aieval

cryogenic22

RAG.md

Evaluating Retrieval Augmented Generation - a framework for assessment

*In this first article of a three-part (monthly) series, we introduce RAG evaluation, outline its challenges, propose an effective evaluation framework, and provide a rough overview of the various tools and approaches you can use to evaluate your RAG application.*

aillmrag

superlinked

PROMPTS.md

🤖 機械学習・データ可視化プレビューガイド

- ✅ TensorFlow MNIST CNN モデル訓練

suetaketakaya

RUNBOOK.md

UI Preview Guide

This guide explains different ways to preview the UI components before deploying your application.

BLKOUTUK

CHECKLIST.md

UX Review Guide for SaaS

This guide provides a systematic approach for auditing User Experience (UX) in commercial SaaS applications, rooted in heuristic evaluation and modern design systems.

aievalworkflow

Lithium-Prime

AGENTS.md

17. Evaluation Frameworks

title: 17. Evaluation Frameworks

aiagentllm

Frostbite1536

EVALS.md

Evaluating generative systems

title: "LLM evaluation, chapter 2: Evaluating generative systems"

aillmprompt

Nebius-Academy

EVALS.md

Evaluating the RAG answer quality

[📺 Watch: (RAG Deep Dive series) Evaluating RAG answer quality](https://www.youtube.com/watch?v=lyCLu53fb3g)

airageval

bhavesh-chainani

EVALS.md

Agent Quality & Evaluation

Comprehensive guide to evaluating and improving agent quality. See SKILL.md for core quality principles and operations.md for Agent Ops overview.

aiagentllm

mguinada

SKILL.md

LLM Evaluation

description: Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.

aiagentllm

majiayu000

ARCHITECTURE.md

System Review Guide (IBS v5)

Audience: Senior software engineers and technical reviewers assessing architecture, code quality, and operational readiness of IBS v5.

airag

edwinjojie

RUNBOOK.md

Document Preview & Download Feature - Complete Guide

I've added document preview and download functionality that retrieves files from MinIO and serves them directly through the application.

aiprompt

bwalia

DEPLOYMENT.md

SentinelAI Services v2.1 - Alice's Expert Review Guide

- **Hackathon**: Dega-Midnight AI DAO Treasury Management

bytewizard42i

SOP.md

Manual Review Guide for OCR Outputs

**Comprehensive guide for manually verifying and correcting OCR-generated CSV files.**

aiworkflow

JasonCruz18

CHECKLIST.md

🔍 Pull Request Review Guide

Code review is one of the most important quality gates in software development. A well-conducted PR review catches bugs, improves code quality, shares knowledge, and maintains consistency across the codebase.

aiworkflow

eslamfaisal

CHECKLIST.md