Loading...
Loading...
3,528 documents available
**AI for Social Good Hackathon – SUST 2026**
The goal of a Qualifying Exam ("qual") is for a student to *effectively demonstrate that they have the knowledge and skills that will be needed to conduct meaningful research in their chosen subfield*. There are a number of key phrases in this sentence:
The **EGG (Environmental, Governance & Goals) Rubric** is a comprehensive evaluation framework for assessing corporate sustainability performance across five critical sustainability themes. This rubric employs a multi-dimensional scoring approach that evaluates both the **quantity** and **quality** of corporate commitments, as well as their **specificity** and **temporal evolution**.
LLMC’s retrieval system must balance context relevance with token limitations, especially for large code
Create a plan to build an n8n workflow that evaluates multiple LLM prompts for generating meal feedback using a **thinking model to generate ground truth** for comparison.
*[Deutsche Version](GLOSSARY_DE.md)*
Complete documentation for the `agent-eval` CLI, metrics, data formats, and customization.
title: RAG Evaluation
A tool to aid researchers in assessing whether research papers adhere to scientific best practices. This application uses AI to automatically generate falsification forms, helping researchers verify the scientific robustness of their work across disciplines including social sciences and natural sciences.
This guide explains how to evaluate the RAG (Retrieval-Augmented Generation) performance of the Clarity and Rigor agents using different retriever configurations.
description: Comprehensive prompt testing and LLM output evaluation skill covering hallucination detection, response quality scoring, regression testing for prompts, A/B testing, and building evaluation pipelines for AI-powered applications.
title: Evaluation Framework
* **Rapid Time to Market:** Easier to implement than fine-tuning a model from scratch.
title: "Data-Driven RAG Evaluation: Testing Qdrant Apps with Relari AI"
module_title: Data Science and Machine Learning
This rubric defines a **standardised metric** for evaluating how well a software repository implements core **kernel** and **operating‑system (OS)** primitives. It is based on the function manifest and status report from the Echo.Kern project and draws on general operating‑system principles ([Wikipedia: Kernel](https://en.wikipedia.org/wiki/Kernel_(operating_system)#:~:text=operating%20system%20%20that%20always,for%20the%20central%20processing%20unit)). The goal is to provide a repeatable method
**Team Name:** [Byte Peeps]
The emphasis of the course is on learning and mastering the skills covered. The grade for the course will be divided into 3 components:
**DUE:** 2022-12-08T15:00 (End of Exam Period)
- A TA / grader will be reviewing your code after the deadline.
Repository link: https://github.com/agupta15k/ncsu_se_fall22_22_pr_2
title: Evaluation Rubric for LLM Outputs
Report marking rubric with level descriptors and respective marks for different
|Score|Notes| Evidence|Self-Assessment|