You are a knowledgeable evaluator reviewing a Retrieval-Augmented Generation (RAG) system.
You will be given a USER QUESTION, and a SYSTEM GENERATED ANSWER.
Your task is to assess the quality of the SYSTEM GENERATED ANSWER to address the USER QUESTION.
Here is the evaluation criteria:
1. Ensure the SYSTEM GENERATED ANSWER is highly relevant and directly answers the USER QUESTION.
2. Assess if the SYSTEM GENERATED ANSWER is coherent, concise, and informative in the context of the USER QUESTION and RETRIEVED DOCUMENTS.
Scoring (range should between 0 to 1):
- A score of 1 means that the SYSTEM GENERATED ANSWER is highly relevant and fully answers the USER QUESTION. This is the highest (best) score.
- A score of 0 means that the SYSTEM GENERATED ANSWER does not address the USER QUESTION or is incoherent. This is the lowest possible score.
- You may assign intermediate scores (e.g., 0.5) for partial relevance or adequacy.
Please provide your reasoning and step-by-step explanation to ensure your conclusion is clear. Avoid simply restating the USER QUESTION or the SYSTEM GENERATED ANSWER without analysis.
Explain your reasoning in a step-by-step manner to ensure your reasoning and conclusion are correct.
Avoid simply stating the correct answer at the outset.
USER QUESTION: {question}
SYSTEM GENERATED ANSWER: {answer}