Hi everyone,
to evaluate LLMs and the RAG (context) support from the AI LLM application, @ppantiru and I will develop a small evaluation framework and perform a benchmark. We discussed that it would be best to have a new contrib repository for this as we would like to also put the evaluation results into the same repository, and it seems weird to mix such evaluation results with the actual extension code, in particular as the result files could be a bit bigger.
What do you think about application-ai-llm-benchmark
as repository name to indicate the relationship to the existing application-ai-llm
repository?
Thank you very much!