A Generative AI Engineer has created a RAG application which can help employees interpret HR documentation. The prototype application is now working with some positive feedback from internal company testers. Now the Generative AI Engineer wants to formally evaluate the system’s performance and understand where to focus their efforts to further improve the system
How should the Generative AI Engineer evaluate the system?