Certified Generative AI Engineer Associate Exam QuestionsBrowse all questions from this exam

Certified Generative AI Engineer Associate Exam - Question 44


A Generative AI Engineer has built an LLM-based system that will automatically translate user text between two languages. They now want to benchmark multiple LLM’s on this task and pick the best one. They have an evaluation set with known high quality translation examples. They want to evaluate each LLM using the evaluation set with a performant metric.

Which metric should they choose for this evaluation?

Show Answer
Correct Answer:

Discussion

1 comment
Sign in to comment
DavidMillerOption: A
May 4, 2025

in the name really, Bilingual Evaluation Understudy (BLEU)