
Dinesh Kalamegam

Stanislav Chistyakov

David Oliver
As generative AI scales, evaluating the quality of its output becomes increasingly complex. This white paper explores LLM-as-a-Judge - using AI to evaluate AI - highlighting methods like pairwise comparison and direct scoring, while discussing benefits, limitations, and why this approach could be key to building trust in AI systems.
The key points:
- Gen AI allows systems to return larger volumes of output to the users. However, traditional ML metrics are harder to apply and Subject Matter Experts (SMEs) won't have time to check the entire output. Therefore we need new means of evaluation that can scale.
- LLM-as-a-Judge is the notion of using LLMs to evaluate LLM generated outputs. The paper goes through two types of LLM-as-a-Judge implementations that can mimic how an SME would do an evaluation: picking the better between two responses, and scoring a single response based on prompts that can be built with own domain knowledge.
- There are some caveats to the method (such as LLM bias and data limitations), which is why it cannot replace SMEs completely. However, the technique can reduce burden on them by being able to leverage the LLM's ability to capture semantic meaning at scale.
Legal Disclaimer
Republication or redistribution of LSE Group content is prohibited without our prior written consent.
The content of this publication is for informational purposes only and has no legal effect, does not form part of any contract, does not, and does not seek to constitute advice of any nature and no reliance should be placed upon statements contained herein. Whilst reasonable efforts have been taken to ensure that the contents of this publication are accurate and reliable, LSE Group does not guarantee that this document is free from errors or omissions; therefore, you may not rely upon the content of this document under any circumstances and you should seek your own independent legal, investment, tax and other advice. Neither We nor our affiliates shall be liable for any errors, inaccuracies or delays in the publication or any other content, or for any actions taken by you in reliance thereon.
Copyright © 2025 London Stock Exchange Group. All rights reserved.
The content of this publication is provided by London Stock Exchange Group plc, its applicable group undertakings and/or its affiliates or licensors (the “LSE Group” or “We”) exclusively.
Neither We nor our affiliates guarantee the accuracy of or endorse the views or opinions given by any third party content provider, advertiser, sponsor or other user. We may link to, reference, or promote websites, applications and/or services from third parties. You agree that We are not responsible for, and do not control such non-LSE Group websites, applications or services.
The content of this publication is for informational purposes only. All information and data contained in this publication is obtained by LSE Group from sources believed by it to be accurate and reliable. Because of the possibility of human and mechanical error as well as other factors, however, such information and data are provided "as is" without warranty of any kind. You understand and agree that this publication does not, and does not seek to, constitute advice of any nature. You may not rely upon the content of this document under any circumstances and should seek your own independent legal, tax or investment advice or opinion regarding the suitability, value or profitability of any particular security, portfolio or investment strategy. Neither We nor our affiliates shall be liable for any errors, inaccuracies or delays in the publication or any other content, or for any actions taken by you in reliance thereon. You expressly agree that your use of the publication and its content is at your sole risk.
To the fullest extent permitted by applicable law, LSE Group, expressly disclaims any representation or warranties, express or implied, including, without limitation, any representations or warranties of performance, merchantability, fitness for a particular purpose, accuracy, completeness, reliability and non-infringement. LSE Group, its subsidiaries, its affiliates and their respective shareholders, directors, officers employees, agents, advertisers, content providers and licensors (collectively referred to as the “LSE Group Parties”) disclaim all responsibility for any loss, liability or damage of any kind resulting from or related to access, use or the unavailability of the publication (or any part of it); and none of the LSE Group Parties will be liable (jointly or severally) to you for any direct, indirect, consequential, special, incidental, punitive or exemplary damages, howsoever arising, even if any member of the LSE Group Parties are advised in advance of the possibility of such damages or could have foreseen any such damages arising or resulting from the use of, or inability to use, the information contained in the publication. For the avoidance of doubt, the LSE Group Parties shall have no liability for any losses, claims, demands, actions, proceedings, damages, costs or expenses arising out of, or in any way connected with, the information contained in this document.
LSE Group is the owner of various intellectual property rights ("IPR”), including but not limited to, numerous trademarks that are used to identify, advertise, and promote LSE Group products, services and activities. Nothing contained herein should be construed as granting any licence or right to use any of the trademarks or any other LSE Group IPR for any purpose whatsoever without the written permission or applicable licence terms.