Contract Intelligence: A Scaled Benchmark for Measuring Contract Understanding – Harvey
Harvey is publishing its evaluation of LLM contract understanding and interpretation, including performance by both Review tables in Vault and human lawyers.
At Harvey, our research isn’t just about pushing the capabilities of AI. It’s also about finding ways to distill those capabilities in order to identify ways to deliver measurable results to real attorneys.
One area where we have found LLMs uniquely capable is in contract extraction, converting contract terms into actionable insights at scale. Last year, we showed that LLMs were now sufficiently capable to build specialized agents for contract extraction and achieve near perfect performance on high value deal points. Our work there also showed that out-of-the-box LLMs struggled on this task, only identifying around 65-70% of valid deal points.
Today, we’re publishing initial findings from our Contract Intelligence benchmark, which encompasses more than 4,000 data points aimed at measuring extraction accuracy on varying contract types and terms and comparing that accuracy to human experts. This publication highlights key outcomes and does not purport to include every result.
To read the article in full, click here.



