CaseSumm: A Large-Scale Dataset for Long-Context Summarization from U.S. Supreme Court Opinions

Document Type

Dataset

Publication Date

2024

Abstract

The CaseSumm dataset consists of U.S. Supreme Court cases and their official summaries, called syllabuses, from the period 1815-2019. Syllabuses are written by an attorney employed by the Court and approved by the Justices. The syllabus is therefore the gold standard for summarizing majority opinions, and ideal for evaluating other summaries of the opinion. We obtain the opinions from Public Resource Org's archive and extract syllabuses from the official opinions published in the U.S. Reporter and hosted by the Library of Congress.

Associated Scholarship

https://arxiv.org/abs/2501.00097


Share

COinS