CaseSumm: A Large-Scale Dataset for Long-Context Summarization from U.S. Supreme Court Opinions
Document Type
Dataset
Publication Date
2024
Abstract
The CaseSumm dataset consists of U.S. Supreme Court cases and their official summaries, called syllabuses, from the period 1815-2019. Syllabuses are written by an attorney employed by the Court and approved by the Justices. The syllabus is therefore the gold standard for summarizing majority opinions, and ideal for evaluating other summaries of the opinion. We obtain the opinions from Public Resource Org's archive and extract syllabuses from the official opinions published in the U.S. Reporter and hosted by the Library of Congress.
Associated Scholarship
Recommended Citation
Mourad Heddaya, Kyle MacMillan, Anup Malani, Hongyuan Mei and Chenhao Tan, “CaseSumm: A Large-Scale Dataset for Long-Context Summarization from U.S. Supreme Court Opinions” (2024) arXiv:2501.00097
