Return to search

Improving Deposition Summarization using Enhanced Generation and Extraction of Entities and Keywords

In the legal domain, depositions help lawyers and paralegals to record details and recall relevant information relating to a case. Depositions are conversations between a lawyer and a deponent and are generally in Question-Answer (QA) format. These documents can be lengthy, which raises the need for applying summarization methods to the documents. Though many automatic summarization methods are available, not all of them give good results, especially in the legal domain. This creates a need to process the QA pairs and develop methods to help summarize the deposition. For further downstream tasks like summarization and insight generation, converting QA pairs to canonical or declarative form can be helpful. Since the transformed canonical sentences are not perfectly readable, we explore methods based on heuristics, language modeling, and deep learning, to improve the quality of sentences in terms of grammaticality, sentence correctness, and relevance.
Further, extracting important entities and keywords from a deposition will help rank the candidate summary sentences and assist with extractive summarization. This work investigates techniques for enhanced generation of canonical sentences and extracting relevant entities and keywords to improve deposition summarization. / Master of Science / In the legal domain, depositions help lawyers and paralegals to record details and recall relevant information relating to a case. Depositions are conversations between a lawyer and a deponent and are generally in Question-Answer format. These documents can be lengthy, which raises the need for applying summarization methods to the documents. Typical automatic summarization techniques perform poorly on depositions since the data format is very different from standard text documents such as news articles, blogs. To standardize the process of summary generation, we convert the Question-Answer pairs from the deposition document to their canonical or declarative form. We apply techniques to improve the readability of these transformed sentences. Further, we extract entities such as person names, locations, organization and keywords from the deposition to retrieve important sentences and help in summarization. This work describes the techniques used to correct transformed sentences and extract important entities and keywords to improve the summarization of depositions.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/111488
Date01 June 2021
CreatorsSumant, Aarohi Milind
ContributorsComputer Science, Fox, Edward A., Meng, Na, Eldardiry, Hoda
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
Detected LanguageEnglish
TypeThesis
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0018 seconds