Return to search

Generating Canonical Sentences from Question-Answer Pairs of Deposition Transcripts

In the legal domain, documents of various types are created in connection with a particular case, such as testimony of people, transcripts, depositions, memos, and emails. Deposition transcripts are one such type of legal document, which consists of conversations between the different parties in the legal proceedings that are recorded by a court reporter. Court reporting has been traced back to 63 B.C. It has transformed from the initial scripts of ``Cuneiform", ``Running Script", and ``Grass Script" to Certified Access Real-time Translation (CART). Since the boom of digitization, there has been a shift to storing these in the PDF/A format. Deposition transcripts are in the form of question-answer (QA) pairs and can be quite lengthy for common people to read. This gives us a need to develop some automatic text-summarization method for the same. The present-day summarization systems do not support this form of text, entailing a need to process them. This creates a need to parse such documents and extract QA pairs as well as any relevant supporting information. These QA pairs can then be converted into complete canonical sentences, i.e., in a declarative form, from which we could extract some insights and use for further downstream tasks. This work investigates the same, as well as using deep-learning techniques for such transformations. / Master of Science / In the legal domain, documents of various types are created in connection with a particular case, such as the testimony of people, transcripts, memos, and emails. Deposition transcripts are one such type of legal document, which consists of conversations between a lawyer and one of the parties in the legal proceedings, captured by a court reporter. Since the boom of digitization, there has been a shift to storing these in the PDF/A format. Deposition transcripts are in the form of question-answer (QA) pairs and can be quite lengthy. Though automatic summarization could help, present-day systems do not work well with such texts. This creates a need to parse these documents and extract QA pairs as well as any relevant supporting information. The QA pairs can then be converted into canonical sentences, i.e., in a declarative form, from which we could extract some insights and support downstream tasks. This work describes these conversions, as well as using deep-learning techniques for such transformations.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/108405
Date15 September 2020
CreatorsMehrotra, Maanav
ContributorsComputer Science, Fox, Edward A., Hsiao, Michael S., Eldardiry, Hoda
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
Detected LanguageEnglish
TypeThesis
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0135 seconds