Return to search

Interpretable and generative AI for actionable insights from textual data

The applications of artificial intelligence (AI) and natural language processing (NLP) methods have enabled managers and researchers to process, interpret, and extract value from text efficiently. These systems offer a lens through which to understand consumer behavior, monitor the dynamics of markets and brands. Recent advancements, including interpretable AI, variational autoencoders and the most recent transformer-based large language models (LLMs), have significantly altered the research landscape. These models, with their robust capabilities for language representation and generation, introduce both new opportunities and challenges in converting raw textual data into actionable business insights. In line with this research trend, this dissertation aims to compile and adapt three of my works on interpretable and generative models for insights from the textual data.
The first essay explores the development of interpretable machine learning algorithm for the automatic extraction of corpus-level concepts from textual data without the need for human-defined guidance or labeled concepts. Through a case study involving online purchase journey data, we demonstrates the ability of the model to identify and quantify the importance of customer review concepts correlated with purchase conversion, providing external validation of its efficacy as an exploratory tool.
The second essay presents the use of Variational AutoEncoder that transforms unstructured patent text into a structured, spatial representation of firms' innovation activities. By learning a disentangled vector space of patents, the model offers interpretable insights into firms' AI-based intellectual property portfolios. Through applications across three decades of patents, this chapter showcases the model’s utility in visualizing technology landscapes, engineering intuitive features from text, and augmenting patent applications to reduce rejection risks, thereby illustrating the transformative potential of generative AI in analyzing unstructured corporate data.
The third essay explores the capabilities and limitations of large language models (LLMs) in simulating human preferences and reasoning in the context of misinformation. By adopting the dual process theory, this study investigates the extent to which LLMs can mimic human discernment in the accuracy estimation and the sharing of political news headlines. We argue that while current LLMs struggle to replicate complex cognitive reasoning behaviors, the integration of psychological theories offers a pathway to align language agents more closely with human reasoning processes.
Broadly, this dissertation addresses the issues of how to align the NLP systems with human intentions for the task of language understanding, language representation, agent development in information systems research, and highlights potential avenues for future research in these areas.

Identiferoai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/48748
Date13 May 2024
CreatorsCheng, Zhaoqi
ContributorsLee, Dokyun
Source SetsBoston University
Languageen_US
Detected LanguageEnglish
TypeThesis/Dissertation

Page generated in 0.0022 seconds