Global ETD Search

Return to search

Syntactic and Semantic Analysis and Visualization of Unstructured English Texts

People have complex thoughts, and they often express their thoughts with complex sentences using natural languages. This complexity may facilitate efficient communications among the audience with the same knowledge base. But on the other hand, for a different or new audience this composition becomes cumbersome to understand and analyze. Analysis of such compositions using syntactic or semantic measures is a challenging job and defines the base step for natural language processing.
In this dissertation I explore and propose a number of new techniques to analyze and visualize the syntactic and semantic patterns of unstructured English texts.
The syntactic analysis is done through a proposed visualization technique which categorizes and compares different English compositions based on their different reading complexity metrics. For the semantic analysis I use Latent Semantic Analysis (LSA) to analyze the hidden patterns in complex compositions. I have used this technique to analyze comments from a social visualization web site for detecting the irrelevant ones (e.g., spam). The patterns of collaborations are also studied through statistical analysis.
Word sense disambiguation is used to figure out the correct sense of a word in a sentence or composition. Using textual similarity measure, based on the different word similarity measures and word sense disambiguation on collaborative text snippets from social collaborative environment, reveals a direction to untie the knots of complex hidden patterns of collaboration.

Readability

Complexity depth of field

Grammatical structure

Visualization

Web mining

Web information retrieval

Recommendation

Semantic similarity

Word sense disambiguation

Natural Language Processing

Computer Sciences

Identifer	oai:union.ndltd.org:GEORGIA/oai:digitalarchive.gsu.edu:cs_diss-1062
Date	14 December 2011
Creators	Karmakar, Saurav
Publisher	Digital Archive @ GSU
Source Sets	Georgia State University
Detected Language	English
Type	text
Format	application/pdf
Source	Computer Science Dissertations

Page generated in 0.0025 seconds

Syntactic and Semantic Analysis and Visualization of Unstructured English Texts

Description

Links & Downloads

Tags

Additional Fields