Return to search

Mining Security Risks from Massive Datasets

Cyber security risk has been a problem ever since the appearance of telecommunication and electronic computers. In the recent 30 years, researchers have developed various tools to protect the confidentiality, integrity, and availability of data and programs.

However, new challenges are emerging as the amount of data grows rapidly in the big data era. On one hand, attacks are becoming stealthier by concealing their behaviors in massive datasets. One the other hand, it is becoming more and more difficult for existing tools to handle massive datasets with various data types.

This thesis presents the attempts to address the challenges and solve different security problems by mining security risks from massive datasets. The attempts are in three aspects: detecting security risks in the enterprise environment, prioritizing security risks of mobile apps and measuring the impact of security risks between websites and mobile apps. First, the thesis presents a framework to detect data leakage in very large content. The framework can be deployed on cloud for enterprise and preserve the privacy of sensitive data. Second, the thesis prioritizes the inter-app communication risks in large-scale Android apps by designing new distributed inter-app communication linking algorithm and performing nearest-neighbor risk analysis. Third, the thesis measures the impact of deep link hijacking risk, which is one type of inter-app communication risks, on 1 million websites and 160 thousand mobile apps. The measurement reveals the failure of Google's attempts to improve the security of deep links. / Ph. D. / Cyber security risk has been a problem ever since the appearance of telecommunication and electronic computers. In the recent 30 years, researchers have developed various tools to prevent sensitive data from being accessed by unauthorized users, protect program and data from being changed by attackers, and make sure program and data to be available whenever needed.

However, new challenges are emerging as the amount of data grows rapidly in the big data era. On one hand, attacks are becoming stealthier by concealing their attack behaviors in massive datasets. On the other hand, it is becoming more and more difficult for existing tools to handle massive datasets with various data types.

This thesis presents the attempts to address the challenges and solve different security problems by mining security risks from massive datasets. The attempts are in three aspects: detecting security risks in the enterprise environment where massive datasets are involved, prioritizing security risks of mobile apps to make sure the high-risk apps being analyzed first and measuring the impact of security risks within the communication between websites and mobile apps. First, the thesis presents a framework to detect sensitive data leakage in enterprise environment from very large content. The framework can be deployed on cloud for enterprise and avoid the sensitive data being accessed by the semi-honest cloud at the same time. Second, the thesis prioritizes the inter-app communication risks in large-scale Android apps by designing new distributed inter-app communication linking algorithm and performing nearest-neighbor risk analysis. The algorithm runs on a cluster to speed up the computation. The analysis leverages each app’s communication context with all the other apps to prioritize the inter-app communication risks. Third, the thesis measures the impact of mobile deep link hijacking risk on 1 million websites and 160 thousand mobile apps. Mobile deep link hijacking happens when a user clicks a link, which is supposed to be opened by one app but being hijacked by another malicious app. Mobile deep link hijacking is one type of inter-app communication risks between mobile browser and apps. The measurement reveals the failure of Google’s attempts to improve the security of mobile deep links.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/78684
Date09 August 2017
CreatorsLiu, Fang
ContributorsComputer Science, Yao, Danfeng (Daphne), Xu, Dongyan, Prakash, B. Aditya, Butt, Ali R., Lou, Wenjing
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
Detected LanguageEnglish
TypeDissertation
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0027 seconds