Global ETD Search

21	Automatic surface targets detection in forward scatter radar Wei, Wei January 2018 (has links) The purpose of this thesis is to apply automatic detection techniques on forward scatter radar for ground targets detection against vegetation clutter background and thermal noise. This thesis presents the FSR automatic detection performance analysis of three signal processing algorithms: coherent, non-coherent and cross-correlation. The concept of a CFAR forward scatter radar detection is presented and includes pre-fixed threshold detection and adaptive threshold detection. The developments of a set of simulation methods for target detection and performance analysis are described in details. In the results, we will compare the probability of detection for both human and vehicle target against a variety of clutter backgrounds - WGN, stationary narrow band clutter, non-stationary narrow band clutter, and real recorded vegetation clutter at a low (VHF and UHF) frequency bands. Finally, the advantages and limitations of detection performance for each signal processing algorithms are described.
22	Privacy preserving search in large encrypted databases Tahir, Shahzaib January 2018 (has links) The Cloud is an environment designed for the provision of on-demand resource sharing and data access to remotely located clients and devices. Once data is outsourced to the Cloud, clients tend to lose control of their data thus becoming susceptible to data theft. To mitigate/ reduce the chances of data theft, Cloud service providers employ methods like encrypting data prior to outsourcing it to the Cloud. Although this increases security, it also gives rise to the challenge of searching and sifting through the large amounts of encrypted documents present in the Cloud. This thesis proposes a comprehensive framework that provides Searchable Encryption-as-a-Service (SEaaS) by enabling clients to search for keyword(s) over the encrypted data stored in the Cloud. Searchable Encryption (SE) is a methodology based on recognized cryptographic primitives to enable a client to search over the encrypted Cloud data. This research makes five major contributions to the field of Searchable Encryption: The first contribution is that the thesis proposes novel index-based SE schemes that increase the query effectiveness while being lightweight. To increase query effectiveness this thesis presents schemes that facilitate single-keyword, parallelized disjunctive-keyword (multi-keyword) and fuzzy-keyword searches. The second contribution of this research is the incorporation of probabilistic trapdoors in all the proposed schemes. Probabilistic trapdoors enable the client to hide the search pattern even when the same keyword is searched repeatedly. Hence, this quality allows the client to resist distinguishability attacks and prevents attackers from inferring the search pattern. The third contribution is the enumeration of a "Privacy-preserving" SE scheme by presenting new definitions for SE; i.e., keyword-trapdoor indistinguishability and trapdoor index indistinguishability. The existing security definitions proposed for SE did not take into account the incorporation of probabilistic trapdoors hence they were not readily applicable to our proposed schemes; hence new definitions have been studied. The fourth contribution is the validation that the proposed index-based SE schemes are efficient and can be deployed on to the real-world Cloud offering. The proposed schemes have been implemented and proof-of-concept prototypes have been deployed onto the British Telecommunication's Cloud Server (BTCS). Once deployed onto the BTCS the proof-of-concept prototypes have been tested over a large real-world speech corpus. The fifth contribution of the thesis is the study of a novel homomorphic SE scheme based on probabilistic trapdoors for the provision of higher level of security and privacy. The proposed scheme is constructed on a Partially Homomorphic Encryption Scheme that is lightweight when compared to existing Fully Homomorphic-based SE schemes. The scheme also provides non-repudiation of the transmitted trapdoor while eliminating the need for a centralized data structure, thereby facilitating scalability across Cross-Cloud platforms.
23	Real-time event detection using Twitter McMinn, Andrew James January 2018 (has links) Twitter has become the social network of news and journalism. Monitoring what is said on Twitter is a frequent task for anyone who requires timely access to information: journalists, traders, and the emergency services have all invested heavily in monitoring Twitter in recent years. Given this, there is a need to develop systems that can automatically monitor Twitter to detect real-world events as they happen, and alert users to novel events. However, this is not an easy task due to the noise and volume of data that is produced from social media streams such as Twitter. Although a range of approaches have been developed, many are unevaluated, cannot scale past low volume streams, or can only detect specific types of event. In this thesis, we develop novel approaches to event detection, and enable the evaluation and comparison of event detection approaches by creating a large-scale test collection called Events 2012, containing 120 million tweets and with relevance judgements for over 500 events. We use existing event detection approaches and Wikipedia to generate candidate events, then use crowdsourcing to gather annotations. We propose a novel entity-based, real-time, event detection approach that we evaluate using the Events 2012 collection, and show that it outperforms existing state-of-the-art approaches to event detection whilst also being scalable. We examine and compare automated and crowdsourced evaluation methodologies for the evaluation of event detection. Finally, we propose a Newsworthiness score that is learned in real-time from heuristically labelled data. The score is able to accurately classify individual tweets as newsworthy or noise in real-time. We adapt the score for use as a feature for event detection, and find that it can easily be used to filter out noisy clusters and improve existing event detection techniques. We conclude with a summary of our research findings and answers to our research questions. We discuss some of the difficulties that remain to be solved in event detection on Twitter and propose some possible future directions for research into real-time event detection on Twitter.
24	Analysing political events on Twitter : topic modelling and user community classification Fang, Anjie January 2019 (has links) Recently, political events, such as elections or referenda, have raised a lot of discussions on social media networks, in particular, Twitter. This brings new opportunities for social scientists to address social science tasks, such as understanding what communities said, identify- ing whether a community has an influence on another or analysing how these communities respond to political events online. However, identifying these communities and extracting what they said from social media data are challenging and non-trivial tasks. In this thesis, we aim to make progress towards understanding 'who' (i.e. communities) said 'what' (i.e. discussed topics) and 'when' (i.e. time) during political events on Twitter. While identifying the 'who' can benefit from Twitter user community classification approaches, 'what' they said and 'when' can be effectively addressed on Twitter by extracting their discussed topics using topic modelling approaches that also account for the importance of time on Twitter. To evaluate the quality of these topics, it is necessary to investigate how coherent these topics are to humans. Accordingly, we propose a series of approaches in this thesis. First, we investigate how to effectively evaluate the coherence of the topics generated using a topic modelling approach. The topic coherence metric evaluates the topical coherence by examining the semantic similarity among words in a topic. We argue that the semantic similarity of words in tweets can be effectively captured by using word embeddings trained using a Twitter background dataset. Through a user study, we demonstrate that our proposed word embedding-based topic coherence metric can assess the coherence of topics like humans. In addition, inspired by the precision at k information retrieval metric, we propose to evaluate the coherence of a topic model (containing many topics) by averaging the top-ranked topics within the topic model. Our proposed metrics can not only evaluate the coherence of topics and topic models, but also can help users to choose the most coherent topics. Second, we aim to extract topics with a high coherence from Twitter data. Such topics can be easily interpreted by humans and they can assist to examine 'what' has been discussed on Twitter and 'when'. Indeed, we argue that topics can be discussed in different time periods and therefore can be effectively identified and distinguished by considering their time periods. Hence, we propose an effective time-sensitive topic modelling approach by integrating the time dimension of tweets (i.e. 'when'). We show that the time dimension helps to generate topics with a high coherence. Hence, we argue that 'what' has been discussed and 'when' can be effectively addressed by our proposed time-sensitive topic modelling approach. Next, to identify 'who' participated in the topic discussions, we propose approaches to identify the community affiliations of Twitter users, including automatic ground-truth generation approaches and a user community classification approach. To generate ground-truth data for training a user community classifier, we show that the mentioned hashtags and entities in the users' tweets can indicate which community a Twitter user belongs to. Hence, we argue that they can be used to generate the ground-truth data for classifying users into communities. On the other hand, we argue that different communities favour different topic discussions and their community affiliations can be identified by leveraging the discussed topics. Accordingly, we propose a Topic-Based Naive Bayes (TBNB) classification approach to classify Twitter users based on their words and discussed topics. We demonstrate that our TBNB classifier together with the ground-truth generation approaches can effectively identify the community affiliations of Twitter users. Finally, to show the generalisation of our approaches, we apply our approaches to analyse 3.6 million tweets related to US Election 2016 on Twitter. We show that our TBNB approach can effectively identify the 'who', i.e. classify Twitter users into communities by using hashtags and the discussed topics. To investigate 'what' these communities have discussed, we apply our time-sensitive topic modelling approach to extract coherent topics. We finally analyse the community-related topics evaluated and selected using our proposed topic coherence metrics. Overall, we contribute to provide effective approaches to assist social scientists towards analysing political events on Twitter. These approaches include topic coherence metrics, a time-sensitive topic modelling approach and approaches for classifying the community affiliations of Twitter users. Together they make progress to study and understand the connections and dynamics among communities on Twitter.
25	A framework for technology-assisted sensitivity review : using sensitivity classification to prioritise documents for review McDonald, Graham January 2019 (has links) More than a hundred countries implement freedom of information laws. In the UK, the Freedom of Information Act 2000 (FOIA) states that the government's documents must be made freely available, or opened, to the public. Moreover, all central UK government departments' documents that have a historic value, for example the minutes from significant meetings, must be transferred to the The National Archives (TNA) within twenty years of the document's creation. However, government documents can contain sensitive information, such as personal information or information that would likely damage the international relations of the UK if it was opened to the public. Therefore, all government documents that are to be publicly archived must be sensitivity reviewed to identify and redact the sensitive information, or close the document until the information is no longer sensitive. Historically, government documents have been stored in a structured file-plan that can reliably inform a sensitivity reviewer about the subject-matter and the likely sensitivities in the documents. However, the lack of structure in digital document collections and the volume of digital documents that are to be sensitivity reviewed mean that the traditional manual sensitivity review process is not practical for digital sensitivity review. In this thesis, we argue that the automatic classification of documents that contain sensitive information, sensitivity classification, can be deployed to assist government departments and human reviewers to sensitivity review born-digital government documents. However, classifying sensitive information is a complex task, since sensitivity is context-dependent. For example, identifying if information is sensitive or not can require a human to judge on the likely effect of releasing the information into the public domain. Moreover, sensitivity is not necessarily topic-oriented, i.e., it is usually dependent on a combination of what is being said and about whom. Furthermore, the vocabulary and entities that are associated to particular types of sensitive information, e.g., confidential information, can vary greatly between different collections. We propose to address sensitivity classification as a text classification task. Moreover, through a thorough empirical evaluation, we show that text classification is effective for sensitivity classification and can be improved by identifying the vocabulary, syntactic and semantic document features that are reliable indicators of sensitive or non-sensitive text. Furthermore, we propose to reduce the number of documents that have to be reviewed to learn an effective sensitivity classifier through an active learning strategy in which a sensitivity reviewer redacts any sensitive text in a document as they review it, to construct a representation of the sensitivities in a collection. With this in mind, we propose a novel framework for technology-assisted sensitivity review that can prioritise the most appropriate documents to be reviewed at specific stages of the review process. Furthermore, our framework can provide the reviewers with useful information to assist them in making their reviewing decisions. Our framework consists of four components, namely the Document Representation, Document Prioritisation, Feedback Integration and Learned Predictions components, that can be instantiated to learn from the reviewers' feedback about the sensitivities in a collection or provide assistance to reviewers at different stages of the review. In particular, firstly, the Document Representation component encodes the document features that can be reliable indicators of the sensitivities in a collection. Secondly, the Document Prioritisation component identifies the documents that should be prioritised for review at a particular stage of the reviewing process, for example to provide the sensitivity classifier with information about the sensitivities in the collection or to focus the available reviewing resources on the documents that are the most likely to be released to the public. Thirdly, the Feedback Integration component integrates explicit feedback from a reviewer to construct a representation of the sensitivities in a collection and identify the features of a reviewer's interactions with the framework that indicate the amount of time that is required to sensitivity review a specific document. Finally, the Learned Predictions component combines the information that has been generated by the other three components and, as the final step in each iteration of the sensitivity review process, the Learned Predictions component is responsible for making accurate sensitivity classification and expected reviewing time predictions for the documents that have not yet been sensitivity reviewed. In this thesis, we identify two realistic digital sensitivity review scenarios as user models and conduct two user studies to evaluate the effectiveness of our proposed framework for assisting digital sensitivity review. Firstly, in the limited review user model, which addresses a scenario in which there are insufficient reviewing resources available to sensitivity review all of the documents in a collection, we show that our proposed framework can increase the number of documents that can be reviewed and released to the public with the available reviewing resources. Secondly, in the exhaustive review user model, which addresses a scenario in which all of the documents in a collection will be manually sensitivity reviewed, we show that providing the reviewers with useful information about the documents in the collection that contain sensitive information can increase the reviewers' accuracy, reviewing speed and agreement. This is the first thesis to investigate automatically classifying FOIA sensitive information to assist digital sensitivity review. The central contributions of this thesis are our proposed framework for technology-assisted sensitivity review and our sensitivity classification approaches. Our contributions are validated using a collection of government documents that are sensitivity reviewed by expert sensitivity reviewers to identify two FOIA sensitivities, namely international relations and personal information. The thesis draws insights from a thorough evaluation and analysis of our proposed framework and sensitivity classifier. Our results demonstrate that our proposed framework is a viable technology for assisting digital sensitivity review.
26	Video popularity metrics and bubble cache eviction algorithm analysis Weisenborn, Hildebrand J. January 2018 (has links) Video data is the largest type of traffic in the Internet, currently responsible for over 72% of the total traffic, with over 883PB of data per month in 2016. Large scale CDN solutions are available that offer a variety of distributed hosting platforms for the purpose of transmitting video over IP. However, the IP protocol, unlike ICN protocol implementations, does not provide an any-cast architecture from which a CDN would greatly benefit. In this thesis we introduce a novel cache eviction strategy called ``Bubble,'' as well as two variants of Bubble, that can be applied to any-cast protocols to aid in optimising video delivery. Bubble, Bubble-LRU and Bubble-Insert were found to greatly reduce the quantity of video associated traffic observed in cache enabled networks. Additionally, analysis on two British Telecom (BT) provided video popularity distributions leveraging Kullback-Leibler and Pearson Chi-Squared testing methods was performed. This was done to assess which model, Zipf or Zipf-Mandelbrot, is best suited to replicate video popularity distributions and the results of these tests conclude that Zipf-Mandelbrot is the most appropriate model to replicate video popularity distributions. The work concludes that the novel cache eviction algorithms introduced in this thesis provide an efficient caching mechanism for future content delivery networks and that the modelled Zipf-Mandelbrot distribution is a better method for simulating the performance of caching algorithms.
27	An ICMetric based framework for secure end-to-end communication Tahir, Ruhma January 2018 (has links) Conventional cryptographic algorithms rely on highly sophisticated and well established algorithms to ensure security, while the cryptographic keys are kept secret. However, adversaries can attack the keys of a cryptosystem without targeting the algorithm. This dissertation aims to cover this gap in the domain of cryptography, that is, the problem associated with cryptographic key compromise. The thesis accomplishes this by presenting a novel security framework based on the ICMetric technology. The proposed framework provides schemes for a secure end-to-end communication environment based on the ICMetric technology, which is a novel root of trust and can eliminate issues associated with stored keys. The ICMetric technology processes unique system features to establish an identity which is then used as a basis for cryptographic services. Hence the thesis presents a study on the concept of the ICMetric technology and features suitable for generating the ICMetric of a system. The first contribution of this thesis is the creation of ICMetric keys of sufficient length and entropy that can be used in cryptographic applications. The proposed strong ICMetric key generation scheme follows a two-tier structure, so that the ICMetric keys are resilient to pre-computed attacks. The second contribution of this thesis is a symmetric key scheme that can be used for symmetric key applications based on the ICMetric of the system. The symmetric keys are generated based on zero knowledge protocols and the cryptographic services are provided without transmitting the key over the channel. The fourth major contribution of this thesis is the investigation into the feasibility of employing the ICMetric technology for identifying Docker containers employed by cloud service providers for hosting their cloud services.
28	High speed 802.11ad wireless video streaming Abe, Adewale January 2018 (has links) The aim of this thesis is to investigate, both theoretically and experimentally, the capability of the IEEE 802.11ad device, the Wireless Gigabit Alliance known as WiGig operating in the 60 GHz band to handle rise in data traffic ubiquitous to high speed data transmission such as bulk data transfer, and wireless video streaming. According to Cisco and others, it is estimated that in 2020, internet video traffic will account for 82 % of all consumer internet traffic. This research evalu- ated the feasibility of the 60 GHz to provide minimum data rate of about 970 Mbps from the Ethernet link limited or clamped to 1 Gbps. This translated to 97 % effi- ciency with respect to the IEEE 802.11ad system performance. For the first time, the author proposed the enhancement of millimetre wave propagation through the use of specular reflection in non-line-of-sight environment, providing at least 94 % bandwidth utilization. Additional investigations result of the IEEE 802.11ad device in real live streaming of 4k ultra-high definition (UHD) video shows the feasibility of aggressive frequency reuse in the absence of co-channel interference. Moreover, using heuristic approach, this work compared materials absorption and signal reception at 60 GHz and the results gives better performance in contrast to the theoretical values. Finally, this thesis proposes a framework for the 802.11ad wireless H.264 video streaming over 60 GHz band. The work describes the potential and efficiency of WiGig device in streaming high definition (HD) video with high temporal index (TI) and 4k UHD video with no retransmission. Caching point established at the re-transmitter increase coverage and cache multimedia data. The results in this thesis shows the growing potential of millimeter wave technology, the WiGig for very high speed bulk data transfer, and live streaming video transmission.
29	Efficient magnetic resonance wireless power transfer systems Thabet, Thabat January 2018 (has links) This thesis aims to improve the performance of magnetic resonance wireless power transfer systems. Different factors have different effects on the performance and the efficiency of the maximum transfer of power in the system. These factors are: the resonance frequency; the quality factor of the resonators; the value and shape of the coils; the mutual inductance, including the distance between the coils; and the load. These systems have four potential types of connection in the transmitter and receiver. These types are Serial to Serial (SS), Serial to Parallel (SP), Parallel to Serial (PS) and Parallel to Parallel (PP). Each type has different applications because it has a different performance from the others. Magnetic resonance wireless power systems in some applications consist of one transmitter and one receiver, while in other applications there is a demand to transfer the power to more than one receiver simultaneously. Hence the importance of studying multiple receiver systems arises. The serial to serial type connection was studied along with the effects of all the other factors on the efficiency, including the existence of multiple receivers. The symmetric capacitance tuning method was presented as a solution to the frequency splitting problem that usually appears in SS wireless power transfer systems with a small gap between the two resonators. Compared to other existing methods, this method provides advantages of high efficiency and keeps the frequency within the chosen Industrial Scientific Medical (ISM) band. The impact of the connection type on the efficiency of wireless power transfer systems and the effect of the load impedance on each type was studied. Finally, an algorithm for intelligent management and control of received wireless power was offered to run a load that requires more than the received power.
30	A study of the kinematics of probabilities in information retrieval Crestani, Fabio A. January 1998 (has links) In Information Retrieval (IR), probabilistic modelling is related to the use of a model that ranks documents in decreasing order of their estimated probability of relevance to a user's information need expressed by a query. In an IR system based on a probabilistic model, the user is guided to examine first the documents that are the most likely to be relevant to his need. If the system performed well, these documents should be at the top of the retrieved list. In mathematical terms the problem consists of estimating the probability P(R \| q,d), that is the probability of relevance given a query q and a document d. This estimate should be performed for every document in the collection, and documents should then be ranked according to this measure. For this evaluation the system should make use of all the information available in the indexing term space. This thesis contains a study of the kinematics of probabilities in probabilistic IR. The aim is to get a better insight of the behaviour of the probabilistic models of IR currently in use and to propose new and more effective models by exploiting different kinematics of probabilities. The study is performed both from a theoretical and an experimental point of view. Theoretically, the thesis explores the use of the probability of a conditional, namely P(d → q), to estimate the conditional probability P(R \| q,d). This is achieved by interpreting the term space in the context of the "possible worlds semantics". Previous approaches in this direction had as their basic assumption the consideration that "a document is a possible world". In this thesis a different approach is adopted, based on the assumption that "a term is a possible world". This approach enables the exploitation of term-term semantic relationships in the term space, estimated using an information theoretic measure. This form of information is rarely used in IR at retrieval time. Two new models of IR are proposed, based on two different way of estimating P(d → q) using a logical technique called Imaging. The first model is called Retrieval by Logical Imaging; the second is called Retrieval by General Logical Imaging, being a generalisation of the first model. The probability kinematics of these two models is compared with that of two other proposed models: the Retrieval by Joint Probability model and the Retrieval by Conditional Probability model. These last two models mimic the probability kinematics of the Vector Space model and of the Probabilistic Retrieval model. Experimentally, the retrieval effectiveness of the above four models is analysed and compared using five test collections of different sizes and characteristics. The results of this experimentation depend heavily on the choice of term weight and term similarity measures adopted. The most important conclusion of this thesis is that theoretically a probability transfer that takes into account the semantic similarity between the probability-donor and the probability-recipient is more effective than a probability transfer that does not take that into account. In the context of IR this is equivalent to saying that models that exploit the semantic similarity between terms in the term space at retrieval time are more effective that models that do not do that. Unfortunately, while the experimental investigation carried out using small test collections provide evidence supporting this conclusion, experiments performed using larger test collections do not provide as much supporting evidence (although they do not provide contrasting evidence either). The peculiar characteristics of the term space of different collections play an important role in shaping the effects that different probability kinematics have on the effectiveness of the retrieval process. The above result suggests the necessity and the usefulness of further investigations into more complex and optimised models of probabilistic IR, where probability kinematics follows non-classical approaches. The models proposed in this thesis are just two such approaches; other ones can be developed using recent results achieved in other fields, such as non-classical logics and belief revision theory.

Search results