Global ETD Search

311	Biometrics writer recognition for Arabic language : analysis and classification techniques using subwords features Maliki, Makki Jasim Radhi January 2015 (has links) Handwritten text in any language is believed to convey a great deal of information about writers’ personality and identity. Indeed, handwritten signature has long been accepted as an authentication of the writer’s physical stamp on financial and legal deals as well official/personal documents and works of art. Handwritten documents are frequently used as evidences in forensic tasks. Handwriting skills is learnt and developed from the early schooling stages. Research interest in behavioral biometrics was the main driving force behind the growth in research into Writer Identification (WI) from handwritten text, but recent rise in terrorism associated with extreme religious ideologies spreading primarily, but not exclusively, from the middle-east has led to a surge of interest in WI from handwritten text in Arabic and similar languages. This thesis is the main outcome of extensive research investigations conducted with the aim of developing an automatic identification of a person from handwritten Arabic text samples. My motivations and interests, as an Iraqi researcher, emanate from my multi-faceted desires to provide scientific support for my people in their fight against terrorism by providing forensic evidences, and as contribute to the ongoing digitization of the Iraqi National archive as well as the wealth of religious and historical archives in Iraq and the middle-east. Good knowledge of the underlying language is invaluable in this project. Despite the rising interest in this recognition modality worldwide, Arabic writer identification has not been addressed as extensively as Latin writer identification. However, in recent years some new Arabic writer identification approaches have been proposed some of which are reviewed in this thesis. Arabic is a cursive language when handwritten. This means that each and every writer in this language develops some unique features that could demonstrate writer’s habits and style. These habits and styles are considered as unique WI features and determining factors. Existing dominating approaches to WI are based on recognizing handwriting habits/styles are embedded in certain parts/components of the written texts. Although the appearance of these components within long text contain rich information and clues to writer identity, the most common approaches to WI in Arabic in the literature are based on features extracted from paragraph(s), line(s), word(s), character(s), and/or a part of a character. Generally, Arabic words are made up of one or more subwords at the end of each; there is a connected stroke with a certain style of which seem to be most representative of writers habits. Another feature of Arabic writing is to do with diacritics that are added to written words/subwords, to add meaning and pronunciation. Subwords are more frequent in written Arabic text and appear as part of several different words or as full individual words. Thus, we propose a new innovative approach based on a seemingly plausible hypothesis that subwords based WI yields significant increase in accuracy over existing approaches. The thesis most significant contributions can be summarized as follows: - Developed a high performing segmentation of scanned text images, that combines threshold based binarisation, morphological operation and active shape model. - Defined digital measures and formed a 15-dimensional feature vectors representations of subwords that implicitly cover its diacritics and strokes. A pilot study that incrementally added features according to writer discriminating power. This reduced subwords feature vector dimension to 8, two of which were modelled as time series. - For the dependent 8-dimensional WI scheme, we identify the best performing set of subwords (best 22 subwords out of 49 then followed by best 11 out of these 22 subwords). - We established the validity of our hypothesis for different versions of subwords based WI schemes by providing empirical evidence when testing on a number of existing text dependent and in text-dependent databases plus a simulated text-in text-dependent DB. The text-dependent scenario results exhibited possible present of the Doddington Zoo phenomena. - The final optimal subword based WI scheme, not only removes the need to include diacritics as part of the subword but also demonstrating that including diacritics within subwords impairs the WI discriminating power of subwords. This should not be taken to discredit research that are based on diacritics based WI. Also in this subword body (without diacritics) base WI scheme, resulted in eliminating the presence of Doddington Zoo effect. - Finally, a significant but un-intended consequence of using subwords for WI is that there is no difference between a text-independent scenario and text-dependent one. In fact, we shall demonstrate that the text-dependent database of the 27-words can be used to simulate the testing of the scheme for an in text-dependent database without the need to record such a DB. Finally, we discussed ways of optimising the performance of our last scheme by considering possible ways of complementing our scheme using the addition of various image texture analysis features to be extracted from subwords, lines, paragraphs or entire file of the scabbed image. These included LBP and Gabor Filter. We also suggested the possible addition of few more features. 004
312	Multi evidence fusion scheme for content-based image retrieval by clustering localised colour and texture features Al-Jubouri, Hanan January 2015 (has links) Content-Based Image Retrieval (CBIR) is an automatic process of retrieving images according to their visual content. Research in this field mainly follows two directions. The first is concerned with the effectiveness in describing the visual content of images (i.e. features) by a technique that lead to discern similar and dissimilar images, and ultimately the retrieval of the most relevant images to the query image. The second direction focuses on retrieval efficiency by deploying efficient structures in organising images by their features in the database to narrow down the search space. The emphasis of this research is mainly on the effectiveness rather than the efficiency. There are two types of visual content features. The global feature represents the entire image by a single vector, and hence retrieval by using the global feature is more efficient but often less accurate. On the other hand, the local feature represents the image by a set of vectors, capturing localised visual variations in different parts of an image, promising better results particularly for images with complicated scenes. The first main purpose of this thesis is to study different types of local features. We explore a range of different types of local features from both frequency and spatial domains. Because of the large number of local features generated from an image, clustering methods are used for quantizing and summarising the feature vectors into segments from which a representation of the visual content of the entire image is derived. Since each clustering method has a different way of working and requires settings of different input parameters (e.g. number of clusters), preparations of input data (i.e. normalized or not) and choice of similarity measures, varied performance outcomes by different clustering methods in segmenting the local features are anticipated. We therefore also intend to study and analyse one commonly used clustering algorithm from each of the four main categories of clustering methods, i.e. K-means (partition-based), EM/GMM (model-based), Normalized Laplacian Spectral (graph-based), and Mean Shift (density-based). These algorithms were investigated in two scenarios when the number of clusters is either fixed or adaptively determined. Performances of the clustering algorithms in terms of image classification and retrieval are evaluated using three publically available image databases. The evaluations have revealed that a local DCT colour-texture feature was overall the best due to its robust integration of colour and texture information. In addition, our investigation into the behaviour of different clustering algorithms has shown that each algorithm had its own strengths and limitations in segmenting local features that affect the performance of image retrieval due to variations in visual colour and texture of the images. There is no algorithm that can outperform the others using either an adaptively determined or big fixed number of clusters. The second focus of this research is to investigate how to combine the positive effects of various local features obtained from different clustering algorithms in a fusion scheme aiming to bring about improved retrieval results over those by using a single clustering algorithm. The proposed fusion scheme integrates effectively the information from different sources, increasing the overall accuracy of retrieval. The proposed multi-evidence fusion scheme regards scores of image retrieval that are obtained from normalizing distances of applying different clustering algorithms to different types of local features as evidence and was presented in three forms: 1) evidence fusion using fixed weights (MEFS) where the weights were determined empirically and fixed a prior; 2) evidence fusion based on adaptive weights (AMEFS) where the fusion weights were adaptively determined using linear regression; 3) evidence fusion using a linear combination (Comb SUM) without weighting the evidences. Overall, all three versions of the multi-evidence fusion scheme have proved the ability to enhance the accuracy of image retrieval by increasing the number of relevant images in the ranked list. However, the improvement varied across different feature-clustering combinations (i.e. image representation) and the image databases used for the evaluation. This thesis presents an automatic method of image retrieval that can deal with natural world scenes by applying different clustering algorithms to different local features. The method achieves good accuracies of 85% at Top 5 and 80% at Top 10 over the WANG database, which are better when compared to a number of other well-known solutions in the literature. At the same time, the knowledge gained from this research, such as the effects of different types of local features and clustering methods on the retrieval results, enriches the understanding of the field and can be beneficial for the CBIR community. 004
313	A knowledge-based approach to scientific workflow composition McIver, Russell P. January 2015 (has links) Scientific Workflow Systems have been developed as a means to enable scientists to carry out complex analysis operations on local and remote data sources in order to achieve their research goals. Systems typically provide a large number of components and facilities to enable such analysis to be performed and have matured to a point where they offer many complex capabilities. This complexity makes it difficult for scientists working with these systems to readily achieve their goals. In this thesis we describe the increasing burden of knowledge required of these scientists in order for them to specify the outcomes they wish to achieve within the workflow systems. We consider ways in which the challenges presented by these systems can be reduced, focusing on the following questions: How can metadata describing the resources available assist users in composing workflows? Can automated assistance be provided to guide users through the composition process? Can such an approach be implemented so as to work with the resources provided by existing Scientific Workflow Systems? We have developed a new approach to workflow composition which makes use of a number of features: an ontology for recording metadata relating to workflow components, a set of algorithms for analyzing the state of a workflow composition and providing suggestions for how to progress based on this metadata, an API to enable both the algorithms and metadata to utilise the resources provided by existing Scientific Workflow Systems, and a prototype user interface to demonstrate how our proposed approach to workflow composition can work in practice. We evaluate the system to show the approach is valid and capable of reducing some of the difficulties presented by existing systems, but that limitations exist regarding the complexity of workflows which can be composed, and also regarding the challenge of initially populating the metadata ontology. 005.1
314	4D (3D Dynamic) statistical models of conversational expressions and the synthesis of highly-realistic 4D facial expression sequences Vandeventer, Jason January 2015 (has links) In this thesis, a novel approach for modelling 4D (3D Dynamic) conversational interactions and synthesising highly-realistic expression sequences is described. To achieve these goals, a fully-automatic, fast, and robust pre-processing pipeline was developed, along with an approach for tracking and inter-subject registering 3D sequences (4D data). A method for modelling and representing sequences as single entities is also introduced. These sequences can be manipulated and used for synthesising new expression sequences. Classification experiments and perceptual studies were performed to validate the methods and models developed in this work. To achieve the goals described above, a 4D database of natural, synced, dyadic conversations was captured. This database is the first of its kind in the world. Another contribution of this thesis is the development of a novel method for modelling conversational interactions. Our approach takes into account the time-sequential nature of the interactions, and encompasses the characteristics of each expression in an interaction, as well as information about the interaction itself. Classification experiments were performed to evaluate the quality of our tracking, inter-subject registration, and modelling methods. To evaluate our ability to model, manipulate, and synthesise new expression sequences, we conducted perceptual experiments. For these perceptual studies, we manipulated modelled sequences by modifying their amplitudes, and had human observers evaluate the level of expression realism and image quality. To evaluate our coupled modelling approach for conversational facial expression interactions, we performed a classification experiment that differentiated predicted frontchannel and backchannel sequences, using the original sequences in the training set. We also used the predicted backchannel sequences in a perceptual study in which human observers rated the level of similarity of the predicted and original sequences. The results of these experiments help support our methods and our claim of our ability to produce 4D, highly-realistic expression sequences that compete with state-of-the-art methods. 006.3
315	Transformation of the university examination timetabling problem space through data pre-processing Abdul Rahim, Siti Khatijah Nor January 2015 (has links) This research investigates Examination Timetabling or Scheduling, with the aim of producing good quality, feasible timetables that satisfy hard constraints and various soft constraints. A novel approach to scheduling, that of transformation of the problem space, has been developed and evaluated for its effectiveness. The examination scheduling problem involves many constraints due to many relationships between students and exams, making it complex and expensive in terms of time and resources. Despite the extensive research in this area, it has been observed that most of the published methods do not produce good quality timetables consistently due to the utilisation of random-search. In this research we have avoided random-search and instead have proposed a systematic, deterministic approach to solving the examination scheduling problem. We pre-process data and constraints to generate more meaningful aggregated data constructs with better expressive power that minimise the need for cross-referencing original student and exam data at a later stage. Using such aggregated data and custom-designed mechanisms, the timetable construction is done systematically, while assuring its feasibility. Later, the timetable is optimized to improve the quality, focusing on maximizing the gap between consecutive exams. Our solution is always reproducible and displays a deterministic optimization pattern on all benchmark datasets. Transformation of the problem space into new aggregated data constructs through pre-processing represents the key novel contribution of this research. 005.7
316	A game theoretic approach to coordinating unmanned aerial vehicles with communications payloads Charlesworth, Philip January 2015 (has links) This thesis considers the placement of two or more Unmanned Aerial Vehicles (UAVs) to provide communications to a community of ground mobiles. The locations for the UAVs are decided by the outcome of a non-cooperative game in which the UAVs compete to maximize their coverage of the mobiles. The game allows navigation decisions to be made onboard the UAVs with the effect of increasing coverage, reducing the need for a central planning function, and increasing the autonomy of the UAVs. A non-cooperative game that includes the key system elements is defined and simulated. The thesis compares methods for solving the game to evaluate their performance. A conflict between the quality of the solution and the time required to obtain that solution is identified and explored. It considers how the payload calculations could be used to modify the behaviour of the UAVs, and the sensitivity of the game to resource limitations such as RF power and radio spectrum. It finishes by addressing how the game could be scaled from two UAVs to many UAVs, and the constraints imposed by current methods for solving games. 629.133
317	Clausal reasoning for branching-time logics Zhang, Lan January 2010 (has links) Computation Tree Logic (CTL) is a branching-time temporal logic whose underlying model of time is a choice of possibilities branching into the future. It has been used in a wide variety of areas in Computer Science and Artificial Intelligence, such as temporal databases, hardware verification, program reasoning, multi-agent systems, and concurrent and distributed systems. In this thesis, firstly we present a refined clausal resolution calculus R�,S CTL for CTL. The calculus requires a polynomial time computable transformation of an arbitrary CTL formula to an equisatisfiable clausal normal form formulated in an extension of CTL with indexed existential path quantifiers. The calculus itself consists of eight step resolution rules, two eventuality resolution rules and two rewrite rules, which can be used as the basis for an EXPTIME decision procedure for the satisfiability problem of CTL. We give a formal semantics for the clausal normal form, establish that the clausal normal form transformation preserves satisfiability, provide proofs for the soundness and completeness of the calculus R�,S CTL, and discuss the complexity of the decision procedure based on R�,S CTL. As R�,S CTL is based on the ideas underlying Bolotov’s clausal resolution calculus for CTL, we provide a comparison between our calculus R�,S CTL and Bolotov’s calculus for CTL in order to show that R�,S CTL improves Bolotov’s calculus in many areas. In particular, our calculus is designed to allow first-order resolution techniques to emulate resolution rules of R�,S CTL so that R�,S CTL can be implemented by reusing any first-order resolution theorem prover. Secondly, we introduce CTL-RP, our implementation of the calculus R�,S CTL. CTL-RP is the first implemented resolution-based theorem prover for CTL. The prover takes an arbitrary CTL formula as input and transforms it into a set of CTL formulae in clausal normal form. Furthermore, in order to use first-order techniques, formulae in clausal normal form are transformed into firstorder formulae, except for those formulae related to eventualities, i.e. formulae containing the eventuality operator 3. To implement step resolution and rewrite rules of the calculus R�,S CTL, we present an approach that uses first-order ordered resolution with selection to emulate the step resolution rules and related proofs. This approach enables us to make use of a first-order theorem prover, which implements the first-order ordered resolution with selection, in order to realise our calculus. Following this approach, CTL-RP utilises the first-order theorem prover SPASS to conduct resolution inferences for CTL and is implemented as a modification of SPASS. In particular, to implement the eventuality resolution rules, CTL-RP augments SPASS with an algorithm, called loop search algorithm for tackling eventualities in CTL. To study the performance of CTL-RP, we have compared CTL-RP with a tableau-based theorem prover for CTL. The experiments show good performance of CTL-RP. i ii ABSTRACT Thirdly, we apply the approach we used to develop R�,S CTL to the development of a clausal resolution calculus for a fragment of Alternating-time Temporal Logic (ATL). ATL is a generalisation and extension of branching-time temporal logic, in which the temporal operators are parameterised by sets of agents. Informally speaking, CTL formulae can be treated as ATL formulae with a single agent. Selective quantification over paths enables ATL to explicitly express coalition abilities, which naturally makes ATL a formalism for specification and verification of open systems and game-like multi-agent systems. In this thesis, we focus on the Next-time fragment of ATL (XATL), which is closely related to Coalition Logic. The satisfiability problem of XATL has lower complexity than ATL but there are still many applications in various strategic games and multi-agent systems that can be represented in and reasoned about in XATL. In this thesis, we present a resolution calculus RXATL for XATL to tackle its satisfiability problem. The calculus requires a polynomial time computable transformation of an arbitrary XATL formula to an equi-satisfiable clausal normal form. The calculus itself consists of a set of resolution rules and rewrite rules. We prove the soundness of the calculus and outline a completeness proof for the calculus RXATL. Also, we intend to extend our calculus RXATL to full ATL in the future. 004
318	An investigation into the use of negation in Inductive Rule Learning for text classification Chua, Stephanie Hui Li January 2012 (has links) This thesis seeks to establish if the use of negation in Inductive Rule Learning (IRL) for text classification is effective. Text classification is a widely research topic in the domain of data mining. There have been many techniques directed at text classification; one of them is IRL, widely chosen because of its simplicity, comprehensibility and interpretability by humans. IRL is a process whereby rules in the form of $antecedent -> conclusion$ are learnt to build a classifier. Thus, the learnt classifier comprises a set of rules, which are used to perform classification. To learn a rule, words from pre-labelled documents, known as features, are selected to be used as conjunctions in the rule antecedent. These rules typically do not include any negated features in their antecedent; although in some cases, as demonstrated in this thesis, the inclusion of negation is required and beneficial for the text classification task. With respect to the use of negation in IRL, two issues need to be addressed: (i) the identification of the features to be negated and (ii) the improvisation of rule refinement strategies to generate rules both with and without negation. To address the first issue, feature space division is proposed, whereby the feature space containing features to be used for rule refinement is divided into three sub-spaces to facilitate the identification of the features which can be advantageously negated. To address the second issue, eight rule refinement strategies are proposed, which are able to generate both rules with and without negation. Typically, single keywords which are deemed significant to differentiate between classes are selected to be used in the text representation in the text classification task. Phrases have also been proposed because they are considered to be semantically richer than single keywords. Therefore, with respect to the work conducted in this thesis, three different types of phrases ($n$-gram phrases, keyphrases and fuzzy phrases) are extracted to be used as the text representation in addition to the use of single keywords. To establish the effectiveness of the use of negation in IRL, the eight proposed rule refinement strategies are compared with one another, using keywords and the three different types of phrases as the text representation, to determine whether the best strategy is one which generates rules with negation or without negation. Two types of classification tasks are conducted; binary classification and multi-class classification. The best strategy in the proposed IRL mechanism is compared to five existing text classification techniques with respect to binary classification: (i) the Sequential Minimal Optimization (SMO) algorithm, (ii) Naive Bayes (NB), (iii) JRip, (iv) OlexGreedy and (v) OlexGA from the Waikato Environment for Knowledge Analysis (WEKA) machine learning workbench. In the multi-class classification task, the proposed IRL mechanism is compared to the Total From Partial Classification (TFPC) algorithm. The datasets used in the experiments include three text datasets: 20 Newsgroups, Reuters-21578 and Small Animal Veterinary Surveillance Network (SAVSNET) datasets and five UCI Machine Learning Repository tabular datasets. The results obtained from the experiments showed that the strategies which generated rules with negation were more effective when the keyword representation was used and less prominent when the phrase representations were used. Strategies which generated rules with negation also performed better with respect to binary classification compared to multi-class classification. In comparison with the other machine learning techniques selected, the proposed IRL mechanism was shown to generally outperform all the compared techniques and was competitive with SMO. 004
319	Predictive trend mining for social network analysis Nohuddin, Puteri January 2012 (has links) This thesis describes research work within the theme of trend mining as applied to social network data. Trend mining is a type of temporal data mining that provides observation into how information changes over time. In the context of the work described in this thesis the focus is on how information contained in social networks changes with time. The work described proposes a number of data mining based techniques directed at mechanisms to not only detect change, but also support the analysis of change, with respect to social network data. To this end a trend mining framework is proposed to act as a vehicle for evaluating the ideas presented in this thesis. The framework is called the Predictive Trend Mining Framework (PTMF). It is designed to support "end-to-end" social network trend mining and analysis. The work described in this thesis is divided into two elements: Frequent Pattern Trend Analysis (FPTA) and Prediction Modeling (PM). For evaluation purposes three social network datasets have been considered: Great Britain Cattle Movement, Deeside Insurance and Malaysian Armed Forces Logistic Cargo. The evaluation indicates that a sound mechanism for identifying and analysing trends, and for using this trend knowledge for prediction purposes, has been established. 004
320	Online network intrusion detection system using temporal logic and stream data processing Ahmed, Abdulbasit January 2013 (has links) These days, the world is becoming more interconnected, and the Internet has dominated the ways to communicate or to do business. Network security measures must be taken to protect the organization environment. Among these security measures are the intrusion detection systems. These systems aim to detect the actions that attempt to compromise the confidentiality, availability, and integrity of a resource by monitoring the events occurring in computer systems and/or networks. The increasing amounts of data that are transmitted at higher and higher speed networks created a challenging problem for the current intrusion detection systems. Once the traffic exceeds the operational boundaries of these systems, packets are dropped. This means that some attacks will not be detected. In this thesis, we propose developing an online network based intrusion detection system by the combined use of temporal logic and stream data processing. Temporal Logic formalisms allow us to represent attack patterns or normal behaviour. Stream data processing is a recent database technology applied to flows of data. It is designed with high performance features for data intensive applications processing. In this work we develop a system where temporal logic specifications are automatically translated into stream queries that run on the stream database server and are continuously evaluated against the traffic to detect intrusions. The experimental results show that this combination was efficient in using the resources of the running machines and was able to detect all the attacks in the test data. Additionally, the proposed solution provides a concise and unambiguous way to formally represent attack signatures and it is extensible allowing attacks to be added. Also, it is scalable as the system can benefit from using more CPUs and additional memory on the same machine, or using distributed servers. 004

Search results