• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 10
  • 2
  • 1
  • Tagged with
  • 1325
  • 1313
  • 1312
  • 1312
  • 1312
  • 192
  • 164
  • 156
  • 129
  • 99
  • 93
  • 79
  • 52
  • 51
  • 51
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
161

Effects of age on smartphone and tablet usability, based on eye-movement tracking and touch-gesture interactions

Al-Showarah, Suleyman January 2015 (has links)
The aim of this thesis is to provide an insight into the effects of user age on interactions with smartphones and tablets applications. The study considered two interaction methods to investigate the effects of user age on the usability of smartphones and tablets of different sizes: 1) eye-movements/browsing and 2) touch-gesture interactions. In eye movement studies, an eye tracker was used to trace and record users’ eye movements which were later analysed to understand the effects of age and screen-size on browsing effectiveness. Whilst in gesture interactions, an application developed for smartphones traced and recorded users’ touch-gestures data, which were later analysed to investigate the effects of age and screensize on touch-gesture performance. The motivation to conduct our studies is summarised as follows: 1) increasing number of elderly people in our society, 2) widespread use of smartphone technology across the world, 3) understanding difficulties for elderly when interacting smartphones technology, and 4) provide the existing body of literature with new understanding on the effects of ageing on smartphone usability. The work of this thesis includes five research projects conducted in two stages. Stage One included two researches used eye movement analysis to investigate the effects of user age and the influence of screen size on browsing smartphone interfaces. The first research examined the scan-paths dissimilarity of browsing smartphones applications or elderly users (60+) and younger users (20-39). The results revealed that the scan-paths dissimilarity in browsing smartphone applications was higher for elderly users (i.e., age-driven) than the younger users. The results also revealed that browsing smartphone applications were stimulus-driven rather than screen size-driven. The second study was conducted to understand the difficulties of information processing when browsing smartphone applications for elderly (60+), middle-age (40-59) and younger (20-39) users. The evaluation was performed using three different screen sizes of smartphone and tablet devices. The results revealed that processing of both local and global information on a smartphone/tablet interfaces was more difficult for elderly users than it was for the other age groups. Across all age groups, browsing on the smaller smartphone size proved to be more difficult compared to the larger screen sizes. Stage Two included three researches to investigate: the difficulties in interacting with gesture-based applications for elderly compared to younger users; and to evaluate the possibility of classifying user’s age-group based on on-screen gestures. The first research investigated the effects of user age and screen size on performing gesture swiping intuitively for four swiping directions: down, left, right, and up. The results revealed that the performance of gesture swiping was influenced by user age, screen size, as well as by the swiping orientation. The purpose of the second research was to investigate the effects of user age, screen sizes, and gesture complexity in performing accurate gestures on smartphones and tablets using gesture-based features. The results revealed that the elderly were less accurate, less efficient, slower, and exerted more pressure on the touch-screen when performing gestures than the younger users. On a small smartphone, all users were less accurate in gesture performance – more so for elderly – compared to mini-sized tablets. Also, the users, especially the elderly, were less efficient and less accurate when performing complex gestures on the small smartphone compared to the mini-tablet. The third research investigated the possibility of classifying a user’s age-group using touch gesture-based features (i.e., gesture speed, gesture accuracy, movement time, and finger pressure) on smartphones. In the third research, we provide evidence for the possibility of classifying a user’s age-group using gesture-based applications on smartphones for user-dependent and user-independent scenarios. The accuracy of age-group classification on smaller screens was higher than that on devices with larger screens due to larger screens being much easier to use for all users across both age groups. In addition, it was found that the age-group classification accuracy was higher for younger users than elderly users. This was due to the fact that some elderly users performed the gestures in the same way as the younger users do, which could be due to their longer experience in using smartphones than the typical elderly user. Overall, our results provided evidence that elderly users encounter difficulties when interacting with smartphones and tablet devices compared to younger users. Also, it was possible to classify user’s age-group based on users’ ability to perform touch-gestures on smartphones and tablets. The designers of smartphone interfaces should remove barriers that make browsing and processing local and global information on smartphones’ applications difficult. Furthermore, larger screen sizes should be considered for elderly users. Also, smartphones could include automatically customisable user interfaces to suite elderly users' abilities to accommodate their needs so that they can be equally efficient as younger users. The outcomes of this research could enhance the design of smartphones and tablets as well the applications that run on such devices, especially those that are aimed at elderly users. Such devices and applications could play an effective role in enhancing elderly peoples’ activities of daily lives.
162

Automatic Speech Emotion Recognition : feature space dimensionality and classification challenges

Al-Talabani, Abdulbasit January 2015 (has links)
In the last decade, research in Speech Emotion Recognition (SER) has become a major endeavour in Human Computer Interaction (HCI), and speech processing. Accurate SER is essential for many applications, like assessing customer satisfaction with quality of services, and detecting/assessing emotional state of children in care. The large number of studies published on SER reflects the demand for its use. The main concern of this thesis is the investigation of SER from a pattern recognition and machine learning points of view. In particular, we aim to identify appropriate mathematical models of SER and examine the process of designing automatic emotion recognition schemes. There are major challenges to automatic SER including ambiguity about the list/definition of emotions, the lack of agreement on a manageable set of uncorrelated speech-based emotion relevant features, and the difficulty of collected emotion-related datasets under natural circumstances. We shall initiate our work by dealing with the identification of appropriate sets of emotion related features/attributes extractible from speech signals as considered from psychological and computational points of views. We shall investigate the use of pattern-recognition approaches to remove redundancies and achieve compactification of digital representation of the extracted data with minimal loss of information. The thesis will include the design of new or complement existing SER schemes and conduct large sets of experiments to empirically test their performances on different databases, identify advantages, and shortcomings of using speech alone for emotion recognition. Existing SER studies seem to deal with the ambiguity/dis-agreement on a “limited” number of emotion-related features by expanding the list from the same speech signal source/sites and apply various feature selection procedures as a mean of reducing redundancies. Attempts are made to discover more relevant features to emotion from speech. One of our investigations focuses on proposing a newly sets of features for SER, extracted from Linear Predictive (LP)-residual speech. We shall demonstrate the usefulness of the proposed relatively small set of features by testing the performance of an SER scheme that is based on fusing our set of features with the existing set of thousands of features using common machine learning schemes of Support Vector Machine (SVM) and Artificial Neural Network (ANN). The challenge of growing dimensionality of SER feature space and its impact on increased model complexity is another major focus of our research project. By studying the pros and cons of the commonly used feature selection approaches, we argued in favour of meta-feature selection and developed various methods in this direction, not only to reduce dimension, but also to adapt and de-correlate emotional feature spaces for improved SER model recognition accuracy. We used rincipal Component Analysis (PCA) and proposed Data Independent PCA (DIPCA) by training on independent emotional and non-emotional datasets. The DIPCA projections, especially when extracted from speech data coloured with different emotions or from Neutral speech data, had comparable capability to the PCA in terms of SER performance. Another adopted approach in this thesis for dimension reduction is the Random Projection (RP) matrices, independent of training data. We have shown that some versions of RP with SVM classifier can offer an adaptation space for Speaker Independent SER that avoid over-fitting and hence improves recognition accuracy. Using PCA trained on a set of data, while testing on emotional data features, has significant implication for machine learning in general. The thesis other major contribution focuses on the classification aspects of SER. We investigate the drawbacks of the well-known SVM classifier when applied to a preprocessed data by PCA and RP. We shall demonstrate the advantages of using the Linear Discriminant Classifier (LDC) instead especially for PCA de-correlated metafeatures. We initiated a variety of LDC-based ensembles classification, to test performance of scheme using a new form of bagging different subsets of metafeature subsets extracted by PCA with encouraging results. The experiments conducted were applied on two benchmark datasets (Emo-Berlin and FAU-Aibo), and an in-house dataset in the Kurdish language. Recognition accuracy achieved by are significantly higher than the state of art results on all datasets. The results, however, revealed a difficult challenge in the form of persisting wide gap in accuracy over different datasets, which cannot be explained entirely by the differences between the natures of the datasets. We conducted various pilot studies that were based on various visualizations of the confusion matrices for the “difficult” databases to build multi-level SER schemes. These studies provide initial evidences to the presence of more than one “emotion” in the same portion of speech. A possible solution may be through presenting recognition accuracy in a score-based measurement like the spider chart. Such an approach may also reveal the presence of Doddington zoo phenomena in SER.
163

Biometrics writer recognition for Arabic language : analysis and classification techniques using subwords features

Maliki, Makki Jasim Radhi January 2015 (has links)
Handwritten text in any language is believed to convey a great deal of information about writers’ personality and identity. Indeed, handwritten signature has long been accepted as an authentication of the writer’s physical stamp on financial and legal deals as well official/personal documents and works of art. Handwritten documents are frequently used as evidences in forensic tasks. Handwriting skills is learnt and developed from the early schooling stages. Research interest in behavioral biometrics was the main driving force behind the growth in research into Writer Identification (WI) from handwritten text, but recent rise in terrorism associated with extreme religious ideologies spreading primarily, but not exclusively, from the middle-east has led to a surge of interest in WI from handwritten text in Arabic and similar languages. This thesis is the main outcome of extensive research investigations conducted with the aim of developing an automatic identification of a person from handwritten Arabic text samples. My motivations and interests, as an Iraqi researcher, emanate from my multi-faceted desires to provide scientific support for my people in their fight against terrorism by providing forensic evidences, and as contribute to the ongoing digitization of the Iraqi National archive as well as the wealth of religious and historical archives in Iraq and the middle-east. Good knowledge of the underlying language is invaluable in this project. Despite the rising interest in this recognition modality worldwide, Arabic writer identification has not been addressed as extensively as Latin writer identification. However, in recent years some new Arabic writer identification approaches have been proposed some of which are reviewed in this thesis. Arabic is a cursive language when handwritten. This means that each and every writer in this language develops some unique features that could demonstrate writer’s habits and style. These habits and styles are considered as unique WI features and determining factors. Existing dominating approaches to WI are based on recognizing handwriting habits/styles are embedded in certain parts/components of the written texts. Although the appearance of these components within long text contain rich information and clues to writer identity, the most common approaches to WI in Arabic in the literature are based on features extracted from paragraph(s), line(s), word(s), character(s), and/or a part of a character. Generally, Arabic words are made up of one or more subwords at the end of each; there is a connected stroke with a certain style of which seem to be most representative of writers habits. Another feature of Arabic writing is to do with diacritics that are added to written words/subwords, to add meaning and pronunciation. Subwords are more frequent in written Arabic text and appear as part of several different words or as full individual words. Thus, we propose a new innovative approach based on a seemingly plausible hypothesis that subwords based WI yields significant increase in accuracy over existing approaches. The thesis most significant contributions can be summarized as follows: - Developed a high performing segmentation of scanned text images, that combines threshold based binarisation, morphological operation and active shape model. - Defined digital measures and formed a 15-dimensional feature vectors representations of subwords that implicitly cover its diacritics and strokes. A pilot study that incrementally added features according to writer discriminating power. This reduced subwords feature vector dimension to 8, two of which were modelled as time series. - For the dependent 8-dimensional WI scheme, we identify the best performing set of subwords (best 22 subwords out of 49 then followed by best 11 out of these 22 subwords). - We established the validity of our hypothesis for different versions of subwords based WI schemes by providing empirical evidence when testing on a number of existing text dependent and in text-dependent databases plus a simulated text-in text-dependent DB. The text-dependent scenario results exhibited possible present of the Doddington Zoo phenomena. - The final optimal subword based WI scheme, not only removes the need to include diacritics as part of the subword but also demonstrating that including diacritics within subwords impairs the WI discriminating power of subwords. This should not be taken to discredit research that are based on diacritics based WI. Also in this subword body (without diacritics) base WI scheme, resulted in eliminating the presence of Doddington Zoo effect. - Finally, a significant but un-intended consequence of using subwords for WI is that there is no difference between a text-independent scenario and text-dependent one. In fact, we shall demonstrate that the text-dependent database of the 27-words can be used to simulate the testing of the scheme for an in text-dependent database without the need to record such a DB. Finally, we discussed ways of optimising the performance of our last scheme by considering possible ways of complementing our scheme using the addition of various image texture analysis features to be extracted from subwords, lines, paragraphs or entire file of the scabbed image. These included LBP and Gabor Filter. We also suggested the possible addition of few more features.
164

Multi evidence fusion scheme for content-based image retrieval by clustering localised colour and texture features

Al-Jubouri, Hanan January 2015 (has links)
Content-Based Image Retrieval (CBIR) is an automatic process of retrieving images according to their visual content. Research in this field mainly follows two directions. The first is concerned with the effectiveness in describing the visual content of images (i.e. features) by a technique that lead to discern similar and dissimilar images, and ultimately the retrieval of the most relevant images to the query image. The second direction focuses on retrieval efficiency by deploying efficient structures in organising images by their features in the database to narrow down the search space. The emphasis of this research is mainly on the effectiveness rather than the efficiency. There are two types of visual content features. The global feature represents the entire image by a single vector, and hence retrieval by using the global feature is more efficient but often less accurate. On the other hand, the local feature represents the image by a set of vectors, capturing localised visual variations in different parts of an image, promising better results particularly for images with complicated scenes. The first main purpose of this thesis is to study different types of local features. We explore a range of different types of local features from both frequency and spatial domains. Because of the large number of local features generated from an image, clustering methods are used for quantizing and summarising the feature vectors into segments from which a representation of the visual content of the entire image is derived. Since each clustering method has a different way of working and requires settings of different input parameters (e.g. number of clusters), preparations of input data (i.e. normalized or not) and choice of similarity measures, varied performance outcomes by different clustering methods in segmenting the local features are anticipated. We therefore also intend to study and analyse one commonly used clustering algorithm from each of the four main categories of clustering methods, i.e. K-means (partition-based), EM/GMM (model-based), Normalized Laplacian Spectral (graph-based), and Mean Shift (density-based). These algorithms were investigated in two scenarios when the number of clusters is either fixed or adaptively determined. Performances of the clustering algorithms in terms of image classification and retrieval are evaluated using three publically available image databases. The evaluations have revealed that a local DCT colour-texture feature was overall the best due to its robust integration of colour and texture information. In addition, our investigation into the behaviour of different clustering algorithms has shown that each algorithm had its own strengths and limitations in segmenting local features that affect the performance of image retrieval due to variations in visual colour and texture of the images. There is no algorithm that can outperform the others using either an adaptively determined or big fixed number of clusters. The second focus of this research is to investigate how to combine the positive effects of various local features obtained from different clustering algorithms in a fusion scheme aiming to bring about improved retrieval results over those by using a single clustering algorithm. The proposed fusion scheme integrates effectively the information from different sources, increasing the overall accuracy of retrieval. The proposed multi-evidence fusion scheme regards scores of image retrieval that are obtained from normalizing distances of applying different clustering algorithms to different types of local features as evidence and was presented in three forms: 1) evidence fusion using fixed weights (MEFS) where the weights were determined empirically and fixed a prior; 2) evidence fusion based on adaptive weights (AMEFS) where the fusion weights were adaptively determined using linear regression; 3) evidence fusion using a linear combination (Comb SUM) without weighting the evidences. Overall, all three versions of the multi-evidence fusion scheme have proved the ability to enhance the accuracy of image retrieval by increasing the number of relevant images in the ranked list. However, the improvement varied across different feature-clustering combinations (i.e. image representation) and the image databases used for the evaluation. This thesis presents an automatic method of image retrieval that can deal with natural world scenes by applying different clustering algorithms to different local features. The method achieves good accuracies of 85% at Top 5 and 80% at Top 10 over the WANG database, which are better when compared to a number of other well-known solutions in the literature. At the same time, the knowledge gained from this research, such as the effects of different types of local features and clustering methods on the retrieval results, enriches the understanding of the field and can be beneficial for the CBIR community.
165

A knowledge-based approach to scientific workflow composition

McIver, Russell P. January 2015 (has links)
Scientific Workflow Systems have been developed as a means to enable scientists to carry out complex analysis operations on local and remote data sources in order to achieve their research goals. Systems typically provide a large number of components and facilities to enable such analysis to be performed and have matured to a point where they offer many complex capabilities. This complexity makes it difficult for scientists working with these systems to readily achieve their goals. In this thesis we describe the increasing burden of knowledge required of these scientists in order for them to specify the outcomes they wish to achieve within the workflow systems. We consider ways in which the challenges presented by these systems can be reduced, focusing on the following questions: How can metadata describing the resources available assist users in composing workflows? Can automated assistance be provided to guide users through the composition process? Can such an approach be implemented so as to work with the resources provided by existing Scientific Workflow Systems? We have developed a new approach to workflow composition which makes use of a number of features: an ontology for recording metadata relating to workflow components, a set of algorithms for analyzing the state of a workflow composition and providing suggestions for how to progress based on this metadata, an API to enable both the algorithms and metadata to utilise the resources provided by existing Scientific Workflow Systems, and a prototype user interface to demonstrate how our proposed approach to workflow composition can work in practice. We evaluate the system to show the approach is valid and capable of reducing some of the difficulties presented by existing systems, but that limitations exist regarding the complexity of workflows which can be composed, and also regarding the challenge of initially populating the metadata ontology.
166

4D (3D Dynamic) statistical models of conversational expressions and the synthesis of highly-realistic 4D facial expression sequences

Vandeventer, Jason January 2015 (has links)
In this thesis, a novel approach for modelling 4D (3D Dynamic) conversational interactions and synthesising highly-realistic expression sequences is described. To achieve these goals, a fully-automatic, fast, and robust pre-processing pipeline was developed, along with an approach for tracking and inter-subject registering 3D sequences (4D data). A method for modelling and representing sequences as single entities is also introduced. These sequences can be manipulated and used for synthesising new expression sequences. Classification experiments and perceptual studies were performed to validate the methods and models developed in this work. To achieve the goals described above, a 4D database of natural, synced, dyadic conversations was captured. This database is the first of its kind in the world. Another contribution of this thesis is the development of a novel method for modelling conversational interactions. Our approach takes into account the time-sequential nature of the interactions, and encompasses the characteristics of each expression in an interaction, as well as information about the interaction itself. Classification experiments were performed to evaluate the quality of our tracking, inter-subject registration, and modelling methods. To evaluate our ability to model, manipulate, and synthesise new expression sequences, we conducted perceptual experiments. For these perceptual studies, we manipulated modelled sequences by modifying their amplitudes, and had human observers evaluate the level of expression realism and image quality. To evaluate our coupled modelling approach for conversational facial expression interactions, we performed a classification experiment that differentiated predicted frontchannel and backchannel sequences, using the original sequences in the training set. We also used the predicted backchannel sequences in a perceptual study in which human observers rated the level of similarity of the predicted and original sequences. The results of these experiments help support our methods and our claim of our ability to produce 4D, highly-realistic expression sequences that compete with state-of-the-art methods.
167

A game theoretic approach to coordinating unmanned aerial vehicles with communications payloads

Charlesworth, Philip January 2015 (has links)
This thesis considers the placement of two or more Unmanned Aerial Vehicles (UAVs) to provide communications to a community of ground mobiles. The locations for the UAVs are decided by the outcome of a non-cooperative game in which the UAVs compete to maximize their coverage of the mobiles. The game allows navigation decisions to be made onboard the UAVs with the effect of increasing coverage, reducing the need for a central planning function, and increasing the autonomy of the UAVs. A non-cooperative game that includes the key system elements is defined and simulated. The thesis compares methods for solving the game to evaluate their performance. A conflict between the quality of the solution and the time required to obtain that solution is identified and explored. It considers how the payload calculations could be used to modify the behaviour of the UAVs, and the sensitivity of the game to resource limitations such as RF power and radio spectrum. It finishes by addressing how the game could be scaled from two UAVs to many UAVs, and the constraints imposed by current methods for solving games.
168

Clausal reasoning for branching-time logics

Zhang, Lan January 2010 (has links)
Computation Tree Logic (CTL) is a branching-time temporal logic whose underlying model of time is a choice of possibilities branching into the future. It has been used in a wide variety of areas in Computer Science and Artificial Intelligence, such as temporal databases, hardware verification, program reasoning, multi-agent systems, and concurrent and distributed systems. In this thesis, firstly we present a refined clausal resolution calculus R�,S CTL for CTL. The calculus requires a polynomial time computable transformation of an arbitrary CTL formula to an equisatisfiable clausal normal form formulated in an extension of CTL with indexed existential path quantifiers. The calculus itself consists of eight step resolution rules, two eventuality resolution rules and two rewrite rules, which can be used as the basis for an EXPTIME decision procedure for the satisfiability problem of CTL. We give a formal semantics for the clausal normal form, establish that the clausal normal form transformation preserves satisfiability, provide proofs for the soundness and completeness of the calculus R�,S CTL, and discuss the complexity of the decision procedure based on R�,S CTL. As R�,S CTL is based on the ideas underlying Bolotov’s clausal resolution calculus for CTL, we provide a comparison between our calculus R�,S CTL and Bolotov’s calculus for CTL in order to show that R�,S CTL improves Bolotov’s calculus in many areas. In particular, our calculus is designed to allow first-order resolution techniques to emulate resolution rules of R�,S CTL so that R�,S CTL can be implemented by reusing any first-order resolution theorem prover. Secondly, we introduce CTL-RP, our implementation of the calculus R�,S CTL. CTL-RP is the first implemented resolution-based theorem prover for CTL. The prover takes an arbitrary CTL formula as input and transforms it into a set of CTL formulae in clausal normal form. Furthermore, in order to use first-order techniques, formulae in clausal normal form are transformed into firstorder formulae, except for those formulae related to eventualities, i.e. formulae containing the eventuality operator 3. To implement step resolution and rewrite rules of the calculus R�,S CTL, we present an approach that uses first-order ordered resolution with selection to emulate the step resolution rules and related proofs. This approach enables us to make use of a first-order theorem prover, which implements the first-order ordered resolution with selection, in order to realise our calculus. Following this approach, CTL-RP utilises the first-order theorem prover SPASS to conduct resolution inferences for CTL and is implemented as a modification of SPASS. In particular, to implement the eventuality resolution rules, CTL-RP augments SPASS with an algorithm, called loop search algorithm for tackling eventualities in CTL. To study the performance of CTL-RP, we have compared CTL-RP with a tableau-based theorem prover for CTL. The experiments show good performance of CTL-RP. i ii ABSTRACT Thirdly, we apply the approach we used to develop R�,S CTL to the development of a clausal resolution calculus for a fragment of Alternating-time Temporal Logic (ATL). ATL is a generalisation and extension of branching-time temporal logic, in which the temporal operators are parameterised by sets of agents. Informally speaking, CTL formulae can be treated as ATL formulae with a single agent. Selective quantification over paths enables ATL to explicitly express coalition abilities, which naturally makes ATL a formalism for specification and verification of open systems and game-like multi-agent systems. In this thesis, we focus on the Next-time fragment of ATL (XATL), which is closely related to Coalition Logic. The satisfiability problem of XATL has lower complexity than ATL but there are still many applications in various strategic games and multi-agent systems that can be represented in and reasoned about in XATL. In this thesis, we present a resolution calculus RXATL for XATL to tackle its satisfiability problem. The calculus requires a polynomial time computable transformation of an arbitrary XATL formula to an equi-satisfiable clausal normal form. The calculus itself consists of a set of resolution rules and rewrite rules. We prove the soundness of the calculus and outline a completeness proof for the calculus RXATL. Also, we intend to extend our calculus RXATL to full ATL in the future.
169

An investigation into the use of negation in Inductive Rule Learning for text classification

Chua, Stephanie Hui Li January 2012 (has links)
This thesis seeks to establish if the use of negation in Inductive Rule Learning (IRL) for text classification is effective. Text classification is a widely research topic in the domain of data mining. There have been many techniques directed at text classification; one of them is IRL, widely chosen because of its simplicity, comprehensibility and interpretability by humans. IRL is a process whereby rules in the form of $antecedent -> conclusion$ are learnt to build a classifier. Thus, the learnt classifier comprises a set of rules, which are used to perform classification. To learn a rule, words from pre-labelled documents, known as features, are selected to be used as conjunctions in the rule antecedent. These rules typically do not include any negated features in their antecedent; although in some cases, as demonstrated in this thesis, the inclusion of negation is required and beneficial for the text classification task. With respect to the use of negation in IRL, two issues need to be addressed: (i) the identification of the features to be negated and (ii) the improvisation of rule refinement strategies to generate rules both with and without negation. To address the first issue, feature space division is proposed, whereby the feature space containing features to be used for rule refinement is divided into three sub-spaces to facilitate the identification of the features which can be advantageously negated. To address the second issue, eight rule refinement strategies are proposed, which are able to generate both rules with and without negation. Typically, single keywords which are deemed significant to differentiate between classes are selected to be used in the text representation in the text classification task. Phrases have also been proposed because they are considered to be semantically richer than single keywords. Therefore, with respect to the work conducted in this thesis, three different types of phrases ($n$-gram phrases, keyphrases and fuzzy phrases) are extracted to be used as the text representation in addition to the use of single keywords. To establish the effectiveness of the use of negation in IRL, the eight proposed rule refinement strategies are compared with one another, using keywords and the three different types of phrases as the text representation, to determine whether the best strategy is one which generates rules with negation or without negation. Two types of classification tasks are conducted; binary classification and multi-class classification. The best strategy in the proposed IRL mechanism is compared to five existing text classification techniques with respect to binary classification: (i) the Sequential Minimal Optimization (SMO) algorithm, (ii) Naive Bayes (NB), (iii) JRip, (iv) OlexGreedy and (v) OlexGA from the Waikato Environment for Knowledge Analysis (WEKA) machine learning workbench. In the multi-class classification task, the proposed IRL mechanism is compared to the Total From Partial Classification (TFPC) algorithm. The datasets used in the experiments include three text datasets: 20 Newsgroups, Reuters-21578 and Small Animal Veterinary Surveillance Network (SAVSNET) datasets and five UCI Machine Learning Repository tabular datasets. The results obtained from the experiments showed that the strategies which generated rules with negation were more effective when the keyword representation was used and less prominent when the phrase representations were used. Strategies which generated rules with negation also performed better with respect to binary classification compared to multi-class classification. In comparison with the other machine learning techniques selected, the proposed IRL mechanism was shown to generally outperform all the compared techniques and was competitive with SMO.
170

Predictive trend mining for social network analysis

Nohuddin, Puteri January 2012 (has links)
This thesis describes research work within the theme of trend mining as applied to social network data. Trend mining is a type of temporal data mining that provides observation into how information changes over time. In the context of the work described in this thesis the focus is on how information contained in social networks changes with time. The work described proposes a number of data mining based techniques directed at mechanisms to not only detect change, but also support the analysis of change, with respect to social network data. To this end a trend mining framework is proposed to act as a vehicle for evaluating the ideas presented in this thesis. The framework is called the Predictive Trend Mining Framework (PTMF). It is designed to support "end-to-end" social network trend mining and analysis. The work described in this thesis is divided into two elements: Frequent Pattern Trend Analysis (FPTA) and Prediction Modeling (PM). For evaluation purposes three social network datasets have been considered: Great Britain Cattle Movement, Deeside Insurance and Malaysian Armed Forces Logistic Cargo. The evaluation indicates that a sound mechanism for identifying and analysing trends, and for using this trend knowledge for prediction purposes, has been established.

Page generated in 0.0592 seconds