111 |
Anti-Spam Study: an Alliance-based ApproachChiu, Yu-fen 12 September 2006 (has links)
The growing problem of spam has generated a need for reliable anti-spam filters. There are many filtering techniques along with machine learning and data miming used to reduce the amount of spam. Such algorithms can achieve very high accuracy but with some amount of false positive tradeoff. Generally false positives are prohibitively expensive in the real world. Much work has been done to improve specific algorithms for the task of detecting spam, but less work has been report on leveraging multiple algorithms in email analysis. This study presents an alliance-based approach to classify, discovery and exchange interesting information on spam. Furthermore, the spam filter in this study is build base on the mixture of rough set theory (RST), genetic algorithm (GA) and XCS classifier system.
RST has the ability to process imprecise and incomplete data such as spam. GA can speed up the rate of finding the optimal solution (i.e. the rules used to block spam). The reinforcement learning of XCS is a good mechanism to suggest the appropriate classification for the email. The results of spam filtering by alliance-based approach are evaluated by several statistical methods and the performance is great. Two main conclusions can be drawn from this study: (1) the rules exchanged from other mail servers indeed help the filter blocking more spam than before. (2) a combination of algorithms improves both accuracy and reducing false positives for the problem of spam detection.
|
112 |
Multiple Classifier Systems For A Generic Missle WarnerBasibuyuk, Kubilay 01 June 2009 (has links) (PDF)
A generic missile warner decision algorithm for airborne platforms with an
emphasis on multiple classifier systems is proposed within the scope of this thesis.
For developing the algorithm, simulation data are utilized. The simulation data are
created in order to cover a wide range of real-life scenarios and for this purpose a
scenario creation methodology is proposed. The scenarios are simulated by a
generic missile warner simulator and tracked object data for each scenario are
produced.
Various feature extraction techniques are applied to the output data of the scenarios
and feature sets are generated. Feature sets are examined by using various statistical
methods. The performance of selected multiple classifier systems are evaluated for
all feature sets and experimental results are presented.
|
113 |
Inspection of LCD Light-guide Plate Using Moment-invariantsChang-chien, Hsin-yu 10 September 2007 (has links)
Inspection of LCD light-guide plate using digital image processing is proposed. Binary dot-pattern images from SEM observation are obtained by image segmentation. Pattern recognition for the images is then performed using moment invariants, Bayes classifier, and Neural network. The rotation independent classification for the recognition using only one descript shape factor are also proposed to reduce storage space. It is found the method has been applied successfully in inspection of different defects on the plate subject to any rotation angles and image scales.
|
114 |
Nominal Arguments and Language VariationJiang, Li January 2012 (has links)
This dissertation investigates nominal arguments in classifier languages (ClLs). There are two main goals. The first is to explore what is constant and what varies in the way ClLs form nominal arguments. The second goal is to understand the relationship between argument formation in classifier languages and argument formation more generally. Three classifier languages are the center of the discussion: Mandarin, a ClL without overt evidence of determiners, Yi, a head-final ClL which will be shown to have overt determiners, and Bengali, a ClL that has already been argued to have overt evidence of determiners. In addition to paying particular attention to these three ClLs, the discussion of nominal arguments also covers a wider range of ClLs and number marking languages (NMLs) from Romance, Germanic, and Slavic, as well as Hindi. In this dissertation we will argue for the following three points. First, numeral constructions (NCs) have identical syntax and semantics in ClLs and NMLs (possibly universally); specifically, we argue that NCs have a predicative interpretation and an argumental interpretation that arises via a choice function in the lexical entry of numerals. Secondly, we argue that language variation in the nominal domain is due primarily to two interrelated factors: what nouns denote (kinds or properties) and what low functional heads (i.e. number morphology (#) and classifiers) denote; we show how this variation in the nominal domain can be related to a more general macroparameter. Thirdly, we argue that determiners in ClLs are in fact expected, contrary to the standard view, but while they can combine with numeral-classifier phrases (ClPs) and numeral-less ClPs, they can never combine with bare nouns. The proposal is that bare nouns in ClLs are always argumental regardless of whether or not there are determiners. In the last chapter of this dissertation, we show that the developed analysis of nominal arguments and language variation yields an updated language typology of argument formation. With this proposed analysis of nominal arguments, we may be a few steps closer to a general theory of argument formation of wide cross-linguistic applicability. / Linguistics
|
115 |
Detection, Localization, and Recognition of Faults in Transmission Networks Using Transient CurrentsPerera, Nuwan 18 September 2012 (has links)
The fast clearing of faults is essential for preventing equipment damage and preserving the stability of the power transmission systems with smaller operating margins. This thesis examined the application of fault generated transients for fast detection and isolation of faults in a transmission system. The basis of the transient based protection scheme developed and implemented in this thesis is the fault current directions identified by a set of relays located at different nodes of the system. The direction of the fault currents relative to a relay location is determined by comparing the signs of the wavelet coefficients of the currents measured in all branches connected to the node. The faulted segment can be identified by combining the fault directions identified at different locations in the system. In order to facilitate this, each relay is linked with the relays located at the adjacent nodes through a telecommunication network.
In order to prevent possible malfunctioning of relays due to transients originating from non-fault related events, a transient recognition system to supervise the relays is proposed. The applicability of different classification methods to develop a reliable transient recognition system was examined. A Hidden Markov Model classifier that utilizes the energies associated with the wavelet coefficients of the measured currents as input features was selected as the most suitable solution.
Performance of the protection scheme was evaluated using a high voltage transmission system simulated in PSCAD/EMTDC simulation software. The custom models required to simulate the complete protection scheme were implemented in PSCAD/EMTDC. The effects of various factors such as fault impedance, signal noise, fault inception angle and current transformer saturation were investigated. The performance of the protection scheme was also tested with the field recorded signals.
Hardware prototypes of the fault direction identification scheme and the transient classification system were implemented and tested under different practical scenarios using input signals generated with a real-time waveform playback instrument. The test results presented in this thesis successfully demonstrate the potential of using transient signals embedded in currents for detection, localization and recognition of faults in transmission networks in a fast and reliable manner.
|
116 |
INVESTIGATIONS INTO THE COGNITIVE ABILITIES OF ALTERNATE LEARNING CLASSIFIER SYSTEM ARCHITECTURESGaines, David Alexander 01 January 2006 (has links)
The Learning Classifier System (LCS) and its descendant, XCS, are promising paradigms for machine learning design and implementation. Whereas LCS allows classifier payoff predictions to guide system performance, XCS focuses on payoff-prediction accuracy instead, allowing it to evolve "optimal" classifier sets in particular applications requiring rational thought. This research examines LCS and XCS performance in artificial situations with broad social/commercial parallels, created using the non-Markov Iterated Prisoner's Dilemma (IPD) game-playing scenario, where the setting is sometimes asymmetric and where irrationality sometimes pays. This research systematically perturbs a "conventional" IPD-playing LCS-based agent until it results in a full-fledged XCS-based agent, contrasting the simulated behavior of each LCS variant in terms of a number of performance measures. The intent is to examine the XCS paradigm to understand how it better copes with a given situation (if it does) than the LCS perturbations studied.Experiment results indicate that the majority of the architectural differences do have a significant effect on the agents' performance with respect to the performance measures used in this research. The results of these competitions indicate that while each architectural difference significantly affected its agent's performance, no single architectural difference could be credited as causing XCS's demonstrated superiority in evolving optimal populations. Instead, the data suggests that XCS's ability to evolve optimal populations in the multiplexer and IPD problem domains result from the combined and synergistic effects of multiple architectural differences.In addition, it is demonstrated that XCS is able to reliably evolve the Optimal Population [O] against the TFT opponent. This result supports Kovacs' Optimality Hypothesis in the IPD environment and is significant because it is the first demonstrated occurrence of this ability in an environment other than the multiplexer and Woods problem domains.It is therefore apparent that while XCS performs better than its LCS-based counterparts, its demonstrated superiority may not be attributed to a single architectural characteristic. Instead, XCS's ability to evolve optimal classifier populations in the multiplexer problem domain and in the IPD problem domain studied in this research results from the combined and synergistic effects of multiple architectural differences.
|
117 |
Identification of tool breakage in a drilling process2015 February 1900 (has links)
In an effort to increase machining efficiency and minimize costs, research into tool condition monitoring (TCM) systems has focused on developing methods to allow for unmanned machining. For drilling processes, such systems typically use indirect approaches to monitoring the tool condition by measuring spindle torque and feed force as well as vibrations including acoustic emission (AE – mechanical vibrations faster than 100 kHz). This project aimed to advance the state-of-the-art in the area of TCM by developing a method to detect sudden tool failures in large diameter (> 25 mm) indexable insert drills. This project was a continuation of the research conducted by Mr. R. Griffin (a former MSc student), who developed a model capable of predicting long term wear trends in indexable insert drills [1]. Notably, his model was unable to react to sudden tool breakage due to tool chipping, which was addressed by this project as presented in this thesis.
In order to develop and train models able to detect sudden tool failure, an experiment was developed and installed in the field of the industry partner of this project. The experiment’s main feature was a pair of AE sensors added to the existing torque and force sensors. On this setup, experiments were conducted by drilling 2251 holes in workpieces using indexable insert drills with or without the insert breaking. When drilling holes without the insert breaking, the holes were named as good ones; and when drilling holes with the insert breaking they were named as bad holes. During the drilling process, data was collected from current sensors attached to the spindle motor and feed motor as well as from an AE sensor on the spindle and on the workpiece.
From the signals from the spindle motor current and feed motor current sensors, algorithms were developed to identify and divide the signals of drilling a hole into different sections of the drilling cycle (i.e. entrance, steady-state, exit, etc.). Steady-state time-domain features were extracted from the sensor signals measured for all holes drilled in the experiments and the extracted features were used to train and test the classifier models. These models were cross validated to determine which type of model was the best fit for the drilling data collected. The results from the classifier models show that most of the classifiers tested have the ability to identify sudden tool breakage based on the data recorded in the present study, with varying degrees of success. The naïve Bayes classifier was able to detect the most failures but suffered from a large number of falsely detected failures. Both the classification tree and linear discriminant analysis classifiers had lower failure detection rates than the naïve Bayes classifier, but did not suffer from the same amount of false positives; as such, these two classifiers had higher overall classification rates than the naïve Bayes.
These results suggest that classification tree and linear discriminant analysis methods are better suited for the drilling application and that the time-domain features should be complemented by others, such as the features extracted from the frequency domain, to accurately diagnose the tool condition. Future research should focus on extracting frequency and time-frequency domain features as these features might contain more information on tool condition. In addition, methods of examining features at the entrance and exit of the holes should be investigated as these two points in the drilling cycle are the most prone to sudden tool failure.
|
118 |
界定日語中的分類詞 / Identifying classifiers in Japanese王維, Wang, Wei Unknown Date (has links)
日文與中文一樣,皆是使用分類詞的語言。長久以來不乏學者討論日文的量詞,然而,對於界定何者為分類詞、如何區分助數詞、量詞和分類詞等議題,卻缺少統一的定論;一方面是因為語言學上對於分類詞的定義缺乏一定的標準,另一方面,傳統日文文法書中的概念,往往僅用「助數詞」一個詞類就概括了所有數量詞後面的詞類。
本篇論文最主要的目標便是依照一個統一且清楚的定義,來界定日文的分類詞。此次研究先參考了四位語言學者的研究,和四本文法書中的分類詞列表,共整理出前人所列出673個可能的分類詞,之後再透過JpWac和Google Search蒐羅實際語料,對這673個詞逐一進行句法和語意測試,最後界定出其中只有115個是真正的日文分類詞。
在此之後,為瞭解日本人對於名詞的分類和意識,便從這115個分類詞由底層到高層建立一項名詞的分類整理。最後,再由出一份問卷請以日文為母語的日本人填寫使用頻率,初步了解現代日本語中分類詞的使用狀況。結果顯示,僅有27個分類詞堪列為現代日本語中常使用的分類詞,期望這些真正的分類詞能成為日後臺灣在日文教學之參考。 / Japanese is one of the languages that use numeral classifiers, which can be combined with both numerals and nouns. However, Japanese grammar books tend to use “counters” to call all morphemes preceded by numerals, and in linguistic studies, the definition of numeral classifiers is controversial. Therefore, there is no consistent analysis in identifying Japanese classifiers.
The goal of this thesis is to identify Japanese classifiers based on one consistent model. Eight previous works from both traditional grammar and linguistic areas were reviewed, and 673 possible classifiers were collected. Each of the 673 possible classifiers is tested to identify true Japanese classifiers. Two corpora, JpWac and Google Search, are used to collect raw data for syntactic and semantic tests. As a result, only 115 true Japanese classifiers are found.
After identifying the true classifiers, a bottom-up classification is performed to understand the concept of noun categorization by native Japanese speakers. A questionnaire is created to evaluate the usage frequencies of these true classifiers. Based on the survey, only 27 out of the 115 classifiers are estimated to be frequently used classifiers.
|
119 |
Automated recognition of handwritten mathematicsMacLean, Scott January 2014 (has links)
Most software programs that deal with mathematical objects require input expressions to be linearized using somewhat awkward and unfamiliar string-based syntax. It is natural to desire a method for inputting mathematics using the same two-dimensional syntax employed with pen and paper, and the increasing prevalence of pen- and touch-based interfaces causes this topic to be of practical as well as theoretical interest. Accurately recognizing two-dimensional mathematical notation is a difficult problem that requires not only theoretical advancement over the traditional theories of string-based languages, but also careful consideration of runtime efficiency, data organization, and other practical concerns that arise during system construction.
This thesis describes the math recognizer used in the MathBrush pen-math system. At a high level, the two-dimensional syntax of mathematical writing is formalized using a relational grammar. Rather than reporting a single recognition result, all recognizable interpretations of the input are
simultaneously represented in a data structure called a parse forest. Individual interpretations may be extracted from the forest and reported one by one as the user requests them. These parsing techniques necessitate robust tree scoring functions, which themselves rely on several lower-level recognition processes for stroke grouping, symbol recognition, and spatial relation classification.
The thesis covers the recognition, parsing, and scoring aspects of the MathBrush recognizer, as well as the algorithms and assumptions necessary to combine those systems and formalisms together into a useful and efficient software system. The effectiveness of the resulting system is measured through two accuracy evaluations. One evaluation uses a novel metric based on user effort, while the
other replicates the evaluation process of an international accuracy competition. The evaluations show that not only is the performance of the MathBrush recognizer improving over time, but it is also significantly more accurate than other academic recognition systems.
|
120 |
Discourse-givenness of noun phrases : theoretical and computational modelsRitz, Julia January 2013 (has links)
This thesis gives formal definitions of discourse-givenness, coreference and reference, and reports on experiments with computational models of discourse-givenness of noun phrases for English and German.
Definitions are based on Bach's (1987) work on reference, Kibble and van Deemter's (2000) work on coreference, and Kamp and Reyle's Discourse Representation Theory (1993).
For the experiments, the following corpora with coreference annotation were used: MUC-7, OntoNotes and ARRAU for Englisch, and TueBa-D/Z for German. As for classification algorithms, they cover J48 decision trees, the rule based learner Ripper, and linear support vector machines. New features are suggested, representing the noun phrase's specificity as well as its context, which lead to a significant improvement of classification quality. / Die vorliegende Arbeit gibt formale Definitionen der Konzepte Diskursgegebenheit, Koreferenz und Referenz. Zudem wird über Experimente berichtet, Nominalphrasen im Deutschen und Englischen hinsichtlich ihrer Diskursgegebenheit zu klassifizieren.
Die Definitionen basieren auf Arbeiten von Bach (1987) zu Referenz, Kibble und van Deemter (2000) zu Koreferenz und der Diskursrepräsentationstheorie (Kamp und Reyle, 1993).
In den Experimenten wurden die koreferenzannotierten Korpora MUC-7, OntoNotes und ARRAU (Englisch) und TüBa-D/Z (Deutsch) verwendet. Sie umfassen die Klassifikationsalgorithmen J48 (Entscheidungsbäume), Ripper (regelbasiertes Lernen) und lineare Support Vector Machines. Mehrere neue Klassifikationsmerkmale werden vorgeschlagen, die die Spezifizität der Nominalphrase messen, sowie ihren Kontext abbilden. Mit Hilfe dieser Merkmale kann eine signifikante Verbesserung der Klassifikation erreicht werden.
|
Page generated in 0.0187 seconds