Global ETD Search

331	Analysis of Transactional Data with Long Short-Term Memory Recurrent Neural Networks Nawaz, Sabeen January 2020 (has links) An issue authorities and banks face is fraud related to payments and transactions where huge monetary losses occur to a party or where money laundering schemes are carried out. Previous work in the field of machine learning for fraud detection has addressed the issue as a supervised learning problem. In this thesis, we propose a model which can be used in a fraud detection system with transactions and payments that are unlabeled. The proposed modelis a Long Short-term Memory in an auto-encoder decoder network (LSTMAED)which is trained and tested on transformed data. The data is transformed by reducing it to Principal Components and clustering it with K-means. The model is trained to reconstruct the sequence with high accuracy. Our results indicate that the LSTM-AED performs better than a random sequence generating process in learning and reconstructing a sequence of payments. We also found that huge a loss of information occurs in the pre-processing stages. / Obehöriga transaktioner och bedrägerier i betalningar kan leda till stora ekonomiska förluster för banker och myndigheter. Inom maskininlärning har detta problem tidigare hanterats med hjälp av klassifierare via supervised learning. I detta examensarbete föreslår vi en modell som kan användas i ett system för att upptäcka bedrägerier. Modellen appliceras på omärkt data med många olika variabler. Modellen som används är en Long Short-term memory i en auto-encoder decoder nätverk. Datan transformeras med PCA och klustras med K-means. Modellen tränas till att rekonstruera en sekvens av betalningar med hög noggrannhet. Vår resultat visar att LSTM-AED presterar bättre än en modell som endast gissar nästa punkt i sekvensen. Resultatet visar också att mycket information i datan går förlorad när den förbehandlas och transformeras. LSTM Auto-encoder decoder anomaly detection K-means clustering Principal Component Analysis Computer and Information Sciences Data- och informationsvetenskap
332	Deriving the Time-Course of the Dominant Frequency of Atrial Fibrillation from a Long Term in vivo Sheep Model using QRST Removal Techniques Price, Nicholas F. January 2011 (has links) No description available. Computer Science Electrical Engineering Medicine atrial fibrillation electrocardiogram dominant frequency QRST cancellation QRST removal principal component analysis
333	Nonlinear Wavelet Compression Methods for Ion Analyses and Dynamic Modeling of Complex Systems Cao, Libo January 2004 (has links) No description available. Chemometrics Nonlinear Wavelet Compression SIMPLISMA Alternating Least Squares (ALS) Ion Mobility Spectrometry (IMS) Principal Component Analysis (PCA)
334	Facial Expression Recognition by Using Class Mean Gabor Responses with Kernel Principal Component Analysis Chung, Koon Yin C. 16 April 2010 (has links) No description available. Computer Science Electrical Engineering Engineering Facial Expression Recognition Gabor Kernel Principal Component Analysis KPCA Class Mean Gabor Responses
335	Characterizing the Quaternary Hydrostratigraphy of Buried Valleys using Multi-Parameter Borehole Geophysics, Georgetown, Ontario Brennan, Andrew N. 10 1900 (has links) <p>In 2009, the Regional Municipality of Halton and McMaster University initiated a 2-year collaborative study (Georgetown Aquifer Characterization Study-GACS) of the groundwater resource potential of Quaternary sediments near Georgetown, Ontario. As part of that study, this thesis investigated the Quaternary infill stratigraphy of the Middle Sixteen Mile Creek (MSMC) and Cedarvale (CV) buried valley systems using newly acquired core and borehole geophysical data. Multi-parameter geophysical log suites (natural gamma, EM conductivity, resistivity, magnetic susceptibility, full-waveform sonic, caliper) were acquired in 16 new boreholes (16 m to 55 m depth), pre-existing monitoring wells and from archival data. Characteristic log responses (electrofacies) were identified and correlated with core to produce a detailed subsurface model of a 20-km<sup>2</sup> area to the southwest of Georgetown. Nine distinctive lithostratgraphic units were identified and their geometry mapped across the study area as structure contour and isochore thickness maps. The subsurface model shows that the CV valley truncates the Late Wisconsin MSMC stratigraphy along a channelized erosional unconformity and is a younger (post-glacial?) sediment-hosted valley system. Model results demonstrate the high level of stratigraphic heterogeneity and complexity that is inherent in bedrock valley systems and provides a geological framework for understanding groundwater resource availability.</p> <p>Principal component analysis (PCA) was applied to selected log suites to evaluate the potential for objective lithologic classification using log data. Gamma, resistivity and conductivity logs were most useful for lithologic typing, while p-wave velocity and resistivity logs were more diagnostic of compact diamict units. Cross plots of the first and second principal components of log parameters discriminated silts and clays/shales from sand/gravel and diamict lithofacies. The results show that PCA is a viable method for predicting subsurface lithology in un-cored boreholes and can assist in the identification of hydrostratigraphic units.</p> / Master of Science (MSc) borehole geophysics buried bedrock valleys Georgetown Ontario principal component analysis hydrostratigraphy lithologic classification Geophysics and Seismology Geophysics and Seismology
336	Analysis of Zero-Heavy Data Using a Mixture Model Approach Wang, Shin Cheng 30 March 1998 (has links) The problem of high proportion of zeroes has long been an interest in data analysis and modeling, however, there are no unique solutions to this problem. The solution to the individual problem really depends on its particular situation and the design of the experiment. For example, different biological, chemical, or physical processes may follow different distributions and behave differently. Different mechanisms may generate the zeroes and require different modeling approaches. So it would be quite impossible and inflexible to come up with a unique or a general solution. In this dissertation, I focus on cases where zeroes are produced by mechanisms that create distinct sub-populations of zeroes. The dissertation is motivated from problems of chronic toxicity testing which has a data set that contains a high proportion of zeroes. The analysis of chronic test data is complicated because there are two different sources of zeroes: mortality and non-reproduction in the data. So researchers have to separate zeroes from mortality and fecundity. The use of mixture model approach which combines the two mechanisms to model the data here is appropriate because it can incorporate the mortality kind of extra zeroes. A zero inflated Poisson (ZIP) model is used for modeling the fecundity in <i> Ceriodaphnia dubia</i> toxicity test. A generalized estimating equation (GEE) based ZIP model is developed to handle longitudinal data with zeroes due to mortality. A joint estimate of inhibition concentration (ICx) is also developed as potency estimation based on the mixture model approach. It is found that the ZIP model would perform better than the regular Poisson model if the mortality is high. This kind of toxicity testing also involves longitudinal data where the same subject is measured for a period of seven days. The GEE model allows the flexibility to incorporate the extra zeroes and a correlation structure among the repeated measures. The problem of zero-heavy data also exists in environmental studies in which the growth or reproduction rates of multi-species are measured. This gives rise to multivariate data. Since the inter-relationships between different species are imbedded in the correlation structure, the study of the information in the correlation of the variables, which is often accessed through principal component analysis, is one of the major interests in multi-variate data. In the case where mortality influences the variables of interests, but mortality is not the subject of interests, the use of the mixture approach can be applied to recover the information of the correlation structure. In order to investigate the effect of zeroes on multi-variate data, simulation studies on principal component analysis are performed. A method that recovers the information of the correlation structure is also presented. / Ph. D. Principal Component Analysis Longitudinal Data Inhibition Concentration Generalized Estimating Equations Chronic toxicity testing Ceriodaphnia Dubia Zero-inflated Poisson
337	An integrated approach to the taxonomic identification of prehistoric shell ornaments Demarchi, B., O'Connor, Sonia A., de Lima Ponzoni, A., de Almeida Rocha Ponzoni, R., Sheridan, A., Penkman, K.E.H., Hancock, Y., Wilson, J. 17 May 2014 (has links) Yes / Shell beads appear to have been one of the earliest examples of personal adornments. Marine shells identified far from the shore evidence long-distance transport and imply networks of exchange and negotiation. However, worked beads lose taxonomic clues to identification, and this may be compounded by taphonomic alteration. Consequently, the significance of this key early artefact may be underestimated. We report the use of bulk amino acid composition of the stable intra-crystalline proteins preserved in shell biominerals and the application of pattern recognition methods to a large dataset (777 samples) to demonstrate that taxonomic identification can be achieved at genus level. Amino acid analyses are fast (<2 hours per sample) and micro-destructive (sample size <2 mg). Their integration with non-destructive techniques provides a valuable and affordable tool, which can be used by archaeologists and museum curators to gain insight into early exploitation of natural resources by humans. Here we combine amino acid analyses, macro- and microstructural observations (by light microscopy and scanning electron microscopy) and Raman spectroscopy to try to identify the raw material used for beads discovered at the Early Bronze Age site of Great Cornard (UK). Our results show that at least two shell taxa were used and we hypothesise that these were sourced locally. Amino Acids Animal shells Animals Aquatic organisms Bivalvia Fossils Humans Principal component analysis Proteins Reproducibility of results Spectrum analysis
338	Development and Application of Novel Computer Vision and Machine Learning Techniques Depoian, Arthur Charles, II 08 1900 (has links) The following thesis proposes solutions to problems in two main areas of focus, computer vision and machine learning. Chapter 2 utilizes traditional computer vision methods implemented in a novel manner to successfully identify overlays contained in broadcast footage. The remaining chapters explore machine learning algorithms and apply them in various manners to big data, multi-channel image data, and ECG data. L1 and L2 principal component analysis (PCA) algorithms are implemented and tested against each other in Python, providing a metric for future implementations. Selected algorithms from this set are then applied in conjunction with other methods to solve three distinct problems. The first problem is that of big data error detection, where PCA is effectively paired with statistical signal processing methods to create a weighted controlled algorithm. Problem 2 is an implementation of image fusion built to detect and remove noise from multispectral satellite imagery, that performs at a high level. The final problem examines ECG medical data classification. PCA is integrated into a neural network solution that achieves a small performance degradation while requiring less then 20% of the full data size. Computer Vision Image Fusion Principal Component Analysis Autoencoder ECG Convolutional Neural Network Graphical Overlays Probalistic Learning machine learning Big Data
339	Multiscale process monitoring with singular spectrum analysis Krishnannair, Syamala 12 1900 (has links) Thesis (MScEng (Process Engineering))--University of Stellenbosch, 2010. / Thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Engineering (Extractive Metallurgy) In the Department of Process Engineering at the University of Stellenbosch / ENGLISH ABSTRACT: Multivariate statistical process control (MSPC) approaches are now widely used for performance monitoring, fault detection and diagnosis in chemical processes. Conventional MSPC approaches are based on latent variable projection methods such as principal component analysis and partial least squares. These methods are suitable for handling linearly correlated data sets, with minimal autocorrelation in the variables. Industrial plant data invariably violate these conditions, and several extensions to conventional MSPC methodologies have been proposed to account for these limitations. In practical situations process data usually contain contributions at multiple scales because of different events occurring at different localizations in time and frequency. To account for such multiscale nature, monitoring techniques that decompose observed data at different scales are necessary. Hence the use of standard MSPC methodologies may lead to unreliable results due to false alarms and significant loss of information. In this thesis a multiscale methodology based on the use of singular spectrum analysis is proposed. Singular spectrum analysis (SSA) is a linear method that extracts information from the short and noisy time series by decomposing the data into deterministic and stochastic components without prior knowledge of the dynamics affecting the time series. These components can be classified as independent additive time series of slowly varying trend, periodic series and aperiodic noise. SSA does this decomposition by projecting the original time series onto a data-adaptive vector basis obtained from the series itself based on principal component analysis (PCA). The proposed method in this study treats each process variable as time series and the autocorrelation between the variables are explicitly accounted for. The data-adaptive nature of SSA makes the proposed method more flexible than other spectral techniques using fixed basis functions. Application of the proposed technique is demonstrated using simulated, industrial data and the Tennessee Eastman Challenge process. Also, a comparative analysis is given using the simulated and Tennessee Eastman process. It is found that in most cases the proposed method is superior in detecting process changes and faults of different magnitude accurately compared to classical statistical process control (SPC) based on latent variable methods as well as the wavelet-based multiscale SPC. / AFRIKAANSE OPSOMMING: Meerveranderlike statistiese prosesbeheerbenaderings (MSPB) word tans wydverspreid benut vir werkverrigtingkontrolering, foutopsporing en .diagnose in chemiese prosesse. Gebruiklike MSPB word op latente veranderlike projeksiemetodes soos hoofkomponentontleding en parsiele kleinste-kwadrate gebaseer. Hierdie metodes is geskik om lineer gekorreleerde datastelle, met minimale outokorrelasie, te hanteer. Nywerheidsaanlegdata oortree altyd hierdie voorwaardes, en verskeie MSPB is voorgestel om verantwoording te doen vir hierdie beperkings. Prosesdata afkomstig van praktiese toestande bevat gewoonlik bydraes by veelvuldige skale, as gevolg van verskillende gebeurtenisse wat by verskillende lokaliserings in tyd en frekwensie voorkom. Kontroleringsmetodes wat waargenome data ontbind by verskillende skale is nodig om verantwoording te doen vir sodanige multiskaalgedrag. Derhalwe kan die gebruik van standaard-MSPB weens vals alarms en beduidende verlies van inligting tot onbetroubare resultate lei. In hierdie tesis word . multiskaalmetodologie gebaseer op die gebruik van singuliere spektrumontleding (SSO) voorgestel. SSO is . lineere metode wat inligting uit die kort en ruiserige tydreeks ontrek deur die data in deterministiese en stochastiese komponente te ontbind, sonder enige voorkennis van die dinamika wat die tydreeks affekteer. Hierdie komponente kan as onafhanklike, additiewe tydreekse geklassifiseer word: stadigveranderende tendense, periodiese reekse en aperiodiese geruis. SSO vermag hierdie ontbinding deur die oorspronklike tydreeks na . data-aanpassende vektorbasis te projekteer, waar hierdie vektorbasis verkry is vanaf die tydreeks self, gebaseer op hoofkomponentontleding. Die voorgestelde metode in hierdie studie hanteer elke prosesveranderlike as . tydreeks, en die outokorrelasie tussen veranderlikes word eksplisiet in berekening gebring. Aangesien die SSO metode aanpas tot data, is die voorgestelde metode meer buigsaam as ander spektraalmetodes wat gebruik maak van vaste basisfunksies. Toepassing van die voorgestelde tegniek word getoon met gesimuleerde prosesdata en die Tennessee Eastman-proses. . Vergelykende ontleding word ook gedoen met die gesimuleerde prosesdata en die Tennessee Eastman-proses. In die meeste gevalle is dit gevind dat die voorgestelde metode beter vaar om prosesveranderings en .foute met verskillende groottes op te spoor, in vergeleke met klassieke statistiese prosesbeheer (SP) gebaseer op latente veranderlikes, asook golfie-gebaseerde multiskaal SP. Process fault diagnosis Dissertations -- Process engineering Theses -- Process engineering Multiscale process monitoring Principal component analysis Singular spectrum analysis Multivariate statistical process control
340	市場風險因子情境產生方法之研究 / Methodology for Risk Factors Scenario Generation 陳育偉, Chen,Yu-Wei Unknown Date (has links) 由於金融事件層出不窮，控管風險已成為銀行、證券、保險各種金融產業的重要課題。其中Value-at-Risk(VaR)模型為銀行與證券業最常用來衡量其市場風險的模型。VaR模型中的蒙地卡羅模擬法是將投資組合持有部位以適當的市場風險因子來表示，接著產生市場風險因子的各種情境，再結合評價公式以求得投資組合在某一段持有期間內、某一信心水準之下的最低價值，再將最低價值減去原來之價值，便為可能的最大損失(Jorion, 2007)。 / 使用蒙地卡羅模擬法產生市場風險因子的各種情境，必須先估計市場風險因子的共變異數矩陣，再藉此模擬出數千種市場風險因子情境。本研究便是將蒙地卡羅模擬法加入隨著時間改變之共變異數矩陣(time-varying covariance matrix)的概念並減少市場風險因子個數，利用蒙地卡羅模擬法配合Constant模型、UWMA模型、EWMA模型、Orthogonal EWMA模型、Orthogonal GARCH模型、PCA EWMA模型、PCA GARCH模型來產生市場風險因子未來的情境並比較各方法對長天期與短天期風險衡量之優劣。結果顯示PCA EWMA模型的效果最好，因此建議各大金融機構可採用PCA EWMA模型來控管其投資組合短天期與長天期的市場風險。市場風險蒙地卡羅共變異數矩陣主成分分析 Market Risk Monte Carlo Covariance Matrix Principal Component Analysis

Search results