Global ETD Search

151	Correlation Between Computed Equilibrium Secondary Structure Free Energy and siRNA Efficiency Bhattacharjee, Puranjoy 13 October 2009 (has links) We have explored correlations between the measured efficiency of the RNAi process and several computed signatures that characterize equilibrium secondary structure of the participating mRNA, siRNA, and their complexes. A previously published data set of 609 experimental points was used for the analysis. While virtually no correlation with the computed structural signatures are observed for individual data points, several clear trends emerge when the data is averaged over 10 bins of N ~ 60 data points per bin. The strongest trend is a positive linear (r² = 0.87) correlation between ln(remaining mRNA) and ΔG<sub>ms</sub>, the combined free energy cost of unraveling the siRNA and creating the break in the mRNA secondary structure at the complementary target strand region. At the same time, the free energy change ΔG<sub>total</sub> of the entire process mRNA + siRNA → (mRNA – siRNA)<sub>complex</sub> is not correlated with RNAi efficiency, even after averaging. These general findings appear to be robust to details of the computational protocols. The correlation between computed ΔG<sub>ms</sub> and experimentally observed RNAi efficiency can be used to enhance the ability of a machine learning algorithm based on a support vector machine (SVM) to predict effective siRNA sequences for a given target mRNA. Specifically, we observe modest, 3 to 7%, but consistent improvement in the positive predictive value (PPV) when the SVM training set is pre- or post-filtered according to a ΔG<sub>ms</sub> threshold. / Master of Science RNAi efficiency RNA interference(RNAi) RNAi equilibrium thermodynamics Support Vector Machine RNA secondary structure
152	Automated Vision-Based Tracking and Action Recognition of Earthmoving Construction Operations Heydarian, Arsalan 06 June 2012 (has links) The current practice of construction productivity and emission monitoring is performed by either manual stopwatch studies which are significantly labor intensive and subject to human errors, or by the use of RFID and GPS tracking devices which may be costly and impractical. To address these limitations, a novel computer vision based method for automated 2D tracking, 3D localization, and action recognition of construction equipment from different camera viewpoints is presented. In the proposed method, a new algorithm based on Histograms of Oriented Gradients and hue-saturation Colors (HOG+C) is used for 2D tracking of the earthmoving equipment. Once the equipment is detected, using a Direct Linear Transformation followed by a non-linear optimization, their positions are localized in 3D. In order to automatically analyze the performance of these operations, a new algorithm to recognize actions of the equipment is developed. First, a video is represented as a collection of spatio-temporal features by extracting space-time interest points and describing each with a Histogram of Oriented Gradients (HOG). The algorithm automatically learns the distributions of these features by clustering their HOG descriptors. Equipment action categories are then learned using a multi-class binary Support Vector Machine (SVM) classifier. Given a novel video sequence, the proposed method recognizes and localizes equipment actions. The proposed method has been exhaustively tested on 859 videos from earthmoving operations. Experimental results with an average accuracy of 86.33% and 98.33% for excavator and truck action recognition respectively, reflect the promise of the proposed method for automated performance monitoring. / Master of Science Support Vector Machine Histogram of Gradients Action Recognition 2D Tracking Construction Performance Monitoring
153	Understanding Fixed Object Crashes with SHRP2 Naturalistic Driving Study Data Hao, Haiyan 30 August 2018 (has links) Fixed-object crashes have long time been considered as major roadway safety concerns. While previous relevant studies tended to address such crashes in the context of roadway departures, and heavily relied on police-reported accidents data, this study integrated the SHRP2 NDS and RID data for analyses, which fully depicted the prior to, during, and after crash scenarios. A total of 1,639 crash, near-crash events, and 1,050 baseline events were acquired. Three analysis methods: logistic regression, support vector machine (SVM) and artificial neural network (ANN) were employed for two responses: crash occurrence and severity level. Logistic regression analyses identified 16 and 10 significant variables with significance levels of 0.1, relevant to driver, roadway, environment, etc. for two responses respectively. The logistic regression analyses led to a series of findings regarding the effects of explanatory variables on fixed-object event occurrence and associated severity level. SVM classifiers and ANN models were also constructed to predict these two responses. Sensitivity analyses were performed for SVM classifiers to infer the contributing effects of input variables. All three methods obtained satisfactory prediction performance, that was around 88% for fixed-object event occurrence and 75% for event severity level, which indicated the effectiveness of NDS event data on depicting crash scenarios and roadway safety analyses. / Master of Science / Fixed-object crashes happen when a single vehicle strikes a roadway feature such as a curb or a median, or runs off the road and hits a roadside feature such as a tree or utility pole. They have long time been considered as major highway safety concerns due to their high frequency, fatality rate, and associated property cost. Previous studies relevant to fixed-object crashes tended to address such crashes in the contexture of roadway departures, and heavily relied on police-reported accident data. However, many fixed-object crashes involved objects in roadway such as traffic control devices, roadway debris, etc. The police-reported accident data were found to be weak in depicting scenarios prior to, during crashes. Also, many minor crashes were often kept unreported. The Second Strategic Highway Research Program (SHRP2) Naturalistic Driving Study (NDS) is the largest NDS project launched across the country till now, aimed to study driver behavior or, performance-related safety problems under real-world scenarios. The data acquisition systems (DASs) equipped on participated vehicles collect vehicle kinematics, roadway, traffic, environment, and driver behavior data continuously, which enable researchers to address such crash scenarios closely. This study integrated SHRP2 NDS and roadway information database (RID) data to conduct a comprehensive analysis of fixed-object crashes. A total of 1,639 crash, near-crash events relevant to fixed objects and animals, and 1,050 baseline events were used. Three analysis methods: logistic regression, support vector machine (SVM) and artificial neural network (ANN) were employed for two responses: crash occurrence and severity level. The logistic regression analyses identified 16 and 10 variables with significance levels of 0.1 for fixed-object event occurrence and severity level models respectively. The influence of explanatory variables was discussed in detail. SVM classifiers and ANN models were also constructed to predict the fixed-object crash occurrence and severity level. Sensitivity analyses were performed for SVM classifiers to infer the contributing effects of input variables. All three methods achieved satisfactory prediction accuracies of around 88% for crash occurrence prediction and 75% for crash severity level prediction, which suggested the effectiveness of NDS event data on depicting crash scenarios and roadway safety analyses. Fixed-object crash Naturalistic driving study Logistic regression analysis Support vector machine Artificial neural network
154	Computational Analysis of LC-MS/MS Data for Metabolite Identification Zhou, Bin 13 January 2012 (has links) Metabolomics aims at the detection and quantitation of metabolites within a biological system. As the most direct representation of phenotypic changes, metabolomics is an important component in system biology research. Recent development on high-resolution, high-accuracy mass spectrometers enables the simultaneous study of hundreds or even thousands of metabolites in one experiment. Liquid chromatography-mass spectrometry (LC-MS) is a commonly used instrument for metabolomic studies due to its high sensitivity and broad coverage of metabolome. However, the identification of metabolites remains a bottle-neck for current metabolomic studies. This thesis focuses on utilizing computational approaches to improve the accuracy and efficiency for metabolite identification in LC-MS/MS-based metabolomic studies. First, an outlier screening approach is developed to identify those LC-MS runs with low analytical quality, so they will not adversely affect the identification of metabolites. The approach is computationally simple but effective, and does not depend on any preprocessing approach. Second, an integrated computational framework is proposed and implemented to improve the accuracy of metabolite identification and prioritize the multiple putative identifications of one peak in LC-MS data. Through the framework, peaks are likely to have the m/z values that can give appropriate putative identifications. And important guidance for the metabolite verification is provided by prioritizing the putative identifications. Third, an MS/MS spectral matching algorithm is proposed based on support vector machine classification. The approach provides an improved retrieval performance in spectral matching, especially in the presence of data heterogeneity due to different instruments or experimental settings used during the MS/MS spectra acquisition. / Master of Science Liquid chromatography-mass spectrometry Metabolomics Spectral matching Outlier screening Support vector machine
155	A Comparison of SVM Classifiers with Embedded Feature Selection Johansson, Adam, Mattsson, Anton January 2024 (has links) Since their introduction in 1995, Support Vector Machines (SVM) have come to be a widely employed machine learning model for binary classification, owing to their explainable architecture, efficient forward inference, and good ability to generalize. A common desire, not only for SVMs but for machine learning classifiers in general, is to have the model do feature selection, using only a limited subset of the available attributes in its predictions. Various alterations to the SVM problem formulation exist that address this, and in this report we compare a range of such SVM models. We compare how the accuracy and feature selection compare between the models for different datasets, both real and synthetic, and we also investigate the impact of dataset size on the aforementioned quantities. Our conclusions are that models trained to classify samples based on a smaller subset of features, tend to perform at a comparable level to dense models, with particular advantage when the dataset is small. Furthermore, as the training dataset grows in size, the number of selected features also increases, giving a more complex classifier when prompted with a larger data supply. Support Vector Machine SVM feature selection sparse dataset comparison Mathematics Matematik
156	Improving fMRI Classification Through Network Deconvolution Martinek, Jacob 01 January 2015 (has links) (PDF) The structure of regional correlation graphs built from fMRI-derived data is frequently used in algorithms to automatically classify brain data. Transformation on the data is performed during pre-processing to remove irrelevant or inaccurate information to ensure that an accurate representation of the subject's resting-state connectivity is attained. Our research suggests and confirms that such pre-processed data still exhibits inherent transitivity, which is expected to obscure the true relationships between regions. This obfuscation prevents known solutions from developing an accurate understanding of a subject’s functional connectivity. By removing correlative transitivity, connectivity between regions is made more specific and automated classification is expected to improve. The task of utilizing fMRI to automatically diagnose Attention Deficit/Hyperactivity Disorder was posed by the ADHD-200 Consortium in a competition to draw in researchers and new ideas from outside of the neuroimaging discipline. Researchers have since worked with the competition dataset to produce ever-increasing detection rates. Our approach was empirically tested with a known solution to this problem to compare processing of treated and untreated data, and the detection rates were shown to improve in all cases with a weighted average increase of 5.88%. Attention deficit hyperactive disorder Correlation Functional magnetic resonance imaging Network deconvolution Support vector machine Transitivity
157	Sélection de modèle par chemin de régularisation pour les machines à vecteurs support à coût quadratique / Model selection using regularization path for quadratic cost support vector machines Bonidal, Rémi 19 June 2013 (has links) La sélection de modèle est un thème majeur de l'apprentissage statistique. Dans ce manuscrit, nous introduisons des méthodes de sélection de modèle dédiées à des SVM bi-classes et multi-classes. Ces machines ont pour point commun d'être à coût quadratique, c'est-à-dire que le terme empirique de la fonction objectif de leur problème d'apprentissage est une forme quadratique. Pour les SVM, la sélection de modèle consiste à déterminer la valeur optimale du coefficient de régularisation et à choisir un noyau approprié (ou les valeurs de ses paramètres). Les méthodes que nous proposons combinent des techniques de parcours du chemin de régularisation avec de nouveaux critères de sélection. La thèse s'articule autour de trois contributions principales. La première est une méthode de sélection de modèle par parcours du chemin de régularisation dédiée à la l2-SVM. Nous introduisons à cette occasion de nouvelles approximations de l'erreur en généralisation. Notre deuxième contribution principale est une extension de la première au cas multi-classe, plus précisément à la M-SVM². Cette étude nous a conduits à introduire une nouvelle M-SVM, la M-SVM des moindres carrés. Nous présentons également de nouveaux critères de sélection de modèle pour la M-SVM de Lee, Lin et Wahba à marge dure (et donc la M-SVM²) : un majorant de l'erreur de validation croisée leave-one-out et des approximations de cette erreur. La troisième contribution principale porte sur l'optimisation des valeurs des paramètres du noyau. Notre méthode se fonde sur le principe de maximisation de l'alignement noyau/cible, dans sa version centrée. Elle l'étend à travers l'introduction d'un terme de régularisation. Les évaluations expérimentales de l'ensemble des méthodes développées s'appuient sur des benchmarks fréquemment utilisés dans la littérature, des jeux de données jouet et des jeux de données associés à des problèmes du monde réel / Model selection is of major interest in statistical learning. In this document, we introduce model selection methods for bi-class and multi-class support vector machines. We focus on quadratic loss machines, i.e., machines for which the empirical term of the objective function of the learning problem is a quadratic form. For SVMs, model selection consists in finding the optimal value of the regularization coefficient and choosing an appropriate kernel (or the values of its parameters). The proposed methods use path-following techniques in combination with new model selection criteria. This document is structured around three main contributions. The first one is a method performing model selection through the use of the regularization path for the l2-SVM. In this framework, we introduce new approximations of the generalization error. The second main contribution is the extension of the first one to the multi-category setting, more precisely the M-SVM². This study led us to derive a new M-SVM, the least squares M-SVM. Additionally, we present new model selection criteria for the M-SVM introduced by Lee, Lin and Wahba (and thus the M-SVM²). The third main contribution deals with the optimization of the values of the kernel parameters. Our method makes use of the principle of kernel-target alignment with centered kernels. It extends it through the introduction of a regularization term. Experimental validation of these methods was performed on classical benchmark data, toy data and real-world data Apprentissage Discrimination Machine à vecteurs support (SVM) Sélection de modèle Chemin de régularisation Machine learning Classification Support Vector Machine (SVM) Model selection Regularization path 006.31
158	Método para detecção de anomalias em tráfego de redes Real Time Ethernet aplicado em PROFINET e em SERCOS III / Method for detecting traffic anomalies of Real Time Ethernet networks applied to PROFINET and SERCOS III Sestito, Guilherme Serpa 24 October 2018 (has links) Esta tese propõe uma metodologia de detecção de anomalias por meio da otimização da extração, seleção e classificação de características relacionadas ao tráfego de redes Real Time Ethernet (RTE). Em resumo, dois classificadores são treinados usando características que são extraídas do tráfego por meio da técnica de janela deslizante e posteriormente selecionadas de acordo com sua correlação com o evento a ser classificado. O número de características relevantes pode variar de acordo com os indicadores de desempenho de cada classificador. Reduzindo a dimensionalidade do evento a ser classificado com o menor número de características possíveis que o represente, são garantidos a redução do esforço computacional, ganho de tempo, dentre outros benefícios. Posteriormente, os classificadores são comparados em função dos indicadores de desempenho: acurácia, taxa de falsos positivos, taxa de falsos negativos, tempo de processamento e erro relativo. A metodologia proposta foi utilizada para identificar quatro diferentes eventos (três anomalias e o estado normal de operação) em redes PROFINET reais e com configurações distintas entre si; também foi aplicada em três eventos (duas anomalias e o estado normal de operação) em redes SERCOS III. O desempenho de cada classificador é analisado em suas particularidades e comparados com pesquisas correlatas. Por fim, é explorada a possibilidade de aplicação da metodologia proposta para outros protocolos baseados em RTE. / This thesis proposes an anomaly detection methodology by optimizing extraction, selection and classification of characteristics related to Real Time Ethernet (RTE) network traffic. In summary, two classifiers are trained using features which are extracted from network traffic through the sliding window technique and selected according to their correlation with the event being classified. The number of relevant characteristics could vary according to performance indicators of each classifier. Reducing the dimensionality of the event to be classified using the smallest number of characteristics which represent it, guarantees reduction in computational effort, processing time, among other benefits. The classifiers are compared according to performance indicators: accuracy, false positive rate, false negative rate, processing time and relative error. The proposed methodology was used to identify four different events (three anomalies and normal operation) in real PROFINET networks, using different configurations. It was also applied in 3 events (two anomalies and normal operation) in SERCOS III networks. The results obtained are analyzed in its particularities and compared with related research. Finally, the possibility of applying the proposed methodology for other protocols based on RTE is explored. Real Time Ethernet Support Vector Machine Artificial Neural Networks Extração de características Feature Extraction Feature Selection Optimization Otimização PROFINET PROFINET Real Time Ethernet Redes Neurais Artificiais Seleção de características SERCOS III Support Vector Machine
159	System för att upptäcka Phishing : Klassificering av mejl Karlsson, Nicklas January 2008 (has links) <p>Denna rapport tar en titt på phishing-problemet, något som många har råkat ut för med bland annat de falska Nordea eller eBay mejl som på senaste tiden har dykt upp i våra inkorgar, och ett eventuellt sätt att minska phishingens effekt. Fokus i rapporten ligger på klassificering av mejl och den huvudsakliga frågeställningen är: ”Är det, med hög träffsäkerhet, möjligt att med hjälp av ett klassificeringsverktyg sortera ut mejl som har med phishing att göra från övrig skräppost.” Det visade sig svårare än väntat att hitta phishing mejl att använda i klassificeringen. I de klassificeringar som genomfördes visade det sig att både metoden Naive Bayes och med Support Vector Machine kan hitta upp till 100 % av phishing mejlen. Rapporten pressenterar arbetsgången, teori om phishing och resultaten efter genomförda klassificeringstest.</p> / <p>This report takes a look at the phishing problem, something that many have come across with for example the fake Nordea or eBay e-mails that lately have shown up in our e-mail inboxes, and a possible way to reduce the effect of phishing. The focus in the report lies on classification of e-mails and the main question is: “Is it, with high accuracy, possible with a classification tool to sort phishing e-mails from other spam e-mails.” It was more difficult than expected to find phishing e-mails to use in the classification. The classifications that were made showed that it was possible to find up to 100 % of the phishing e-mails with both Naive Bayes and with Support Vector Machine. The report presents the work done, facts about phishing and the results of the classification tests made.</p> Phishing spam classification Naive Bayes Support Vector Machine RainBow Cygwin Anti-Phishing Working Group spam filer Phishing nätfiske spam skräppost klassificering Naive Bayes Support Vector Machine RainBow Cygwin Anti-Phishing Working Group spamfiler Computer science Datalogi
160	Forecasting Mid-Term Electricity Market Clearing Price Using Support Vector Machines 2014 May 1900 (has links) In a deregulated electricity market, offering the appropriate amount of electricity at the right time with the right bidding price is of paramount importance. The forecasting of electricity market clearing price (MCP) is a prediction of future electricity price based on given forecast of electricity demand, temperature, sunshine, fuel cost, precipitation and other related factors. Currently, there are many techniques available for short-term electricity MCP forecasting, but very little has been done in the area of mid-term electricity MCP forecasting. The mid-term electricity MCP forecasting focuses electricity MCP on a time frame from one month to six months. Developing mid-term electricity MCP forecasting is essential for mid-term planning and decision making, such as generation plant expansion and maintenance schedule, reallocation of resources, bilateral contracts and hedging strategies. Six mid-term electricity MCP forecasting models are proposed and compared in this thesis: 1) a single support vector machine (SVM) forecasting model, 2) a single least squares support vector machine (LSSVM) forecasting model, 3) a hybrid SVM and auto-regression moving average with external input (ARMAX) forecasting model, 4) a hybrid LSSVM and ARMAX forecasting model, 5) a multiple SVM forecasting model and 6) a multiple LSSVM forecasting model. PJM interconnection data are used to test the proposed models. Cross-validation technique was used to optimize the control parameters and the selection of training data of the six proposed mid-term electricity MCP forecasting models. Three evaluation techniques, mean absolute error (MAE), mean absolute percentage error (MAPE) and mean square root error (MSRE), are used to analysis the system forecasting accuracy. According to the experimental results, the multiple SVM forecasting model worked the best among all six proposed forecasting models. The proposed multiple SVM based mid-term electricity MCP forecasting model contains a data classification module and a price forecasting module. The data classification module will first pre-process the input data into corresponding price zones and then the forecasting module will forecast the electricity price in four parallel designed SVMs. This proposed model can best improve the forecasting accuracy on both peak prices and overall system compared with other 5 forecasting models proposed in this thesis. Classiﬁcation Deregulated electric market Electricity market clearing price Electricity price forecasting PJM Support vector machine (SVM) Peak price

Search results