Global ETD Search

591	Design, Analysis and Experimental Evaluation of a Virtual Synchronous Machine Based Control Scheme for STATCOM Applications Li, Chi 23 September 2015 (has links) Because renewable energy sources are environment-friendly and inexhaustible, more and more renewable energy power plants have been integrated into power grids worldwide. To compensate for their inherent variability, STATCOMs are typically installed at the point of common coupling (PCC) to help their operation by regulating the PCC voltage. However under different contingencies, PCC voltage fluctuations in magnitude and frequency may impede the STATCOM from tracking the grid frequency correctly, hence worsening its overall compensation performance, and putting at risk the operation of the power plant. Further, the virtual synchronous machine (VSM) concept has recently been introduced to control grid-connected inverters emulating the behavior of rotating synchronous machines, in an effort to eliminate the shortcomings of conventional d-q frame phase-locked loops (PLL). In this dissertation, the VSM concept is extended by developing a STATCOM controller with it, which then behaves like a fully-adjustable synchronous condenser, including the adjustment of its "virtual" inertia and impedance. An average model in two D-Q frames is proposed to analyze the inherent dynamics of the VSM-based STATCOM controller with insight into impacts from the virtual parameters and a design guideline is then formulated. The proposed controller is compared against existent d-q frame STATCOM control strategies, evincing how the VSM-based approach guarantees an improved voltage regulation performance at the PCC by adjusting the phase of its compensating current during frequency fluctuations, in both simulation and experiment. Secondly, the dynamics of the VSM-based STATCOM controller in large signal sense is studied, especially its capability to ride through faults. Analysis is carried out with phasors to obtain a fundamental understanding at first and followed by state space equations to predict the transients analytically, which is validated by matching both simulation and experiment. The effects of two outer loops are also reviewed and some possible solutions are suggested and evaluated. Moreover, the relationship between the virtual inertia and the actual inertia is established and the dc capacitor sizing is discussed in a possibly more economical way. The start-up process of a VSM-based STATCOM is presented to implement a practical prototype as well. / Master of Science STATCOM virtual synchronous machine Control
592	VIP: Finding Important People in Images Mathialagan, Clint Solomon 25 June 2015 (has links) People preserve memories of events such as birthdays, weddings, or vacations by capturing photos, often depicting groups of people. Invariably, some individuals in the image are more important than others given the context of the event. This work analyzes the concept of the importance of individuals in group photographs. We address two specific questions - Given an image, who are the most important individuals in it? Given multiple images of a person, which image depicts the person in the most important role? We introduce a measure of importance of people in images and investigate the correlation between importance and visual saliency. We find that not only can we automatically predict the importance of people from purely visual cues, incorporating this predicted importance results in significant improvement in applications such as im2text (generating sentences that describe images of groups of people). / Master of Science Computer Vision Machine Learning Importance
593	Detecting Bots using Stream-based System with Data Synthesis Hu, Tianrui 28 May 2020 (has links) Machine learning has shown great success in building security applications including bot detection. However, many machine learning models are difficult to deploy since model training requires the continuous supply of representative labeled data, which are expensive and time-consuming to obtain in practice. In this thesis, we build a bot detection system with a data synthesis method to explore detecting bots with limited data to address this problem. We collected the network traffic from 3 online services in three different months within a year (23 million network requests). We develop a novel stream-based feature encoding scheme to support our model to perform real-time bot detection on anonymized network data. We propose a data synthesis method to synthesize unseen (or future) bot behavior distributions to enable our system to detect bots with extremely limited labeled data. The synthesis method is distribution-aware, using two different generators in a Generative Adversarial Network to synthesize data for the clustered regions and the outlier regions in the feature space. We evaluate this idea and show our method can train a model that outperforms existing methods with only 1% of the labeled data. We show that data synthesis also improves the model's sustainability over time and speeds up the retraining. Finally, we compare data synthesis and adversarial retraining and show they can work complementary with each other to improve the model generalizability. / Master of Science / An internet bot is a computer-controlled software performing simple and automated tasks over the internet. Although some bots are legitimate, many bots are operated to perform malicious behaviors causing severe security and privacy issues. To address this problem, machine learning (ML) models that have shown great success in building security applications are widely used in detecting bots since they can identify hidden patterns learning from data. However, many ML-based approaches are difficult to deploy since model training requires labeled data, which are expensive and time-consuming to obtain in practice, especially for security tasks. Meanwhile, the dynamic-changing nature of malicious bots means bot detection models need the continuous supply of representative labeled data to keep the models up-to-date, which makes bot detection more challenging. In this thesis, we build an ML-based bot detection system to detect advanced malicious bots in real-time by processing network traffic data. We explore using a data synthesis method to detect bots with limited training data to address the limited and unrepresentative labeled data problem. Our proposed data synthesis method synthesizes unseen (or future) bot behavior distributions to enable our system to detect bots with extremely limited labeled data. We evaluate our approach using real-world datasets we collected and show that our model outperforms existing methods using only 1% of the labeled data. We show that data synthesis also improves the model's sustainability over time and helps to keep it up-to-date easier. Finally, we show that our method can work complementary with adversarial retraining to improve the model generalizability. Bot Detection Security Machine learning
594	Methodology Development for Improving the Performance of Critical Classification Applications Afrose, Sharmin 17 January 2023 (has links) People interact with different critical applications in day-to-day life. Some examples of critical applications include computer programs, anonymous vehicles, digital healthcare, smart homes, etc. There are inherent risks in these critical applications if they fail to perform properly. In my dissertation, we mainly focus on developing methodologies for performance improvement for software security and healthcare prognosis. Cryptographic vulnerability tools are used to detect misuses of Java cryptographic APIs and thus classify secure and insecure parts of code. These detection tools are critical applications as misuse of cryptographic libraries and APIs causes devastating security and privacy implications. We develop two benchmarks that help developers to identify secure and insecure code usage as well as improve their tools. We also perform a comparative analysis of four static analysis tools. The developed benchmarks enable the first scientific comparison of the accuracy and scalability of cryptographic API misuse detection. Many published detection tools (CryptoGuard, CrySL, Oracle Parfait) have used our benchmarks to improve their performance in terms of the detection capability of insecure cases. We also examine the need for performance improvement for healthcare applications. Numerous prediction applications are developed to predict patients' health conditions. These are critical applications where misdiagnosis can cause serious harm to patients, even death. Due to the imbalanced nature of many clinical datasets, our work provides empirical evidence showing various prediction deficiencies in a typical machine learning model. We observe that missed death cases are 3.14 times higher than missed survival cases for mortality prediction. Also, existing sampling methods and other techniques are not well-equipped to achieve good performance. We design a double prioritized (DP) technique to mitigate representational bias or disparities across race and age groups. we show DP consistently boosts the minority class recall for underrepresented groups, by up to 38.0%. Our DP method also shows better performance than the existing methods in terms of reducing relative disparity by up to 88% in terms of minority class recall. Incorrect classification in these critical applications can have significant ramifications. Therefore, it is imperative to improve the performance of critical applications to alleviate risk and harm to people. / Doctor of Philosophy / We interact with many software using our devices in our everyday life. Examples of software usage include calling transport using Lyft or Uber, doing online shopping using eBay, using social media via Twitter, check payment status from credit card accounts or bank accounts. Many of these software use cryptography to secure our personal and financial information. However, the inappropriate or improper use of cryptography can let the malicious party gain sensitive information. To capture the inappropriate usage of cryptographic functions, there are several detection tools are developed. However, to compare the coverage of the tools, and the depth of detection of these tools, suitable benchmarks are needed. To bridge this gap, we aim to build two cryptographic benchmarks that are currently used by many tool developers to improve their performance and compare their tools with the existing tools. In another aspect, people see physicians and are admitted to hospitals if needed. Physicians also use different software that assists them in caring the patients. Among this software, many of them are built using machine learning algorithms to predict patients' conditions. The historical medical information or clinical dataset is taken as input to the prediction models. Clinical datasets contain information about patients of different races and ages. The number of samples in some groups of patients may be larger than in other groups. For example, many clinical datasets contain more white patients (i.e., majority group) than Black patients (i.e., minority group). Prediction models built on these imbalanced clinical data may provide inaccurate predictions for minority patients. Our work aims to improve the prediction accuracy for minority patients in important medical applications, such as estimating the likelihood of a patient dying in an emergency room visit or surviving cancer. We design a new technique that builds customized prediction models for different demographic groups. Our results reveal that subpopulation-specific models show better performance for minority groups. Our work contributes to improving the medical care of minority patients in the age of digital health. Overall, our aim is to improve the performance of critical applications to help people by decreasing risk. Our developed methods can be applicable to other critical application domains. Software Security Machine Learning Bias
595	Using Artificial Life to Design Machine Learning Algorithms for Decoding Gene Expression Patterns from Images Zaghlool, Shaza Basyouni 26 May 2008 (has links) Understanding the relationship between gene expression and phenotype is important in many areas of biology and medicine. Current methods for measuring gene expression such as microarrays however are invasive, require biopsy, and expensive. These factors limit experiments to low rate temporal sampling of gene expression and prevent longitudinal studies within a single subject, reducing their statistical power. Thus methods for non-invasive measurements of gene expression are an important and current topic of research. An interesting approach (Segal et al, Nature Biotechnology 25 (6) 2007) to indirect measurements of gene expression has recently been reported that uses existing imaging techniques and machine learning to estimate a function mapping image features to gene expression patterns, providing an image-derived surrogate for gene expression. However, the design of machine learning methods for this purpose is hampered by the cost of training and validation. My thesis shows that populations of artificial organisms simulating genetic variation can be used for designing machine learning approaches to decoding gene expression patterns from images. If analysis of these images proves successful, then this can be applied to real biomedical images reducing the limitations of invasive imaging. The results showed that the box counting dimension was a suitable feature extraction method yielding a classification rate of at least 90% for mutation rates up to 40%. Also, the box-counting dimension was robust in dealing with distorted images. The performance of the classifiers using the fractal dimension as features, actually, seemed more vulnerable to the mutation rate as opposed to the applied distortion level. / Master of Science phenotype Machine learning biomorph genotype
596	A Machine Learning Approach for the Objective Sonographic Assessment of Patellar Tendinopathy in Collegiate Basketball Athletes Cheung, Carrie Alyse 07 June 2021 (has links) Patellar tendinopathy (PT) is a knee injury resulting in pain localized to the patellar tendon. One main factor that causes PT is repetitive overloading of the tendon. Because of this mechanism, PT is commonly seen in "jumping sports" like basketball. This injury can severely impact a player's performance, and in order for a timely return to preinjury activity levels early diagnosis and treatment is important. The standard for the diagnosis of PT is a clinical examination, including a patient history and a physical assessment. Because PT has similar symptoms to injuries of other knee structures like the bursae, fat pad, and patellofemoral joint, imaging is regularly performed to aid in determining the correct diagnosis. One common imaging modality for the patellar tendon is gray-scale ultrasonography (GS-US). However, the accurate detection of PT in GS-US images is grader dependent and requires a high level of expertise. Machine learning (ML) models, which can accurately and objectively perform image classification tasks, could be used as a reliable automated tool to aid clinicians in assessing PT in GS-US images. ML models, like support vector machines (SVMs) and convolutional neural networks (CNNs), use features learned from labelled images, to predict the class of an unlabelled image. SVMs work by creating an optimal hyperplane between classes of labelled data points, and then classifies an unlabelled datapoint depending on which side of the hyperplane it falls. CNNs work by learning the set of features and recognizing what pattern of features describes each class. The objective of this study was to develop a SVM model and a CNN model to classify GS-US images of the patellar tendon as either normal or diseased (PT present), with an accuracy around 83%, the accuracy that experienced clinicians achieved when diagnosing PT in GS-US images that were already clinically diagnosed as either diseased or normal. We will also compare different test designs for each model to determine which achieved the highest accuracy. GS-US images of the patellar tendon were obtained from male and female Virginia Tech collegiate basketball athletes. Each image was labelled by an experienced clinician as either diseased or normal. These images were split into training and testing sets. The SVM and the CNN models were created using Python. For the SVM model, features were extracted from the training set using speeded up robust features (SURF). These features were then used to train the SVM model by calculating the optimal weights for the hyperplane. For the CNN model, the features were learned by layers within the CNN as were the optimal weights for classification. Both of these models were then used to predict the class of the images within the testing set, and the accuracy, sensitivity and precision of the models were calculated. For each model we looked at different test designs. The balanced designs had the same amount of diseased and normal images. The designs with Long images had only images taken in the longitudinal orientation, unlike Long+Trans, which had both longitudinal and transverse images. The designs with Full images contained the patellar tendon and surrounding tissue, whereas the ROI images removed the surrounding tissue. The best designs for the SVM model were the Unbalanced Long designs for both the Full and ROI images. Both designs had an accuracy of 77.5%. The best design for the CNN model was the Balanced Long+Trans Full design, with an accuracy of 80.3%. Both of the models had more difficulty classifying normal images than diseased images. This may be because the diseased images had a well defined feature pattern, while the normal images did not. Overall, the CNN features and classifier achieved a higher accuracy than the SURF features and SVM classifier. The CNN model is only slightly below 83%, the accuracy of an experienced clinician. These are promising results, and as the data set size increases and the models are fine tuned, the accuracy of the model will only continue to increase. / Master of Science / Patellar tendinopathy (PT) is a common knee injury. This injury is frequently seen in sports like basketball, where athletes are regularly jumping and landing, and ultimately applying a lot of force onto the patellar tendon. This injury can severely impact a player's performance, and in order for a timely return to preinjury activity levels early diagnosis and treatment is important. Currently, diagnosis of PT involves a patient history and a physical assessment, and is commonly supplemented by ultrasound imaging. However, clinicians need to have a high level of expertise in order to accurately assess these images for PT. In order to aid in this assessment, a tool like Machine learning (ML) models could be used. ML is becoming more and more prevalent in our every day lives. These models are everywhere, from the facial recognition tool on your phone to the list of recommended items on your Amazon account. ML models can use features learned from labelled images, to predict the class of an unlabeled image. The objective of this study was to develop ML models to classify ultrasound images of the patellar tendon as either normal or diseased (PT present). Machine learning Patellar Tendinopathy Ultrasonography
597	Machine Learning Classification of Gas Chromatography Data Clark, Evan Peter 28 August 2023 (has links) Gas Chromatography (GC) is a technique for separating volatile compounds by relying on adherence differences in the chemical components of the compound. As conditions within the GC are changed, components of the mixture elute at different times. Sensors measure the elution and produce data which becomes chromatograms. By analyzing the chromatogram, the presence and quantity of the mixture's constituent components can be determined. Machine Learning (ML) is a field consisting of techniques by which machines can independently analyze data to derive their own procedures for processing it. Additionally, there are techniques for enhancing the performance of ML algorithms. Feature Selection is a technique for improving performance by using a specific subset of the data. Feature Engineering is a technique to transform the data to make processing more effective. Data Fusion is a technique which combines multiple sources of data so as to produce more useful data. This thesis applies machine learning algorithms to chromatograms. Five common machine learning algorithms are analyzed and compared, including K-Nearest Neighbour (KNN), Support Vector Machines (SVM), Convolutional Neural Network (CNN), Decision Tree, and Random Forest (RF). Feature Selection is tested by applying window sweeps with the KNN algorithm. Feature Engineering is applied via the Principal Component Analysis (PCA) algorithm. Data Fusion is also tested. It was found that KNN and RF performed best overall. Feature Selection was very beneficial overall. PCA was helpful for some algorithms, but less so for others. Data Fusion was moderately beneficial. / Master of Science / Gas Chromatography is a method for separating a mixture into its constituent components. A chromatogram is a time series showing the detection of gas in the gas chromatography machine over time. With a properly set up gas chromatographer, different mixtures will produce different chromatograms. These differences allow researchers to determine the components or differentiate compounds from each other. Machine Learning (ML) is a field encompassing a set of methods by which machines can independently analyze data to derive the exact algorithms for processing it. There are many different machine learning algorithms which can accomplish this. There are also techniques which can process the data to make it more effective for use with machine learning. Feature Engineering is one such technique which transforms the data. Feature Selection is another technique which reduces the data to a subset. Data Fusion is a technique which combines different sources of data. Each of these processing techniques have many different implementations. This thesis applies machine learning to gas chromatography. ML systems are developed to classify mixtures based on their chromatograms. Five common machine learning algorithms are developed and compared. Some common Feature Engineering, Feature Selection, and Data Fusion techniques are also evaluated. Two of the algorithms were found to be more effective overall than the other algorithms. Feature Selection was found to be very beneficial. Feature Engineering was beneficial for some algorithms but less so for others. Data Fusion was moderately beneficial. Gas Chromatography Machine Learning Classification
598	The effects of spatial ability on performance with ecological interfaces : mental models and knowledge-based behaviors Bowen, Shane A.M. 01 January 2004 (has links) No description available. Human machine systems Spatial ability
599	A Comparative Analysis of Web-based Machine Translation Quality: English to French and French to English Barnhart, Zachary 12 1900 (has links) This study offers a partial reduplication of a 2006 study by Williams, which focused primarily on the analysis of the quality of translation produced by online software, namely Yahoo!® Babelfish, Freetranslation.com, and Google Translate. Since the data for the study by Williams were collected in 2004 and the data for present study in 2012, this gives a lapse of eight years for a diachronic analysis of the differences in quality of the translations provided by these online services. At the time of the 2006 study by Williams, all three services used a rule-based translation system, but, in October 2007, however, Google Translate switched to a system that is entirely statistical in nature. Thus, the present study is also able to examine the differences in quality between contemporary statistical and rule-based approaches to machine translation. French English machine translation web-based machine translation linguistics statistical machine translation Google Babblefish Freetranslation.com,
600	Optimal choice of machine tool for a machining job in a CAE environment Kumar, Eshwar January 2010 (has links) Developments in cutting tools, coolants, drives, controls, tool changers, pallet changers and the philosophy of machine tool design have made ground breaking changes in machine tools and machining processes. Modern Machining Centres have been developed to perform several operations on several faces of a workpiece in a single setup. On the other hand industry requires high value added components, which have many quality critical features to be manufactured in an outsourcing environment as opposed to the traditional in-house manufacture. The success of this manufacture critically depends on matching the advanced features of the machine tools to the complexity of the component. This project has developed a methodology to represent the features of a machine tool in the form of an alphanumeric string and the features of the component in another string. The strings are then matched to choose the most suitable and economical Machine Tool for the component’s manufacture. Literature identified that block structure is the way to answer the question ‘how to systematically describe the layout of such a machining centre’. Incomplete attempts to describe a block structure as alphanumeric strings were also presented in the literature. Survey on sales literature from several machine tool suppliers was investigated to systematically identify the features need by the user for the choice of a machine tool. Combining these, a new alphanumeric string was developed to represent machine tools. Using these strings as one of the ‘key’s for sorting a database of machine tools was developed. A supporting database of machine tools was also developed. Survey on machining on the other hand identified, that machining features can be used as a basis for planning the machining of a component. It analysed various features and feature sets proposed and provided and their recognition in CAD models. Though a vast number of features were described only two sets were complete sets. The project was started with one of them, (the other was carrying too many unwanted details for the task of this project) machining features supported by ‘Expert Machinist’ software. But when it became unavailable a ‘Feature set’ along those lines were defined and used in the generation of an alphanumeric string to represent the work. Comparing the two strings led the choice of suitable machines from the database. The methodology is implemented as a bolt on software incorporated within Pro/Engineer software where one can model any given component using cut features (mimicking machining operation) and produce a list of machine tools having features for the machining of that component. This will enable outsourcing companies to identify those Precision Engineers who have the machine tools with the matching apabilities. Supporting software and databases were developed using Access Database, Visual Basic and C with Pro/TOOLKIT functions. The resulting software suite was tested on several case studies and found to be effective. 502.85

Search results