Global ETD Search

21	A Machine Learning Approach for Tracking the Torque Losses in Internal Gear Pump - AC Motor Units Ali, Emad, Weber, Jürgen, Wahler, Matthias January 2016 (has links) This paper deals with the application of speed variable pumps in industrial hydraulic systems. The benefit of the natural feedback of the load torque is investigated for the issue of condition monitoring as the development of losses can be taken as evidence of faults. A new approach is proposed to improve the fault detection capabilities by tracking the changes via machine learning techniques. The presented algorithm is an art of adaptive modeling of the torque balance over a range of steady operation in fault free behavior. The aim thereby is to form a numeric reference with acceptable accuracy of the unit used in particular, taking into consideration the manufacturing tolerances and other operation conditions differences. The learned model gives baseline for identification of major possible abnormalities and offers a fundament for fault isolation by continuously estimating and analyzing the deviations. info:eu-repo/classification/ddc/620 ddc:620
22	Data-Dependent Analysis of Learning Algorithms Philips, Petra Camilla, petra.philips@gmail.com January 2005 (has links) This thesis studies the generalization ability of machine learning algorithms in a statistical setting. It focuses on the data-dependent analysis of the generalization performance of learning algorithms in order to make full use of the potential of the actual training sample from which these algorithms learn.¶ First, we propose an extension of the standard framework for the derivation of generalization bounds for algorithms taking their hypotheses from random classes of functions. This approach is motivated by the fact that the function produced by a learning algorithm based on a random sample of data depends on this sample and is therefore a random function. Such an approach avoids the detour of the worst-case uniform bounds as done in the standard approach. We show that the mechanism which allows one to obtain generalization bounds for random classes in our framework is based on a “small complexity” of certain random coordinate projections. We demonstrate how this notion of complexity relates to learnability and how one can explore geometric properties of these projections in order to derive estimates of rates of convergence and good confidence interval estimates for the expected risk. We then demonstrate the generality of our new approach by presenting a range of examples, among them the algorithm-dependent compression schemes and the data-dependent luckiness frameworks, which fall into our random subclass framework.¶ Second, we study in more detail generalization bounds for a specific algorithm which is of central importance in learning theory, namely the Empirical Risk Minimization algorithm (ERM). Recent results show that one can significantly improve the high-probability estimates for the convergence rates for empirical minimizers by a direct analysis of the ERM algorithm. These results are based on a new localized notion of complexity of subsets of hypothesis functions with identical expected errors and are therefore dependent on the underlying unknown distribution. We investigate the extent to which one can estimate these high-probability convergence rates in a data-dependent manner. We provide an algorithm which computes a data-dependent upper bound for the expected error of empirical minimizers in terms of the “complexity” of data-dependent local subsets. These subsets are sets of functions of empirical errors of a given range and can be determined based solely on empirical data. We then show that recent direct estimates, which are essentially sharp estimates on the high-probability convergence rate for the ERM algorithm, can not be recovered universally from empirical data. statistical learning theory generalization bounds data-dependent complexity machine learning algorithms empirical risk minimization empirical process theory concentration inequalities Rademacher averages localized complexities
23	Software defect prediction using maximal information coefficient and fast correlation-based filter feature selection Mpofu, Bongeka 12 1900 (has links) Software quality ensures that applications that are developed are failure free. Some modern systems are intricate, due to the complexity of their information processes. Software fault prediction is an important quality assurance activity, since it is a mechanism that correctly predicts the defect proneness of modules and classifies modules that saves resources, time and developers’ efforts. In this study, a model that selects relevant features that can be used in defect prediction was proposed. The literature was reviewed and it revealed that process metrics are better predictors of defects in version systems and are based on historic source code over time. These metrics are extracted from the source-code module and include, for example, the number of additions and deletions from the source code, the number of distinct committers and the number of modified lines. In this research, defect prediction was conducted using open source software (OSS) of software product line(s) (SPL), hence process metrics were chosen. Data sets that are used in defect prediction may contain non-significant and redundant attributes that may affect the accuracy of machine-learning algorithms. In order to improve the prediction accuracy of classification models, features that are significant in the defect prediction process are utilised. In machine learning, feature selection techniques are applied in the identification of the relevant data. Feature selection is a pre-processing step that helps to reduce the dimensionality of data in machine learning. Feature selection techniques include information theoretic methods that are based on the entropy concept. This study experimented the efficiency of the feature selection techniques. It was realised that software defect prediction using significant attributes improves the prediction accuracy. A novel MICFastCR model, which is based on the Maximal Information Coefficient (MIC) was developed to select significant attributes and Fast Correlation Based Filter (FCBF) to eliminate redundant attributes. Machine learning algorithms were then run to predict software defects. The MICFastCR achieved the highest prediction accuracy as reported by various performance measures. / School of Computing / Ph. D. (Computer Science) Defect prediction Feature selection Software metrics Relevant metrics Redundancy Machine learning algorithms Filter Wrapper Embedded Information theory 005.14 Software measurement Machine learning Embedded computer systems Information theory
24	Investigation and application of artificial intelligence algorithms for complexity metrics based classification of semantic web ontologies Koech, Gideon Kiprotich 11 1900 (has links) M. Tech. (Department of Information Technology, Faculty of Applied and Computer Sciences), Vaal University of Technology. / The increasing demand for knowledge representation and exchange on the semantic web has resulted in an increase in both the number and size of ontologies. This increased features in ontologies has made them more complex and in turn difficult to select, reuse and maintain them. Several ontology evaluations and ranking tools have been proposed recently. Such evaluation tools provide a metrics suite that evaluates the content of an ontology by analysing their schemas and instances. The presence of ontology metric suites may enable classification techniques in placing the ontologies in various categories or classes. Machine Learning algorithms mostly based on statistical methods used in classification of data makes them the perfect tools to be used in performing classification of ontologies. In this study, popular Machine Learning algorithms including K-Nearest Neighbors, Support Vector Machines, Decision Trees, Random Forest, Naïve Bayes, Linear Regression and Logistic Regression were used in the classification of ontologies based on their complexity metrics. A total of 200 biomedical ontologies were downloaded from the Bio Portal repository. Ontology metrics were then generated using the OntoMetrics tool, an online ontology evaluation platform. These metrics constituted the dataset used in the implementation of the machine learning algorithms. The results obtained were evaluated with performance evaluation techniques, namely, precision, recall, F-Measure Score and Receiver Operating Characteristic (ROC) curves. The Overall accuracy scores for K-Nearest Neighbors, Support Vector Machines, Decision Trees, Random Forest, Naïve Bayes, Logistic Regression and Linear Regression algorithms were 66.67%, 65%, 98%, 99.29%, 74%, 64.67%, and 57%, respectively. From these scores, Decision Trees and Random Forests algorithms were the best performing and can be attributed to the ability to handle multiclass classifications. Artificial intelligence algorithms Machine learning algorithms Ontology Dissertations, Academic -- South Africa Semantic web Ontologies (Information retrieval) Artificial intelligence
25	Maskininlärning som verktyg för att extrahera information om attribut kring bostadsannonser i syfte att maximera försäljningspris / Using machine learning to extract information from real estate listings in order to maximize selling price Ekeberg, Lukas, Fahnehjelm, Alexander January 2018 (has links) The Swedish real estate market has been digitalized over the past decade with the current practice being to post your real estate advertisement online. A question that has arisen is how a seller can optimize their public listing to maximize the selling premium. This paper analyzes the use of three machine learning methods to solve this problem: Linear Regression, Decision Tree Regressor and Random Forest Regressor. The aim is to retrieve information regarding how certain attributes contribute to the premium value. The dataset used contains apartments sold within the years of 2014-2018 in the Östermalm / Djurgården district in Stockholm, Sweden. The resulting models returned an R2-value of approx. 0.26 and Mean Absolute Error of approx. 0.06. While the models were not accurate regarding prediction of premium, information was still able to be extracted from the models. In conclusion, a high amount of views and a publication made in April provide the best conditions for an advertisement to reach a high selling premium. The seller should try to keep the amount of days since publication lower than 15.5 days and avoid publishing on a Tuesday. / Den svenska bostadsmarknaden har blivit alltmer digitaliserad under det senaste årtiondet med nuvarande praxis att säljaren publicerar sin bostadsannons online. En fråga som uppstår är hur en säljare kan optimera sin annons för att maximera budpremie. Denna studie analyserar tre maskininlärningsmetoder för att lösa detta problem: Linear Regression, Decision Tree Regressor och Random Forest Regressor. Syftet är att utvinna information om de signifikanta attribut som påverkar budpremien. Det dataset som använts innehåller lägenheter som såldes under åren 2014-2018 i Stockholmsområdet Östermalm / Djurgården. Modellerna som togs fram uppnådde ett R²-värde på approximativt 0.26 och Mean Absolute Error på approximativt 0.06. Signifikant information kunde extraheras from modellerna trots att de inte var exakta i att förutspå budpremien. Sammanfattningsvis skapar ett stort antal visningar och en publicering i april de bästa förutsättningarna för att uppnå en hög budpremie. Säljaren ska försöka hålla antal dagar sedan publicering under 15.5 dagar och undvika att publicera på tisdagar. correlation linear regression decision tree regressor random forest regressor gini impurity pricing property market data features predictive models machine learning algorithms Computer and Information Sciences Data- och informationsvetenskap
26	Comparision of Machine Learning Algorithms on Identifying Autism Spectrum Disorder Aravapalli, Naga Sai Gayathri, Palegar, Manoj Kumar January 2023 (has links) Background: Autism Spectrum Disorder (ASD) is a complex neurodevelopmen-tal disorder that affects social communication, behavior, and cognitive development.Patients with autism have a variety of difficulties, such as sensory impairments, at-tention issues, learning disabilities, mental health issues like anxiety and depression,as well as motor and learning issues. The World Health Organization (WHO) es-timates that one in 100 children have ASD. Although ASD cannot be completelytreated, early identification of its symptoms might lessen its impact. Early identifi-cation of ASD can significantly improve the outcome of interventions and therapies.So, it is important to identify the disorder early. Machine learning algorithms canhelp in predicting ASD. In this thesis, Support Vector Machine (SVM) and RandomForest (RF) are the algorithms used to predict ASD. Objectives: The main objective of this thesis is to build and train the models usingmachine learning(ML) algorithms with the default parameters and with the hyper-parameter tuning and find out the most accurate model based on the comparison oftwo experiments to predict whether a person is suffering from ASD or not. Methods: Experimentation is the method chosen to answer the research questions.Experimentation helped in finding out the most accurate model to predict ASD. Ex-perimentation is followed by data preparation with splitting of data and by applyingfeature selection to the dataset. After the experimentation followed by two exper-iments, the models were trained to find the performance metrics with the defaultparameters, and the models were trained to find the performance with the hyper-parameter tuning. Based on the comparison, the most accurate model was appliedto predict ASD. Results: In this thesis, we have chosen two algorithms SVM and RF algorithms totrain the models. Upon experimentation and training of the models using algorithmswith hyperparameter tuning. SVM obtained the highest accuracy score and f1 scoresfor test data are 96% and 97% compared to other model RF which helps in predictingASD. Conclusions: The models were trained using two ML algorithms SVM and RF andconducted two experiments, in experiment-1 the models were trained using defaultparameters and obtained accuracy, f1 scores for the test data, and in experiment-2the models were trained using hyper-parameter tuning and obtained the performancemetrics such as accuracy and f1 score for the test data. By comparing the perfor-mance metrics, we came to the conclusion that SVM is the most accurate algorithmfor predicting ASD. Autism Spectrum Disorder(ASD) Classification Data pre-processing Feature selection Machine learning algorithms Random Forest Classifier Support Vector Classifier. Computer Engineering Datorteknik Computer Sciences Datavetenskap (datalogi)
27	Konzeptentwicklung für das Qualitätsmanagement und der vorausschauenden Instandhaltung im Bereich der Innenhochdruck-Umformung (IHU): SFU 2023 Reuter, Thomas, Massalsky, Kristin, Burkhardt, Thomas 06 March 2024 (has links) Serienfertiger im Bereich der Innenhochdruck-Umformung stehen unter starkem Wettbewerbsdruck alternativer klassischer Fertigungen und deren Kostenkriterien. Wechselnde Produktionsanforderungen im globalisierten Marktumfeld erfordern flexibles Handeln bei höchster Qualität und niedrigen Kosten. Durch Reduzierung der Lager- und Umlaufbestände können Kosteneinsparungen erzielt werden. Störungsbedingte Ausfälle an IHU-Anlagen gilt es dabei auf ein Minimum zu reduzieren, um die vereinbarten Liefertermine fristgerecht zu erfüllen und Konventionalstrafen zu umgehen. Die erforderliche Produktivität und das angestrebte Qualitätsniveau lässt sich nur durch angepasste Instandhaltungsstrategien aufrechterhalten, weshalb ein Konzept für die vorausschauende Instandhaltung mit integriertem Qualitätsmanagement speziell für den Bereich der IHU erarbeitet wurde. Dynamische Prozess- und Instandhaltungsanpassungen sind zentraler Bestandteil der Entwicklungsarbeit.
28	Concept development for quality management and predictive maintenance in the area of hydroforming (IHU): SFU 2023 Reuter, Thomas, Massalsky, Kristin, Burkhardt, Thomas 06 March 2024 (has links) Series manufacturers in the field of hydroforming face intense competition from alternative conventional manufacturing methods and their cost criteria. Changing production requirements in the globalized market environment require flexible action with highest quality and low costs. Cost savings can be achieved through reductions in warehouse and circulating stocks. Malfunction-related downtimes in hydroforming systems must be reduced to a minimum in order to meet the agreed delivery dates on time and avoid conventional penalties. The required productivity and the desired quality level can only be maintained through adapted maintenance strategies, leading to the development of a concept for predictive maintenance integrated with quality management specifically for the IHU domain. Dynamic process and maintenance adaptations are a central component to this developmental effort.
29	Using hydrological models and digital soil mapping for the assessment and management of catchments: A case study of the Nyangores and Ruiru catchments in Kenya (East Africa) Kamamia, Ann Wahu 18 July 2023 (has links) Human activities on land have a direct and cumulative impact on water and other natural resources within a catchment. This land-use change can have hydrological consequences on the local and regional scales. Sound catchment assessment is not only critical to understanding processes and functions but also important in identifying priority management areas. The overarching goal of this doctoral thesis was to design a methodological framework for catchment assessment (dependent upon data availability) and propose practical catchment management strategies for sustainable water resources management. The Nyangores and Ruiru reservoir catchments located in Kenya, East Africa were used as case studies. A properly calibrated Soil and Water Assessment Tool (SWAT) hydrologic model coupled with a generic land-use optimization tool (Constrained Multi-Objective Optimization of Land-use Allocation-CoMOLA) was applied to identify and quantify functional trade-offs between environmental sustainability and food production in the ‘data-available’ Nyangores catchment. This was determined using a four-dimension objective function defined as (i) minimizing sediment load, (ii) maximizing stream low flow and (iii and iv) maximizing the crop yields of maize and soybeans, respectively. Additionally, three different optimization scenarios, represented as i.) agroforestry (Scenario 1), ii.) agroforestry + conservation agriculture (Scenario 2) and iii.) conservation agriculture (Scenario 3), were compared. For the data-scarce Ruiru reservoir catchment, alternative methods using digital soil mapping of soil erosion proxies (aggregate stability using Mean Weight Diameter) and spatial-temporal soil loss analysis using empirical models (the Revised Universal Soil Loss Equation-RUSLE) were used. The lack of adequate data necessitated a data-collection phase which implemented the conditional Latin Hypercube Sampling. This sampling technique reduced the need for intensive soil sampling while still capturing spatial variability. The results revealed that for the Nyangores catchment, adoption of both agroforestry and conservation agriculture (Scenario 2) led to the smallest trade-off amongst the different objectives i.e. a 3.6% change in forests combined with 35% change in conservation agriculture resulted in the largest reduction in sediment loads (78%), increased low flow (+14%) and only slightly decreased crop yields (3.8% for both maize and soybeans). Therefore, the advanced use of hydrologic models with optimization tools allows for the simultaneous assessment of different outputs/objectives and is ideal for areas with adequate data to properly calibrate the model. For the Ruiru reservoir catchment, digital soil mapping (DSM) of aggregate stability revealed that susceptibility to erosion exists for cropland (food crops), tea and roadsides, which are mainly located in the eastern part of the catchment, as well as deforested areas on the western side. This validated that with limited soil samples and the use of computing power, machine learning and freely available covariates, DSM can effectively be applied in data-scarce areas. Moreover, uncertainty in the predictions can be incorporated using prediction intervals. The spatial-temporal analysis exhibited that bare land (which has the lowest areal proportion) was the largest contributor to erosion. Two peak soil loss periods corresponding to the two rainy periods of March–May and October–December were identified. Thus, yearly soil erosion risk maps misrepresent the true dimensions of soil loss with averages disguising areas of low and high potential. Also, a small portion of the catchment can be responsible for a large proportion of the total erosion. For both catchments, agroforestry (combining both the use of trees and conservation farming) is the most feasible catchment management strategy (CMS) for solving the major water quantity and quality problems. Finally, the key to thriving catchments aiming at both sustainability and resilience requires urgent collaborative action by all stakeholders. The necessary stakeholders in both Nyangores and Ruiru reservoir catchments must be involved in catchment assessment in order to identify the catchment problems, mitigation strategies/roles and responsibilities while keeping in mind that some risks need to be shared and negotiated, but so will the benefits.:TABLE OF CONTENTS DECLARATION OF CONFORMITY........................................................................ i DECLARATION OF INDEPENDENT WORK AND CONSENT ............................. ii LIST OF PAPERS ................................................................................................. iii ACKNOWLEDGEMENTS ..................................................................................... iv THESIS AT A GLANCE ......................................................................................... v SUMMARY ............................................................................................................ vi List of Figures......................................................................................................... x List of Tables........................................................................................................... x ABBREVIATION..................................................................................................... xi PART A: SYNTHESIS 1. INTRODUCTION ............................................................................................... 1 1.1 Catchment management ...................................................................................1 1.2 Tools to support catchment assessment and management ..............................4 1.3 Catchment management strategies (CMSs)......................................................9 1.4 Concept and research objectives.......................................................................11 2. MATERIAL AND METHODS................................................................................15 2.1. STUDY AREA ..................................................................................................15 2.1.1. Nyangores catchment ...................................................................................15 2.1.2. Ruiru reservoir catchment .............................................................................17 2.2. Using SWAT conceptual model and land-use optimization ..............................19 2.3. Using soil erosion proxies and empirical models ..............................................21 3. RESULTS AND DISCUSSION..............................................................................24 3.1. Assessing multi-metric calibration performance using the SWAT model...........25 3.2. Land-use optimization using SWAT-CoMOLA for the Nyangores catchment. ..26 3.3. Digital soil mapping of soil aggregate stability ..................................................28 3.4. Spatio-temporal analysis using the revised universal soil loss equation (RUSLE) 29 4. CRITICAL ASSESSMENT OF THE METHODS USED ......................................31 4.1. Assessing suitability of data for modelling and overcoming data challenges...31 4.2. Selecting catchment management strategies based on catchment assessment . 35 5. CONCLUSION AND RECOMMENDATIONS ....................................................36 6. REFERENCES ............................ .....................................................................38 PART B: PAPERS PAPER I .................................................................................................................47 PAPER II ................................................................................................................59 PAPER III ...............................................................................................................74 PAPER IV ...............................................................................................................88 info:eu-repo/classification/ddc/551 ddc:551
30	Learning Algorithms Using Chance-Constrained Programs Jagarlapudi, Saketha Nath 07 1900 (has links) This thesis explores Chance-Constrained Programming (CCP) in the context of learning. It is shown that chance-constraint approaches lead to improved algorithms for three important learning problems — classification with specified error rates, large dataset classification and Ordinal Regression (OR). Using moments of training data, the CCPs are posed as Second Order Cone Programs (SOCPs). Novel iterative algorithms for solving the resulting SOCPs are also derived. Borrowing ideas from robust optimization theory, the proposed formulations are made robust to moment estimation errors. A maximum margin classifier with specified false positive and false negative rates is derived. The key idea is to employ chance-constraints for each class which imply that the actual misclassification rates do not exceed the specified. The formulation is applied to the case of biased classification. The problems of large dataset classification and ordinal regression are addressed by deriving formulations which employ chance-constraints for clusters in training data rather than constraints for each data point. Since the number of clusters can be substantially smaller than the number of data points, the resulting formulation size and number of inequalities are very small. Hence the formulations scale well to large datasets. The scalable classification and OR formulations are extended to feature spaces and the kernelized duals turn out to be instances of SOCPs with a single cone constraint. Exploiting this speciality, fast iterative solvers which outperform generic SOCP solvers, are proposed. Compared to state-of-the-art learners, the proposed algorithms achieve a speed up as high as 10000 times, when the specialized SOCP solvers are employed. The proposed formulations involve second order moments of data and hence are susceptible to moment estimation errors. A generic way of making the formulations robust to such estimation errors is illustrated. Two novel confidence sets for moments are derived and it is shown that when either of the confidence sets are employed, the robust formulations also yield SOCPs. Machine Learning Classification Dataset Classification Ordinal Regression (OR) Chance-Constrained Programming (CCP) Classification - Algorithms Ordinal Regression - Algorithms Machine Learning - Algorithms Second Order Cone Programs (SOCPs) Maximum Margin Classification Focused Crawling Large Datasets Error Rates Computer Science

Search results