281 |
Obtaining Accurate and Comprehensible Data Mining Models : An Evolutionary ApproachJohansson, Ulf January 2007 (has links)
When performing predictive data mining, the use of ensembles is claimed to virtually guarantee increased accuracy compared to the use of single models. Unfortunately, the problem of how to maximize ensemble accuracy is far from solved. In particular, the relationship between ensemble diversity and accuracy is not completely understood, making it hard to efficiently utilize diversity for ensemble creation. Furthermore, most high-accuracy predictive models are opaque, i.e. it is not possible for a human to follow and understand the logic behind a prediction. For some domains, this is unacceptable, since models need to be comprehensible. To obtain comprehensibility, accuracy is often sacrificed by using simpler but transparent models; a trade-off termed the accuracy vs. comprehensibility trade-off. With this trade-off in mind, several researchers have suggested rule extraction algorithms, where opaque models are transformed into comprehensible models, keeping an acceptable accuracy. In this thesis, two novel algorithms based on Genetic Programming are suggested. The first algorithm (GEMS) is used for ensemble creation, and the second (G-REX) is used for rule extraction from opaque models. The main property of GEMS is the ability to combine smaller ensembles and individual models in an almost arbitrary way. Moreover, GEMS can use base models of any kind and the optimization function is very flexible, easily permitting inclusion of, for instance, diversity measures. In the experimentation, GEMS obtained accuracies higher than both straightforward design choices and published results for Random Forests and AdaBoost. The key quality of G-REX is the inherent ability to explicitly control the accuracy vs. comprehensibility trade-off. Compared to the standard tree inducers C5.0 and CART, and some well-known rule extraction algorithms, rules extracted by G-REX are significantly more accurate and compact. Most importantly, G-REX is thoroughly evaluated and found to meet all relevant evaluation criteria for rule extraction algorithms, thus establishing G-REX as the algorithm to benchmark against. Read more
|
282 |
Understanding, Modeling and Predicting Hidden Solder Joint Shape Using Active ThermographyGiron Palomares, Jose 2012 May 1900 (has links)
Characterizing hidden solder joint shapes is essential for electronics reliability. Active thermography is a methodology to identify hidden defects inside an object by means of surface abnormal thermal response after applying a heat flux. This research focused on understanding, modeling, and predicting hidden solder joint shapes. An experimental model based on active thermography was used to understand how the solder joint shapes affect the surface thermal response (grand average cooling rate or GACR) of electronic multi cover PCB assemblies. Next, a numerical model simulated the active thermography technique, investigated technique limitations and extended technique applicability to characterize hidden solder joint shapes. Finally, a prediction model determined the optimum active thermography conditions to achieve an adequate hidden solder joint shape characterization.
The experimental model determined that solder joint shape plays a higher role for visible than for hidden solder joints in the GACR; however, a MANOVA analysis proved that hidden solder joint shapes are significantly different when describe by the GACR. An artificial neural networks classifier proved that the distances between experimental solder joint shapes GACR must be larger than 0.12 to achieve 85% of accuracy classifying. The numerical model achieved minimum agreements of 95.27% and 86.64%, with the experimental temperatures and GACRs at the center of the PCB assembly top cover, respectively. The parametric analysis proved that solder joint shape discriminability is directly proportional to heat flux, but inversely proportional to covers number and heating time. In addition, the parametric analysis determined that active thermography is limited to five covers to discriminate among hidden solder joint shapes. A prediction model was developed based on the parametric numerical data to determine the appropriate amount of energy to discriminate among solder joint shapes for up to five covers. The degree of agreement between the prediction model and the experimental model was determined to be within a 90.6% for one and two covers. The prediction model is limited to only three solder joints, but these research principles can be applied to generate more realistic prediction models for large scale electronic assemblies like ball grid array assemblies having as much as 600 solder joints. Read more
|
283 |
Virtual virtuosityBresin, Roberto January 2000 (has links)
This dissertation presents research in the field ofautomatic music performance with a special focus on piano. A system is proposed for automatic music performance, basedon artificial neural networks (ANNs). A complex,ecological-predictive ANN was designed thatlistensto the last played note,predictsthe performance of the next note,looksthree notes ahead in the score, and plays thecurrent tone. This system was able to learn a professionalpianist's performance style at the structural micro-level. In alistening test, performances by the ANN were judged clearlybetter than deadpan performances and slightly better thanperformances obtained with generative rules. The behavior of an ANN was compared with that of a symbolicrule system with respect to musical punctuation at themicro-level. The rule system mostly gave better results, butsome segmentation principles of an expert musician were onlygeneralized by the ANN. Measurements of professional pianists' performances revealedinteresting properties in the articulation of notes markedstaccatoandlegatoin the score. Performances were recorded on agrand piano connected to a computer.Staccatowas realized by a micropause of about 60% ofthe inter-onset-interval (IOI) whilelegatowas realized by keeping two keys depressedsimultaneously; the relative key overlap time was dependent ofIOI: the larger the IOI, the shorter the relative overlap. Themagnitudes of these effects changed with the pianists' coloringof their performances and with the pitch contour. Theseregularities were modeled in a set of rules for articulation inautomatic piano music performance. Emotional coloring of performances was realized by means ofmacro-rules implemented in the Director Musices performancesystem. These macro-rules are groups of rules that werecombined such that they reflected previous observations onmusical expression of specific emotions. Six emotions weresimulated. A listening test revealed that listeners were ableto recognize the intended emotional colorings. In addition, some possible future applications are discussedin the fields of automatic music performance, music education,automatic music analysis, virtual reality and soundsynthesis. / QC 20100518 Read more
|
284 |
Design and implementation of controller for robotic manipulators using Artificial Neural NetworksChamanirad, Mohsen January 2009 (has links)
In this thesis a novel method for controlling a manipulator with arbitrary number of Degrees of freedom is proposed, the proposed method has the main advantages of two common controllers, the simplicity of PID controller and the robustness and accuracy of adaptive controller. The controller architecture is based on an Artificial Neural Network (ANN) and a PID controller. The controller has the ability of solving inverse dynamics and inverse kinematics of robot with two separate Artificial Neural Networks. Since the ANN is learning the system parameters by itself the structure of controller can easily be changed to improve the performance of robot. The proposed controller can be implemented on a FPGA board to control the robot in real-time or the response of the ANN can be calculated offline and be reconstructed by controller using a lookup table. Error between the desired trajectory path and the path of the robot converges to zero rapidly and as the robot performs its tasks the controller learns the robot parameters and generates better control signal. The performance of controller is tested in simulation and on a real manipulator with satisfactory results.
|
285 |
Machine Learning Methods for Annual Influenza Vaccine UpdateTang, Lin 26 April 2013 (has links)
Influenza is a public health problem that causes serious illness and deaths all over the world. Vaccination has been shown to be the most effective mean to prevent infection. The primary component of influenza vaccine is the weakened strains. Vaccination triggers the immune system to develop antibodies against those strains whose viral surface glycoprotein hemagglutinin (HA) is similar to that of vaccine strains. However, influenza vaccine must be updated annually since the antigenic structure of HA is constantly mutation.
Hemagglutination inhibition (HI) assay is a laboratory procedure frequently applied to evaluate the antigenic relationships of the influenza viruses. It enables the World Health Organization (WHO) to recommend appropriate updates on strains that will most likely be protective against the circulating influenza strains. However, HI assay is labour intensive and time-consuming since it requires several controls for standardization. We use two machine-learning methods, i.e. Artificial Neural Network (ANN) and Logistic Regression, and a Mixed-Integer Optimization Model to predict antigenic variety. The ANN generalizes the input data to patterns inherent in the data, and then uses these patterns to make predictions. The logistic regression model identifies and selects the amino acid positions, which contribute most significantly to antigenic difference. The output of the logistic regression model will be used to predict the antigenic variants based on the predicted probability. The Mixed-Integer Optimization Model is formulated to find hyperplanes that enable binary classification. The performances of our models are evaluated by cross validation. Read more
|
286 |
Modeling and analysis of actual evapotranspiration using data driven and wavelet techniquesIzadifar, Zohreh 22 July 2010
Large-scale mining practices have disturbed many natural watersheds in northern Alberta, Canada. To restore disturbed landscapes and ecosystems functions, reconstruction strategies have been adopted with the aim of establishing sustainable reclaimed lands. The success of the reconstruction process depends on the design of reconstruction strategies, which can be optimized by improving the understanding of the controlling hydrological processes in the reconstructed watersheds. Evapotranspiration is one of the important components of the hydrological cycle; its estimation and analysis are crucial for better assessment of the reconstructed landscape hydrology, and for more efficient design. The complexity of the evapotranspiration process and its variability in time and space has imposed some limitations on previously developed evapotranspiration estimation models. The vast majority of the available models estimate the rate of potential evapotranspiration, which occurs under unlimited water supply condition. However, the rate of actual evapotranspiration (AET) depends on the available soil moisture, which makes its physical modeling more complicated than the potential evapotranspiration. The main objective of this study is to estimate and analyze the AET process in a reconstructed landscape.<p>
Data driven techniques can model the process without having a complete understanding of its physics. In this study, three data driven models; genetic programming (GP), artificial neural networks (ANNs), and multilinear regression (MLR), were developed and compared for estimating the hourly eddy covariance (EC)-measured AET using meteorological variables. The AET was modeled as a function of five meteorological variables: net radiation (Rn), ground temperature (Tg), air temperature (Ta), relative humidity (RH), and wind speed (Ws) in a reconstructed landscape located in northern Alberta, Canada. Several ANN models were evaluated using two training algorithms of Levenberg-Marquardt and Bayesian regularization. The GP technique was employed to generate mathematical equations correlating AET to the five meteorological variables. Furthermore, the available data were statistically analyzed to obtain MLR models and to identify the meteorological variables that have significant effect on the evapotranspiration process. The utility of the investigated data driven models was also compared with that of HYDRUS-1D model, which is a physically based model that makes use of conventional Penman-Monteith (PM) method for the prediction of AET. HYDRUS-1D model was examined for estimating AET using meteorological variables, leaf area index, and soil moisture information. Furthermore, Wavelet analysis (WA), as a multiresolution signal processing tool, was examined to improve the understanding of the available time series temporal variations, through identifying the significant cyclic features, and to explore the possible correlation between AET and the meteorological signals. WA was used with the purpose of input determination of AET models, a priori.<p>
The results of this study indicated that all three proposed data driven models were able to approximate the AET reasonably well; however, GP and MLR models had better generalization ability than the ANN model. GP models demonstrated that the complex process of hourly AET can be efficiently modeled as simple semi-linear functions of few meteorological variables. The results of HYDRUS-1D model exhibited that a physically based model, such as HYDRUS-1D, might perform on par or even inferior to the data driven models in terms of the overall prediction accuracy. The developed equation-based models; GP and MLR, revealed the larger contribution of net radiation and ground temperature, compared to other variables, to the estimation of AET. It was also found that the interaction effects of meteorological variables are important for the AET modeling. The results of wavelet analysis demonstrated the presence of both small-scale (2 to 8 hours) and larger-scale (e.g. diurnal) cyclic features in most of the investigated time series. Larger-scale cyclic features were found to be the dominant source of temporal variations in the AET and most of the meteorological variables. The results of cross wavelet analysis indicated that the cause and effect relationship between AET and the meteorological variables might vary based on the time-scale of variation under consideration. At small time-scales, significant linear correlations were observed between AET and Rn, RH, and Ws time series, while at larger time-scales significant linear correlations were observed between AET and Rn, RH, Tg, and Ta time series. Read more
|
287 |
Construction of an Electroencephalogram-Based Brain-Computer Interface Using an Artificial Neural NetworkKOBAYASHI, Takeshi, HONDA, Hiroyuki, OGAWA, Tetsuo, SHIRATAKI, Tatsuaki, IMANISHI, Toshiaki, HANAI, Taizo, HIBINO, Shin, LIU, Xicheng 01 September 2003 (has links)
No description available.
|
288 |
A Hybrid Neural Network- Mathematical Programming Approach to Design an Air Quality Monitoring Network for an Industrial ComplexAl-Adwani, Suad January 2007 (has links)
Air pollution sampling site selection is one of the most important and yet most vexing of the problems faced by those responsible for regional and urban air quality management and for the attainment and maintenance of national ambient air quality standards. Since one cannot hope to monitor air quality at all locations at all times, selection of sites to give a reliable and realistic picture of air quality becomes a major issue and at the same time a difficult task. The location (configuration) and the number of stations may be based on many factors, some of which may depend on limited resources, federal and state regulations and local conditions. The combination of these factors has made air quality surveys more complex; requiring comprehensive planning to ensure that the prescribed objectives can be attained in the shortest possible time and at the least cost. Furthermore, the choice and siting of the measuring network represents a factor of significant economic relevance for policymakers. In view of the fact that equipment, maintenance and operating personnel costs are increasing dramatically, the possibility of optimizing the monitoring design, is most attractive to the directors of air quality management programs.
In this work a methodology that is able to design an optimal air quality monitoring network (AQMN) is described. The objective of the optimization is to provide maximum information about the presence and level of atmospheric contaminants in a given area and with a limited budget. A criterion for assessing the allocation of monitoring stations is developed by applying a utility function that can describe the spatial coverage of the network and its ability to detect violations of standards for multiple pollutants. A mathematical model based on the Multiple Cell Approach (MCA) was used to create monthly spatial distributions for the concentrations of the pollutants emitted from different emission sources. This data was used to train artificial neural networks (ANN) that were proven to be able to predict very well the pattern and violation scores at different potential locations. These neural networks were embedded within a mathematical programming model whose objective is to determine the best monitoring locations for a given budget. This resulted in a nonlinear program (NLP).
The proposed model is applied to a network of existing refinery stacks and the locations of monitoring stations and their area coverage percentage are obtained. Read more
|
289 |
Forecasting Pavement Surface Temperature Using Time Series and Artificial Neural NetworksHashemloo, Behzad 09 June 2008 (has links)
Transportation networks play a significant role in the economy of Canadians during winter seasons; thus, maintaining a safe and economic flow of traffic on Canadian roads is crucial. Winter contaminants such as freezing rain, snow, and ice cause reduced friction between vehicle tires and pavement and thus increased accident-risk and decreased road capacity. The formation of ice and frost caused by snowfall and wind chill makes driving a very difficult task. Pavement surface temperature is an important indicator for road authorities when they are deciding the optimal time to apply anti-icer/deicer chemicals and when estimating their effect and the optimal amounts to apply. By forecasting pavement temperature, maintenance crews can figure out road surface conditions ahead of time and start their operations in a timely manner, thereby reducing salt use and increasing the safety and security of road users by eliminating accidents caused by slipperiness.
This research investigates the feasibility of applying simple statistical models for forecasting road surface temperatures at locations where RWIS data are available. Two commonly used modeling techniques were considered: time-series analysis and artificial neural networks (ANN). A data set from an RWIS station is used for model calibration and validation. The analysis indicates that multi-variable SARIMA is the most competitive technique and has the lowest number of forecasting errors. Read more
|
290 |
A Hybrid Neural Network- Mathematical Programming Approach to Design an Air Quality Monitoring Network for an Industrial ComplexAl-Adwani, Suad January 2007 (has links)
Air pollution sampling site selection is one of the most important and yet most vexing of the problems faced by those responsible for regional and urban air quality management and for the attainment and maintenance of national ambient air quality standards. Since one cannot hope to monitor air quality at all locations at all times, selection of sites to give a reliable and realistic picture of air quality becomes a major issue and at the same time a difficult task. The location (configuration) and the number of stations may be based on many factors, some of which may depend on limited resources, federal and state regulations and local conditions. The combination of these factors has made air quality surveys more complex; requiring comprehensive planning to ensure that the prescribed objectives can be attained in the shortest possible time and at the least cost. Furthermore, the choice and siting of the measuring network represents a factor of significant economic relevance for policymakers. In view of the fact that equipment, maintenance and operating personnel costs are increasing dramatically, the possibility of optimizing the monitoring design, is most attractive to the directors of air quality management programs.
In this work a methodology that is able to design an optimal air quality monitoring network (AQMN) is described. The objective of the optimization is to provide maximum information about the presence and level of atmospheric contaminants in a given area and with a limited budget. A criterion for assessing the allocation of monitoring stations is developed by applying a utility function that can describe the spatial coverage of the network and its ability to detect violations of standards for multiple pollutants. A mathematical model based on the Multiple Cell Approach (MCA) was used to create monthly spatial distributions for the concentrations of the pollutants emitted from different emission sources. This data was used to train artificial neural networks (ANN) that were proven to be able to predict very well the pattern and violation scores at different potential locations. These neural networks were embedded within a mathematical programming model whose objective is to determine the best monitoring locations for a given budget. This resulted in a nonlinear program (NLP).
The proposed model is applied to a network of existing refinery stacks and the locations of monitoring stations and their area coverage percentage are obtained. Read more
|
Page generated in 0.0655 seconds