Global ETD Search

1	Protein Backbone Reconstruction with Tool Preference Classification for Standard and Nonstandard Proteins Wu, Hsin-Fang 11 September 2012 (has links) Given a protein sequence and the C£\ coordinates on its backbone, the all-atom protein backbone reconstruction problem (PBRP) is to reconstruct the backbone by its 3D coordinates of N, C and O atoms. In the past few decades, many methods have been proposed for solving PBRP, such as ab initio, homology modeling, SABBAC, Wang¡¦s method, Chang¡¦s method, BBQ (Backbone Building from Quadrilaterals) and Chen¡¦s method. Chen found that, if they can choose the correct prediction tool to build the 3D coordinates of the desired atoms, the RMSD may be improved. In this thesis, we propose a method for solving PBRP based on Chen¡¦s method. We use tool preference classification on each atom of the residue, where the classification model is generated by SVM (Support Vector Machine). We rebuild the backbone by combing the prediction results of all atoms in all residues. The data sets used in our experiments are CASP7, CASP8 and CASP9, which have 65, 52 and 63 proteins, respectively. These data sets contain nonstandard amino acids as well as standard ones. We improve the average RMSDs of Chen¡¦s results in some cases. The average RMSDs of our method are 0.3496 in CASP7, 0.3084 in CASP8 and 0.3286 in CASP9. backbone bioinformatics standard protein support vector machine feature set
2	Prediction for the Essential Protein with the Support Vector Machine Yang, Zih-Jie 06 September 2011 (has links) Essential proteins affect the cellular life deeply, but it is hard to identify them. Protein-protein interaction is one of the ways to disclose whether a protein is essential or not. We notice that many researchers use the feature set composed of topology properties from protein-protein interaction to predict the essential proteins. However, the functionality of a protein is also a clue to determine its essentiality. In this thesis, to build SVM models for predicting the essential proteins, our feature set contains the sequence properties which can influence the protein function, topology properties and protein properties. In our experiments, we download Scere20070107, which contains 4873 proteins and 17166 interactions, from DIP database. The ratio of essential proteins to nonessential proteins is nearly 1:4, so it is imbalanced. In the imbalanced dataset, the best values of F-measure, MCC, AIC and BIC of our models are 0.5197, 0.4671, 0.2428 and 0.2543, respectively. We build another balanced dataset with ratio 1:1. For balanced dataset, the best values of F-measure, MCC, AIC and BIC of our models are 0.7742, 0.5484, 0.3603 and 0.3828, respectively. Our results are superior to all previous results with various measurements. bioinformatics essential protein protein-protein interaction support vector machine feature set
3	Taiwan Stock Forecasting with the Genetic Programming Jhou, Siao-ming 07 September 2011 (has links) In this thesis, we propose a model which applies the genetic programming (GP) to train the profitable and stable trading strategy in the training period, and then the strategy is applied to trade stocks in the testing period. The variables for GP in our models include 6 basic information and 25 technical indicators. We perform our models on Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) from 2000/9/14 to 2010/5/21, approximately ten years. We conduct five experiments. In these experiments, we find that the trading strategies generated by GP with two arithmetic trees have more stable returns. In addition, if we obtain the trading strategies in three historical periods which are the most similar to the current training period, we earn higher return in the testing periods. In each experiment, 24 cases are considered, with training periods of 90, 180, 270, 365, 455, 545, 635 and 730 days, and testing periods of 90, 180 and 365 days, respectively. The testing period is rolling updated until the end of the experiment period. The best cumulative return 165.30\% occurs when 730-day training period pairs with 365-day testing period, which is much higher than the return of the buy-and-hold strategy 1.19\%. Stock genetic programming annualized return feature set
4	Trading Strategy Mining with Gene Expression Programming Huang, Chang-Hao 12 September 2012 (has links) In the thesis, we apply the gene expression programming (GEP) to training profitable trading strategies. We propose a model which utilizes several historical periods that are highly related to the current template period, and the best trading strategies of the historical periods generate the trading signals. To keep stability of our model, we proposed the trading decision mechanism based on simple majority vote in our model. The Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) is selected as our investment target and the trading period starts from 2000/9/14 to 2012/1/17, approximately twelve years. In our experiments, the lengths of our training period are 60, 90, 120, 180, and 270 trading days, respectively. We observe that the model with higher voting threshold usually can make profitable trading decisions. The best cumulative return 236.25\% and the best annualized cumulative return 10.63\% occur when the 180-day training models pairs with available threshold 0.21 and voting threshold 0.88, which are higher than the cumulative return 0.96\% and annualized cumulative return 0.08\% of the buy-and-hold strategy. simple majority vote feature set strategy pool gene expression programming
5	Using Genetic Algorithms for Feature Set Selection in Text Mining Rogers, Benjamin Charles 17 January 2014 (has links) No description available. Artificial Intelligence Computer Engineering Computer Science Information Science genetic algorithms feature set selection text mining design rationale GATE WEKA pipeline
6	Development of a Machine Learning Algorithm to Identify Error Causes of Automated Failed Test Results Pallathadka Shivarama, Anupama 15 March 2024 (has links) The automotive industry is continuously innovating and adapting new technologies. Along with that, the companies work towards maintaining the quality of a hardware product and meeting the customer demands. Before delivering the product to the customer, it is essential to test and approve it for the safe use. The concept remains the same when it comes to a software. Adapting modern technologies will further improve the efficiency of testing a software. The thesis aims to build a machine learning algorithm for the implementation during the software testing. In general, the evaluation of a generated test report after the testing consumes more time. The built algorithm should be able to reduce the time spent and the manual effort during the evaluation. Basically, the machine learning algorithms will analyze and learn the data available in the old test reports. Based on the learnt data pattern, it will suggest the possible root causes for the failed test cases in the future. The thesis report has the literature survey that helped in understanding the machine learning concepts in different industries for similar problems. The tasks involved while building the model are data loading, data pre-processing, selecting the best conditions for each algorithm and comparison of the performance among them. It also suggest the possible future work towards improving the performance of the models. The entire work is implemented in Jupyter notebook using pandas and scikit-learn libraries. info:eu-repo/classification/ddc/004 ddc:004 Softwaretest Maschinelles Lernen

1

Page generated in 0.0463 seconds