311 |
Elaboração de itens para avaliações em larga escala / Elaboration of items for large scale evaluationsEdson Ferreira Costa 17 May 2018 (has links)
Este trabalho visa auxiliar professores e profissionais da Educação Básica para a elaboração de itens nas avaliações em larga escala. A princípio, é realizado um breve histórico sobre a situação da Educação Básica no país, em meados dos anos 80. Em seguida, são reveladas algumas das medidas planejadas pelos órgãos educacionais na busca por melhorias no cenário educacional brasileiro como, por exemplo, a reestruturação das avaliações em larga escala existentes na década de 90 e a criação de novos exames. O capítulo seguinte apresenta os documentos que são consultados durante este processo de construção dessas avaliações (matrizes curriculares e de referência) com ênfase no Exame Nacional do Ensino Médio (ENEM), pelo fato de ser a avaliação em larga escala de maior abrangência, em nível federal, desde 2009. Os capítulos seguintes revelam a importância do item nas avaliações em larga escala e apresentam alguns modelos elaborados, com base na Matriz de Referência do ENEM. / This work aims to help teachers and professionals of Basic Education to elaborate items in the large scale evaluations. At the outset, a brief history of the situation of Basic Education in the country in the mid-1980s is made. Then, some of the measures planned by the educational agencies are revealed in the search for improvements in the Brazilian educational scenario, such as the restructuring of the large-scale assessments in the 1990s and the creation of new examinations. The following chapter presents the documents that are consulted during this process of construction of these evaluations (curricular and reference matrices) with emphasis on the Exame Nacional do Ensino Médio (ENEM), because it is the large scale federal, since 2009. The following chapters reveal the importance of the item in the large-scale evaluations and present some elaborate models, based on the ENEM Reference Matrix.
|
312 |
Interacting dark sectors in cosmologyBuen Abad Najar, Manuel Alejandro 27 November 2018 (has links)
We present two different interacting dark sector models: one in which the dark matter particle is charged under a non-abelian dark gauge group, whose gauge bosons constitute a dark radiation component; and one in which a fraction of the dark matter has efficient number-changing self-interactions that keep it warm. We find that in general the structure formation is slowed down in these models, which addresses a discrepancy in the measurement of the σ8 parameter of large-scale structure. We also perform fits to cosmological data for a generalization of the non-abelian model (in which only a fraction of the dark matter interacts with the dark gauge bosons) and show that it can ease the current experimental tension in the measurement of the Hubble expansion rate H0.
|
313 |
Voltage island-driven floorplanning.January 2008 (has links)
Ma, Qiang. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (leaves 78-80). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgement --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Background --- p.1 / Chapter 1.2 --- Floorplanning --- p.2 / Chapter 1.3 --- Motivations --- p.4 / Chapter 1.4 --- Design Implementation of Voltage Islands --- p.5 / Chapter 1.5 --- Problem Formulation --- p.8 / Chapter 1.6 --- Progress on the Problem --- p.10 / Chapter 1.7 --- Contributions --- p.12 / Chapter 1.8 --- Thesis Organization --- p.14 / Chapter 2 --- Literature Review on MSV --- p.15 / Chapter 2.1 --- Introduction --- p.15 / Chapter 2.2 --- MSV at Post-floorplan/Post Placement Stage --- p.16 / Chapter 2.2.1 --- """Post-Placement Voltage Island Generation under Performance Requirement""" --- p.16 / Chapter 2.2.2 --- """Post-Placement Voltage Island Generation""" --- p.18 / Chapter 2.2.3 --- """Timing-Constrained and Voltage-Island-Aware Voltage Assignment""" --- p.19 / Chapter 2.2.4 --- """Voltage Island Generation under Performance Requirement for SoC Designs""" --- p.20 / Chapter 2.2.5 --- """An ILP Algorithm for Post-Floorplanning Voltage-Island Generation Considering Power-Network Planning""" --- p.21 / Chapter 2.3 --- MSV at Floorplan/Placement Stage --- p.22 / Chapter 2.3.1 --- """Architecting Voltage Islands in Core-based System-on-a- Chip Designs""" --- p.22 / Chapter 2.3.2 --- """Voltage Island Aware Floorplanning for Power and Timing Optimization""" --- p.23 / Chapter 2.4 --- Summary --- p.27 / Chapter 3 --- MSV Driven Floorplanning --- p.29 / Chapter 3.1 --- Introduction --- p.29 / Chapter 3.2 --- Problem Formulation --- p.32 / Chapter 3.3 --- Algorithm Overview --- p.33 / Chapter 3.4 --- Optimal Island Partitioning and Voltage Assignment --- p.33 / Chapter 3.4.1 --- Voltage Islands in Non-subtrees --- p.35 / Chapter 3.4.2 --- Proof of Optimality --- p.36 / Chapter 3.4.3 --- Handling Island with Power Down Mode --- p.37 / Chapter 3.4.4 --- Speedup in Implementation and Complexity --- p.38 / Chapter 3.4.5 --- Varying Background Chip-level Voltage --- p.39 / Chapter 3.5 --- Simulated Annealing --- p.39 / Chapter 3.5.1 --- Moves --- p.39 / Chapter 3.5.2 --- Cost Function --- p.40 / Chapter 3.6 --- Experimental Results --- p.40 / Chapter 3.6.1 --- Extension to Minimize Level Shifters --- p.45 / Chapter 3.6.2 --- Extension to Consider Power Network Routing --- p.46 / Chapter 3.7 --- Summary --- p.46 / Chapter 4 --- MSV Driven Floorplanning with Timing --- p.49 / Chapter 4.1 --- Introduction --- p.49 / Chapter 4.2 --- Problem Formulation --- p.52 / Chapter 4.3 --- Algorithm Overview --- p.56 / Chapter 4.4 --- Voltage Assignment Problem --- p.56 / Chapter 4.4.1 --- Lagrangian Relaxation --- p.58 / Chapter 4.4.2 --- Transformation into the Primal Minimum Cost Flow Problem --- p.60 / Chapter 4.4.3 --- Cost-Scaling Algorithm --- p.64 / Chapter 4.4.4 --- Solution Transformation --- p.66 / Chapter 4.5 --- Simulated Annealing --- p.69 / Chapter 4.5.1 --- Moves --- p.69 / Chapter 4.5.2 --- Speeding up heuristic --- p.69 / Chapter 4.5.3 --- Cost Function --- p.70 / Chapter 4.5.4 --- Annealing Schedule --- p.71 / Chapter 4.6 --- Experimental Results --- p.71 / Chapter 4.7 --- Summary --- p.72 / Chapter 5 --- Conclusion --- p.76 / Bibliography --- p.80
|
314 |
Predictive floorplanning with fixed outline constraint.January 2008 (has links)
Leung, Chi Kwan. / Thesis submitted in: December 2007. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (leaves 66-68). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Literature Review on Fixed-outline Floorplanning --- p.5 / Chapter 2.1 --- General Floorplanning --- p.5 / Chapter 2.1.1 --- Simulated Annealing --- p.6 / Example - Normalized Polish Expression --- p.9 / Example - Sequence Pair Representation --- p.15 / Example - Corner Block List --- p.19 / Chapter 2.1.2 --- Genetic Algorithm --- p.24 / Chapter 2.1.3 --- Mixed Integer Linear Programming --- p.25 / Chapter 2.1.4 --- Geometric Programming --- p.25 / Chapter 2.1.5 --- Discussion --- p.26 / Advantages of using Simulated Annealing --- p.26 / Disadvantages of using Simulated Annealing --- p.27 / Chapter 2.2 --- Fixed-outline Floorplanning --- p.28 / Chapter 2.2.1 --- Motivation --- p.28 / Chapter 2.2.2 --- Dimension Based Cost Function --- p.30 / Chapter 2.2.3 --- Aspect Ratio Based Cost Function --- p.32 / Chapter 2.2.4 --- Evolutionary Search --- p.33 / Chapter 2.2.5 --- Instance Augmentation --- p.35 / Chapter 3 --- Predictive Rating with Fixed Outline Constraints --- p.39 / Chapter 3.1 --- Introduction --- p.39 / Chapter 3.2 --- Motivation --- p.40 / Chapter 3.3 --- Predictive Rating Scheme --- p.44 / Chapter 3.3.1 --- Area --- p.45 / Chapter 3.3.2 --- Dimensions --- p.46 / Chapter 3.3.3 --- Aspect Ratio --- p.47 / Chapter 3.3.4 --- Overall Equation for Predictive Rating --- p.48 / Chapter 3.4 --- Integration into the Floorplanner --- p.49 / Chapter 3.5 --- Experimental Results --- p.50 / Chapter 3.5.1 --- Accuracy of Predictive Rating --- p.50 / Chapter 3.5.2 --- Test One --- p.52 / Chapter 3.5.3 --- Test Two --- p.57 / Chapter 3.6 --- Conclusion --- p.61 / Chapter 4 --- Conclusion --- p.64 / Bibliography --- p.66
|
315 |
Fixed-outline bus-driven floorplanning.January 2011 (has links)
Jiang, Yan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2011. / Includes bibliographical references (p. 87-92). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgement --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Physical Design --- p.2 / Chapter 1.2 --- Floorplanning --- p.6 / Chapter 1.2.1 --- Floorplanning Objectives --- p.7 / Chapter 1.2.2 --- Common Approaches --- p.8 / Chapter 1.3 --- Motivations and Contributions --- p.14 / Chapter 1.4 --- Organization of the Thesis --- p.15 / Chapter 2 --- Literature Review on BDF --- p.17 / Chapter 2.1 --- Zero-Bend BDF --- p.17 / Chapter 2.1.1 --- BDF Using the Sequence-Pair Representation --- p.17 / Chapter 2.1.2 --- Using B*-Tree and Fast SA --- p.20 / Chapter 2.2 --- Two-Bend BDF --- p.22 / Chapter 2.3 --- TCG-Based Multi-Bend BDF --- p.25 / Chapter 2.3.1 --- Placement Constraints for Bus --- p.26 / Chapter 2.3.2 --- Bus Ordering --- p.28 / Chapter 2.4 --- Bus-Pin-Aware BDF --- p.30 / Chapter 2.5 --- Summary --- p.33 / Chapter 3 --- Fixed-Outline BDF --- p.35 / Chapter 3.1 --- Introduction --- p.35 / Chapter 3.2 --- Problem Formulation --- p.36 / Chapter 3.3 --- The Overview of Our Approach --- p.36 / Chapter 3.4 --- Partitioning --- p.37 / Chapter 3.4.1. --- The Overview of Partitioning --- p.38 / Chapter 3.4.2 --- Building a Hypergraph G --- p.39 / Chapter 3.5 --- Floorplaiining with Bus Routing --- p.43 / Chapter 3.5.1 --- Find Bus Routes --- p.43 / Chapter 3.5.2 --- Realization of Bus Routes --- p.48 / Chapter 3.5.3 --- Details of the Annealing Process --- p.50 / Chapter 3.6 --- Handle Fixed-Outline Constraints --- p.52 / Chapter 3.7 --- Bus Layout --- p.52 / Chapter 3.8 --- Experimental Results --- p.56 / Chapter 3.9 --- Summary --- p.61 / Chapter 4 --- Fixed-Outline BDF with L-shape bus --- p.63 / Chapter 4.1 --- Introduction --- p.63 / Chapter 4.2 --- Problem Formulation --- p.64 / Chapter 4.3 --- Our Approach --- p.65 / Chapter 4.3.1 --- Bus Routability Checking --- p.67 / Chapter 4.3.2 --- Details of the Annealing Process --- p.79 / Chapter 4.4 --- Experimental Results --- p.79 / Chapter 4.5 --- Summary --- p.82 / Chapter 5 --- Conclusion --- p.85 / Bibliography --- p.92
|
316 |
Clock routing for high performance microprocessor designs.January 2011 (has links)
Tian, Haitong. / Chinese abstract is on unnumbered page. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2011. / Includes bibliographical references (p. 65-74). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Motivations --- p.1 / Chapter 1.2 --- Our Contributions --- p.2 / Chapter 1.3 --- Organization of the Thesis --- p.3 / Chapter 2 --- Background Study --- p.4 / Chapter 2.1 --- Traditional Clock Routing Problem --- p.4 / Chapter 2.2 --- Tree-Based Clock Routing Algorithms --- p.5 / Chapter 2.2.1 --- Clock Routing Using H-tree --- p.5 / Chapter 2.2.2 --- Method of Means and Medians(MMM) --- p.6 / Chapter 2.2.3 --- Geometric Matching Algorithm (GMA) --- p.8 / Chapter 2.2.4 --- Exact Zero-Skew Algorithm --- p.9 / Chapter 2.2.5 --- Deferred Merge Embedding (DME) --- p.10 / Chapter 2.2.6 --- Boundary Merging and Embedding (BME) Algorithm --- p.14 / Chapter 2.2.7 --- Planar Clock Routing Algorithm --- p.17 / Chapter 2.2.8 --- Useful-skew Tree Algorithm --- p.18 / Chapter 2.3 --- Non-Tree Clock Distribution Networks --- p.19 / Chapter 2.3.1 --- Grid (Mesh) Structure --- p.20 / Chapter 2.3.2 --- Spine Structure --- p.20 / Chapter 2.3.3 --- Hybrid Structure --- p.21 / Chapter 2.4 --- Post-grid Clock Routing Problem --- p.22 / Chapter 2.5 --- Limitations of the Previous Work --- p.24 / Chapter 3 --- Post-Grid Clock Routing Problem --- p.26 / Chapter 3.1 --- Introduction --- p.26 / Chapter 3.2 --- Problem Definition --- p.27 / Chapter 3.3 --- Our Approach --- p.30 / Chapter 3.3.1 --- Delay-driven Path Expansion Algorithm --- p.31 / Chapter 3.3.2 --- Pre-processing to Connect Critical ports --- p.34 / Chapter 3.3.3 --- Post-processing to Reduce Capacitance --- p.36 / Chapter 3.4 --- Experimental Results --- p.39 / Chapter 3.4.1 --- Experiment Setup --- p.39 / Chapter 3.4.2 --- Validations of the Delay and Slew Estimation --- p.39 / Chapter 3.4.3 --- Comparisons with the Tree Grow (TG) Approach --- p.41 / Chapter 3.4.4 --- Lowest Achievable Delays --- p.42 / Chapter 3.4.5 --- Simulation Results --- p.42 / Chapter 4 --- Non-tree Based Post-Grid Clock Routing Problem --- p.44 / Chapter 4.1 --- Introduction --- p.44 / Chapter 4.2 --- Handling Ports with Large Load Capacitances --- p.46 / Chapter 4.2.1 --- Problem Ports Identification --- p.47 / Chapter 4.2.2 --- Non-Tree Construction --- p.47 / Chapter 4.2.3 --- Wire Link Selection --- p.48 / Chapter 4.3 --- Path Expansion in Non-tree Algorithm --- p.51 / Chapter 4.4 --- Limitations of the Non-tree Algorithm --- p.51 / Chapter 4.5 --- Experimental Results --- p.51 / Chapter 4.5.1 --- Experiment Setup --- p.51 / Chapter 4.5.2 --- Validations of the Delay and Slew Estimation --- p.52 / Chapter 4.5.3 --- Lowest Achievable Delays --- p.53 / Chapter 4.5.4 --- Results on New Benchmarks --- p.53 / Chapter 4.5.5 --- Simulation Results --- p.55 / Chapter 5 --- Efficient Partitioning-based Extension --- p.57 / Chapter 5.1 --- Introduction --- p.57 / Chapter 5.2 --- Partition-based Extension --- p.58 / Chapter 5.3 --- Experimental Results --- p.61 / Chapter 5.3.1 --- Experiment Setup --- p.61 / Chapter 5.3.2 --- Running Time Improvement with Partitioning Technique --- p.61 / Chapter 6 --- Conclusion --- p.63 / Bibliography --- p.65
|
317 |
Cosmologie et gravité des régions sphériques compensées / Cosmology and gravity of spherically compensated cosmic regionsFromont, Paul de 23 June 2017 (has links)
Cette thèse de cosmologie est consacrée à l'étude de l'empreinte de l'énergie noire sur la formation des structures de l'Univers. Je défini et introduit les régions cosmiques compensées comme l'environnement à grande échelle autour des extrema locaux dans le champ de densité. Dans le cas d'un minimum central, cette région peut être identifiée aux vides cosmiques usuels. A l'aide de simulations numériques, je montre que ces régions présentent des propriétés de formes particulières et qu'elles dépendent de la cosmologie. Je montre que la forme moyenne de ces profils de densité ainsi que leur propriétés statistiques peuvent être calculée analytiquement dans l'Univers primordial. En utilisant une dynamique appropriée, je montre qu'il est possible de suivre précisément l'évolution non linéaire de ces structures. Il devient alors possible de reconstruire les profils de matières observés aujourd'hui à partir les profils théoriques primordiaux évolués selon une dynamique appropriée. J’exhibe une propriété fondamentale de ces régions qui maintient constant une taille particulière, le rayon de compensation. Autour de ce point, l'évolution non linéaire du champ de matière peut être suivie analytiquement. En étudiant l'effondrement gravitationnel dans des théories étendues de gravité, je montre qu'il est possible de contraindre efficacement la nature de la gravité et de la cosmologie à partir de l'étude de certaines propriétés spécifiques à ces régions. Ce travail permet à la fois de donner une origine aux profils de matière sur les très grandes échelles cosmiques mais aussi de définir de nouvelles sondes cosmologiques pour tester la nature de notre Univers. / This thesis is devoted to the study of the imprints of dark energy on the formation of the large scale structures in the Universe. I define the spherically compensated cosmic regions as the large-scale environment around local extrema in the density field. For central minimum, this region can be identified with standard cosmic voids. Using numerical simulations, I show that these regions, once properly identified, can be used efficiently to distinguish competitive cosmological models. I show that the average shape of these density profiles and their statistical properties can be analytically computed in the primordial Universe. Using an appropriate dynamical formalism, I show that it is possible to follow the nonlinear evolution of these structures until today. This allows to reconstruct the shape of such large scale regions from first principles. I exhibit a fundamental property of these regions which maintains constant a particular size : the compensation radius. Around this radius, the nonlinear evolution of the matter field can be analytically derived. By studying the gravitational collapse in gravity models beyond General Relativity, I show that it is possible to constrain efficiently both cosmology and the nature of gravity. Beside giving a physically motivated model for both shape and statistical properties of such large scale matter profile, this work also define new cosmological probes that could be used to test the nature of our Universe.
|
318 |
Big Data : le nouvel enjeu de l'apprentissage à partir des données massives / Big Data : the new challenge Learning from data MassiveAdjout Rehab, Moufida 01 April 2016 (has links)
Le croisement du phénomène de mondialisation et du développement continu des technologies de l’information a débouché sur une explosion des volumes de données disponibles. Ainsi, les capacités de production, de stockage et de traitement des donnée sont franchi un tel seuil qu’un nouveau terme a été mis en avant : Big Data.L’augmentation des quantités de données à considérer, nécessite la mise en oeuvre de nouveaux outils de traitement. En effet, les outils classiques d’apprentissage sont peu adaptés à ce changement de volumétrie tant au niveau de la complexité de calcul qu’à la durée nécessaire au traitement. Ce dernier, étant le plus souvent centralisé et séquentiel,ce qui rend les méthodes d’apprentissage dépendantes de la capacité de la machine utilisée. Par conséquent, les difficultés pour analyser un grand jeu de données sont multiples.Dans le cadre de cette thèse, nous nous sommes intéressés aux problèmes rencontrés par l’apprentissage supervisé sur de grands volumes de données. Pour faire face à ces nouveaux enjeux, de nouveaux processus et méthodes doivent être développés afin d’exploiter au mieux l’ensemble des données disponibles. L’objectif de cette thèse est d’explorer la piste qui consiste à concevoir une version scalable de ces méthodes classiques. Cette piste s’appuie sur la distribution des traitements et des données pou raugmenter la capacité des approches sans nuire à leurs précisions.Notre contribution se compose de deux parties proposant chacune une nouvelle approche d’apprentissage pour le traitement massif de données. Ces deux contributions s’inscrivent dans le domaine de l’apprentissage prédictif supervisé à partir des données volumineuses telles que la Régression Linéaire Multiple et les méthodes d’ensemble comme le Bagging.La première contribution nommée MLR-MR, concerne le passage à l’échelle de la Régression Linéaire Multiple à travers une distribution du traitement sur un cluster de machines. Le but est d’optimiser le processus du traitement ainsi que la charge du calcul induite, sans changer évidement le principe de calcul (factorisation QR) qui permet d’obtenir les mêmes coefficients issus de la méthode classique.La deuxième contribution proposée est appelée "Bagging MR_PR_D" (Bagging based Map Reduce with Distributed PRuning), elle implémente une approche scalable du Bagging,permettant un traitement distribué sur deux niveaux : l’apprentissage et l’élagage des modèles. Le but de cette dernière est de concevoir un algorithme performant et scalable sur toutes les phases de traitement (apprentissage et élagage) et garantir ainsi un large spectre d’applications.Ces deux approches ont été testées sur une variété de jeux de données associées àdes problèmes de régression. Le nombre d’observations est de plusieurs millions. Nos résultats expérimentaux démontrent l’efficacité et la rapidité de nos approches basées sur la distribution de traitement dans le Cloud Computing. / In recent years we have witnessed a tremendous growth in the volume of data generatedpartly due to the continuous development of information technologies. Managing theseamounts of data requires fundamental changes in the architecture of data managementsystems in order to adapt to large and complex data. Single-based machines have notthe required capacity to process such massive data which motivates the need for scalablesolutions.This thesis focuses on building scalable data management systems for treating largeamounts of data. Our objective is to study the scalability of supervised machine learningmethods in large-scale scenarios. In fact, in most of existing algorithms and datastructures,there is a trade-off between efficiency, complexity, scalability. To addressthese issues, we explore recent techniques for distributed learning in order to overcomethe limitations of current learning algorithms.Our contribution consists of two new machine learning approaches for large scale data.The first contribution tackles the problem of scalability of Multiple Linear Regressionin distributed environments, which permits to learn quickly from massive volumes ofexisting data using parallel computing and a divide and-conquer approach to providethe same coefficients like the classic approach.The second contribution introduces a new scalable approach for ensembles of modelswhich allows both learning and pruning be deployed in a distributed environment.Both approaches have been evaluated on a variety of datasets for regression rangingfrom some thousands to several millions of examples. The experimental results showthat the proposed approaches are competitive in terms of predictive performance while reducing significantly the time of training and prediction.
|
319 |
Correlates of Mathematics Achievement in Developed and Developing Countries: An HLM Analysis of TIMSS 2003 Eighth-Grade Mathematics ScoresPhan, Ha T 10 October 2008 (has links)
Using eighth-grade mathematics scores from TIMSS 2003, a large-scale international achievement assessment database, this study investigated correlates of math achievement in two developed countries, Canada and the United States and two developing countries, Egypt and South Africa. Variation in math achievement within and between schools for individual countries was accounted for by a series of two-level HLM models. Specifically, there were five sets of HLM models representing student background, home resources, instructional practices, teacher background, and school background related factors. In addition, a final model was built by including all the statistically significant predictors in earlier models to predict math achievement. Findings from this study suggested that whereas the instructional practices model worked the best for the United States and the teacher background model served as the most efficient and parsimonious model for predicting math achievement in Egypt, the final model served as the best model for predicting math achievement in Canada and South Africa. These findings provide empirical evidence that different models are needed to account for factors related to achievement in different countries. This study, therefore, highlights the importance that policy makers and educators from developing countries should not base their educational decisions and educational reform projects solely on research findings of developed countries. Rather, they need to use their country-specific findings to support their educational decisions. This study also provides a methodological framework for applied researchers to evaluate the effects of background and contextual factors on students' math achievement
|
320 |
Prediction of DNA-Binding Proteins and their Binding SitesPokhrel, Pujan 01 May 2018 (has links)
DNA-binding proteins play an important role in various essential biological processes such as DNA replication, recombination, repair, gene transcription, and expression. The identification of DNA-binding proteins and the residues involved in the contacts is important for understanding the DNA-binding mechanism in proteins. Moreover, it has been reported in the literature that the mutations of some DNA-binding residues on proteins are associated with some diseases. The identification of these proteins and their binding mechanism generally require experimental techniques, which makes large scale study extremely difficult. Thus, the prediction of DNA-binding proteins and their binding sites from sequences alone is one of the most challenging problems in the field of genome annotation. Since the start of the human genome project, many attempts have been made to solve the problem with different approaches, but the accuracy of these methods is still not suitable to do large scale annotation of proteins. Rather than relying solely on the existing machine learning techniques, I sought to combine those using novel “stacking technique” and used the problem-specific architectures to solve the problem with better accuracy than the existing methods. This thesis presents a possible solution to the DNA-binding proteins prediction problem which performs better than the state-of-the-art approaches.
|
Page generated in 0.0757 seconds