621 |
Modelagem da produtividade da cultura da cana de açúcar por meio do uso de técnicas de mineração de dados / Modeling sugarcane yield through Data Mining techniquesHammer, Ralph Guenther 27 July 2016 (has links)
O entendimento da hierarquia de importância dos fatores que influenciam a produtividade da cana de açúcar pode auxiliar na sua modelagem, contribuindo assim para a otimização do planejamento agrícola das unidades produtoras do setor, bem como no aprimoramento das estimativas de safra. Os objetivos do presente estudo foram a ordenação das variáveis que condicionam a produtividade da cana de açúcar, de acordo com a sua importância, bem como o desenvolvimento de modelos matemáticos de produtividade da cana de açúcar. Para tanto, foram utilizadas três técnicas de mineração de dados nas análises de bancos de dados de usinas de cana de açúcar no estado de São Paulo. Variáveis meteorológicas e de manejo agrícola foram submetidas às análises por meio das técnicas Random Forest, Boosting e Support Vector Machines, e os modelos resultantes foram testados por meio da comparação com dados independentes, utilizando-se o coeficiente de correlação (r), índice de Willmott (d), índice de confiança de Camargo (C), erro absoluto médio (EAM) e raíz quadrada do erro médio (RMSE). Por fim, comparou-se o desempenho dos modelos gerados com as técnicas de mineração de dados com um modelo agrometeorológico, aplicado para os mesmos bancos de dados. Constatou-se que, das variáveis analisadas, o número de cortes foi o fator mais importante em todas as técnicas de mineração de dados. A comparação entre as produtividades estimadas pelos modelos de mineração de dados e as produtividades observadas resultaram em RMSE variando de 19,70 a 20,03 t ha-1 na abordagem mais geral, que engloba todas as regiões do banco de dados. Com isso, o desempenho preditivo foi superior ao modelo agrometeorológico, aplicado no mesmo banco de dados, que obteve RMSE ≈ 70% maior (≈ 34 t ha-1). / The understanding of the hierarchy of the importance of the factors which influence sugarcane yield can subsidize its modeling, thus contributing to the optimization of agricultural planning and crop yield estimates. The objectives of this study were to ordinate the variables which condition the sugarcane yield, according to their relative importance, as well as the development of mathematical models for predicting sugarcane yield. For this, three Data Mining techniques were applied in the analyses of data bases of several sugar mills in the State of São Paulo, Brazil. Meteorological and crop management variables were analyzed through the Data Mining techniques Random Forest, Boosting and Support Vector Machines, and the resulting models were tested through the comparison with an independent data set, using the coefficient of correlation (r), Willmott index (d), confidence index of Camargo (c), mean absolute error (MAE), and root mean square error (RMSE). Finally, the predictive performances of these models were compared with the performance of an agrometeorological model, applied in the same data set. The results allowed to conclude that, within all the variables, the number of cuts was the most important factor considered by all Data Mining models. The comparison between the observed yields and those estimated by the Data Mining techniques resulted in a RMSE ranging between 19,70 to 20,03 t ha-1, in the general method, which considered all regions of the data base. Thus, the predictive performances of the Data Mining algorithms were superior to that of the agrometeorological model, which presented RMSE ≈ 70% higher (≈ 34 t ha-1).
|
622 |
Uma metodologia de projetos para circuitos com reconfiguração dinâmica de hardware aplicada a support vector machines. / A design methodology for circuits with dynamic reconfiguration of hardware applied to support vector machines.Gonzalez, José Artur Quilici 07 November 2006 (has links)
Sistemas baseados em processadores de uso geral caracterizam-se pela flexibilidade a mudanças de projeto, porém com desempenho computacional abaixo daqueles baseados em circuitos dedicados otimizados. A implementação de algoritmos em dispositivos reconfiguráveis, conhecidos como Field Programmable Gate Arrays - FPGAs, oferece uma solução de compromisso entre a flexibilidade dos processadores e o desempenho dos circuitos dedicados, pois as FPGAs permitem que seus recursos de hardware sejam configurados por software, com uma granularidade menor que a do processador de uso geral e flexibilidade maior que a dos circuitos dedicados. As versões atuais de FPGAs apresentam um tempo de reconfiguração suficientemente pequeno para viabilizar sua reconfiguração dinâmica, i.e., mesmo com o dispositivo executando um algoritmo, a forma como seus recursos são dispostos pode ser alterada, oferecendo a possibilidade de particionar temporalmente um algoritmo. Novas linhas de FPGAs já são fabricadas com opção de reconfiguração dinâmica parcial, i.e., é possível reconfigurar áreas selecionadas de uma FPGA enquanto o restante continua em operação. No entanto, para que esta nova tecnologia se torne largamente difundida é necessário o desenvolvimento de uma metodologia própria, que ofereça soluções eficazes aos novos desdobramentos do projeto digital. Em particular, uma das principais dificuldades apresentadas por esta abordagem refere-se à maneira de particionar o algoritmo, de forma a minimizar o tempo necessário para completar sua tarefa. Este manuscrito oferece uma metodologia de projeto para dispositivos dinamicamente reconfiguráveis, com ênfase no problema do particionamento temporal de circuitos, tendo como aplicação alvo uma família de algoritmos, utilizados principalmente em Bioinformática, representada pelo classificador binário conhecido como Support Vector Machine. Algumas técnicas de particionamento para FPGA Dinamicamente Reconfigurável, especificamente aplicáveis ao particionamento de FSM, foram desenvolvidas para garantir que um projeto dominado por fluxo de controle seja mapeado numa única FPGA, sem alterar sua funcionalidade. / Systems based on general-purpose processors are characterized by a flexibility to design changes, although with a computational performance below those based on optimized dedicated circuits. The implementation of algorithms in reconfigurable devices, known as Field Programmable Gate Arrays, FPGAs, offers a solution with a trade-off between the processor\'s flexibility and the dedicated circuit\'s performance. With FPGAs it is possible to have their hardware resources configured by software, with a smaller granularity than that of the general-purpose processor and greater flexibility than that of dedicated circuits. Current versions of FPGAs present a reconfiguration time sufficiently small as to make feasible dynamic reconfiguration, i.e., even with the device executing an algorithm, the way its resources are displayed can be modified, offering the possibility of temporal partitioning of an algorithm. New lines of FPGAs are already being manufactured with the option of partial dynamic reconfiguration, i.e. it is possible to reconfigure selected areas of an FPGA anytime, while the remainder area continue in operation. However, in order for this new technology to become widely adopted the development of a proper methodology is necessary, which offers efficient solutions to the new stages of the digital project. In particular, one of the main difficulties presented by this approach is related to the way of partitioning the algorithm, in order to minimize the time necessary to complete its task. This manuscript offers a project methodology for dynamically reconfigurable devices, with an emphasis on the problem of the temporal partitioning of circuits, having as a target application a family of algorithms, used mainly in Bioinformatics, represented by the binary classifier known as Support Machine Vector. Some techniques of functional partitioning for Dynamically Reconfigurable FPGA, specifically applicable to partitioning of FSMs, were developed to guarantee that a control flow dominated design be mapped in only one FPGA, without modifying its functionality.
|
623 |
Modelagem da produtividade da cultura da cana de açúcar por meio do uso de técnicas de mineração de dados / Modeling sugarcane yield through Data Mining techniquesRalph Guenther Hammer 27 July 2016 (has links)
O entendimento da hierarquia de importância dos fatores que influenciam a produtividade da cana de açúcar pode auxiliar na sua modelagem, contribuindo assim para a otimização do planejamento agrícola das unidades produtoras do setor, bem como no aprimoramento das estimativas de safra. Os objetivos do presente estudo foram a ordenação das variáveis que condicionam a produtividade da cana de açúcar, de acordo com a sua importância, bem como o desenvolvimento de modelos matemáticos de produtividade da cana de açúcar. Para tanto, foram utilizadas três técnicas de mineração de dados nas análises de bancos de dados de usinas de cana de açúcar no estado de São Paulo. Variáveis meteorológicas e de manejo agrícola foram submetidas às análises por meio das técnicas Random Forest, Boosting e Support Vector Machines, e os modelos resultantes foram testados por meio da comparação com dados independentes, utilizando-se o coeficiente de correlação (r), índice de Willmott (d), índice de confiança de Camargo (C), erro absoluto médio (EAM) e raíz quadrada do erro médio (RMSE). Por fim, comparou-se o desempenho dos modelos gerados com as técnicas de mineração de dados com um modelo agrometeorológico, aplicado para os mesmos bancos de dados. Constatou-se que, das variáveis analisadas, o número de cortes foi o fator mais importante em todas as técnicas de mineração de dados. A comparação entre as produtividades estimadas pelos modelos de mineração de dados e as produtividades observadas resultaram em RMSE variando de 19,70 a 20,03 t ha-1 na abordagem mais geral, que engloba todas as regiões do banco de dados. Com isso, o desempenho preditivo foi superior ao modelo agrometeorológico, aplicado no mesmo banco de dados, que obteve RMSE ≈ 70% maior (≈ 34 t ha-1). / The understanding of the hierarchy of the importance of the factors which influence sugarcane yield can subsidize its modeling, thus contributing to the optimization of agricultural planning and crop yield estimates. The objectives of this study were to ordinate the variables which condition the sugarcane yield, according to their relative importance, as well as the development of mathematical models for predicting sugarcane yield. For this, three Data Mining techniques were applied in the analyses of data bases of several sugar mills in the State of São Paulo, Brazil. Meteorological and crop management variables were analyzed through the Data Mining techniques Random Forest, Boosting and Support Vector Machines, and the resulting models were tested through the comparison with an independent data set, using the coefficient of correlation (r), Willmott index (d), confidence index of Camargo (c), mean absolute error (MAE), and root mean square error (RMSE). Finally, the predictive performances of these models were compared with the performance of an agrometeorological model, applied in the same data set. The results allowed to conclude that, within all the variables, the number of cuts was the most important factor considered by all Data Mining models. The comparison between the observed yields and those estimated by the Data Mining techniques resulted in a RMSE ranging between 19,70 to 20,03 t ha-1, in the general method, which considered all regions of the data base. Thus, the predictive performances of the Data Mining algorithms were superior to that of the agrometeorological model, which presented RMSE ≈ 70% higher (≈ 34 t ha-1).
|
624 |
Aedes aegypti à la Martinique : écologie et transmission des arbovirus / Aedes aegypti in Martinique : ecology and transmission of arbovirusesFarraudière, Laurence 28 November 2016 (has links)
Le moustique Aedes aegypti représente un problème de santé publique majeur, car il est le principal vecteur des virus de la Fièvre Jaune, de la Dengue, du Chikungunya et du Zika à l’échelle mondiale. Dans la région des Amériques, ce moustique a été introduit, depuis le continent africain au cours du XVIIème siècle. À la Martinique, entre 2013 et 2016, cette espèce a transmis de façon active les virus de le Dengue, du Chikungunya et du Zika, plaçant l’île en situation épidémique. Sur l’île, la Dengue est devenue endémique depuis près de 20 ans, avec 7 grandes épidémies (1995, 1997, 2001, 2005, 2007, 2010, 2013) ; celle de 2010 a touché près de 40 000 personnes et causé 18 décès. Le Chikungunya est apparue en Décembre 2013 et l’épidémie qui a duré jusqu’en Janvier 2015 a touché 72 520 personnes (dont 83 décès). Le Zika est apparu en Décembre 2015 et l’épidémie a duré toute l’année 2016 (36 000 cas estimés au 30 Septembre). C’est dans ce contexte de circulation active des arbovirus, et dans la volonté d’améliorer les connaissances actuelles sur la bioécologie du vecteur Ae. aegypti, qu’a été initié ce travail sur le thème « Aedes aegypti à la Martinique : écologie et transmission des arbovirus ». La détection puis la caractérisation des virus dengue et chikungunya dans les populations naturelles du moustique ont confirmé le rôle vectoriel d’Ae. aegypti. Ces résultats permettent d’envisager la mise en place d’une veille entomo-virologique dans le cadre de la surveillance des virus circulant sur l’île ; cette veille entomo-virologique pouvant avoir une application opérationnelle (contrôle de foyers émergents). Ensuite, des études portant sur l’écologie larvaire du moustique ont été initiées. L’étude physicochimique des eaux des gîtes larvaires, les retombées de deltaméthrine suite à un traitement spatial et leurs impacts sur le développement larvaire et les traits de vie du moustique ont permis de confirmé que le phénomène de résistance aux insecticides des populations locales d’Ae. aegypti est un frein dans la stratégie de lutte contre le vecteur, dans la mesure où le développement de ce dernier n’était pas affecté. À l’échelle locale, ces études combinées, visent à compléter les données et connaissances sur le moustique, en vue d’une gestion plus efficace de ce dernier et des risques sanitaires et épidémiologiques qui lui sont associés. / Aedes aegypti mosquito is a major public health problem because it is the main vector of Yellow Fever, Dengue, Chikungunya and Zika viruses worldwide. In the Americas, the mosquito was introduced from Africa during the seventeenth century. In Martinique, between 2013 and 2016, the species has actively transmitted Dengue, Chikungunya and Zika viruses, placing the island in an epidemic situation. On the island, Dengue has become endemic in nearly 20 years, with 7 major epidemics (1995, 1997, 2001, 2005, 2007, 2010, 2013); the 2010 epidemic affected almost 40,000 people and caused 18 deaths. Chikungunya virus was introduced in December 2013 and the epidemic lasted until January 2015 and affected 72,520 people (including 83 deaths). Zika virus was introduced in December 2015 and the epidemic lasted throughout 2016 (36,000 cases estimated up to September 30th). In this context of active circulation of arboviruses, and the will to improve the current knowledge on the bioecology of the vector Ae. aegypti, we initiated the work on "Aedes aegypti in Martinique : ecology and transmission of arboviruses."Detection and characterization of Dengue and Chikungunya viruses in natural populations of the mosquito have first confirmed the role of Ae. aegypti as the main vector of these arboviruses on the island. These results allow to consider the establishment of a virological monitoring tool for surveillance of viruses circulating on the island; this can have an operational application such as the control of emerging households.Then, studies of larval mosquito ecology have been initiated. Physicochemical studies of breeding sites waters, impact of deltamethrin after spatial spray and their impacts on larval development and mosquito life traits showed no effect on the general reserves of emerging adults confirmed that the phenomenon of insecticide resistance of local populations of Ae. aegypti is an obstacle in the strategy against the vector. Locally, these studies are intended to supplement data and knowledge about the mosquito, for a more efficient management of the sanitary and epidemiological risks associated.
|
625 |
SVM-Based Negative Data Mining to Binary ClassificationJiang, Fuhua 03 August 2006 (has links)
The properties of training data set such as size, distribution and the number of attributes significantly contribute to the generalization error of a learning machine. A not well-distributed data set is prone to lead to a partial overfitting model. Two approaches proposed in this dissertation for the binary classification enhance useful data information by mining negative data. First, an error driven compensating hypothesis approach is based on Support Vector Machines (SVMs) with (1+k)-iteration learning, where the base learning hypothesis is iteratively compensated k times. This approach produces a new hypothesis on the new data set in which each label is a transformation of the label from the negative data set, further producing the positive and negative child data subsets in subsequent iterations. This procedure refines the base hypothesis by the k child hypotheses created in k iterations. A prediction method is also proposed to trace the relationship between negative subsets and testing data set by a vector similarity technique. Second, a statistical negative example learning approach based on theoretical analysis improves the performance of the base learning algorithm learner by creating one or two additional hypotheses audit and booster to mine the negative examples output from the learner. The learner employs a regular Support Vector Machine to classify main examples and recognize which examples are negative. The audit works on the negative training data created by learner to predict whether an instance is negative. However, the boosting learning booster is applied when audit does not have enough accuracy to judge learner correctly. Booster works on training data subsets with which learner and audit do not agree. The classifier for testing is the combination of learner, audit and booster. The classifier for testing a specific instance returns the learner's result if audit acknowledges learner's result or learner agrees with audit's judgment, otherwise returns the booster's result. The error of the classifier is decreased to O(e^2) comparing to the error O(e) of a base learning algorithm.
|
626 |
Employing Multiple Kernel Support Vector Machines for Counterfeit Banknote RecognitionSu, Wen-pin 29 July 2008 (has links)
Finding an efficient method to detect counterfeit banknotes is imperative. In this study, we propose multiple kernel weighted support vector machine for counterfeit banknote recognition. A variation of SVM in optimizing false alarm rate, called FARSVM, is proposed which provide minimized false negative rate and false positive rate. Each banknote is divided into m ¡Ñ n partitions, and each partition comes with its own kernels. The optimal weight with each kernel matrix in the combination is obtained through the semidefinite programming (SDP) learning method. The amount of time and space required by the original SDP is very demanding. We focus on this framework and adopt two strategies to reduce the time and space requirements. The first strategy is to assume the non-negativity of kernel weights, and the second strategy is to set the sum of weights equal to 1. Experimental results show that regions with zero kernel weights are easy to imitate with today¡¦s digital imaging technology, and regions with nonzero kernel weights are difficult to imitate. In addition, these results show that the proposed approach outperforms single kernel SVM and standard SVM with SDP on Taiwanese banknotes.
|
627 |
The Interrelationships among Stock Returns and Institutional Investors' Buy-sell Difference in Taiwan's Stock Market: An Empirical AnalysisHsueh, Lung-chin 28 August 2009 (has links)
This study investigates the long-term and short-term dynamic relationships among the variables of stock returns and institutional investors' buy-sell difference in Taiwan's stock market for the sample periods from Jan., 2000 through May, 2009. Some econometrical methodologies are used in this study, such as unit test, vector autoregressive model, cointegration test, vector error correction model, impulse response function.
The major empirical results are shown as follows:
1. Cointegration test
For the sample periods, one long-term equilibrium relationship is found from the Johansen's cointegration test, significantly with 5% confidence level between stock year returns and the buy-sell difference for the foreign investment institutions, the domestic investment institutions, and the dealers. The long-term equilibrium relationship is Ry=1.65*QFII+4.28*FUND+35.22*DLR-1142.6.
2. VECM estimation
(1)With the vector error correction model (VECM) being applied to the sample periods, the findings indicate that the changes of stock returns are not influenced among the short-term dynamic relationships by the changes of institutional investors' buy-sell difference, but only affected by one-period-lag of itself.
(2) Among the short-term dynamic relationships, the changes of foreign investment institutions' buy-sell difference are affected by one-period-lag of institutional investors that positively affected by one-period-lag of the dealers, and inversely affected by one-period-lag of itself and one-period-lag of the domestic investment institutions. However, it is positively affected by one-period-lag of long-term equilibrium, which indicates foreign investment institutions follow positive feedback trading strategies.
(3)The changes of the domestic investment institutions' buy-sell difference are only affected by one-period-lag of itself among the short-term dynamic relationships.
(4)The changes of the dealers' buy-sell difference are positively affected among the short-term dynamic relationships by one-period-lag of the foreign investment institutions. As for the long-term relationships, it is affected by one-period-lag of long-term equilibrium, which also indicates the dealers follow positive feedback trading strategies.
(5)The foreign investment institutions and the dealers have the mutual feedback relationship.
|
628 |
Support vector classification analysis of resting state functional connectivity fMRICraddock, Richard Cameron 17 November 2009 (has links)
Since its discovery in 1995 resting state functional connectivity derived from functional
MRI data has become a popular neuroimaging method for study psychiatric disorders.
Current methods for analyzing resting state functional connectivity in disease involve
thousands of univariate tests, and the specification of regions of interests to employ in the
analysis. There are several drawbacks to these methods. First the mass univariate tests
employed are insensitive to the information present in distributed networks of functional
connectivity. Second, the null hypothesis testing employed to select functional connectivity
dierences between groups does not evaluate the predictive power of identified functional
connectivities. Third, the specification of regions of interests is confounded by experimentor
bias in terms of which regions should be modeled and experimental error in terms
of the size and location of these regions of interests. The objective of this dissertation is
to improve the methods for functional connectivity analysis using multivariate predictive
modeling, feature selection, and whole brain parcellation.
A method of applying Support vector classification (SVC) to resting state functional
connectivity data was developed in the context of a neuroimaging study of depression.
The interpretability of the obtained classifier was optimized using feature selection techniques
that incorporate reliability information. The problem of selecting regions of interests
for whole brain functional connectivity analysis was addressed by clustering whole brain
functional connectivity data to parcellate the brain into contiguous functionally homogenous
regions. This newly developed famework was applied to derive a classifier capable of
correctly seperating the functional connectivity patterns of patients with depression from
those of healthy controls 90% of the time. The features most relevant to the obtain classifier
match those previously identified in previous studies, but also include several regions not
previously implicated in the functional networks underlying depression.
|
629 |
Vad styr företagens investeringar?En studie om hur förändringar i reporänta, makroekonomiska faktorer samt finansiella indikatorer påverkar investeringar hos svenska företag / What determines investments of firms?A study on how changes in the repo rate, macroeconomics factors and financial indicators affect investments of Swedish firmsJansson, Emelie, Kapple, Linda January 2015 (has links)
Bakgrund: I november 2014 beslutade Riksbanken att ta steget mot en nollränta och i februari 2015 gick Riksbanken ut med ytterligare en sänkning till -0,10 procent. På så vis fick Sverige för första gången en negativ reporänta. Enligt makroekonomisk teori ska en sänkning av reporäntan stimulera konsumtion och investeringar i ekonomin. Huruvida reporäntan och dess räntesänkningar skapar förutsättningar för företag att investera är ett aktuellt och viktigt forskningsområde. Forskningen i ämnet är tunn på den svenska marknaden och således är forskningsbidraget från denna studie av betydelse.Syfte: Syftet med studien är att undersöka och analysera hur förändringar i reporänta, makro-ekonomiska faktorer samt finansiella indikatorer påverkar investeringar hos svenska företag.Genomförande: Studien bygger på en kvantitativ metod. En Vector Autoregressive model har skapats för att redogöra hur reporäntan, de makroekonomiska faktorerna och de finansiella indikatorerna påverkar företagens investeringar. För att möjliggöra en analys av dessa effekter har impulse response functions skattats i modellen. På så vis undersöks det hur en isolerad enhetsökning i de valda variablerna påverkar företagens investeringar över flera tidsperioder. För att genomföra en mer omfattande analys skattas tre modeller där den första tar hänsyn till både makroekonomiska faktorer och finansiella indikatorer. Den andra modellen exkluderar de finansiella indikatorerna och den tredje modellen speglar reporäntans utveckling i två olika tidsperioder.Resultat: Företagens investeringar påverkas av flertalet faktorer. En enhetsökning av utlåningsräntan, växelkursen och företagens inflationsförväntningar uppvisar ett signifikant negativt samband. En enhetsökning av BNP-tillväxten visar däremot ett signifikant positivt samband. Reporäntan visar ingen direkt effekt på investeringar i de första två modellerna. Däremot uppvisar reporäntan skillnader i den tredje modellen, där ett negativt samband förekommer i den första av de två observerade tidsperioderna. / Background: The central bank of Sweden decided in November 2014 to set the repo rate close to zero. Further they decided to lower the repo rate to -0,10 percent in February 2015. In regard to this, Sweden had a negative repo rate for the first time. According to macroeconomic theory a decrease in the repo rate is performed to stimulate an economy’s investments and consumptions. Whether or not a decrease in interest rates gives greater incentives for firms to invest is a topical subject and an important field of research. In addition to this, the existing research on the Swedish market is insufficient within this field, which gives us further motives to conduct this study.Aim: The purpose of this study is to examine and analyse how changes in the repo rate, macroeconomic factors and financial indicators affects investments of Swedish firms.Completion: The study is conducted with a quantitative approach. A Vector Autoregressive model is created in order to examine the impact of changes in the repo rate, the macroeconomic factors and the financial indicators on firms’ investments. Impulse response functions are estimated to allow a further analysis of these effects. Hence, it is conceivable to examine how one isolated unit-increase in a specific variable affects firms’ investment through several time periods. Furthermore, we estimate three models, one which includes both macroeconomic variables and financial indicators and another which excludes the financial indicators. The last model reflects the repo rate’s impact on investments in two separate time periods.Result: Investments of firms are affected by numerous of factors. One unit-increase of the lending rate, the exchange rate and firms’ expectations of inflation exhibit a negative relation to investments. Furthermore, one unit-increase in GDP-growth tends to increase investments. However, the repo rate has no impact on investments in the first two models. In spite of this, evidence from the third model indicates that the repo rate has a negative impact on investments during the first period.
|
630 |
Αναγνώριση γονιδιακών εκφράσεων νεοπλασιών σε microarrays / Identification of tumor gene expression from microarraysΤσακανίκας, Παναγιώτης 16 May 2007 (has links)
Η ουσιώδης ανάπτυξη που παρουσίασε η μοριακή παθολογία τα τελευταία χρόνια, είναι συνυφασμένη με την ανάπτυξη της microarray τεχνολογίας. Αυτή η τεχνολογία μας παρέχει μια νέα οδό προσπέλασης υψηλής χωρητικότητας τέτοια ώστε: i. να δίνεται η δυνατότητα ανάλυσης μεγάλης κλίμακας της ισοτοπικής αφθονίας του αγγελιοφόρου RNA (mRNA), ως δείκτη γονιδιακών εκφράσεων (cDNA arrays), ii. να ανιχνεύονται πολυμορφισμοί ή μεταλλάξεις μέσα σε έναν πληθυσμό γονιδίων χρησιμοποιώντας ξεχωριστούς nucleotide πολυμορφισμούς (Single Nucleotide Polymorphisms arrays), iii. και για εξέταση «απώλειας» ή «κέρδους», ή αλλαγές στον αριθμό αντιγραφής κάποιου συγκεκριμένου γονιδίου που σχετίζεται με κάποια ασθένεια (CGH arrays). Η τεχνολογία των microarrays είναι ευλογοφανές να εξελιχθεί σε ακρογωνιαίο λίθο της μοριακής έρευνας στα επόμενα χρόνια, και αυτό γιατί DNA microarrays χρησιμοποιούνται για τον ποσοτικό προσδιορισμό δεκάδων χιλιάδων DNA ή RNA ακολουθιών σε μια και μόνο ανάλυση – πείραμα. Από μια σειρά από τέτοια πειράματα, είναι δυνατόν να προσδιορίσουμε τους μηχανισμούς που ελέγχουν την ενεργοποίηση των γονιδίων σε έναν οργανισμό. Ακόμη η χρήση των microarrays για την επισκόπηση γονιδιακών εκφράσεων είναι μια ραγδαία αναπτυσσόμενη τεχνολογία, η οποία μετακινήθηκε από εξειδικευμένα, σε συμβατικά βιολογικά εργαστήρια. Στην παρούσα διπλωματική εργασία κυρίως, θα αναφερθούμε στα δύο γενικά στάδια της ανάλυσης των microarray εικόνων που μας δίνονται ως απόρρεια των πειραμάτων, που έχουν ως στόχο την εξόρυξη πληροφορίας από αυτές. Τα δύο αυτά στάδια είναι: i. Επεξεργασία εικόνας και εξαγωγή πληροφορίας από αυτήν. ii. Ανάλυση της προκύπτουσας πληροφορίας και αναγνώριση των γονιδιακών εκφράσεων. Όσον αφορά το πρώτο στάδιο θα αναφέρουμε τις βασικές μεθόδους που χρησιμοποιούνται σήμερα από εμπορικά και εκπαιδευτικά πακέτα λογισμικού, οι οποίες μέθοδοι δίνουν και τα καλύτερα αποτελέσματα μέχρι στιγμής. Για το δεύτερο στάδιο, αφού αναφέρουμε τις πιο σημαντικές μεθόδους που χρησιμοποιούνται, θα υλοποιήσουμε μια δική μας μέθοδο και θα την συγκρίνουμε με τις υπάρχουσες.
|
Page generated in 0.0537 seconds