21 |
Identifying Regulatory Patterns at the 3'end Regions of Over-expressed and Under-expressed GenesOthoum, Ghofran K. 05 1900 (has links)
Promoters, neighboring regulatory regions and those extending further upstream of the 5’end of genes, are considered one of the main components affecting the expression status of genes in a specific phenotype. More recently research by Chen et al. (2006, 2012) and Mapendano et al. (2010) demonstrated that the 3’end regulatory regions of genes also influence gene expression. However, the association between the regulatory regions surrounding 3’end of genes and their over- or under-expression status in a particular phenotype has not been systematically studied. The aim of this study is to ascertain if regulatory regions surrounding the 3’end of genes contain sufficient regulatory information to correlate genes with their expression status in a particular phenotype. Over- and under-expressed ovarian cancer (OC) genes were used as a model. Exploratory analysis of the 3’end regions were performed by transforming the annotated regions using principal component analysis (PCA), followed by clustering the transformed data thereby achieving a clear separation of genes with different expression status. Additionally, several classification algorithms such as Naïve Bayes, Random Forest and Support Vector Machine (SVM) were tested with different parameter settings to analyze the discriminatory capacity of the 3’end regions of genes related to their gene expression status. The best performance was achieved using the SVM
classification model with 10-fold cross-validation that yielded an accuracy of 98.4%, sensitivity of 99.5% and specificity of 92.5%. For gene expression status for newly available instances, based on information derived from the 3’end regions, an SVM predictive model was developed with 10-fold cross-validation that yielded an accuracy of 67.0%, sensitivity of 73.2% and specificity of 61.0%. Moreover, building an SVM with polynomial kernel model to PCA transformed data yielded an accuracy of 83.1%, sensitivity of 92.5% and specificity of 74.8% using 10-fold cross-validation for evaluation. These clustering and classification analyses strongly suggest that the regions surrounding the 3’end of genes contain sufficiently rich regulatory information to discriminate between over- and under-expressed genes; at least in the case of genes implicated in OC.
|
22 |
Toward Better Website Usage: Leveraging Data Mining Techniques and Rough Set Learning to Construct Better-to-use WebsitesKhasawneh, Natheer Yousef 23 September 2005 (has links)
No description available.
|
23 |
Bayesian Infinite Mixture Models for Gene Clustering and Simultaneous Context Selection Using High-Throughput Gene Expression DataFreudenberg, Johannes M. January 2009 (has links)
No description available.
|
24 |
Approaches to Find the Functionally Related Experiments Based on Enrichment Scores: Infinite Mixture Model Based Cluster Analysis for Gene Expression DataLi, Qian 18 October 2013 (has links)
No description available.
|
25 |
ANALYSIS OF LIQUID POOLING DURING LATE-STAGE SOLIDIFICATIONAshraf, Rameez 10 1900 (has links)
<p>Grain structure and secondary phases play a critical role in determining the mechanical properties of industrial alloys. The spatial variation of such phases is very closely correlated to the liquid pooling established during late stage solidification and grain boundary coalescence. Obtaining a theory that correlates the evolution of length scales during grain boundary coalescence is a critical step toward the optimization of commercial alloys. This thesis highlights various phenomena that enter such a theory. They include coarsening and coalescence of dendrites, nucleation mechanisms and changes in composition of inter-dendritic liquid where second phases tend to initially form. Quantitative phase field models of solidification to simulate casting conditions and microstructure evolution are used in combination with characterization techniques to illustrate the connection between number, size, and distribution of liquid pools. Characterization techniques include spectral analysis, and clustering analysis by way of the Hoshen-Kopleman algorithm. By characterizing late-stage liquid pools, this thesis aims to be a first step towards developing a statistical scaling theory of length scale of liquid pooling.</p> / Master of Applied Science (MASc)
|
26 |
Enhancing PV Hosting Capacity of Distribution Feeders using Voltage Profile DesignJain, Akshay Kumar 06 March 2018 (has links)
Distribution feeders form the last leg of the bulk power system and have the responsibility of providing reliable power to the customers. These feeders experience voltage drops due to a combination of feeder length, load distribution, and other factors. Traditionally, voltage drop was a major concern. Now, due to an ever-increasing PV penetration, overvoltage has also become a major concern. This limits the amount of solar PV that may be integrated.
Few solutions exist to improve the voltage profile, where the most common is the use of voltage control devices like shunt capacitors and voltage regulators. Due to a large number of design parameters to be considered, the determination of the numbers and locations of these devices is a challenging problem. Significant research has been done to address this problem, utilizing a wide array of optimization techniques. However, many utilities still determine these locations and numbers manually. This is because most algorithms have not been adequately validated. The validation of a voltage profile design (VPD) algorithm has been presented here. The validation of this algorithm was carried out on a set of statistically relevant feeders. These feeders were chosen based on the results obtained from a feeder taxonomy study using clustering analysis. The algorithm was found to be effective in enhancing the amount of solar PV a feeder may host, while still maintaining all the voltages within the ANSI standard limits. Furthermore, the methodology adopted here may also be used for the validation of other algorithms. / Master of Science / Utilities have the responsibility of providing reliable power supply to their customers. Traditionally, bulk power was generated and transmitted over long distances incurring losses and voltage drops along the way. Now, with the integration of distributed energy resources, particularly solar photovoltaic (PV) generators at the customer locations, overvoltage has also become a problem. This requires adoption of measures which can help in maintaining the voltages within standard limits.
Several options exist to compensate for these voltage issues, the most commonly used is voltage control devices like capacitor banks and voltage regulators. However, determining the required numbers of these devices and their appropriate locations is a challenging problem. Even though a number of algorithms have been proposed to give automated solutions to this problem, most utilities still use a manual approach. This is because these algorithms have not been validated on a statistically relevant set of feeders. To solve this issue, the validation of a voltage profile design (VPD) algorithm is presented in this thesis. The ability of this algorithm to enhance the amount of PV that may be connected to a distribution network has been validated on a set of feeders. The feeders were chosen based on the results obtained from clustering analysis, a machine learning concept. The cost effectiveness of this algorithm has also been investigated and significant savings were observed. Furthermore, the methodology adopted here can be easily extended for the validation of other algorithms as well.
|
27 |
Locating Potential Aspect Interference Using Clustering AnalysisBennett, Brian Todd 01 May 2015 (has links)
Software design continues to evolve from the structured programming paradigm of the 1970s and 1980s and the object-oriented programming (OOP) paradigm of the 1980s and 1990s. The functional decomposition design methodology used in these paradigms reduced the prominence of non-functional requirements, which resulted in scattered and tangled code to address non-functional elements. Aspect-oriented programming (AOP) allowed the removal of crosscutting concerns scattered throughout class code into single modules known as aspects. Aspectization resulted in increased modularity in class code, but introduced new types of problems that did not exist in OOP. One such problem was aspect interference, in which aspects meddled with the data flow or control flow of a program. Research has developed various solutions for detecting and addressing aspect interference using formal design and specification methods, and by programming techniques that specify aspect precedence. Such explicit specifications required practitioners to have a complete understanding of possible aspect interference in an AOP system under development. However, as system size increased, understanding of possible aspect interference could decrease. Therefore, practitioners needed a way to increase their understanding of possible aspect interference within a program. This study used clustering analysis to locate potential aspect interference within an aspect-oriented program under development, using k-means partitional clustering. Vector space models, using two newly defined metrics, interference potential (IP) and interference causality potential (ICP), and an existing metric, coupling on advice execution (CAE), provided input to the clustering algorithms. Resulting clusters were analyzed via an internal strategy using the R-Squared, Dunn, Davies-Bouldin, and SD indexes. The process was evaluated on both a smaller scale AOP system (AspectTetris), and a larger scale AOP system (AJHotDraw). By seeding potential interference problems into these programs and comparing results using visualizations, this study found that clustering analysis provided a viable way for detecting interference problems in aspect-oriented software. The ICP model was best at detecting interference problems, while the IP model produced results that were more sporadic. The CAE clustering models were not effective in pinpointing potential aspect interference problems. This was the first known study to use clustering analysis techniques specifically for locating aspect interference.
|
28 |
Estabilidade em análise de agrupamento via modelo AMMI com reamostragem \"bootstrap\" / Stability in clustering analysis through the AMMI methodology with bootstrapGodoi, Débora Robert de 11 October 2013 (has links)
O objetivo deste trabalho é propor uma nova metodologia de interpretação da estabilidade dos métodos de agrupamento, para dados de vegetação, utilizando a metodologia AMMI e a reamostragem (bootstrap), para ganhar confiabilidade nos agrupamentos formados. Os dados utilizados são provenientes do departamento de genética da Escola Superior de Agricultura \"Luiz de Queiroz\", e visam à produtividade de soja. Primeiramente aplica-se a metodologia AMMI e então, é estimada a matriz de distâncias euclidianas - com base nos dados originais e obtidos via reamostragem (bootstrap) - para a aplicação dos métodos de agrupamento (vizinho mais próximo, vizinho mais distante, ligação média, centroide, mediana e Ward). Para a verificação da validade dos agrupamentos formados utiliza-se o coeficiente de correlação cofenética, e pelo teste de Mantel, é apresentada a distribuição empírica dos coeficientes de correlação cofenética. Os agrupamentos obtidos pelos diferentes métodos são, em sua maioria, semelhantes indicando que, em princípio, qualquer um desses métodos seria adequado para a representação. O método que apresenta resultados discrepantes em relação aos outros (tanto para os dados originais, quanto pelos dados obtidos via bootstrap) - na representação gráfica em dendrograma - é método de Ward. Este estudo é promissor na análise da validade de agrupamentos formados em dados de vegetação. / The objective of this work is to propose a new interpretation methodology of clustering methods for vegetation data stability, using the AMMI and bootstrap methodology, to gain reliability in the clusters formed. The database used is from the Departament of Genetics of Luiz de Queiroz College of Agriculture, aiming soybean yield. Firstly AMMI is applied, then the Euclidian distance matrix is estimated - based on the original data and on the acquired by the bootstrap method - for the application of clustering methods (nearest neighbor, furthest neighbor, average linkage, centroid , median and Ward). In order to assess the validity of clusters formed the cophenetic correlation coefficient is used, and the Mantel test, in order to show the empirical distribution of the cophenetic correlation coefficients. The clusters obtained by different methods are, in most cases, quite similar, indicating that in principle, any of these methods would be suitable for the representation. The method that presents discrepant results (for both the original and bootstrap method obtained data) - on the dendrogram graphical representation, compared to the others - is the Ward\'s. This study is promising in the analysis of validity of clusters formed in vegetation data.
|
29 |
Retrofit urbano: uma abordagem para apoio de tomada de decisão. / Urban retrofitting an approach to support decision-making.Iara Negreiros 07 December 2018 (has links)
Acomodar adequadamente uma população urbana crescente terá implicações maiores não só para a indústria da construção, empregos e habitação, mas também para a infraestrutura associada, incluindo transporte, energia, água e espaços abertos ou verdes. Limitações da infraestrutura geralmente incluem o envelhecimento, subutilização e inadequação, assim como uma ausência de integração das estratégias de planejamento, projeto e gestão para o desenvolvimento futuro da cidade, em cenários de longo prazo. A exemplo do retrofit de edifícios, em que as intervenções ocorrem no âmbito do edifício isolado e seus sistemas constituintes, o retrofit urbano pode ser entendido como um conjunto de intervenções urbanas com vistas não somente à adequação da área urbana para atingir a sustentabilidade no momento presente, frente a problemas e demandas atuais, mas vislumbra a adequação para população e demandas futuras, fazendo a transição da situação atual da cidade para sua visão de futuro. Esta transição, o retrofit urbano em si, apresenta caráter abrangente e de larga escala, natureza integrada e deve ser mensurado por meio de indicadores e metas claramente definidos para monitoramento. Portanto, esta tese apresenta um método para implementação de retrofit urbano, na escala de cidades, para auxiliar a definição de metas de longo prazo e a tomada de decisão em processos de planejamento urbano. Utilizando as metas dos ODS - Objetivos de Desenvolvimento Sustentável, os \"indicadores de serviços urbanos e qualidade de vida\" da NBR ISO 37120:2017 (ABNT, 2017a), análise de tendência por Média Móvel Simples e benchmarking por análise de agrupamento (clustering), o resultado é um painel visual (dashboard), adaptável e flexível, passível de agregações e filtros, tais como: seções e temas da ISO 37120, classificação de indicadores, diferentes escalas temporais e espaciais, entre outras. O dashboard é interativo e amigável, traz informações e resultados desta pesquisa e pode ser totalmente acessado em https://bit.ly/2EDnZ4J. Sorocaba, município de grande porte do Estado de São Paulo, é utilizada como estudo de caso, evidenciando os desafios e oportunidades gerados pelo rápido crescimento populacional e auxiliando a priorizar intervenções de retrofit para o desenvolvimento urbano na direção de cenários futuros. / Accommodating growing populations in cities will have major implications not only for employment, housing and the construction industry, but also for urban infrastructure including transportation, energy, water and open or green space. Infrastructure constraints currently include ageing, underutilized and inadequate existing built environment, as well as a lack of integration in planning, design and management strategies for future infrastructure development in long-term scenarios. As building retrofit, which interventions take place in isolated buildings and their constituting systems performance, urban retrofitting can be understood as a set of interventions designed to upgrade and sustain an urban area by providing a long-term practical response to its current problems and pressures. Such interventions must take into account the future population´s needs by ensuring that the present urban infrastructure provides a firm basis for launching and achieving a city\'s ambitions for the future. One of the main requirements for urban retrofitting is a clearly defined set of goals and metrics for monitoring purposes. This thesis presents a method for urban retrofit implementation at city scale using a visual tool to support decision-making and urban planning processes. Using Sustainable Development Goals (SDGs) targets, the 100 ISO 37120:2014 \'indicators for city services and quality of life\', Simple Moving Averages (SMA) trend analysis, clustering and city benchmarking, this method proposes creating an adaptative and flexible dashboard, that could aggregate and filter data, such as: ISO 37120 sections, indicators classification, time and spatial levels, etc. The resulting dashboard is interactive and friendly, and can be fully accessed in https://bit.ly/2EDnZ4J. We use Sorocaba, a medium sized, well-located city in São Paulo State in Brazil, as a case study, focusing on the challenges and opportunities arising from exceptional urban population growth, and ranking key retrofit interventions in Sorocaba as possible forerunners of future urban development scenarios.
|
30 |
Information Structures in Notated Music: Statistical Explorations of Composers' Performance Marks in Solo Piano ScoresBuchanan, J. Paul 05 1900 (has links)
Written notation has a long history in many musical traditions and has been particularly important in the composition and performance of Western art music. This study adopted the conceptual view that a musical score consists of two coordinated but separate communication channels: the musical text and a collection of composer-selected performance marks that serve as an interpretive gloss on that text. Structurally, these channels are defined by largely disjoint vocabularies of symbols and words. While the sound structures represented by musical texts are well studied in music theory and analysis, the stylistic patterns of performance marks and how they acquire contextual meaning in performance is an area with fewer theoretical foundations.
This quantitative research explored the possibility that composers exhibit recurring patterns in their use of performance marks. Seventeen solo piano sonatas written between 1798 and 1913 by five major composers were analyzed from modern editions by tokenizing and tabulating the types and usage frequencies of their individual performance marks without regard to the associated musical texts. Using analytic methods common in information science, the results demonstrated persistent statistical similarities among the works of each composer and differences among the work groups of different composers. Although based on a small sample, the results still offered statistical support for the existence of recurring stylistic patterns in composers' use of performance marks across their works.
|
Page generated in 0.0839 seconds