• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 284
  • 67
  • 28
  • 23
  • 20
  • 17
  • 13
  • 11
  • 10
  • 9
  • 8
  • 6
  • 6
  • 6
  • 4
  • Tagged with
  • 591
  • 93
  • 84
  • 83
  • 78
  • 63
  • 57
  • 52
  • 41
  • 40
  • 39
  • 37
  • 37
  • 35
  • 32
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
101

Análisis de mercado universitario para Ingeniería Comercial: Estudio de percepción y medición de variables estratégicas

Castro Asencio, Rodrigo, Díaz Solís, David January 2002 (has links)
El estudio propuesto, presenta diversas justificaciones. En términos generales, este trabajo de investigación sirve para otorgar la importancia necesaria a los fenómenos perceptuales del ser humano. En términos estadísticos, se buscará conocer e interpretar dichos fenómenos en nuestro público objetivo. La Universidad de Chile cumple un rol importante al estar comprometida con el país a entregar los mejores profesionales a nivel nacional e internacional que aporten progreso, por lo que este trabajo de investigación ayudará a entregar la información y los medios necesarios para saber si realmente nuestra Facultad está satisfaciendo este requerimiento. Para ello, no sólo nos valdremos de estudios de carácter subjetivo, sino que también generaremos una poderosa herramienta que nos permita medir objetivamente los atributos de una serie de universidades a definir. Luego de esto, creemos que nos encontraremos en una posición privilegiada para efectuar recomendaciones o guías para la elaboración de un plan estratégico por parte de las autoridades de nuestra facultad.
102

Analyses bioinformatiques et classements consensus pour les données biologiques à haut débit / Bioinformatics analysis and consensus ranking for biological high throughput data

Yang, Bo 30 September 2014 (has links)
Cette thèse aborde deux problèmes relatifs à l’analyse et au traitement des données biologiques à haut débit: le premier touche l’analyse bioinformatique des génomes à grande échelle, le deuxième est consacré au développement d’algorithmes pour le problème de la recherche d’un classement consensus de plusieurs classements.L’épissage des ARN est un processus cellulaire qui modifie un ARN pré-messager en en supprimant les introns et en raboutant les exons. L’hétérodimère U2AF a été très étudié pour son rôle dans processus d’épissage lorsqu’il se fixe sur des sites d’épissage fonctionnels. Cependant beaucoup de problèmes critiques restent en suspens, notamment l’impact fonctionnel des mutations de ces sites associées à des cancers. Par une analyse des interactions U2AF-ARN à l’échelle génomique, nous avons déterminé qu’U2AF a la capacité de reconnaître environ 88% des sites d’épissage fonctionnels dans le génome humain. Cependant on trouve de très nombreux autres sites de fixation d’U2AF dans le génome. Nos analyses suggèrent que certains de ces sites sont impliqués dans un processus de régulation de l’épissage alternatif. En utilisant une approche d’apprentissage automatique, nous avons développé une méthode de prédiction des sites de fixation d’UA2F, dont les résultats sont en accord avec notre modèle de régulation. Ces résultats permettent de mieux comprendre la fonction d’U2AF et les mécanismes de régulation dans lesquels elle intervient.Le classement des données biologiques est une nécessité cruciale. Nous nous sommes intéressés au problème du calcul d’un classement consensus de plusieurs classements de données, dans lesquels des égalités (ex-aequo) peuvent être présentes. Plus précisément, il s’agit de trouver un classement dont la somme des distances aux classements donnés en entrée est minimale. La mesure de distance utilisée le plus fréquemment pour ce problème est la distance de Kendall-tau généralisée. Or, il a été montré que, pour cette distance, le problème du consensus est NP-difficile dès lors qu’il y a plus de quatre classements en entrée. Nous proposons pour le résoudre une heuristique qui est une nouvelle variante d’algorithme à pivot. Cette heuristique, appelée Consistent-pivot, s’avère à la fois plus précise et plus rapide que les algorithmes à pivot qui avaient été proposés auparavant. / It is thought to be more and more important to solve biological questions using Bioinformatics approaches in the post-genomic era. This thesis focuses on two problems related to high troughput data: bioinformatics analysis at a large scale, and development of algorithms of consensus ranking. In molecular biology and genetics, RNA splicing is a modification of the nascent pre-messenger RNA (pre-mRNA) transcript in which introns are removed and exons are joined. The U2AF heterodimer has been well studied for its role in defining functional 3’ splice sites in pre-mRNA splicing, but multiple critical problems are still outstanding, including the functional impact of their cancer-associated mutations. Through genome-wide analysis of U2AF-RNA interactions, we report that U2AF has the capacity to define ~88% of functional 3’ splice sites in the human genome. Numerous U2AF binding events also occur in other genomic locations, and metagene and minigene analysis suggests that upstream intronic binding events interfere with the immediate downstream 3’ splice site associated with either the alternative exon to cause exon skipping or competing constitutive exon to induce inclusion of the alternative exon. We further build up a U2AF65 scoring scheme for predicting its target sites based on the high throughput sequencing data using a Maximum Entropy machine learning method, and the scores on the up and down regulated cases are consistent with our regulation model. These findings reveal the genomic function and regulatory mechanism of U2AF, which facilitates us understanding those associated diseases.Ranking biological data is a crucial need. Instead of developing new ranking methods, Cohen-Boulakia and her colleagues proposed to generate a consensus ranking to highlight the common points of a set of rankings while minimizing their disagreements to combat the noise and error for biological data. However, it is a NP-hard questioneven for only four rankings based on the Kendall-tau distance. In this thesis, we propose a new variant of pivot algorithms named as Consistent-Pivot. It uses a new strategy of pivot selection and other elements assignment, which performs better both on computation time and accuracy than previous pivot algorithms.
103

Desenvolvimento de um sistema para classificar recursos energéticos de oferta e demanda com base no cômputo e na valoração do potencial completo dos recursos energéticos dentro do planejamento integrado de recursos. / Development of a system to classify energy resources from supply and demand sides-based in computation and valuation full potential of energy resources into the integrated resource planning.

Rigolin, Pascoal Henrique da Costa 17 June 2013 (has links)
O objetivo deste trabalho é analisar e desenvolver um modelo para seleção completa de recursos energéticos com base na teoria de tomada de decisão para classificação de Recursos do lado da Oferta e do lado da Demanda. Sendo que a seleção completa, circunscrita em elementos de auxílio tipo atributos e sub-atributos, implica na consideração das dimensões do desenvolvimento, relativos a metodologia do Planejamento Integrado de Recursos Energéticos (PIR). Estes elementos são descritos através de algoritmos demonstrados a partir de uma escala matemática qualitativa e quantitativa distribuídos em quatro grandes dimensões: Técnico-econômica, Ambiental, Social e Política (dimensões do desenvolvimento energético e humano). Os algoritmos dos atributos e sub-atributos que caracterizam cada um dos Recursos Energéticos (REs) podem ser dados através de um valor numérico (mais comuns nos algoritmos das dimensões Técnico-econômica e Ambiental, onde o valor pode ser medido ou estimado) ou através de um valor na escala não numérica (escala utilizada com mais frequência nas dimensões Social e Política). Posteriormente a caracterização de cada um dos REs, estes passam por uma etapa denominada padronização, que consiste em converter diferentes unidades de caracterização para uma mesma base, dando a possibilidade de comparação entre os diferentes elementos de caracterização dos REs. A ferramenta matemática de comparação entre diferentes elementos utilizada no auxílio à tomada de decisão foi desenvolvida por Thomas L. Saaty e chama-se Processo Analítico Hierárquico (AHP do significado em inglês) A partir deste ponto é possível calcular os Ranqueamentos dos REs. São três Ranqueamentos: Padrão, dos En-In e o Final. O cálculo dos dois primeiros Ranqueamentos é equivalente, e é feito através da somatória da multiplicação dos dados convertidos pelos pesos dos atributos e subatributos correspondentes. O cálculo do Ranqueamento Final é feito através da média dos resultados dos dois Ranqueamentos citados. O valor final obtido deste processo, para cada um dos REs, é chamado de Custo Completo. Portanto, para este caso, quanto maior o Custo Completo, mais bem ranqueado estará o RE. Um estudo de caso foi feito para consolidação dos resultados deste trabalho. A região do estudo foi a Região Administrativa de Araçatuba (RAA) que engloba 43 municípios do Estado de São Paulo. Foram caracterizados e ranqueados 182 REs, e obteve-se como resultado do Ranqueamento Final, as 58 primeiras melhores colocações sendo ocupadas em sua integridade por Recursos Energéticos de Demanda e as 22 últimas posições ocupadas por algum Recurso Energético que tem como característica a termo-geração (nuclear ou pela queima direta de combustíveis). / The objective of this study is to analyze and develop a model for full selection of energy resources on the theory of decision making for classification of Supply Side and Demand Side Energy Resources. Considering the full selection limited aid elements in type attributes and sub-attributes, implies the consideration of development dimensions concerning the methodology of Energy Integrated Resource Planning (PIR acronym in Portuguese of the Planejamento Integrado de Recursos). These elements are described algorithmically shown from a qualitative and quantitative mathematical scale distributed into four main dimensions: Technical-Economic, Environmental, Social and Policy (dimensions of development energy and development human). The algorithms of the attributes and sub-attributes that characterize each of Energy Resources (REs) can be given by a numerical value (most common in the algorithms of Technical-Economic and Environmental Dimensions, where the value can be measured or estimated) or through a value in the no numeric scale (scale used more often in Social and Policy dimensions). Subsequent characterization of each of the REs, they undergo a step called \"standardization\", which consists in converting units of different characterization for a single base, giving the possibility of comparison between the different elements of characterization of REs. A mathematical tool for comparing different elements used in assisting decision making was developed by Thomas L. Saaty and called Analytic Hierarchy Process (AHP). From this point its possible to calculate the Energy Resources Ranks. There are three Ranks: Padrão, dos En-In and Final (names in Portuguese). The calculation of the first two Ranks is equivalent and is done by the sum of the multiplying of weights of the converted data attributes and sub-attributes thereof. The calculation of the final ranking is done by averaging the results of both Ranks mentioned. The final value obtained from this process for each of REs is called full cost. Therefore, in this case, the higher the Full Cost Account ranked will be better RE. A case study was done to consolidate the results of this work. The study region was Araçatuba Administrative Region (RAA) that encompasses 43 municipalities in the state of São Paulo. Were characterized and ranked 182 REs, and was obtained as a result of final ranking, the 58 best placements being occupied in its entirety by Demand Side Energy Resources and 22 last positions occupied by some Energy Resource that is characterized by the thermal generation (nuclear or by direct burning of fuels).
104

Dissolução de Foraminíferos Quaternários do Atlântico Sul: da perda de CaCO3 ao ganho de informação paleoceanográfica

Petró, Sandro Monticelli January 2018 (has links)
Estudos paleoceanográficos são baseados em indicadores indiretos, ou seja, informações sedimentológicas, paleontológicas e geoquímicas que refletem as condições ambientais dos oceanos no passado geológico. Processos tafonômicos como a dissolução podem enviesar a informação contida nestes indicadores. Porém, quando corretamente identificada, a dissolução pode se tornar uma ferramenta para caracterizar mudanças oceanográficas, como variações na distribuição das massas d’água e a acidificação dos oceanos. O objetivo deste estudo é entender como ocorre este enviesamento, identificar indicadores que determinam a presença ou ausência da dissolução e identificar alterações oceanográficas no Quaternário tardio da Bacia de Pelotas em função deste processo. Esta tese compreende várias etapas, incluindo a realização de experimentos de dissolução em varias espécies de foraminíferos provenientes do oeste do Atlântico Sul, a comparação entre dados de fauna de foraminíferos planctônicos de sedimento com as condições ambientais do oceano Atlântico Sul e, finalmente, as análises de testemunhos do Quaternário tardio da Bacia de Pelotas (foraminíferos, cocolitoforídeos, teor de carbonato, granulometria, δ18O, AMS 14C). Os experimentos geraram um ranking de susceptibilidade à dissolução para foraminíferos, identificando Orbulina universa e Hoeglundina sp. como bons indicadores de pouca dissolução, além de identificar a razão entre foraminíferos planctônicos e bentônicos como imprópria para indicar o grau de dissolução. A análise da fauna de foraminíferos planctônicos do Atlântico Sul identificou maior enviesamento por condições ambientais de fundo em áreas com massas d’água mais corrosivas provenientes do sul, bem como o viés observado nas espécies frágeis pode ser relacionado aos erros em estimativas de paleotemperaturas baseadas em censos de fauna. Baseado nos indicadores aqui elaborados, combinado com outros indicadores comuns, as alterações oceanográficas na Bacia de Pelotas indicam um aumento da dissolução em períodos glaciais em função do avanço da Água Antartica de Fundo e da Água Circumpolar Superior. Finalmente, na parte conceitual desta tese, é proposta uma definição de zona tafonômicamente ativa para sistemas pelágicos, que deve considerar a coluna d’água como seu limite superior. / Paleoceanographic studies are based on proxy data, i.e. sedimentological, paleontological and geochemical information that reflect the environmental conditions of the oceans in the geological past. Taphonomic processes such as dissolution may bias the information contained in these proxies. However, when correctly identified, dissolution can become a tool to characterize oceanographic changes, such as variations in the water masses distribution and acidification of the oceans. The purpose of this study is to understand how this bias occurs, to identify proxies that determine the presence or absence of dissolution and to identify late Quaternary oceanographic changes in the Pelotas Basin as a function of this process. This thesis comprises several stages, including dissolution experiments on several foraminifera species from the western South Atlantic, comparison between sediment plankton foraminifera fauna data with environmental conditions of the South Atlantic Ocean, and finally, the analyses of late Quaternary of the Pelotas Basin cores (foraminifera, coccolith, carbonate content, grain size, δ18O, and AMS 14C). The experiments generated a ranking of susceptibility to dissolution for foraminifera, identifying Orbulina universa and Hoeglundina sp. as good proxies of low dissolution, besides identifying the planktonic and benthic foraminifera ratio as inappropriate for indicating the degree of dissolution. The analysis of the planktonic foraminifera fauna of the South Atlantic identified greater bias due to bottom environmental conditions in areas with more corrosive water masses from the south, as well as the observed bias in fragile species may be related to errors in paleotemperature estimates based on fauna census counts. Based on the proxies developed here, and other indicators frequently used, the late Quaternary oceanographic changes in the Pelotas Basin indicate an increase of the dissolution in glacial periods due to the advance of Antarctic Bottom Water and Upper Circumpolar Deep Water. Finally, in the conceptual part of this thesis, a taphonomically active zone for pelagic systems is proposed, which should consider the water column as its upper limit.
105

VersionsRank : escores de reputação de páginas web baseados na detecção de versões

Silva, Glauber Rodrigues da January 2009 (has links)
Os motores de busca utilizam o WebGraph formado pelas páginas e seus links para atribuir reputação às páginas Web. Essa reputação é utilizada para montar o ranking de resultados retornados ao usuário. No entanto, novas versões de páginas com uma boa reputação acabam por distribuir os votos de reputação entre todas as versões, trazendo prejuízo à página original e também as suas versões. O objetivo deste trabalho é especificar novos escores que considerem todas as versões de uma página Web para atribuir reputação para as mesmas. Para atingir esse objetivo, foram propostos quatro escores que utilizam a detecção de versões para atribuir uma reputação mais homogênea às páginas que são versões de um mesmo documento. Os quatro escores propostos podem ser classificados em duas categorias: os que realizam mudanças estruturais no WebGraph (VersionRank e VersionPageRank) e os que realizam operações aritméticas sobre os escores obtidos pelo algoritmo de PageRank (VersionSumRank e VersionAverageRank). Os experimentos demonstram que o VersionRank tem desempenho 26,55% superior ao PageRank para consultas navegacionais sobre a WBR03 em termos de MRR, e em termos de P@10, o VersionRank tem um ganho de 9,84% para consultas informacionais da WBR99. Já o escore VersionAverageRank, apresentou melhores resultados na métrica P@10 para consultas informacionais na WBR99 e WBR03. Na WBR99, os ganhos foram de 6,74% sobre o PageRank. Na WBR03, para consultas informacionais aleatórias o escore VersionAverageRank obteve um ganho de 35,29% em relação ao PageRank. / Search engines use WebGraph formed by the pages and their links to assign reputation to Web pages. This reputation is used for ranking show for the user. However, new versions of pages with a good reputation distribute your votes of reputation among all versions, damaging the reputation of original page and also their versions. The objective of this work is to specify the new scores to consider all versions of a Web page to assign reputation to them. To achieve this goal, four scores were proposed using the version detection to assign a more homogeneous reputation to the pages that are versions of the same document. The four scores proposed can be classified into two categories: those who perform structural changes in WebGraph (VersionRank and VersionPageRank) and those who performs arithmetic operations on the scores obtained by the PageRank algorithm (VersionSumRank and VersionAverageRank). The experiments show that the performance VersionRank is 26.55% higher than the PageRank for navigational queries on WBR03 in terms of MRR, and in terms of P@10, the VersionRank has a gain of 9.84% for the WBR99 informational queries. The score VersionAverageRank showed better results in the metric P@10 for WBR99 and WBR03 information queries. In WBR99, it had a gain of 6.74% compared to PageRank. In WBR03 for random informational queries, VersionAverageRank showed an increase of 35.29% compared to PageRank.
106

Desenvolvimento de um sistema para classificar recursos energéticos de oferta e demanda com base no cômputo e na valoração do potencial completo dos recursos energéticos dentro do planejamento integrado de recursos. / Development of a system to classify energy resources from supply and demand sides-based in computation and valuation full potential of energy resources into the integrated resource planning.

Pascoal Henrique da Costa Rigolin 17 June 2013 (has links)
O objetivo deste trabalho é analisar e desenvolver um modelo para seleção completa de recursos energéticos com base na teoria de tomada de decisão para classificação de Recursos do lado da Oferta e do lado da Demanda. Sendo que a seleção completa, circunscrita em elementos de auxílio tipo atributos e sub-atributos, implica na consideração das dimensões do desenvolvimento, relativos a metodologia do Planejamento Integrado de Recursos Energéticos (PIR). Estes elementos são descritos através de algoritmos demonstrados a partir de uma escala matemática qualitativa e quantitativa distribuídos em quatro grandes dimensões: Técnico-econômica, Ambiental, Social e Política (dimensões do desenvolvimento energético e humano). Os algoritmos dos atributos e sub-atributos que caracterizam cada um dos Recursos Energéticos (REs) podem ser dados através de um valor numérico (mais comuns nos algoritmos das dimensões Técnico-econômica e Ambiental, onde o valor pode ser medido ou estimado) ou através de um valor na escala não numérica (escala utilizada com mais frequência nas dimensões Social e Política). Posteriormente a caracterização de cada um dos REs, estes passam por uma etapa denominada padronização, que consiste em converter diferentes unidades de caracterização para uma mesma base, dando a possibilidade de comparação entre os diferentes elementos de caracterização dos REs. A ferramenta matemática de comparação entre diferentes elementos utilizada no auxílio à tomada de decisão foi desenvolvida por Thomas L. Saaty e chama-se Processo Analítico Hierárquico (AHP do significado em inglês) A partir deste ponto é possível calcular os Ranqueamentos dos REs. São três Ranqueamentos: Padrão, dos En-In e o Final. O cálculo dos dois primeiros Ranqueamentos é equivalente, e é feito através da somatória da multiplicação dos dados convertidos pelos pesos dos atributos e subatributos correspondentes. O cálculo do Ranqueamento Final é feito através da média dos resultados dos dois Ranqueamentos citados. O valor final obtido deste processo, para cada um dos REs, é chamado de Custo Completo. Portanto, para este caso, quanto maior o Custo Completo, mais bem ranqueado estará o RE. Um estudo de caso foi feito para consolidação dos resultados deste trabalho. A região do estudo foi a Região Administrativa de Araçatuba (RAA) que engloba 43 municípios do Estado de São Paulo. Foram caracterizados e ranqueados 182 REs, e obteve-se como resultado do Ranqueamento Final, as 58 primeiras melhores colocações sendo ocupadas em sua integridade por Recursos Energéticos de Demanda e as 22 últimas posições ocupadas por algum Recurso Energético que tem como característica a termo-geração (nuclear ou pela queima direta de combustíveis). / The objective of this study is to analyze and develop a model for full selection of energy resources on the theory of decision making for classification of Supply Side and Demand Side Energy Resources. Considering the full selection limited aid elements in type attributes and sub-attributes, implies the consideration of development dimensions concerning the methodology of Energy Integrated Resource Planning (PIR acronym in Portuguese of the Planejamento Integrado de Recursos). These elements are described algorithmically shown from a qualitative and quantitative mathematical scale distributed into four main dimensions: Technical-Economic, Environmental, Social and Policy (dimensions of development energy and development human). The algorithms of the attributes and sub-attributes that characterize each of Energy Resources (REs) can be given by a numerical value (most common in the algorithms of Technical-Economic and Environmental Dimensions, where the value can be measured or estimated) or through a value in the no numeric scale (scale used more often in Social and Policy dimensions). Subsequent characterization of each of the REs, they undergo a step called \"standardization\", which consists in converting units of different characterization for a single base, giving the possibility of comparison between the different elements of characterization of REs. A mathematical tool for comparing different elements used in assisting decision making was developed by Thomas L. Saaty and called Analytic Hierarchy Process (AHP). From this point its possible to calculate the Energy Resources Ranks. There are three Ranks: Padrão, dos En-In and Final (names in Portuguese). The calculation of the first two Ranks is equivalent and is done by the sum of the multiplying of weights of the converted data attributes and sub-attributes thereof. The calculation of the final ranking is done by averaging the results of both Ranks mentioned. The final value obtained from this process for each of REs is called full cost. Therefore, in this case, the higher the Full Cost Account ranked will be better RE. A case study was done to consolidate the results of this work. The study region was Araçatuba Administrative Region (RAA) that encompasses 43 municipalities in the state of São Paulo. Were characterized and ranked 182 REs, and was obtained as a result of final ranking, the 58 best placements being occupied in its entirety by Demand Side Energy Resources and 22 last positions occupied by some Energy Resource that is characterized by the thermal generation (nuclear or by direct burning of fuels).
107

Mining Data with Feature Interactions

January 2018 (has links)
abstract: Models using feature interactions have been applied successfully in many areas such as biomedical analysis, recommender systems. The popularity of using feature interactions mainly lies in (1) they are able to capture the nonlinearity of the data compared with linear effects and (2) they enjoy great interpretability. In this thesis, I propose a series of formulations using feature interactions for real world problems and develop efficient algorithms for solving them. Specifically, I first propose to directly solve the non-convex formulation of the weak hierarchical Lasso which imposes weak hierarchy on individual features and interactions but can only be approximately solved by a convex relaxation in existing studies. I further propose to use the non-convex weak hierarchical Lasso formulation for hypothesis testing on the interaction features with hierarchical assumptions. Secondly, I propose a type of bi-linear models that take advantage of interactions of features for drug discovery problems where specific drug-drug pairs or drug-disease pairs are of interest. These models are learned by maximizing the number of positive data pairs that rank above the average score of unlabeled data pairs. Then I generalize the method to the case of using the top-ranked unlabeled data pairs for representative construction and derive an efficient algorithm for the extended formulation. Last but not least, motivated by a special form of bi-linear models, I propose a framework that enables simultaneously subgrouping data points and building specific models on the subgroups for learning on massive and heterogeneous datasets. Experiments on synthetic and real datasets are conducted to demonstrate the effectiveness or efficiency of the proposed methods. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2018
108

Sequential Procedures for the "Selection" Problems in Discrete Simulation Optimization

Wenyu Wang (7491243) 17 October 2019 (has links)
<div>The simulation optimization problems refer to the nonlinear optimization problems whose objective function can be evaluated through stochastic simulations. We study two significant discrete simulation optimization problems in this thesis: Ranking and Selection (R&S) and Factor Screening (FS). Both R&S and FS are the "selection" problems defined upon a finite set of candidate systems or factors. They vary mainly in their objectives: the R&S problems is to find the "best" system(s) among all alternatives; whereas the FS is to select important factors that are critical to the stochastic systems. </div><div><br></div><div>In this thesis, we develop efficient sequential procedures for these two problems. For the R&S problem, we propose fully-sequential procedures for selecting the "best" systems with a guaranteed probability of correct selection (PCS). The main features of the stated methods are: (1) a Bonferroni-free model, these procedures overcome the conservativeness of the Bonferroni correction and deliver the exact probabilistic guarantee without overshooting; (2) asymptotic optimality, these procedures achieve the lower bound of average sample size asymptotically; (3) an indifference-zone-flexible formulation, these procedures bridge the gap between the indifference-zone formulation and the indifference-zone-free formulation so that the indifference-zone parameter is not indispensable but could be helpful if provided. We establish the validity and asymptotic efficiency for the proposed procedure and conduct numerical studies to investigates the performance under multiple configurations.</div><div><br></div><div>We also consider the multi-objective R&S (MOR&S) problem. To the best of our knowledge, the procedure proposed is the first frequentist approach for MOR&S. These procedures identify the Pareto front with a guaranteed probability of correct selection (PCS). In particular, these procedures are fully sequential using the test statistics built upon the Generalized Sequential Probability Ratio Test (GSPRT). The main features are: 1) an objective-dimension-free model, the performance of these procedures do not deteriorate as the number of objectives increases, and achieve the same efficiency as KN family procedures for single-objective ranking and selection problem; 2) an indifference-zone-flexible formulation, the new methods eliminate the necessity of indifference-zone parameter while makes use of the indifference-zone information if provided. A numerical evaluation demonstrates the validity efficiency of the new procedure.</div><div><br></div><div>For the FS problem, our objective is to identify important factors for simulation experiments with controlled Family-Wise Error Rate. We assume a Multi-Objective first-order linear model where the responses follow a multivariate normal distribution. We offer three fully-sequential procedures: Sum Intersection Procedure (SUMIP), Sort Intersection Procedure (SORTIP), and Mixed Intersection procedure (MIP). SUMIP uses the Bonferroni correction to adjust for multiple comparisons; SORTIP uses the Holms procedure to overcome the conservative of the Bonferroni method, and MIP combines both SUMIP and SORTIP to work efficiently in the parallel computing environment. Numerical studies are provided to demonstrate the validity and efficiency, and a case study is presented.</div>
109

Learning to rank by maximizing the AUC with linear programming for problems with binary output

Ataman, Kaan 01 January 2007 (has links)
Ranking is a popular machine learning problem that has been studied extensively for more then a decade. Typical machine learning algorithms are generally built to optimize predictive performance (usually measured in accuracy) by minimizing classification error. However, there are many real world problems where correct ordering of instances is of equal or greater importance than correct classification. Learning algorithms that are built to minimize classification error are often not effective when ordering within or among classes. This gap in research created a necessity to alter the objective of such algorithms to focus on correct ranking rather then classification. Area Under the ROC Curve (AUC), which is equivalent to the Wicoxon-Mann-Whitney (WMW) statistic, is a widely accepted performance measure for evaluating ranking performance in binary classification problems. In this work we present a linear programming approach (LPR), similar to 1-norm Support Vector Machines (SVM), for ranking instances with binary outputs by maximizing an approximation to the WMW statistic. Our formulation handles non-linear problems by making use of kernel functions. The results on several well-known benchmark datasets show that our approach ranks better than 2-norm SVM and faster than the support vector ranker (SVR). The number of constraints in the linear programming formulation increases quadratically with the number of data points considered for the training of the algorithm. We tackle this problem by implementing a number of exact and approximate speed-up approaches inspired by well-known methods such as chunking, clustering and subgradient methods. The subgradient method is the most promising because of its solution quality and its fast convergence to the optimal solution. We adopted LPR formulation to survival analysis. With this approach it is possible to order subjects by risk for experiencing an event. Such an ordering enables determination of high-risk and low-risk groups among the subjects that can be helpful not only in medical studies but also in engineering, business and social sciences. Our results show that our algorithm is superior in time-to-event prediction to the most popular survival analysis tool, Cox's proportional hazard regression.
110

Practical use of Multiple Geostatistical Realizations in Petroleum Engineering

Fenik, Dawib 06 1900 (has links)
Ranking of multiple realizations is an important step when the processing time for a realization is large. This is the case in reservoir flow simulation and in other areas of geology, environmental and even medical applications. Significant uncertainty exists in all reservoirs especially at unsampled locations where the geological heterogeneity and connectivity are impossible to exactly predict between wells. Geostatistical techniques are used to construct models of static properties such as lithofacies, porosity, permeability and residual fluid saturations and provide multiple equally probable realizations of these properties. The number of realizations that is required for modeling the uncertainty may be large; usually 100 realizations are considered enough to quantify uncertainty. However, this number of realizations is still too high for processing by a flow simulator. This thesis aims at developing a robust and reliable ranking methodology to rank the realizations using a static ranking measure. The outcome is the identification of the high, low, and intermediate ranking realizations for further detailed simulations. The methodology was developed for the steam assisted gravity drainage (SAGD) reservoir application. This thesis will consider the cumulative oil produced (COPrate) and cumulative steam-oil-ratio (CSOR) as the ranking parameters in the flow simulations, hereafter called performance parameter. Connected hydrocarbon volume (CHV) was the parameter that was used in the ranking methodology as the static ranking measure. High calibration between the performance parameters and the CHV would indicate the success of the proposed ranking methodology. The ranking methodology was validated against the results of the flow simulations. The results indicate a mediocre correlation between the SAGD performance parameters and CHV. The ranking methodology was modified by incorporating the average reservoir permeability. Significant improvement in the correlation between the static ranking measure and the SAGD performance parameters resulted. / Petroleum Engineering

Page generated in 0.0981 seconds