Global ETD Search

1	IT žinių portalo statistikos modulis pagrįstas grupavimu / Portal Statistics Module Based on Clustering Ruzgys, Martynas 16 August 2007 (has links) Pristatomas duomenų gavybos ir grupavimo naudojimas paplitusiose sistemose bei sukurtas IT žinių portalo statistikos prototipas duomenų saugojimui, analizei ir peržiūrai atlikti. Siūlomas statistikos modulis duomenų saugykloje periodiškais laiko momentais vykdantis duomenų transformacijas. Portale prieinami statistiniai duomenys gali būti grupuoti. Sugrupuotą informaciją pateikus grafiškai, duomenys gali būti interpretuojami ir stebimi veiklos mastai. Panašių objektų grupėms išskirti pritaikytas vienas iš žinomiausių duomenų grupavimo metodų – lygiagretusis k-vidurkių metodas. / Presented data mining methods and clustering usage in current statistical systems and created statistics module prototype for data storage, analysis and visualization for IT knowledge portal. In suggested statistics prototype database periodical data transformations are performed. Statistical data accessed in portal can be clustered. Clustered information represented graphically may serve for interpreting information when trends may be noticed. One of the best known data clustering methods – parallel k-means method – is adapted for separating similar data clusters. Informatics Duomenų gavyba Duomenų grupavimas Lygiagretusis k-vidurkių metodas Statistika Duomenų diskretizavimas Data mining Data clustering Parallel k-means method Statistics Data discreterization
2	Elaboration d'un score de vieillissement : propositions théoriques / Development of a score of ageing : proposal for a mathematical theory Sarazin, Marianne 21 May 2013 (has links) Le vieillissement fait actuellement l’objet de toutes les attentions, constituant en effet un problème de santé publique majeur. Sa description reste cependant complexe en raison des intrications à la fois individuelles et collectives de sa conceptualisation et d’une dimension subjective forte. Les professionnels de santé sont de plus en plus obligés d’intégrer cette donnée dans leur réflexion et de proposer des protocoles de prise en charge adaptés. Le vieillissement est une évolution inéluctable du corps dont la quantification est établie par l’âge dépendant du temps dit « chronologique ». Ce critère âge est cependant imparfait pour mesurer l’usure réelle du corps soumise à de nombreux facteurs modificateurs dépendant des individus. Aussi, partant de réflexions déjà engagées et consistant à substituer cet âge chronologique par un critère composite appelé « âge biologique », aboutissant à la création d’un indicateur ou score de vieillissement et sensé davantage refléter le vieillissement individuel, une nouvelle méthodologie est proposée adaptée à la pratique de médecine générale. Une première phase de ce travail a consisté à sonder les médecins généralistes sur leur perception et leur utilisation des scores cliniques en pratique courante par l’intermédiaire d’une enquête qualitative et quantitative effectuée en France métropolitaine. Cette étude a montré que l’adéquation entre l’utilisation déclarée et la conception intellectualisée des scores restait dissociée. Les scores constituent un outil d’aide à la prise en charge utile pour cibler une approche systémique souvent complexe dans la mesure où ils sont simples à utiliser (peu d’items et items adaptés à la pratique) et à la validité scientifiquement comprise par le médecin. Par ailleurs, l’âge du patient a été cité comme un élément prépondérant influençant le choix adéquat du score par le médecin généraliste. Cette base de travail a donc servi à proposer une modélisation de l’âge biologique dont la réflexion a porté tant sur le choix du modèle mathématique que des variables constitutives de ce modèle. Une sélection de variables marqueurs du vieillissement a été effectuée à partir d’une revue de la littérature et tenant compte de leur possible intégration dans le processus de soin en médecine générale. Cette sélection a été consolidée par une approche mathématique selon un processus de sélection ascendant à partir d’un modèle régressif. Une population dite « témoin » au vieillissement considéré comme normal a été ensuite constituée servant de base comparative au calcul de l’âge biologique. Son choix a été influencé dans un premier temps par les données de la littérature puis secondairement selon un tri par classification utilisant la méthode des nuées dynamiques. Un modèle de régression linéaire simple a ensuite été construit mais avec de données normalisées selon la méthode des copules gaussiennes suivi d’une étude des queues de distribution marginales. Les résultats ainsi obtenus laissent entrevoir des perspectives intéressantes de réflexion pour approfondir le calcul d’un âge biologique et du score en découlant en médecine générale, sa validation par une étude de morbidité constituant l’étape ultime de ce travail / Ageing is nowadays a major public health problem. Its description remains complex, both individual and collective conceptualization being interlaced with a strong subjective dimension. Health professionals are increasingly required to integrate ageing and prevention into their thought and to create adapted protocol and new tools. Ageing characterizes unavoidable changes in the body. It is usually measured by the age dependent on time and called “chronological age”. However, the criterion « chronological age » reflects imperfectly the actual ageing of the body depending on many individual factors. Also, this criterion has for a long time been replaced by another composite criterion called « biological age » supposed to better reflect the ageing process. In order to build a score of ageing adapted to general practice, a new methodology is proposed suitable for general practitioners. First of all, a first phase of this work consisted in a qualitative and quantitative survey conducted among general practitioners in France. This survey was done to obtain data on the use of predictive scores by general practitioners in their daily practice and their appropriateness, as well as to know the reasons of their non-utilization. Results showed that predictive scores are useful tools in daily practice to target a complex systemic approach insofar as they are simple to use (few items, items suitable for general practice) and their scientific validity is easily understood. In addition, patient’s age has been cited as a major criterion influencing general practitioners use of a predictive score. Results of this first phase have been used to propose a model of biological ageing, with reflexion on mathematical model as well as on component variables of this model. A selection of variables as markers of ageing was carried out from a review of the literature, taking into account their capacity of integration in general practitioners’ daily practice. This selection was completed by a mathematical approach based on an ascending process on a regression model. A control sample, assumed to be "normal ageing" on the basis of current knowledge in general medicine, was then used. This sample was first carried out from a review of the literature and then from a K-means method that classified this sample into several groups. The statistical dependence of measured variables was modeled by a Gaussian copula (taking into account only linear correlations of pairs). A standardized biological age was defined explicitly from these correlation coefficients. The tails of marginal distribution (method of excess) were estimated to enhance the discriminating power of the model. Results suggest interesting possibilities for a biological ageing calculation, and the predictive score they provide, suitable for general practitioners’ daily practice. Its validation by a morbidity and mortality survey will constitute the final phase of this work Age biologique Médecine générale Copules gaussiennes Méthode des excès Méthode des nuées dynamiques Biological ageing General practice Gaussian copula Method of excess K-means method
3	工具カタログからのデータマイニングに支援されたものづくりシステムに関する研究 / コウグカタログカラノデータマイニングニシエンサレタモノズクリシステムニカンスルケンキュウ児玉紘幸, Hiroyuki Kodama 22 March 2014 (has links) 博士(工学) / Doctor of Philosophy in Engineering / 同志社大学 / Doshisha University データマイニング工具カタログ切削条件 K-measn法ライフサイクルアセスメント Data mining Tool Catalog End-milling conditons K-means method Life cycle assessment
4	Shluková analýza rozsáhlých souborů dat: nové postupy založené na metodě k-průměrů / Cluster analysis of large data sets: new procedures based on the method k-means Žambochová, Marta January 2005 (has links) Abstract Cluster analysis has become one of the main tools used in extracting knowledge from data, which is known as data mining. In this area of data analysis, data of large dimensions are often processed, both in the number of objects and in the number of variables, which characterize the objects. Many methods for data clustering have been developed. One of the most widely used is a k-means method, which is suitable for clustering data sets containing large number of objects. It is based on finding the best clustering in relation to the initial distribution of objects into clusters and subsequent step-by-step redistribution of objects belonging to the clusters by the optimization function. The aim of this Ph.D. thesis was a comparison of selected variants of existing k-means methods, detailed characterization of their positive and negative characte- ristics, new alternatives of this method and experimental comparisons with existing approaches. These objectives were met. I focused on modifications of the k-means method for clustering of large number of objects in my work, specifically on the algorithms BIRCH k-means, filtering, k-means++ and two-phases. I watched the time complexity of algorithms, the effect of initialization distribution and outliers, the validity of the resulting clusters. Two real data files and some generated data sets were used. The common and different features of method, which are under investigation, are summarized at the end of the work. The main aim and benefit of the work is to devise my modifications, solving the bottlenecks of the basic procedure and of the existing variants, their programming and verification. Some modifications brought accelerate the processing. The application of the main ideas of algorithm k-means++ brought to other variants of k-means method better results of clustering. The most significant of the proposed changes is a modification of the filtering algorithm, which brings an entirely new feature of the algorithm, which is the detection of outliers. The accompanying CD is enclosed. It includes the source code of programs written in MATLAB development environment. Programs were created specifically for the purpose of this work and are intended for experimental use. The CD also contains the data files used for various experiments.
5	[en] METHODOLOGY FOR EVALUATING THE CONTINUITY OF THE DISTRIBUTION SERVICE IN LOCATIONS WITH ACCESS RESTRICTIONS DUE TO RECORDS OF VIOLENCE / [pt] METODOLOGIA PARA AVALIAÇÃO DA CONTINUIDADE DO SERVIÇO DE DISTRIBUIÇÃO EM LOCAIS COM RESTRIÇÃO DE ACESSO POR REGISTROS DE VIOLÊNCIA THAIS ROUPE BORGES 30 October 2023 (has links) [pt] Os segmentos de geração, transmissão e distribuição constituem a cadeia produtiva do setor elétrico, sendo o consumidor ou carga o último elo que deve ser atendido pelas distribuidoras. A percepção de qualidade, e consequentemente a satisfação do cliente, está intrinsecamente relacionada, entre outros fatores, à continuidade do fornecimento assegurada pelas concessionárias. No Brasil, a Agência Nacional de Energia Elétrica (ANEEL) é responsável por regular o setor de distribuição e estabelecer indicadores de referência com o objetivo de avaliar a eficiência das concessionárias em termos de confiabilidade e qualidade do serviço prestado. Diversos fatores podem impactar a continuidade da distribuição de energia, sendo alguns mais conhecidos e gerenciáveis pelas empresas, como quedas de objetos na rede ou sobrecarga de equipamentos. No entanto, outros fatores, como restrições de acesso a determinadas áreas devido à violência e ao controle territorial por grupos criminosos, apresentam desafios complexos e de gerenciabilidade inexistente por parte das distribuidoras. Essas limitações dificultam a pronta recomposição do serviço em situações emergenciais, resultando em tempos de falha mais longos e afetando negativamente os indicadores de continuidade monitorados pela ANEEL, bem como a satisfação do consumidor. Neste contexto, a presente dissertação propõe uma metodologia focada em identificar os ativos da distribuidora localizados em áreas com evidências de violência, o que implica em acesso limitado pelas equipes de campo. É utilizada a base de dados geográfica da distribuidora (BDGD) para identificar as unidades transformadoras em áreas com evidências de violência, também delineadas por plataformas de dados públicos. Técnicas de clusterização e testes estatísticos são então utilizados para aferir se os índices de continuidade nessas áreas são significativamente diferentes e superiores aos de locais em que não se observa registros de violência. Sistemas de distribuição dos estados do Rio de Janeiro e Pernambuco são utilizados para testar a eficácia da metodologia proposta. Diversos testes são realizados e os resultados obtidos são plenamente discutidos. / [en] The segments of generation, transmission and distribution constitute the production chain of the electricity sector, with the consumer or load being the last link that must be served by the distributors. The perception of quality, and consequently customer satisfaction, is intrinsically related, among other factors, to the continuity of supply ensured by the concessionaires. In Brazil, the National Electric Energy Agency (ANEEL) is responsible for regulating the distribution sector and establishing benchmarks in order to assess the efficiency of concessionaires in terms of reliability and quality of service provided. Several factors can impact the continuity of energy distribution, some of which are better known and manageable by companies, such as falling objects on the network or overloading equipment. However, other factors, such as access restrictions to certain areas due to violence and territorial control by criminal groups, present complex challenges and non-existent manageability on the part of the distributors. These limitations make it difficult to promptly restore the service in emergency situations, resulting in longer failure durations and negatively affecting the continuity indicators monitored by ANEEL, as well as consumer satisfaction. In this context, this dissertation proposes a methodology focused on identifying the distributor s assets located in areas with evidence of violence, which implies limited access by field service teams. The distribution company s geographic database (BDGD) is used to identify transforming units in areas with evidence of violence, also delineated by public data platforms. Clustering techniques and statistical tests are then used to assess whether the continuity indices in these areas are significantly different and higher than those in places where there are no records of violence. Distribution systems in the states of Rio de Janeiro and Pernambuco are used to test the effectiveness of the proposed methodology. Several tests are carried out and the results obtained are fully discussed. [pt] AREAS DE RISCO [pt] METODO K-MEANS [en] RISK AREAS [en] GEOGRAPHIC DATABASE [en] SERVICE CONTINUITY INDEX [en] K-MEANS METHOD [en] DISTRIBUTION SYSTEM RELIABILITY

1

Page generated in 0.0417 seconds