Spelling suggestions: "subject:"ehe attributed"" "subject:"ehe attributes""
311 |
Techniques for indexing large and complex datasets with missing attribute values. / Técnicas de indexação de grandes conjuntos de dados complexos com valores de atributos faltantes.Safia Brinis 18 July 2016 (has links)
Due to the increasing amount and complexity of data processed in real world applications, similarity search became a vital task to store and retrieve such data. However, missing attribute values are very frequent and metric access methods (MAMs), designed to support similarity search, do not operate on datasets when attribute values are missing. Currently, the approach to use the existing indexing techniques on datasets with missing attribute values just use an indicator to identify the missing values and employ a traditional indexing technique. Although, this approach can be applied over multidimensional indexing techniques, it is impractical for metric access methods. This dissertation presents the results of a research conducted to identify and deal with the issues related to indexing and querying datasets with missing values in metric spaces. An empirical analysis of the metric access methods when applied on incomplete datasets leads us to identify two main issues: distortion of the internal structure of the index when data are missing at random and skew of the index structure when data are not missing at random. Based on those findings, a new variant of the Slim-tree access method, called Hollow-tree, is presented. It employs new techniques that are capable to handle missing data issues when missingness is ignorable. The first technique includes a set of indexing policies that allow to index objects with missing attribute values and prevent distortions to occur in the internal structure of the indexes. The second technique targets the similarity queries to improve the query performance over incomplete datasets. This technique employs the fractal dimension of the dataset and the local density around the query object to estimate an ideal radius able to achieve an accurate query answer, considering data with missing values as a potential response. Results from experiments with a variety of real and synthetic datasets show that Hollow-tree achieves nearly 100% of precision and recall for Range queries and more than 90% for k Nearest Neighbor queries, while Slim-tree access method deteriorates with the increasing amount of missing values. The results confirm that the indexing technique helps to establish consistency in the index structure and the searching technique achieves a remarkable performance. When combined, the new techniques allow to explore properly all the available data even with high amounts of missing attribute values. As they are independent of the underlying access method, they can be adopted by a broad range of metric access methods, allowing to extend the class of MAMs. / O crescimento em quantidade e complexidade dos dados processados e armazenados torna a busca por similaridade uma tarefa fundamental para tratar esses dados. No entanto, atributos faltantes ocorrem freqüentemente, inviabilizando os métodos de acesso métricos (MAMs) projetados para apoiar a busca por similaridade. Assim, técnicas de tratamento de dados faltantes precisam ser desenvolvidas. A abordagem mais comum para executar as técnicas de indexação existentes sobre conjuntos de dados com valores faltantes é usar um indicador de valores faltantes e usar as técnicas de indexação tradicionais. Embora, esta técnica seja útil para os métodos de indexação multidimensionais, é impraticável para os métodos de acesso métricos. Esta dissertação apresenta os resultados da pesquisa realizada para identificar e lidar com os problemas de indexação e recuperação de dados em espaços métricos com valores faltantes. Uma análise experimental dos MAMs aplicados a conjuntos de dados incompletos identificou dois problemas principais: distorção na estrutura interna do índice quando a falta é aleatória e busca tendenciosa na estrutura do índice quando o processo de falta não é aleatório. Uma variante do MAM Slim-tree, chamada Hollow-tree foi proposta com base nestes resultados. A Hollow-tree usa novas técnicas de indexação e de recuperação de dados com valores faltantes quando o processo de falta é aleatório. A técnica de indexação inclui um conjunto de políticas de indexação que visam a evitar distorções na estrutura interna dos índices. A técnica de recuperação de dados melhora o desempenho das consultas por similaridade sobre bases de dados incompletas. Essas técnicas utilizam o conceito de dimensão fractal do conjunto de dados e a densidade local da região de busca para estimar um raio de busca ideal para obter uma resposta mais correta, considerando os dados com valores faltantes como uma resposta potencial. As técnicas propostas foram avaliadas sobre diversos conjuntos de dados reais e sintéticos. Os resultados mostram que a Hollow-tree atinge quase 100% de precisão e revocação para consultas por abrangência e mais de 90% para k vizinhos mais próximos, enquanto a Slim-tree rapidamente deteriora com o aumento da quantidade de valores faltantes. Tais resultados indicam que a técnica de indexação proposta ajuda a estabelecer a consistência na estrutura do índice e a técnica de busca pode ser realizada com um desempenho notável. As técnicas propostas são independentes do MAM básico usado e podem ser aplicadas em uma grande variedade deles, permitindo estender a classe dos MAMs em geral para tratar dados faltantes.
|
312 |
Contribuições para a construção de taxonomias de tópicos em domínios restritos utilizando aprendizado estatístico / Contributions to topic taxonomy construction in a specific domain using statistical learningMaria Fernanda Moura 26 October 2009 (has links)
A mineração de textos vem de encontro à realidade atual de se compreender e utilizar grandes massas de dados textuais. Uma forma de auxiliar a compreensão dessas coleções de textos é construir taxonomias de tópicos a partir delas. As taxonomias de tópicos devem organizar esses documentos, preferencialmente em hierarquias, identificando os grupos obtidos por meio de descritores. Construir manual, automática ou semi-automaticamente taxonomias de tópicos de qualidade é uma tarefa nada trivial. Assim, o objetivo deste trabalho é construir taxonomias de tópicos em domínios de conhecimento restrito, por meio de mineração de textos, a fim de auxiliar o especialista no domínio a compreender e organizar os textos. O domínio de conhecimento é restrito para que se possa trabalhar apenas com métodos de aprendizado estatístico não supervisionado sobre representações bag of words dos textos. Essas representações independem do contexto das palavras nos textos e, conseqüentemente, nos domínios. Assim, ao se restringir o domínio espera-se diminuir erros de interpretação dos resultados. A metodologia proposta para a construção de taxonomias de tópicos é uma instanciação do processo de mineração de textos. A cada etapa do processo propôem-se soluções adaptadas às necessidades específicas de construçao de taxonomias de tópicos, dentre as quais algumas contribuições inovadoras ao estado da arte. Particularmente, este trabalho contribui em três frentes no estado da arte: seleção de atributos n-gramas em tarefas de mineração de textos, dois modelos para rotulação de agrupamento hierárquico de documentos e modelo de validação do processo de rotulação de agrupamento hierárquico de documentos. Além dessas contribuições, ocorrem outras em adaptações e metodologias de escolha de processos de seleção de atributos, forma de geração de atributos, visualização das taxonomias e redução das taxonomias obtidas. Finalmente, a metodologia desenvolvida foi aplicada a problemas reais, tendo obtido bons resultados. / Text mining provides powerful techniques to help on the current needs of understanding and organizing huge amounts of textual documents. One way to do this is to build topic taxonomies from these documents. Topic taxonomies can be used to organize the documents, preferably in hierarchies, and to identify groups of related documents and their descriptors. Constructing high quality topic taxonomies, either manually, automatically or semi-automatically, is not a trivial task. This work aims to use text mining techniques to build topic taxonomies for well defined knowledge domains, helping the domain expert to understand and organize document collections. By using well defined knowledge domains, only unsupervised statistical methods are used, with a bag of word representation for textual documents. These representations are independent of the context of the words in the documents as well as in the domain. Thus, if the domain is well defined, a decrease of mistakes of the result interpretation is expected. The proposed methodology for topic taxonomy construction is an instantiation of the text mining process. At each step of the process, some solutions are proposed and adapted to the specific needs of topic taxonomy construction. Among these solutions there are some innovative contributions to the state of the art. Particularly, this work contributes to the state of the art in three different ways: the selection of n-grams attributes in text mining tasks, two models for hierarchical document cluster labeling and a validation model of the hierarchical document cluster labeling. Additional contributions include adaptations and methodologies of attribute selection process choices, attribute representation, taxonomy visualization and obtained taxonomy reduction. Finally, the proposed methodology was also validated by successfully applying it to real problems
|
313 |
Development of a GIS and model-based method for optimizing the selection of locations for drinking water extraction by means of riverbank filtrationZhou, Yan 12 January 2021 (has links)
The lack of safe drinking water worldwide has drawn the attention of decision makers to riverbank filtration (RBF) for its many advantages in purifying surface water. This study provides an overview of the hydrogeologic, fluvial, and environmental influences on the performance of RBF systems and aims to develop a model for RBF site selection. Using multi-attribute utility theory (MAUT), this study structured the RBF siting problem and assessed a multiplicative utility function for the decision maker. In a case study, geostatistical methods were used to acquire the necessary data and geographic information systems (GIS) were used to screen sites suitable for RBF implementation. Those suitable sites were then evaluated and ranked using the multi-attribute utility model. The result showed that sites can be identified as most preferred among the selected suitable sites based on their expected utility values. This study definitively answers the question regarding the capability of MAUT in RBF site selection. Further studies are needed to verify the influences of the attributes on the performance of RBF systems.:Abstract iii
Zusammenfassung iv
Acknowledgments v
Table of Contents vi
List of Tables viii
List of Figures x
Definition of terms xiii
1. Abbreviations xiii
2. Symbols xiii
Part I Introduction 1
1. Introduction 2
2. Statement of purpose 2
3. Research questions 3
4. Overview of methodology 3
5. Organization of the dissertation 3
Part II Fundamentals and Literature Review 5
1. The definition of bank filtration 6
2. The Significance of RBF 7
2.1 RBF in drinking water supply 7
2.2 Benefits of RBF for China 14
3. RBF Site Selection 19
3.1 RBF site selection model 20
3.2 Definition of successful RBF sites 24
4. Factors Affecting RBF Site Selection 26
4.1 River hydrology/hydraulics 27
4.2 Geology 28
4.3 Land cover 36
4.4 Well field location 36
4.5 Water quality 37
4.6 Aquifer properties 38
4.7 Distance to river 41
4.8 Riverbed characteristics 43
5. Effect of Clogging on Yield 46
6. Summary 51
Part III Developing a Multi-attribute Utility Model for RBF Site Selection 53
1. Introduction 54
2. Objectives and Attributes 54
3. Assessment of the Utility Function 57
3.1 Investigation of the qualitative preference structure 58
3.2 Assessment of component utility function 62
3.3 Assessment of the scaling constants 63
4. Results 67
5. Discussion 69
6. Summary 74
Part IV Case Study 75
1. Introduction 76
2. Materials and Methods 78
2.1 GIS data collection 78
2.1.1 Geologic data 79
2.1.2 Land cover data 79
2.1.3 Groundwater quality data 80
2.1.4 Aquifer properties data 80
2.1.5 Surface water area data 80
2.1.6 Surface water quality data 81
2.1.7 Streambed material data 81
2.2 Kriging the saturated thickness 91
2.3 Aggregation of all constraint maps 103
3. Results 105
3.1 Kriging 105
3.2 Suitable sites 105
4. Discussion 109
4.1 A discussion of the kriging results 109
4.2 A discussion of the multi-attribute utility model results 117
5. Summary 122
Part V Conclusions and Recommendations 123
1. Conclusion and Recommendation 124
Appendix 1 Environmental quality standards for surface water (GB 3838-2002) 125
Appendix 2 Quality standard for groundwater (GB14848-93) 127
Appendix 3 Explanation to Germany’s RBF site location data 130
Appendix 4 Layer information of drillings 133
Appendix 5 Streambed materials used by Schälchli (1993) 141
Appendix 6 Interview and questionnaires 143
Appendix 7 Surface water area of Jilin City 150
Bibliography 152
|
314 |
Grammar-Based Translation Framework / Grammar-Based Translation FrameworkVít, Radek January 2019 (has links)
V této práci prozkoumáváme existující algoritmy pro přijímání jazyků definovaných bezkontextovými gramatikami. Na základě těchto znalostí navrhujeme nový model pro reprezentaci LR automatů a s jeho pomocí definujeme nový algoritmus LSCELR. Modifikujeme algoritmy pro přijímání jazyků k vytvoření algoritmů pro překlad založený na překladových gramatikách. Definujeme atributové překladové gramatiky jako rozšířené překladové gramatiky pro definici vztahů mezi vstupními a výstupními symboly překladu. Implementujeme překladový framework ctf založený na gramatikách, který implementuje překlad pomocí LSCELR. Definujeme jazyk pro popis atributových překladových gramatik a implementujeme překladač pro překlad této reprezentace do zdrojového kódu pro implementovaný framework.
|
315 |
Optimalizace strukturovaných dotazů nad rozsáhlými databázemi / Optimization of Structured Queries on Large DatabasesJaneček, Jiří January 2012 (has links)
This master's thesis deals with optimization of structured queries on large databases. Principles of these optimizations are used during creation of application, which allows finding over one specific large database. At the same time this thesis compares efficiency between the new designed SQL constructions and the not optimized SQL constructions.
|
316 |
Kryptografie a ochrana soukromí / Cryptography and Privacy ProtectionMalík, Ondrej January 2021 (has links)
The main goal of this diploma thesis was to create web applications for issuer, verifier and revocation authority of revocable keyed-verification anonymous credentials system. Applications created in this thesis provide functions for all tasks, that are performed by each entity. Using these applications a global management of RKVAC system is possible. Authentication module created in verifier’s app is universaly usable for access control to any web service. Both issuer’s and revocation authority’s app are compatible with whole RKVAC system and are therefor applicable as central elements of systems.
|
317 |
Reconceptualizing mathematics teaching and learning: Teacher learning in a realistic mathematics contextSmith, Edward Charles January 2000 (has links)
Philosophiae Doctor - PhD / In this study the construct of personal theories is used to represent the teacher's conceptions, which are interpreted as the consciously held beliefs. The teacher's personal theories encompass beliefs, images, values and attitudes as well as understanding about teaching and learning. This study investigates the influence of the teacher's conceptions of mathematics, of the teaching and learning of mathematics and of the context before and after a structured learning experience. The interest in the teacher's conceptions is derived from the assumption that these serve as a primary component that influence how teachers think about their professional responsibilities and how they act in their classrooms. Furthermore, the extent of implementation of a new curriculum has been linked to the scope of congruence between the teachers' conceptions and the underpinning philosophy of the intended curriculum. The study of the teacher's conceptions is especially relevant during a time of educational reform, such as the current transition to an Outcomes Based Education curriculum in South Africa. The participants in this study consist of four primary school mathematics teachers with various educational backgrounds, who teach at schools situated in different physical environments. The conceptions that these teachers have of mathematics, of the teaching and learning of mathematics and the influence of the context are investigated using a variety of instruments. Data collection was done with a questionnaire, a repertory grid, a semi-structured interview and lesson observations. The teachers participated in the Teaching Intervention and Support Programme (TISP), as a structured teacher learning experience. The programme is centred on the integration of the developmental and socio-cultural perspectives on teacher learning. With the developmental perspective the focus is on the acquisition of intellectual skills, while the socio-cultural perspective emphasizes participation in social practice. Both are directed at effecting conceptual change. With the developmental approach the process of conceptual change involves the development of new conceptions from existing conceptions. From the socio-cultural perspective the context is paramount and conceptual change is seen as new ways of being and acting within a particular
context. The teachers were invited to attend a two-week intervention session, followed by a six months support programme that was aimed at establishing a teacher learning community. The learning experiences provided during the intervention session were drawn mainly from Realistic Mathematics Education. On completion of the programme, the teachers' conceptions of mathematics, of the teaching and learning of mathematics and the influence of the context were again investigated. The results of this study show that two of the participants had highly mechanistic conceptions of mathematics, and the teaching and learning of mathematics. The remaining two had a more empiristic approach with its high focus on environmental activities. After the programme, the teachers with the mechanistic views adopted a mixed. conception with some of the mechanistic conceptions retained, but now interspersed with some empiristic and realistic conceptions. The
participants with the empiristic conceptions adopted a more realistic conception, but again to varying degrees. Thompson's (1991) hierarchical structure for the development of conceptions was also used to describe the extent of conceptual change. However, it was found that a concentric, rather than a hierarchical representation is a more appropriate to describe these changes. With regards to the
socio-cultural view of conceptual change, all the participants perceived the context differently. The teachers' actions were also more commensurate with the practices associated with teachers that encourage learner autonomy, mathematical investigations and a facilitative role for the teacher.
|
318 |
Databázový systém pro správu biologických dat / Database System for Biological Data ManagementDrlík, Radovan January 2010 (has links)
This thesis describes the problems of storage and management of biological data, particularly of Haloalkane Dehalogenase enzymes. Furthermore, the thesis aims at project HADES (HAloalkane DEhalogenase databaSe) initiated by protein engineering group of Loschmidt Laboratories, Masaryk University in Brno. This is a project whose main goal is simply to store, preserve and display a wide variety of proteins data. The result of this work is a flexible database system allowing easy extensibility and maintainability, which is built on technologies Apache, PostgreSQL and PHP using the Zend Framework.
|
319 |
Attribute Exploration on the WebJäschke, Robert, Rudolph, Sebastian 28 May 2013 (has links)
We propose an approach for supporting attribute exploration by web information retrieval, in particular by posing appropriate queries to search engines, crowd sourcing systems, and the linked open data cloud. We discuss underlying general assumptions for this to work and the degree to which these can be taken for granted.
|
320 |
What determines who qualifies? : A quantitative study on the presence of first- and second-level agenda setting and issue ownership in the 2020 Democratic primary debates. / Vad avgör vem som går vidare? : En kvantitativ studie av förekomsten av första och andra nivån av dagordningsteorin samt issue ownership i demokraternas primärdebatter 2020.Boström, Lovisa January 2021 (has links)
The purpose of this study is to investigate the presence of first- and second-level agenda setting as well as issue ownership in the 2020 Democratic primary debates and whether there is a relationship between using strategies based on these theories and qualifying for future debates. The study seeks to answer three research questions: What is the relationship, if any, between a candidate whose statements focused primarily on the three issues considered most important by the public according to opinion polls and whether this candidate qualified for future debates? How did candidates use frames to redraw the attention of issues? What is the relationship, if any, between the extent to which a candidate’s statements discussed performance issues more than Republican-owned or Democratic-owned issues and whether this candidate qualified for future debates? The study draws mainly on the first and second level of the agenda setting theory, as well as the theory of issue ownership, and analyzes what issues candidates focus on, what attributes of these issues they emphasize, and whether they discuss performance issues like the economy or foreign policy more than issues owned by either the Republican or the Democratic Party. Through a quantitative content analysis of four candidates’ (Joe Biden, Bernie Sanders, Amy Klobuchar, & Andrew Yang) statements from three of the eleven primary debates held in the 2020 primary process, the study found no direct relationship between focusing on the public’s three most important issues and qualifying for future debates. Similarly, no such relationship was found between emphasizing certain attributes and qualifying for future debates, although the results suggest that candidates may have benefited from avoiding framing issues economically, which concurs with previous findings (Boydstun, Glazier, & Pietryka, 2013a; Boydstun, Glazier, & Phillips, 2013) and supports Vavreck’s (2009) theory that insurgent candidates should not emphasize the economy. Findings also demonstrated the contrasting ways three of the candidates framed the same issues, where Joe Biden and Amy Klobuchar tended to emphasize economic frames when discussing Medicare while Bernie Sanders emphasized effectiveness. Lastly, the findings support previous research on issue ownership since findings showed that most candidates discussed Democratic-owned issues more than other issues, while the eventual presidential nominee, Joe Biden, overall discussed performance issues more than issues owned by either party. This suggests that focusing on such issues may be beneficial for challenging candidates during an election cycle where the sitting president has been criticized for not being able to handle the job. Thus, no direct relationship could be found in the case of RQ1 or RQ2 but discussing performance issues the most overall may have benefited one candidate, suggesting there is a relationship in the case of RQ3. / Syftet med denna studie är att undersöka förekomsten av första och andra nivån av dagordningsteorin samt av issue ownership i Demokraternas primärdebatter 2020 och huruvida det finns någon relation mellan att använda strategier baserade på dessa teorier och att kvalificera sig för framtida debatter. Studien undersöker tre frågeställningar: Vad är relationen, om någon, mellan en kandidat vars uttalanden under debatterna fokuserade främst på de tre frågor som väljarna ansåg var viktigast enligt opinionsundersökningar och huruvida denna kandidat kvalificerade sig för framtida debatter? Hur använde kandidaterna ”frames” för att kontrollera diskussionen kring frågor? Vad är relationen, om någon, mellan den utsträckning en kandidats uttalanden under debatterna diskuterade så kallade ”performance issues” mer än frågor ägda av det republikanska eller demokratiska partiet och huruvida denna kandidat kvalificerade sig för framtida debatter? Studien bygger huvudsakligen på den första och andra nivån av dagordningsteorin, liksom teorin om issue ownership, och analyserar vilka frågor kandidaterna fokuserar på, vilka attribut de betonar när de talar om dessa frågor och om de diskuterar performance issues såsom ekonomi eller utrikespolitik mer än frågor som ägs av antingen republikanska eller demokratiska partiet. Genom en kvantitativ innehållsanalys av fyra kandidaters (Joe Biden, Bernie Sanders, Amy Klobuchar och Andrew Yang) uttalanden från tre av de elva primärdebatterna som hölls under primärprocessen 2020 fann studien ingen direkt relation mellan att fokusera på de tre frågor som väljarna ansåg var viktigast och att kvalificera sig för framtida debatter. Det hittades inte heller någon sådan relation mellan att betona vissa attribut och att kvalificera sig för framtida debatter, även om resultaten tyder på att kandidater kan ha haft nytta av att undvika att betona ekonomiska attribut, vilket överensstämmer med tidigare resultat (Boydstun, Glazier, & Pietryka, 2013a; Boydstun, Glazier, & Phillips, 2013) och stöttar Vavrecks (2009) teori att så kallade ”insurgent candidates” drar nytta av att inte diskutera ekonomin mer än nödvändigt. Resultaten visade också hur olika kandidaternas inramning av en specifik fråga var, då Joe Biden och Amy Klobuchar hade en tendens att betona ekonomiska attribut när de talade om frågor gällande Medicare medan Bernie Sanders fokuserade mer på effektivitetsattribut. Slutligen stöder studien tidigare forskning om issue ownership då resultaten visade på att de flesta kandidater diskuterade frågor ägda av det demokratiska partiet mer än andra frågor, medan den kandidat som slutligen skulle få det demokratiska partiets presidentsnominering, Joe Biden, totalt sett diskuterade performance issues mer än frågor ägda av något av partierna. Detta tyder på att ett fokus på sådana frågor kan vara till nytta för att utmanande kandidater under en valcykel där den sittande presidenten har kritiserats för sin hantering av arbetet. Således kunde ingen direkt relation hittas när det gällde RQ1 eller RQ2, men resultaten tyder på att en kandidat kan ha gynnats av att diskutera performance issues mest över lag, vilket i sig tyder på att det finns en relation gällnade RQ3.
|
Page generated in 0.0711 seconds