Global ETD Search

21	The influence of cross-cultural interviewing on the generation of data Tabane, Ramodungoane James 04 February 2005 (has links) Researchers use interviews as one of the means of collecting the information that is surrounding people. Interviewing is an important instrument of collecting data during a research. Although the collection of particular data is not guaranteed, interviews render an opportunity for collection of that data. Reasons for successful collection and/ or failure to collect the targeted data are various. Cultural formation of the interview situation might be one of those various reasons. This study focused on selected cultural dimensions, namely race, gender and language as possible causative dimensions influencing the generation of data in terms of volume, expression, range, content and content formulation. Data collected during culturally formulated interviews were presented. The influence that the three dimensions might have had on the generation of data was emphasized. A Response Process Model was utilized in this study to interpret the responding process that an individual might go through before yielding a response to the posed question. Coupled with the demands of meeting the question’s objective, an individual might be bombarded by extraneous and internal cues that might be exacerbated by the cross-cultural formation of the interview situation and therefore imposing extra demands on the individual and ultimately affecting the response given. The response processes were indicated in this study that at times were altered to possibly suit the cross-cultural interview situation. / Dissertation (MEd (Educational Psychology))--University of Pretoria, 2006. / Educational Psychology / unrestricted Response process model Race Gender Language Mono-culture Data generation Culture Cross-culture UCTD
22	Geração automática de dados de teste para programas concorrrentes com meta-heurística / Automatic test data generation for concurrent programs with metaheuristic José Dario Pintor da Silva 22 September 2014 (has links) A programação concorrente é cada vez mais utilizada nos sistemas atuais com o objetivo de reduzir custos e obter maior eficiência no processamento. Com a importância da programação concorrente é imprescindível que programas que implementam esse paradigma apresentem boa qualidade e estejam livres de defeitos. Assim,diferentes técnicas e critérios de teste vêm sendo definidos para apoiar a validação de aplicações desenvolvidas nesse paradigma. Nesse contexto, a geração automática de dados de teste é importante, pois permite reduzir o custo na geração e seleção de dados relevantes. O uso de técnicas meta-heurísticas tem sido uma área de grande interesse entre os pesquisadores para geração de dados, pois essas técnicas apresentam abordagens aplicáveis a problemas complexos e de difícil solução. Considerando esse aspecto, este trabalho apresenta uma abordagem de geração automática de dados para o teste estrutural de programas concorrentes em MPI (Message Passing Interface). A meta-heurística usada foi Algoritmo Genético em que a busca é guiada por critérios de teste que consideram características implícitas de programas concorrentes. O desempenho da abordagem foi avaliado por meio da cobertura dos dados detestes, da eficácia em revelar defeitos e do custo de execução. Para comparação, a geração aleatória foi considerada. Os resultados indicaram que é promissor usar geração de dados de teste no contexto de programas concorrentes, com resultados interessantes em relação à eficácia e cobertura dos requisitos de teste. / Concurrent programming has been increasingly used in current systems in order to reduce costs and obtain higher processing efficiency and, consequently, it is expected that these systems have high quallity. Therefore, different techniques and testing criteria have been proposed aiming to support the verification and validation of the concurrent applications. In this context, the automated data test generation allows to reduce the testing costs during the generation and selection of data tests. Metaheuristic technique has been widely investigated to support the data test generation because this technique has presented good results to complex and costly problems. In this work, we present an approach to the automated data test generation for message passing concurrent programs in MPI (Message Passing Interface). The generation of data test is performed using the genetic algorithm metaheuristic technique, guiding by structural testing criteria. An experimental study was conducted to evaluate the proposed approach, analyzing the effectiveness and application cost. The results indicate that the genetic algorithm is a promising approach to automated test data generation for concurrent programs, presenting good results in relation to effectiveness and data test coverage. Geração de dados de teste Meta-heurística Programação concorrente Teste de software Concurrent programs Data generation Metaheuristic Software testing
23	Nástroj pro tvorbu obsahu databáze pro účely testování software / Test Data Generator for Relational Databases Kotyz, Jan January 2018 (has links) This thesis deals with the problematic of test-data generation for relational databases. The aim of the this thesis is to design and implement tool which meets defined constrains and allows us to generate test-data. This tool uses SMT solver for constraint solving and test-data generation.
24	Test data generation based on binary search for class-level testing Beydeda, Sami, Gruhn, Volker 08 November 2018 (has links) One of the important tasks during software testing is the generation of appropriate test data. Various techniques have been proposed to automate this task. The techniques available, however, often have problems limiting their use. In the case of dynamic test data generation techniques, a frequent problem is that a large number of iterations might be necessary to obtain test data. This article proposes a novel technique for automated test data generation based on binary search. Binary search conducts searching tasks in logarithmic time, as long as its assumptions are fulfilled. This article shows that these assumptions can also be fulfilled in the case of path-oriented test data generation and presents a technique which can be used to generate test data covering certain paths in class methods. info:eu-repo/classification/ddc/005.1 ddc:005.1
25	Method for Collecting Relevant Topics from Twitter supported by Big Data Silva, Jesús, Senior Naveda, Alexa, Gamboa Suarez, Ramiro, Hernández Palma, Hugo, Niebles Núẽz, William 07 January 2020 (has links) There is a fast increase of information and data generation in virtual environments due to microblogging sites such as Twitter, a social network that produces an average of 8, 000 tweets per second, and up to 550 million tweets per day. That's why this and many other social networks are overloaded with content, making it difficult for users to identify information topics because of the large number of tweets related to different issues. Due to the uncertainty that harms users who created the content, this study proposes a method for inferring the most representative topics that occurred in a time period of 1 day through the selection of user profiles who are experts in sports and politics. It is calculated considering the number of times this topic was mentioned by experts in their timelines. This experiment included a dataset extracted from Twitter, which contains 10, 750 tweets related to sports and 8, 758 tweets related to politics. All tweets were obtained from user timelines selected by the researchers, who were considered experts in their respective subjects due to the content of their tweets. The results show that the effective selection of users, together with the index of relevance implemented for the topics, can help to more easily find important topics in both sport and politics. Big data Sports Virtual reality Data generation Microblogging Time-periods User profile Social networking (online)
26	Semantic Segmentation with Carla Simulator Malec, Stanislaw January 2021 (has links) Autonomous vehicles perform semantic segmentation to orient themselves, but training neural networks for semantic segmentation requires large amounts of labeled data. A hand-labeled real-life dataset requires considerable effort to create, so we instead turn to virtual simulators where the segmented labels are known to generate large datasets virtually for free. This work investigates how effective synthetic datasets are in driving scenarios by collecting a dataset from a simulator and testing it against a real-life hand-labeled dataset. We show that we can get a model up and running faster by mixing synthetic and real-life data than traditional dataset collection methods and achieve close to baseline performance. autonomous vehicles synthetic data generation semantic segmentation computer vision carla simulator Computer Sciences Datavetenskap (datalogi)
27	On the use of routing engines for dynamic travel time calculation within emergency vehicle transport simulation Juninger, Marcus, Narvell, Nicholas January 2023 (has links) Traditional methods for constructing simulation models can involve severalsteps that require manual pre-processing of large data sets. This process maybe time-consuming and prone to human errors, while also leading to modelsthat are inconvenient to customize for varying simulation scenarios. In thisthesis, we propose an alternate data preparation methodology in emergencyvehicle transport simulation, which aims to eliminate parts of the manualpre-processing. Our research is based on a previous case study using datafrom Sweden’s Southern Healthcare Region. The methodology we propose isinstantiated through a proof-of-concept software module that replacespreviously used static input sets by introducing dynamic runtime calculationsof ambulance travel times. This was done in two steps where we first evaluatedseveral routing engines according to needs extracted from the studied case.Secondly, we implemented and integrated the chosen routing engine into thepreviously mentioned module. Testing of the module showed feasible andconsistent performance, demonstrating the potential usage of our proposedmethodology in emergency vehicle transport simulation. Emergency Vehicle Simulation Routing Engines Travel Time Calculation Dynamic Data Generation Information Systems
28	Data Generation in Metal Recycling Using Unconditional Diffusion Models Sebastian, Andersson January 2023 (has links) Combitech AB was interested in how to automate the process of annotating aluminum scrap when it was adjacent to other metals. This was to ultimately create an annotated dataset that could be utilized for training a segmentation model. The idea was to make use of generative models to generate samples of general scrap metals. Then, with this model, introduce a small dataset of only aluminum, to try to change the features into a domain suitable for aluminum. Since the contents of the samples were generated separately, the system would know where the aluminum was and could then annotate it. This master's thesis aimed to investigate whether it was possible to construct generative models to generate these samples and see if they had realistic characteristics. It was also investigated if it was possible to get a meaningful model based on a relatively small dataset (aluminum in this case). The data used were two datasets, one with general scrap metal (excluding aluminum) and the other containing only aluminum scrap. Unconditional diffusion models were utilized as generative models. The scrap model achieved satisfactory results, making it possible to generate samples that carried similar properties as the real scrap dataset. When it came to aluminum, which had a much smaller dataset than the scrap dataset, it was possible to get promising results when utilizing transfer learning. However, the same good quality as the scrap model gave was not achieved. This master's thesis has shown that it is possible to get a model to generate realistic-looking images of scrap metal. Furthermore, this scrap model served as a good base when training other generative models to generate images of metals, even if the provided datasets were small. In this way, a foundation was laid for an investigation of an automatic annotation system. Machine learning diffusion model unconditional diffusion model data generation generative model Computer Sciences Datavetenskap (datalogi)
29	Génération de données : de l’anonymisation à la construction de populations synthétiques Jutras-Dubé, Pascal 11 1900 (has links) Les coûts élevés de collecte de données ne rendent souvent possible que l’échantillonnage d’un sous-ensemble de la population d’intérêt. Il arrive également que les données collectées renferment des renseignements personnels et sensibles au sujet des individus qui y figurent de sorte qu’elles sont protégées par des lois ou des pratiques strictes de sécurité et gouvernance de données. Dans les deux cas, l’accès aux données est restreint. Nos travaux considèrent deux angles de recheche sous lesquels on peut se servir de la génération de données fictives pour concevoir des modèles d’analyse où les données véritables sont inaccessibles. Sous le premier angle, la génératon de données fictives se substitue aux données du recensement. Elle prend la forme d’une synthèse de population constituée d’individus décrits par leurs attributs aux niveaux individuel et du ménage. Nous proposons les copules comme nouvelle approche pour modéliser une population d’intérêt dont seules les distributions marginales sont connues lorsque nous possédons un échantillon d’une autre population qui partage des caractéristiques de dépendances interdimensionnelles similaires. Nous comparons les copules à l’ajustement proportionnel itératif, technologie répandue dans le domaine de la synthèse de population, mais aussi aux approches d’apprentissage automatique modernes comme les réseaux bayésiens, les auto-encodeurs variationnels et les réseaux antagonistes génératifs lorsque la tâche consiste à générer des populations du Maryland dont les données sont issues du recensement américain. Nos expériences montrent que les copules surpassent l’ajustement proportionnel itératif à modéliser les relations interdimensionnelles et que les distributions marginales des données qu’elles génèrent correspondent mieux à celles de la population d’intèrêt que celles des données générées par les méthodes d’apprentissage automatique. Le second angle considère la génération de données qui préservent la confidentialité. Comme la désensibilisation des données est en relation inverse avec son utilité, nous étudions en quelles mesures le k-anonymat et la modélisation générative fournissent des données utiles relativement aux données sensibles qu’elles remplacent. Nous constatons qu’il est effectivement possible d’employer ces définitions de confidentialité pour publier des données utiles, mais la question de comparer leurs garanties de confidentialité demeure ouverte. / The high costs of data collection can restrict sampling so that only a subset of the data is available. The data collected may also contain personal and sensitive information such that it is protected by laws or strict data security and governance practices. In both cases, access to the data is restricted. Our work considers two research angles under which one can use the generation of synthetic data to design analysis models where the real data is inaccessible. In the first project, a synthetically generated population made up of individuals described by their attributes at the individual and household levels replaces census data. We propose copulas as a new approach to model a population of interest whose only marginal distributions are known when we have a sample from another population that shares similar interdimensional dependencies. We compare copulas to iterative proportional fitting, a technology developed in the field of population synthesis, but also to modern machine learning approaches such as Bayesian networks, variational autoencoders, and generative adversarial networks when the task is to generate populations of Maryland. Our experiments demonstrated that the copulas outperform iterative proportional fitting in modeling interdimensional relationships and that the marginal distributions of the data they generated match those of the population of interest better than those of the data generated by the machine learning methods. The second project consists of generating data that preserves privacy. As data privacy is inversely related to its usefulness, we study to what extent k-anonymity and generative modeling provide useful data relative to the sensitive data they replace. We find that it is indeed possible to use these privacy definitions to publish useful data, but the question of comparing their privacy guarantees remains open. Génération de données Copules Synthèse de population Confidentialité Data Generation Copulas Population Synthesis Privacy Transportation / Transport (UMI : 0709)
30	Generative Adversarial Networks for Vehicle Trajectory Generation / Generativa Motståndarnätverk för Generering av Fordonsbana Bajarunas, Kristupas January 2022 (has links) Deep learning models heavily rely on an abundance of data, and their performance is directly affected by data availability. In mobility pattern modeling, problems, such as next location prediction or flow prediction, are commonly solved using deep learning approaches. Despite advances in modeling techniques, complications arise when acquiring mobility data is limited by geographic factors and data protection laws. Generating highquality synthetic data is one of the solutions to get around at times when information is scarce. Trajectory generation is concerned with generating trajectories that can reproduce the spatial and temporal characteristics of the underlying original mobility patterns. The task of this project was to evaluate Generative Adversarial Network (GAN) capabilities to generate synthetic vehicle trajectory data. We extend the methodology of previous research on trajectory generation by introducing conditional trajectory duration labels and a model pretraining mechanism. The evaluation of generated trajectories consisted of a two-fold analysis. We perform qualitative analysis by visually inspecting generated trajectories and quantitative analysis by calculating the statistical distance between synthetic and original data distributions. The results indicate that extending the previous GAN methodology allows the novel model to generate trajectories statistically closer to the original data distribution. Nevertheless, a statistical base model has the best generative performance and is the only model to generate visually plausible results. We accredit the superior performance of the statistical base model to the highly predictive nature of vehicle trajectories, which must follow the road network and have the tendency to follow minimum distance routes. This research considered only one type of GAN-based model, and further research should explore other architecture alternatives to understand the potential of GAN-based models fully / Modeller för djupinlärning är starkt beroende av ett överflöd av data, och derasprestanda påverkas direkt av datatillgänglighet. I mobilitetsmönstermodellering löses problem, såsom nästa platsförutsägelse eller flödesprediktion,vanligtvis med hjälp av djupinlärningsmetoder. Trots framsteg inommodelleringsteknik uppstår komplikationer när inhämtning av mobilitetsdatabegränsas av geografiska faktorer och dataskyddslagar. Att generera syntetiskdata av hög kvalitet är en av lösningarna för att ta sig runt i tider dåinformationen är knapp. Bangenerering handlar om att generera banorsom kan reproducera de rumsliga och tidsmässiga egenskaperna hos deunderliggande ursprungliga rörlighetsmönstren. Uppgiften för detta projektvar att utvärdera GAN-kapaciteten för att generera syntetiska fordonsbanor. Viutökar metodiken för tidigare forskning om banagenerering genom att introducera villkorliga etiketter för banalängd och en modellförträningsmekanism.Utvärderingen av genererade banor bestod av en tvåfaldig analys. Viutför kvalitativ analys genom att visuellt inspektera genererade banor ochkvantitativ analys genom att beräkna det statistiska avståndet mellan syntetiskaoch ursprungliga datafördelningar. Resultaten indikerar att en utvidgningav den tidigare GAN-metoden tillåter den nya modellen att generera banorstatistiskt närmare den ursprungliga datadistributionen. Ändå har en statistiskbasmodell den bästa generativa prestandan och är den enda modellen somgenererar visuellt rimliga resultat. Vi ackrediterar den statistiska basmodellensöverlägsna prestanda till den mycket prediktiva karaktären hos fordonsbanor,som måste följa vägnätet och ha en tendens att följa minimiavståndsrutter.Denna forskning övervägde endast en typ av GAN-baserad modell, ochytterligare forskning bör utforska andra arkitekturalternativ för att förståpotentialen hos GAN-baserade modeller fullt ut Data Generation Generative Adversarial Networks Vehicle Trajectories Datagenerering Generativa Motståndarnätverk Fordonsbanor Computer and Information Sciences Data- och informationsvetenskap

Search results