• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 37
  • 30
  • 4
  • 4
  • 4
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 99
  • 99
  • 22
  • 18
  • 12
  • 11
  • 9
  • 9
  • 8
  • 8
  • 7
  • 7
  • 7
  • 7
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

Représentation d'images hiérarchique multi-critère / Hierarchical multi-feature image representation

Randrianasoa, Tianatahina Jimmy Francky 08 December 2017 (has links)
La segmentation est une tâche cruciale en analyse d’images. L’évolution des capteurs d’acquisition induit de nouvelles images de résolution élevée, contenant des objets hétérogènes. Il est aussi devenu courant d’obtenir des images d’une même scène à partir de plusieurs sources. Ceci rend difficile l’utilisation des méthodes de segmentation classiques. Les approches de segmentation hiérarchiques fournissent des solutions potentielles à ce problème. Ainsi, l’Arbre Binaire de Partitions (BPT) est une structure de données représentant le contenu d’une image à différentes échelles. Sa construction est généralement mono-critère (i.e. une image, une métrique) et fusionne progressivement des régions connexes similaires. Cependant, la métrique doit être définie a priori par l’utilisateur, et la gestion de plusieurs images se fait en regroupant de multiples informations issues de plusieurs bandes spectrales dans une seule métrique. Notre première contribution est une approche pour la construction multicritère d’un BPT. Elle établit un consensus entre plusieurs métriques, permettant d’obtenir un espace de segmentation hiérarchique unifiée. Par ailleurs, peu de travaux se sont intéressés à l’évaluation de ces structures hiérarchiques. Notre seconde contribution est une approche évaluant la qualité des BPTs en se basant sur l’analyse intrinsèque et extrinsèque, suivant des exemples issus de vérités-terrains. Nous discutons de l’utilité de cette approche pour l’évaluation d’un BPT donné mais aussi de la détermination de la combinaison de paramètres adéquats pour une application précise. Des expérimentations sur des images satellitaires mettent en évidence la pertinence de ces approches en segmentation d’images. / Segmentation is a crucial task in image analysis. Novel acquisition devices bring new images with higher resolutions, containing more heterogeneous objects. It becomes also easier to get many images of an area from different sources. This phenomenon is encountered in many domains (e.g. remote sensing, medical imaging) making difficult the use of classical image segmentation methods. Hierarchical segmentation approaches provide solutions to such issues. Particularly, the Binary Partition Tree (BPT) is a hierarchical data-structure modeling an image content at different scales. It is built in a mono-feature way (i.e. one image, one metric) by merging progressively similar connected regions. However, the metric has to be carefully thought by the user and the handling of several images is generally dealt with by gathering multiple information provided by various spectral bands into a single metric. Our first contribution is a generalized framework for the BPT construction in a multi-feature way. It relies on a strategy setting up a consensus between many metrics, allowing us to obtain a unified hierarchical segmentation space. Surprisingly, few works were devoted to the evaluation of hierarchical structures. Our second contribution is a framework for evaluating the quality of BPTs relying both on intrinsic and extrinsic quality analysis based on ground-truth examples. We also discuss about the use of this evaluation framework both for evaluating the quality of a given BPT and for determining which BPT should be built for a given application. Experiments using satellite images emphasize the relevance of the proposed frameworks in the context of image segmentation.
92

Conception des Systèmes d'Information : une approche centrée sur les Patrons de Gestion de la Qualité / A Quality Pattern Based Approach for the Analysis and Design of Information Systems

Mehmood, Kashif 03 September 2010 (has links)
Les modèles conceptuels (MC) jouent un rôle crucial qui est celui de servir de base à l’ensemble du processus de développement d’un système d’information (SI) mais aussi de moyen de communication à la fois au sein de l’équipe de développement et avec les utilisateurs durant les premières étapes de validation. Leur qualité joue par conséquent un rôle déterminant dans le succès du système final. Des études ont montré que la majeure partie des changements que subit un SI concerne des manques ou des défaillances liés aux fonctionnalités attendues. Sachant que la définition de ses fonctionnalités incombe à la phase de l’analyse et conception dont les MC constituent les livrables, il apparaît indispensable pour une méthode de conception de veiller à la qualité des MC qu’elle produit. Notre approche vise les problèmes liés à la qualité de la modélisation conceptuelle en proposant une solution intégrée au processus de développement qui à l’avantage d’être complète puisqu’elle adresse à la fois la mesure de la qualité ainsi que son amélioration. La proposition couvre les aspects suivants: i. Formulation de critères de qualité en fédérant dans un premier temps les travaux existant sur la qualité des MC. En effet, un des manques constaté dans le domaine de la qualité des MC est l’absence de consensus sur les concepts et leurs définitions. Ce travail a été validé par une étude empirique. Ce travail a également permis d’identifier les parties non couverte par la littérature et de les compléter en proposant de nouveaux concepts ou en précisant ceux dont la définition n’était complète. ii. Définition d’un concept (pattern de qualité) permettant de capitaliser les bonnes pratiques dans le domaine de la mesure et de l’amélioration de la qualité des MC. Un pattern de qualité sert à aider un concepteur de SI dans l’identification des critères de qualité applicables à sa spécification, puis de le guider progressivement dans la mesure de la qualité ainsi que dans son amélioration. Sachant que la plupart des approches existantes s’intéresse à la mesure de la qualité et néglige les moyens de la corriger. La définition de ce concept est motivée par la difficulté et le degré d’expertise important qu’exige la gestion de la qualité surtout au niveau conceptuel où le logiciel fini n’est pas encore disponible et face à la diversité des concepts de qualité (critères et métriques) pouvant s’appliquer. iii. Formulation d’une méthode orientée qualité incluant à la fois des concepts, des guides et des techniques permettant de définir les concepts de qualité souhaités, leur mesure et l’amélioration de la qualité des MC. Cette méthode propose comme point d’entrée le besoin de qualité que doit formuler le concepteur. Il est ensuite guidée de manière flexible dans le choix des critères de qualité adaptés jusqu’à la mesure et la proposition de recommandations aidant à l’amélioration de la qualité du MC initial conformément au besoin formulé. iv. Développement d'un prototype "CM-Quality". Notre prototype met en œuvre la méthode proposée et offre ainsi une aide outillé à son application. Nous avons enfin mené deux expérimentations ; la première avait comme objectif de valider les concepts de qualité utilisés et de les retenir. La deuxième visait à valider la méthode de conception guidée par la qualité proposée / Conceptual models (CM) serve as the blueprints of information systems and their quality plays decisive role in the success of the end system. It has been witnessed that majority of the IS change-requests result due to deficient functionalities in the information systems. Therefore, a good analysis and design method should ensure that CM are correct and complete, as they are the communicating mediator between the users and the development team. Our approach targets the problems related to conceptual modeling quality by proposing a comprehensive solution. We designed multiple artifacts for different aspects of CM quality. These artifacts include the following: i. Formulation of comprehensive quality criteria (quality attributes, metrics, etc.) by federating the existing quality frameworks and identifying the quality criteria for gray areas. Most of the existing literature on CM quality evaluation represents disparate and autonomous quality frameworks proposing non-converging solutions. Thus, we synthesized (existing concepts proposed by researchers) and added the new concepts to formulate a comprehensive quality approach for conceptual models that also resulted in federating the existing quality frameworks. ii. Formulation of quality patterns to encapsulate past-experiences and good practices as the selection of relevant quality criteria (including quality attributes and metrics) with respect to a particular requirement (or goal) remains trickier for a non-expert user. These quality patterns encapsulate valuable knowledge in the form of established and better solutions to resolve quality problems in CM. iii. Designing of the guided quality driven process encompassing methods and techniques to evaluate and improve the conceptual models with respect to a specific user requirement or goal. Our process guides the user in formulating the desired quality goal, helps him/her in identifying the relevant quality patterns or quality attributes with respect to the quality goal and finally the process helps in evaluating the quality of the model and propose relevant recommendations for improvement. iv. Development of a software prototype “CM-Quality”. Our prototype implements all the above mentioned artifacts and proposes a workflow enabling its users to evaluate and improve CMs efficiently and effectively. We conducted a survey to validate the selection of the quality attributes through the above mentioned federating activity and also conducted three step detailed experiment to evaluate the efficacy and efficiency of our overall approach and proposed artifacts.
93

Structure Oriented Evaluation Model for E-Learning

Tudevdagva, Uranchimeg 21 July 2014 (has links)
Volume 14 of publication series EINGEBETTETE, SELBSTORGANISIERENDE SYSTEME is devoted to the structure oriented evaluation of e-learning. For future knowledge society, beside creation of intelligent technologies, adapted methods of knowledge transfer are required. In this context e-learning becomes a key technology for development of any education system. E-learning is a complex process into which many different groups with specific tasks and roles are included. The dynamics of an e-learning process requires adjusted quality management. For that corresponding evaluation methods are needed. In the present work, Dr.Tudevdagva develops a new evaluation approach for e-learning. The advantage of her method is that in contrast to linear evaluation methods no weight factors are needed and the logical goal structure of an elearning process can be involved into evaluation. Based on general measure theory structure oriented score calculation rules are derived. The so obtained score function satisfies the same calculation rules as they are known from normalised measures. In statistical generalisation, these rules allow the structure oriented calculation of empirical evaluation scores based on checklist data. By these scores the quality can be described by which an e-learning has reached its total goal. Moreover, a consistent evaluation of embedded partial processes of an e-learning becomes possibly. The presented score calculation rules are part of a eight step evaluation model which is illustrated by pilot samples. U. Tudevdagva’s structure oriented evaluation model (SURE model) is by its embedding into the general measure theory quite universal applicable. In similar manner, an evaluation of efficiency of administration or organisation processes becomes possible.
94

Evaluating PQPM for Usage in Combination with Continuous LOD in VR

Nyström, Oskar January 2022 (has links)
The use of Virtual Reality (VR) is growing in commercial use, one type of VR headset, called mobile stand-alone system has limited resources for computing and memory. Because of this, when developing real-time applications for this type of VR headset, performance needs to be heavily considered. One popular optimization technique is Level-of-Detail (LOD) which is a technique that represents a model at various resolutions. One type of LOD is called continuous LOD which can represent a model at a continuous spectrum of detail, this is not frequently used, however, because of being less intuitive and more difficult to implement than other versions. This project researched a type of continuous LOD used with a new metric called Pixel Quality Per Meter (PQPM). PQPM relies on having a minimum edge length calculated using a model's screen coverage in relation to its size. To answer whether PQPM can be used together with continuous LOD for intuitive, simple, and efficient updating and rendering in VR. The continuous LOD only uses one low-poly mesh which is tessellated with the help of a Vertex Displacement Map (VDM) to the desired quality. This approach is then evaluated using Nvidia FLIP, an image comparison application that emulates the human visual system. The result was an intuitive and easily implementable LOD which is used together with PQPM to decide the optimal quality given the models size and coverage on the screen. The usage of PQPM did not result in optimal quality at all distances, due to smaller segments being present, which could disappear completely at far distances. The continuous LOD combined with the PQPM did also not scale well but worked well at lower qualities. The study showed groundwork for how PQPM could work together with continuous LOD, it provides a more intuitive and easily implementable continuous LOD than previous approaches, however, because of the scalability issues, further work needs to be done to optimize this approach. / Användingen av Virtual Reality (VR) i kommersiellt syfte ökar, en typ av dessa VR headsets kallas mobila fristående system. Mobila fristående system har begränsade resurser när det kommer till att göra beräkningar och minne. På grund av detta måste man noga överväga prestanda. En populär optimeringsmetod kallas Level-of-Detail (LOD) vilket är en metod som kan representera en modell på ett antal olika upplösningar. En typ av LOD kallas kontinuerlig LOD som kan representera en modell med en kontinuerling detaljnivå. Denna används inte frekvent däremot på grund av att den är mindre intuitiv och svårare att implementera och uppdatera detaljnivå. Detta projektet undersökte en typ av kontinuerlig LOD där den används tillsammans med ett nytt metriskt mått som kallas Pixel Quality Per Meter (PQPM). PQPM byggs på att beräkna en minimal kantlängd via hur stor del av skärmen den täcker i relation till dess storlek. Detta undersöks för att besvara ifall PQPM kan användas tillsammans med kontinuerlig LOD för intuitiv, simpel och effektiv uppdatering och rendering i VR. Den kontinuerliga LOD använder sig bara av en low-poly mesh som är tessellerad med hjälp av en Vertex Displacement Map (VDM) till en önskad kvalitet. Detta evalueras sedan genom Nvidia FLIP, en bildjämförelseapplikation som emulerar människosyn. Resultatet var en intuitiv och enkelt implementerbar kontinuerlig LOD som användes tillsammans med PQPM för att bestämma den optimala detaljnivån baserat på modellens storlek och hur mycket av skärmen den täckte. Användingen av PQPM resulterade inte i optimal kvalitet för alla distanser, på grund av mindre segment som kunde försvinna helt på längre distanser. Den kontinuerliga LOD tillsammans med PQPM hade dålig skalning i prestanda men fungerade bra vid lägre kvalitetsnivåer. Studien visa grundarbetet för hur PQPM kan användas tillsammans med kontinuerlig LOD. Detta tillvägagångsättet tillät för ett mer intuitivt och simpelt sätt att implementera än tidigare tillvägagångssätt. Däremot, på grund av skalningsproblemen, fortsatt arbete måste göras för att optimera detta tillvägagångssättet.
95

Cycle Route Analysis : Mediating and Facilitating Participatory Cycle Planning / Cykelvägsanalys : Förmedling och underlättande av cykelplanering

Lereculey-Peran, Alix January 2022 (has links)
Cycling is recognised as a mode of transport with many health and environmental benefits yet remains relatively underfunded and lacks priority in many Swedish municipalities. Despite a will from the government to enhance sustainable transport, cycling is not given its rightful place in urban areas. This implies a presence of bottlenecks and barriers in cycle planning. Cycle advocacy organisations try to change this paradigm and develop tools, methods, and processes to improve this process. Cycling advocacy Cykelfrämjandet recently released a new process called Cyklisternas Cykelvägsanalys, cyclists’ cycle route analysis. This process can be used by any usual cyclist who will invite decision makers to experience the cycling environment and evaluate it together using a quality assessment method. A report is handed over to the municipality afterwards and a follow up is done one year after.To fully grasp what this new process entails, a thorough document analysis was conducted. Through an international review exploring similar initiatives developed by NGOs, individuals or governmental entities, the degree of innovation this process was assessed. This was then built upon using semi constructed interviews with participants of the three trial Cykelvägsanalys happening in Marks Kommun, Pitea and Varberg. The interview results were also used to evaluate how Cyklisternas Cykelvägsanalys can work on bottlenecks and use action levers to improve the cycling environment. Results show that whilst other methods to assess the quality of cycling environmentsexist, none are tied into a process, or at least nothing of the official kind. This emphasis on the process, the communication that it creates between politicians, planners, and everyday cyclists can help lift bottlenecks related to cycle planning linked to a lack of political support and the weak cycle lobby. In the three municipalities, it seems that CVA has more impact on municipalities that are less advanced on the cycling question. The results are very promising, and the participants on the municipal side willing to act on recommendations issued from the workshop. A big drawback however is the presence of very few politicians in the workshops which is what would have the most impact. A bigger emphasis on the necessity of getting these actors to participate in the methodology would be beneficial. More research can be done in a few years to assess the impact of follow up and if everyday cyclists took the initiative of trying out the method in their municipalities.
96

Évaluation de la qualité des documents anciens numérisés

Rabeux, Vincent 06 March 2013 (has links)
Les travaux de recherche présentés dans ce manuscrit décrivent plusieurs apports au thème de l’évaluation de la qualité d’images de documents numérisés. Pour cela nous proposons de nouveaux descripteurs permettant de quantifier les dégradations les plus couramment rencontrées sur les images de documents numérisés. Nous proposons également une méthodologie s’appuyant sur le calcul de ces descripteurs et permettant de prédire les performances d’algorithmes de traitement et d’analyse d’images de documents. Les descripteurs sont définis en analysant l’influence des dégradations sur les performances de différents algorithmes, puis utilisés pour créer des modèles de prédiction à l’aide de régresseurs statistiques. La pertinence, des descripteurs proposés et de la méthodologie de prédiction, est validée de plusieurs façons. Premièrement, par la prédiction des performances de onze algorithmes de binarisation. Deuxièmement par la création d’un processus automatique de sélection de l’algorithme de binarisation le plus performant pour chaque image. Puis pour finir, par la prédiction des performances de deux OCRs en fonction de l’importance du défaut de transparence (diffusion de l’encre du recto sur le verso d’un document). Ce travail sur la prédiction des performances d’algorithmes est aussi l’occasion d’aborder les problèmes scientifiques liés à la création de vérités-terrains et d’évaluation de performances. / This PhD. thesis deals with quality evaluation of digitized document images. In order to measure the quality of a document image, we propose to create new features dedicated to the characterization of most commons degradations. We also propose to use these features to create prediction models able to predict the performances of different types of document analysis algorithms. The features are defined by analyzing the impact of a specific degradation on the results of an algorithm and then used to create statistical regressors.The relevance of the proposed features and predictions models, is analyzed in several experimentations. The first one aims to predict the performance of different binarization methods. The second experiment aims to create an automatic procedure able to select the best binarization method for each image. At last, the third experiment aims to create a prediction model for two commonly used OCRs. This work on performance prediction algorithms is also an opportunity to discuss the scientific problems of creating ground-truth for performance evaluation.
97

[en] SPEECH CODING AT AVERAGE RATES BELOW 2KB/S / [es] CODIFICACIÓN DE VOZ A TASAS MEDIAS ABAJO DE 2 KB/S / [pt] CODIFICAÇÃO DE VOZ A TAXAS MÉDIAS ABAIXO DE 2 KB/S

RODRIGO CAIADO DE LAMARE 21 August 2001 (has links)
[pt] Esta dissertação propõe algoritmos para codificações de voz a taxas médias em torno de 1,2 Kb/s. Um esquema de quantização vetorial preditiva chaveada com desempenho superior aos esquemas previamente descritos na literatura é proposto e avaliado em canal com ou sem ruído. Detectores eficientes de período fundamental e de sons oclusivos e fricativos são examinados e adaptados ao codificador proposto. Técnicas de exitação a baixas taxas de bits são investigadas a fim de reproduzir uma boa qualidade de voz decodificada. O modelo de exitação mista em multi-bandas com três sub-bandas é adotado para codificar os quadros sonoros. Para os quadros surdos são empregadas técnicas de modelagem e síntese de sinais fricativos e oclusivos, capazes de oferecer qualidade de voz satisfatória, reduzindo a taxa de bits destes quadros para apenas 0,4 Kb/s. Técnicas de pós-filtragem para reduzir o ruído de codificação e melhorar a qualidade de voz reconstruída são também examinadas e comparadas em uma mesma plataforma. Para reduzir o nível de ruído ambiente são ainda analisados métodos de supressão de ruído. Finalmente, o codificador proposto é comparado ao padrão norte-americano Mixed Excitation Linear Prediction (MELP), por meios de teste de comparação do tipo A/B. Os testes realizados indicam que o sistema proposto, operando a 1,2 Kb/s, apresenta qualidade de voz ligeiramente superior ao MELP, operando a 2,4 Kb/s. Para situações de transcodificação, o codificador proposto também apresenta desempenho superior ao MELP. / [en] This dissertation presents algorithms to encode at an avarage bit rate of 1.2 Kb/s. A novel switched-predictive vector quantiser technique that outperforms previously reported schemes is proposed and assessed under noise-free and noisy channels. Efficient detectors for the pitch period and fricative and stop sounds are examined and adapted to the proposed coder. Low bit rate excitation methods are investigated in order to reproduce rather high quality speech. A mixed multiband excitation approach with three sub-bands is employed to encode voiced frames. For unvoiced frames, fricatives and stops modelling and synthesis techniques are used. This approach has shown to provide high quality synthesised speech, whilts it reduces the bit rate to only 0.4 Kb/s for unvoiced frames. To reduce coding noise and improve decoded speech, post- filtering techniques are analysed and compared on the same plataform. To reduce background noise, noise suppression methods are also examined. Finally, the propose coder is evaluated against the North American Mixed Prediction (MELP) coder, through A/B comparison tests. Assessment results have shown that the proposed system, operating at 1.2 Kb/s, slightly outperformed the MELP coder, operating at 2.4 Kb/s. For tandem connection situations, the proposed algorithm has presented a superior performance than the MELP coder. / [es] Esta disertación propone algoritmos para codificaciones de voz a tasas medias en torno de 1,2 Kb/s. Se propone un esquema de cuantización vectorial predictiva, con desempeño superior a los esquemas previamente descritos en la literatura. Este esquema se evalúa en canal con o sin ruido. Se examinan detectores eficientes de período fundamental y de sueños oclusivos y fricativos se adaptan al codificador propuesto. Técnicas de exitación a bajas tasas de bits son investigadas a fin de reproducir una boa calidad de voz decodificada. Se adopta el modelo de exitación mixta en multi-bandas con tres sub-bandas para codificar los cuadros sonoros. Para los cuadros surdos se emplean técnicas de modelación y síntesis de señales fricativos y oclusivos, capaces de ofrecer calidad de voz satisfactoria, reduciendo la tasa de bits de estos cuadros para apenas 0,4 Kb/s. También se examinan y se comparan las técnicas de pós-filtragen para reducir el ruido de codificación y mejorar la calidad de voz reconstruída. Para reducir el nível de ruído ambiente se analizan métodos de supresión de ruido. Finalmente, el codificador propuesto se compara al padrón norteamericano Mixed Excitation Lineal Prediction (MELP), por medio de pruebas de comparación del tipo LA/B. Las pruebas realizadas indican que el sistema propuesto, operando a 1,2 Kb/s, presenta calidad de voz ligeramente superior al MELP, operando a 2,4 Kb/s. Para situaciones de transcodificación, el codificador propuesto también presenta desempeño superior al MELP.
98

Pokročilé metody snímání a hodnocení kvality 3D videa / Advanced Methods for 3D Video Capturing and Evaluation

Kaller, Ondřej January 2018 (has links)
Disertační práce se zabývá metodami snímání a hodnocení kvality 3D obrazů a videí. Po krátkém shrnutí fyziologie prostorového vnímání, obsahuje práce stav poznání v oblastech problému adaptivní paralaxy a konfigurace kamer pro snímání klasického stereopáru. Taktéž shrnuje dnešní možnosti odhadu hloubkové mapy. Zmíněny jsou aktivní i pasivní metody, detailněji je vysvětleno profilometrické skenování. Byly změřeny některé technické parametry dvou technologií současných 3D zobrazovačů, a to polarizačně-oddělujících a využívajících časový multiplex, například přeslechy mezi levým a pravým snímkem. Jádro práce tvoří nová metoda pro vytváření hloubkové mapy při snímání 3D scény, kterážto byla autorem navržena a testována. Inovativnost tohoto přístupu spočívá v chytré kombinaci současných aktivních a pasivních metod snímání hloubky scény, která vtipně využívá výhod obou metod. Nakonec jsou prezentovány výsledky subjektivních testů kvality 3D videa. Největší přínos zde má navržená metrika modelující výsledky subjektivních testů kvality 3D videa.
99

Measuring the Utility of Synthetic Data : An Empirical Evaluation of Population Fidelity Measures as Indicators of Synthetic Data Utility in Classification Tasks / Mätning av Användbarheten hos Syntetiska Data : En Empirisk Utvärdering av Population Fidelity mätvärden som Indikatorer på Syntetiska Datas Användbarhet i Klassifikationsuppgifter

Florean, Alexander January 2024 (has links)
In the era of data-driven decision-making and innovation, synthetic data serves as a promising tool that bridges the need for vast datasets in machine learning (ML) and the imperative necessity of data privacy. By simulating real-world data while preserving privacy, synthetic data generators have become more prevalent instruments in AI and ML development. A key challenge with synthetic data lies in accurately estimating its utility. For such purpose, Population Fidelity (PF) measures have shown to be good candidates, a category of metrics that evaluates how well the synthetic data mimics the general distribution of the original data. With this setting, we aim to answer: "How well are different population fidelity measures able to indicate the utility of synthetic data for machine learning based classification models?" We designed a reusable six-step experiment framework to examine the correlation between nine PF measures and the performance of four ML for training classification models over five datasets. The six-step approach includes data preparation, training, testing on original and synthetic datasets, and PF measures computation. The study reveals non-linear relationships between the PF measures and synthetic data utility. The general analysis, meaning the monotonic relationship between the PF measure and performance over all models, yielded at most moderate correlations, where the Cluster measure showed the strongest correlation. In the more granular model-specific analysis, Random Forest showed strong correlations with three PF measures. The findings show that no PF measure shows a consistently high correlation over all models to be considered a universal estimator for model performance.This highlights the importance of context-aware application of PF measures and sets the stage for future research to expand the scope, including support for a wider range of types of data and integrating privacy evaluations in synthetic data assessment. Ultimately, this study contributes to the effective and reliable use of synthetic data, particularly in sensitive fields where data quality is vital. / I eran av datadriven beslutsfattning och innovation, fungerar syntetiska data som ett lovande verktyg som bryggar behovet av omfattande dataset inom maskininlärning (ML) och nödvändigheten för dataintegritet. Genom att simulera verklig data samtidigt som man bevarar integriteten, har generatorer av syntetiska data blivit allt vanligare verktyg inom AI och ML-utveckling. En viktig utmaning med syntetiska data är att noggrant uppskatta dess användbarhet. För detta ändamål har mått under kategorin Populations Fidelity (PF) visat sig vara goda kandidater, det är mätvärden som utvärderar hur väl syntetiska datan efterliknar den generella distributionen av den ursprungliga datan. Med detta i åtanke strävar vi att svara på följande: Hur väl kan olika population fidelity mätvärden indikera användbarheten av syntetisk data för maskininlärnings baserade klassifikationsmodeller? För att besvara frågan har vi designat ett återanvändbart sex-stegs experiment ramverk, för att undersöka korrelationen mellan nio PF-mått och prestandan hos fyra ML klassificeringsmodeller, på fem dataset. Sex-stegs strategin inkluderar datatillredning, träning, testning på både ursprungliga och syntetiska dataset samt beräkning av PF-mått. Studien avslöjar förekommandet av icke-linjära relationer mellan PF-måtten och användbarheten av syntetiska data. Den generella analysen, det vill säga den monotona relationen mellan PF-måttet och prestanda över alla modeller, visade som mest medelmåttiga korrelationer, där Cluster-måttet visade den starkaste korrelationen. I den mer detaljerade, modell-specifika analysen visade Random Forest starka korrelationer med tre PF-mått. Resultaten visar att inget PF-mått visar konsekvent hög korrelation över alla modeller för att betraktas som en universell indikator för modellprestanda. Detta understryker vikten av kontextmedveten tillämpning av PF-mått och banar väg för framtida forskning för att utöka omfånget, inklusive stöd för ett bredare utbud för data av olika typer och integrering av integritetsutvärderingar i bedömningen av syntetiska data. Därav, så bidrar denna studie till effektiv och tillförlitlig användning av syntetiska data, särskilt inom känsliga områden där datakvalitet är avgörande.

Page generated in 0.0494 seconds