Global ETD Search

211	Estimating Poolability of Transport Demand Using Shipment Encoding : Designing and building a tool that estimates different poolability types of shipment groups using dimensionality reduction. / Uppskattning av Poolbarhet av Transportefterfrågan med Försändelsekodning : Designa och bygga ett verktyg som uppskattar olika typer av poolbarhetstyper av försändelsegrupper med hjälp av dimensionsreduktion och mätvärden för att mäta poolbarhetsegenskaper. Kërçini, Marvin January 2023 (has links) Dedicating less transport resources by grouping goods to be shipped together, or pooling as we name it, has a very crucial role in saving costs in transport networks. Nonetheless, it is not so easy to estimate pooling among different groups of shipments or understand why these groups are poolable. The typical solution would be to consider all shipments of both groups as one and use some Vehicle Routing Problem (VRP) software to estimate costs of the new combined group. However, this brings with it some drawbacks, such as high computational costs and no pooling explainability. On this work we build a tool that estimates the different types of pooling using demand data. This solution includes mapping shipment data to a lower dimension, where each poolability trait corresponds to a latent dimension. We tested different dimensionality reduction techniques and found that the best performing are the autoencoder models based on neural networks. Nevertheless, comparing shipments on the latent space turns out to be more challenging than expected, because distances in these latent dimensions are sometimes uncorrelated to the distances in the real shipment features. Although this limits the use cases of this approach, we still manage to build the full poolability tool that incorporates the autoencoders and uses metrics we designed to measure each poolability trait. This tool is then compared to a VRP software and proves to have close accuracy, while being much faster and explainable. / Att optimera transportresurser genom att gruppera varor som ska skickas tillsammans, även kallat poolning, spelar en avgörande roll för att spara kostnader i transportnätverk. Trots detta är det inte så enkelt att uppskatta poolning mellan olika grupper av försändelser eller förstå varför dessa grupper kan poolas. Den vanliga lösningen skulle vara att betrakta alla försändelser från båda grupperna som en enda enhet och använda mjukvara för att lösa problemet med fordonsschemaläggning (Vehicle Routing Problem, VRP) för att uppskatta kostnaderna för den nya sammanslagna gruppen. Detta medför dock vissa nackdelar, såsom höga beräkningskostnader och bristande förklarbarhet när det kommer till poolning. I detta arbete bygger vi ett verktyg som med hjälp av efterfrågedata uppskattar olika typer av poolning. Lösningen innefattar kartläggning av försändelsedata till en lägre dimension där varje egenskap för poolbarhet motsvarar en dold dimension. Vi testade olika tekniker för att minska dimensionerna och fann att de bäst presterande är autoencoder-modeller baserade på neurala nätverk. Trots detta visade det sig vara mer utmanande än förväntat att jämföra försändelser i det dolda rummet eftersom avstånden i dessa dolda dimensioner ibland inte korrelerar med avstånden i de faktiska försändelseegenskaperna. Trots att detta begränsar användningsområdena för denna metod lyckades vi ändå bygga ett komplett verktyg för poolbarhet som inkluderar autoencoders och använder metriker som vi har utformat för att mäta varje egenskap för poolbarhet. Detta verktyg jämförs sedan med en VRP-mjukvara och visar sig ha liknande noggrannhet samtidigt som det är betydligt snabbare och mer förklarligt. / Dedicare meno risorse di trasporto raggruppando insieme le merci da spedire, o creando un pool come lo chiamiamo noi, svolge un ruolo cruciale nel risparmio dei costi nelle reti di trasporto. Tuttavia, non è facile stimare il grado di aggregazione tra diversi gruppi di spedizioni o comprendere perché tali gruppi siano aggregabili. La soluzione tipica consisterebbe nel considerare tutte le spedizioni di entrambi i gruppi come una sola entità e utilizzare un software di Problema di Routing dei Veicoli (VRP) per stimare i costi del nuovo gruppo combinato. Tuttavia, ciò comporta alcuni svantaggi, come elevati costi computazionali e la mancanza di spiegazioni riguardo all'aggregazione. In questo lavoro abbiamo sviluppato uno strumento che stima i diversi tipi di aggregabilità utilizzando i dati di domanda. Questa soluzione prevede la mappatura dei dati delle spedizioni in una dimensione inferiore, in cui ciascuna caratteristica di aggregabilità corrisponde a una dimensione. Abbiamo testato diverse tecniche di riduzione dimensionale e abbiamo constatato che i modelli autoencoder basati su reti neurali sono i più efficaci. Tuttavia, confrontare le spedizioni nello spazio latente si è rivelato più complesso del previsto, poiché le distanze in queste dimensioni latenti talvolta non sono correlate alle distanze nelle caratteristiche reali delle spedizioni. Sebbene ciò limiti le applicazioni di questo approccio, siamo comunque riusciti a sviluppare uno strumento completo per l'aggregabilità che incorpora gli autoencoder e utilizza metriche da noi progettate per misurare ciascuna caratteristica di aggregabilità. Successivamente, abbiamo confrontato questo strumento con un software VRP e dimostrato che presenta un'accuratezza simile, pur essendo più veloce e fornendo spiegazioni chiare. Poolability Transport networks Autoencoder Dimensionality reduction Vehicle Routing Problem Raggruppabilità Reti di trasporto Autoencoder Riduzione della dimensionalità Vehicle Routing Problem Poolbarhet Transportnätverk Autokodare Dimensionsreduktion Fordonsdirigeringsproblem Computer Sciences Datavetenskap (datalogi) Computer Engineering Datorteknik Computer and Information Sciences Data- och informationsvetenskap
212	Multi-defect detection in hardwood using AI on hyperspectral images Ytterberg, Kalle January 2024 (has links) With the evolution of GPU performance, the interest of using AI for all kinds of purposes has risen. Companies today put a great amount of resources to find new ways of using AI to increase the value of their products or automating processes. An area in the wood industry where AI is widely used and studied is in defect detection. In this thesis, the combination of using AI and hyperspectral images is studied and evaluated in the case of segmenting defects in hardwood with a U- Net network structure. The performance is compared to another known method usually used when dealing with high-dimensional data: PLS-DA. This thesis also compares the use of RGB image data in combination with AI, to further analyze the usefulness that the hyperspectral data provide. The results showed signs of improvement when using hyperspectral images com- pared to RGB images when detecting blue stain and red heartwood defects. De- tection of the defects rot and knots did however show no sign of improvements. Due to the annotations being more accurate in the RGB data, the results from the hyperspectral data-fed networks would suggest that blue stain and red heartwood could be of interest regarding further investigation. Computational performance is shown to vary across the different reduction meth- ods, and the results from this thesis provides some insight that might aid in the reasoning regarding how to choose an appropriate reduction method. Computer Vision Hyperspectral Imaging AI Segmentation Dimensionality Reduction Binning PCA PLS LDA FBAE U-NET Red heartwood Blue stain Rot Knots Beech Defects Computer and Information Sciences Data- och informationsvetenskap Wood Science Trävetenskap
213	Accelerated Discovery of Multi-Principal Element Alloys and Wide Bandgap Semiconductors under Extreme Conditions Saswat Mishra (19185079) 22 July 2024 (has links) <p dir="ltr">Advancements in material science are accelerating technological evolution, driven by initiatives like the Materials Genome Project, which integrates computational and experi- mental strategies to expedite material discovery. In this work, we focus on the reliability of advanced materials under extreme conditions, a critical area for enhancing their technological applications.</p><p dir="ltr">Multi-principal component alloys (MPEAs) exhibit remarkable properties under extreme conditions. However, their vast compositional space makes a brute-force exploration of potential alloys prohibitive. We address this challenge by employing a Bayesian approach to explore the oxidation resistance of hundreds of alloys, applying computational techniques to accurately calculate and quantify errors in the melting temperatures of MPEAs, and investigating the compositional biases and short-range order in their nucleation behaviors.</p><p dir="ltr">Furthermore, we scrutinize the role of wide bandgap semiconductors, which are essential in high-power applications due to their superior breakdown voltage, drift velocity, and sheet charge density. The lack of lattice-matched substrates often results in strained films, which enhances piezoelectric eﬀects crucial for device reliability. Our research advances the pre- diction of piezoelectric and dielectric responses as influenced by biaxial strain and doping in gallium nitride (GaN). Additionally, we delve into how various common defects aﬀect the formation of trap states, significantly impacting the electronic properties of these materials. These studies oﬀer significant advancements in understanding MPEAs and wide bandgap semiconductors under extreme conditions. We also provide foundational insights for developing robust and eﬃcient materials essential for next-generation applications.</p> Compound semiconductors Metals and alloy materials High entropy alloys (HEAs) Multi principal element alloys (MPEAs) Computational Materials Repository Computational Materials Science Gaussian Process Bayesian Statistics Dimensionality Reduction Nucleation Melting temperature Oxidation Piezoelectric Point defects
214	Evaluating perceptual maps of asymmetries for gait symmetry quantification and pathology detection Moevus, Antoine 12 1900 (has links) Le mouvement de la marche est un processus essentiel de l'activité humaine et aussi le résultat de nombreuses interactions collaboratives entre les systèmes neurologiques, articulaires et musculo-squelettiques fonctionnant ensemble efficacement. Ceci explique pourquoi une analyse de la marche est aujourd'hui de plus en plus utilisée pour le diagnostic (et aussi la prévention) de différents types de maladies (neurologiques, musculaires, orthopédique, etc.). Ce rapport présente une nouvelle méthode pour visualiser rapidement les différentes parties du corps humain liées à une possible asymétrie (temporellement invariante par translation) existant dans la démarche d'un patient pour une possible utilisation clinique quotidienne. L'objectif est de fournir une méthode à la fois facile et peu dispendieuse permettant la mesure et l'affichage visuel, d'une manière intuitive et perceptive, des différentes parties asymétriques d'une démarche. La méthode proposée repose sur l'utilisation d'un capteur de profondeur peu dispendieux (la Kinect) qui est très bien adaptée pour un diagnostique rapide effectué dans de petites salles médicales car ce capteur est d'une part facile à installer et ne nécessitant aucun marqueur. L'algorithme que nous allons présenter est basé sur le fait que la marche saine possède des propriétés de symétrie (relativement à une invariance temporelle) dans le plan coronal. / The gait movement is an essential process of the human activity and also the result of coordinated effort between the neurological, articular and musculoskeletal systems. This motivates why gait analysis is important and also increasingly used nowadays for the (possible early) diagnosis of many different types (neurological, muscular, orthopedic, etc.) of diseases. This paper introduces a novel method to quickly visualize the different parts of the body related to an asymmetric movement in the human gait of a patient for daily clinical. The goal is to provide a cheap and easy-to-use method to measure the gait asymmetry and display results in a perceptually relevant manner. This method relies on an affordable consumer depth sensor, the Kinect. The Kinect was chosen because this device is amenable for use in small, confined area, like a living room. Also, since it is marker-less, it provides a fast non-invasive diagnostic. The algorithm we are going to introduce relies on the fact that a healthy walk has (temporally shift-invariant) symmetry properties in the coronal plane. Analyse de la symétrie de la marche Kinect trouble locomoteur positionnement multidimensionnel (MDS) carte de couleur perceptuelle invariance par décalage temporel Gait asymmetry analysis Kinect loco-motor disorders multidimensional scaling (MDS) nonlinear dimensionality reduction perceptual color map temporal shift-invariance
215	Machine learning via dynamical processes on complex networks / Aprendizado de máquina via processos dinâmicos em redes complexas Cupertino, Thiago Henrique 20 December 2013 (has links) Extracting useful knowledge from data sets is a key concept in modern information systems. Consequently, the need of efficient techniques to extract the desired knowledge has been growing over time. Machine learning is a research field dedicated to the development of techniques capable of enabling a machine to \"learn\" from data. Many techniques have been proposed so far, but there are still issues to be unveiled specially in interdisciplinary research. In this thesis, we explore the advantages of network data representation to develop machine learning techniques based on dynamical processes on networks. The network representation unifies the structure, dynamics and functions of the system it represents, and thus is capable of capturing the spatial, topological and functional relations of the data sets under analysis. We develop network-based techniques for the three machine learning paradigms: supervised, semi-supervised and unsupervised. The random walk dynamical process is used to characterize the access of unlabeled data to data classes, configuring a new heuristic we call ease of access in the supervised paradigm. We also propose a classification technique which combines the high-level view of the data, via network topological characterization, and the low-level relations, via similarity measures, in a general framework. Still in the supervised setting, the modularity and Katz centrality network measures are applied to classify multiple observation sets, and an evolving network construction method is applied to the dimensionality reduction problem. The semi-supervised paradigm is covered by extending the ease of access heuristic to the cases in which just a few labeled data samples and many unlabeled samples are available. A semi-supervised technique based on interacting forces is also proposed, for which we provide parameter heuristics and stability analysis via a Lyapunov function. Finally, an unsupervised network-based technique uses the concepts of pinning control and consensus time from dynamical processes to derive a similarity measure used to cluster data. The data is represented by a connected and sparse network in which nodes are dynamical elements. Simulations on benchmark data sets and comparisons to well-known machine learning techniques are provided for all proposed techniques. Advantages of network data representation and dynamical processes for machine learning are highlighted in all cases / A extração de conhecimento útil a partir de conjuntos de dados é um conceito chave em sistemas de informação modernos. Por conseguinte, a necessidade de técnicas eficientes para extrair o conhecimento desejado vem crescendo ao longo do tempo. Aprendizado de máquina é uma área de pesquisa dedicada ao desenvolvimento de técnicas capazes de permitir que uma máquina \"aprenda\" a partir de conjuntos de dados. Muitas técnicas já foram propostas, mas ainda há questões a serem reveladas especialmente em pesquisas interdisciplinares. Nesta tese, exploramos as vantagens da representação de dados em rede para desenvolver técnicas de aprendizado de máquina baseadas em processos dinâmicos em redes. A representação em rede unifica a estrutura, a dinâmica e as funções do sistema representado e, portanto, é capaz de capturar as relações espaciais, topológicas e funcionais dos conjuntos de dados sob análise. Desenvolvemos técnicas baseadas em rede para os três paradigmas de aprendizado de máquina: supervisionado, semissupervisionado e não supervisionado. O processo dinâmico de passeio aleatório é utilizado para caracterizar o acesso de dados não rotulados às classes de dados configurando uma nova heurística no paradigma supervisionado, a qual chamamos de facilidade de acesso. Também propomos uma técnica de classificação de dados que combina a visão de alto nível dos dados, por meio da caracterização topológica de rede, com relações de baixo nível, por meio de medidas de similaridade, em uma estrutura geral. Ainda no aprendizado supervisionado, as medidas de rede modularidade e centralidade Katz são aplicadas para classificar conjuntos de múltiplas observações, e um método de construção evolutiva de rede é aplicado ao problema de redução de dimensionalidade. O paradigma semissupervisionado é abordado por meio da extensão da heurística de facilidade de acesso para os casos em que apenas algumas amostras de dados rotuladas e muitas amostras não rotuladas estão disponíveis. É também proposta uma técnica semissupervisionada baseada em forças de interação, para a qual fornecemos heurísticas para selecionar parâmetros e uma análise de estabilidade mediante uma função de Lyapunov. Finalmente, uma técnica não supervisionada baseada em rede utiliza os conceitos de controle pontual e tempo de consenso de processos dinâmicos para derivar uma medida de similaridade usada para agrupar dados. Os dados são representados por uma rede conectada e esparsa na qual os vértices são elementos dinâmicos. Simulações com dados de referência e comparações com técnicas de aprendizado de máquina conhecidas são fornecidos para todas as técnicas propostas. As vantagens da representação de dados em rede e de processos dinâmicos para o aprendizado de máquina são evidenciadas em todos os casos Aprendizado baseado em redes Aprendizado de máquina Aprendizado não supervisionado Aprendizado semissupervisionado Aprendizado supervisionado Caminhada aleatória Complex networks Consensus time Controle pontual Dimensionality reduction Dynamical processes Estado estacionário Forças de interação Interacting forces Limiting probabilities Machine learning Network-based learning Plinning control Probabilidades limite Processos dinâmicos Random walk Redes complexas Redução de dimensionalidade Semi-supervised learning Stationary states Supervised learning Tempo de consenso Unsupervised learning
216	Categorical structural optimization : methods and applications / Optimisation structurelle catégorique : méthodes et applications Gao, Huanhuan 07 February 2019 (has links) La thèse se concentre sur une recherche méthodologique sur l'optimisation structurelle catégorielle au moyen d'un apprentissage multiple. Dans cette thèse, les variables catégorielles non ordinales sont traitées comme des variables discrètes multidimensionnelles. Afin de réduire la dimensionnalité, les nombreuses techniques d'apprentissage sont introduites pour trouver la dimensionnalité intrinsèque et mapper l'espace de conception d'origine sur un espace d'ordre réduit. Les mécanismes des techniques d'apprentissage à la fois linéaires et non linéaires sont d'abord étudiés. Ensuite, des exemples numériques sont testés pour comparer les performances de nombreuses techniques d’apprentissage. Sur la base de la représentation d'ordre réduit obtenue par Isomap, les opérateurs de mutation et de croisement évolutifs basés sur les graphes sont proposés pour traiter des problèmes d'optimisation structurelle catégoriels, notamment la conception du dôme, du cadre rigide de six étages et des structures en forme de dame. Ensuite, la méthode de recherche continue consistant à déplacer des asymptotes est exécutée et fournit une solution compétitive, mais inadmissible, en quelques rares itérations. Ensuite, lors de la deuxième étape, une stratégie de recherche discrète est proposée pour rechercher de meilleures solutions basées sur la recherche de voisins. Afin de traiter le cas dans lequel les instances de conception catégorielles sont réparties sur plusieurs variétés, nous proposons une méthode d'apprentissage des variétés k-variétés basée sur l'analyse en composantes principales pondérées. / The thesis concentrates on a methodological research on categorical structural optimizationby means of manifold learning. The main difficulty of handling the categorical optimization problems lies in the description of the categorical variables: they are presented in a category and do not have any orders. Thus the treatment of the design space is a key issue. In this thesis, the non-ordinal categorical variables are treated as multi-dimensional discrete variables, thus the dimensionality of corresponding design space becomes high. In order to reduce the dimensionality, the manifold learning techniques are introduced to find the intrinsic dimensionality and map the original design space to a reduced-order space. The mechanisms of both linear and non-linear manifold learning techniques are firstly studied. Then numerical examples are tested to compare the performance of manifold learning techniques mentioned above. It is found that the PCA and MDS can only deal with linear or globally approximately linear cases. Isomap preserves the geodesic distances for non-linear manifold however, its time consuming is the most. LLE preserves the neighbour weights and can yield good results in a short time. KPCA works like a non-linear classifier and we proves why it cannot preserve distances or angles in some cases. Based on the reduced-order representation obtained by Isomap, the graph-based evolutionary crossover and mutation operators are proposed to deal with categorical structural optimization problems, including the design of dome, six-story rigid frame and dame-like structures. The results show that the proposed graph-based evolutionary approach constructed on the reduced-order space performs more efficiently than traditional methods including simplex approach or evolutionary approach without reduced-order space. In chapter 5, the LLE is applied to reduce the data dimensionality and a polynomial interpolation helps to construct the responding surface from lower dimensional representation to original data. Then the continuous search method of moving asymptotes is executed and yields a competitively good but inadmissible solution within only a few of iteration numbers. Then in the second stage, a discrete search strategy is proposed to find out better solutions based on a neighbour search. The ten-bar truss and dome structural design problems are tested to show the validity of the method. In the end, this method is compared to the Simulated Annealing algorithm and Covariance Matrix Adaptation Evolutionary Strategy, showing its better optimization efficiency. In chapter 6, in order to deal with the case in which the categorical design instances are distributed on several manifolds, we propose a k-manifolds learning method based on the Weighted Principal Component Analysis. And the obtained manifolds are integrated in the lower dimensional design space. Then the method introduced in chapter 4 is applied to solve the ten-bar truss, the dome and the dame-like structural design problems. Optimisation structurelle Apprentissage multiple Réduction de la dimensionnalité Structure en treillis Categorical optimization Structural optimization Manifold learning Dimensionality reduction Polynomial fitting Locally linear embedding Isomap K-manifolds learning Evolutionary methods Kernel functions Polynomial fitting Truss structure Weighted principal component analysis
217	Machine learning via dynamical processes on complex networks / Aprendizado de máquina via processos dinâmicos em redes complexas Thiago Henrique Cupertino 20 December 2013 (has links) Extracting useful knowledge from data sets is a key concept in modern information systems. Consequently, the need of efficient techniques to extract the desired knowledge has been growing over time. Machine learning is a research field dedicated to the development of techniques capable of enabling a machine to \"learn\" from data. Many techniques have been proposed so far, but there are still issues to be unveiled specially in interdisciplinary research. In this thesis, we explore the advantages of network data representation to develop machine learning techniques based on dynamical processes on networks. The network representation unifies the structure, dynamics and functions of the system it represents, and thus is capable of capturing the spatial, topological and functional relations of the data sets under analysis. We develop network-based techniques for the three machine learning paradigms: supervised, semi-supervised and unsupervised. The random walk dynamical process is used to characterize the access of unlabeled data to data classes, configuring a new heuristic we call ease of access in the supervised paradigm. We also propose a classification technique which combines the high-level view of the data, via network topological characterization, and the low-level relations, via similarity measures, in a general framework. Still in the supervised setting, the modularity and Katz centrality network measures are applied to classify multiple observation sets, and an evolving network construction method is applied to the dimensionality reduction problem. The semi-supervised paradigm is covered by extending the ease of access heuristic to the cases in which just a few labeled data samples and many unlabeled samples are available. A semi-supervised technique based on interacting forces is also proposed, for which we provide parameter heuristics and stability analysis via a Lyapunov function. Finally, an unsupervised network-based technique uses the concepts of pinning control and consensus time from dynamical processes to derive a similarity measure used to cluster data. The data is represented by a connected and sparse network in which nodes are dynamical elements. Simulations on benchmark data sets and comparisons to well-known machine learning techniques are provided for all proposed techniques. Advantages of network data representation and dynamical processes for machine learning are highlighted in all cases / A extração de conhecimento útil a partir de conjuntos de dados é um conceito chave em sistemas de informação modernos. Por conseguinte, a necessidade de técnicas eficientes para extrair o conhecimento desejado vem crescendo ao longo do tempo. Aprendizado de máquina é uma área de pesquisa dedicada ao desenvolvimento de técnicas capazes de permitir que uma máquina \"aprenda\" a partir de conjuntos de dados. Muitas técnicas já foram propostas, mas ainda há questões a serem reveladas especialmente em pesquisas interdisciplinares. Nesta tese, exploramos as vantagens da representação de dados em rede para desenvolver técnicas de aprendizado de máquina baseadas em processos dinâmicos em redes. A representação em rede unifica a estrutura, a dinâmica e as funções do sistema representado e, portanto, é capaz de capturar as relações espaciais, topológicas e funcionais dos conjuntos de dados sob análise. Desenvolvemos técnicas baseadas em rede para os três paradigmas de aprendizado de máquina: supervisionado, semissupervisionado e não supervisionado. O processo dinâmico de passeio aleatório é utilizado para caracterizar o acesso de dados não rotulados às classes de dados configurando uma nova heurística no paradigma supervisionado, a qual chamamos de facilidade de acesso. Também propomos uma técnica de classificação de dados que combina a visão de alto nível dos dados, por meio da caracterização topológica de rede, com relações de baixo nível, por meio de medidas de similaridade, em uma estrutura geral. Ainda no aprendizado supervisionado, as medidas de rede modularidade e centralidade Katz são aplicadas para classificar conjuntos de múltiplas observações, e um método de construção evolutiva de rede é aplicado ao problema de redução de dimensionalidade. O paradigma semissupervisionado é abordado por meio da extensão da heurística de facilidade de acesso para os casos em que apenas algumas amostras de dados rotuladas e muitas amostras não rotuladas estão disponíveis. É também proposta uma técnica semissupervisionada baseada em forças de interação, para a qual fornecemos heurísticas para selecionar parâmetros e uma análise de estabilidade mediante uma função de Lyapunov. Finalmente, uma técnica não supervisionada baseada em rede utiliza os conceitos de controle pontual e tempo de consenso de processos dinâmicos para derivar uma medida de similaridade usada para agrupar dados. Os dados são representados por uma rede conectada e esparsa na qual os vértices são elementos dinâmicos. Simulações com dados de referência e comparações com técnicas de aprendizado de máquina conhecidas são fornecidos para todas as técnicas propostas. As vantagens da representação de dados em rede e de processos dinâmicos para o aprendizado de máquina são evidenciadas em todos os casos Aprendizado baseado em redes Aprendizado de máquina Aprendizado não supervisionado Aprendizado semissupervisionado Aprendizado supervisionado Caminhada aleatória Controle pontual Estado estacionário Forças de interação Probabilidades limite Processos dinâmicos Redes complexas Redução de dimensionalidade Tempo de consenso Complex networks Consensus time Dimensionality reduction Dynamical processes Interacting forces Limiting probabilities Machine learning Network-based learning Plinning control Random walk Semi-supervised learning Stationary states Supervised learning Unsupervised learning
218	PCA based dimensionality reduction of MRI images for training support vector machine to aid diagnosis of bipolar disorder / PCA baserad dimensionalitetsreduktion av MRI bilder för träning av stödvektormaskin till att stödja diagnostisering av bipolär sjukdom Chen, Beichen, Chen, Amy Jinxin January 2019 (has links) This study aims to investigate how dimensionality reduction of neuroimaging data prior to training support vector machines (SVMs) affects the classification accuracy of bipolar disorder. This study uses principal component analysis (PCA) for dimensionality reduction. An open source data set of 19 bipolar and 31 control structural magnetic resonance imaging (sMRI) samples was used, part of the UCLA Consortium for Neuropsychiatric Phenomics LA5c Study funded by the NIH Roadmap Initiative aiming to foster breakthroughs in the development of novel treatments for neuropsychiatric disorders. The images underwent smoothing, feature extraction and PCA before they were used as input to train SVMs. 3-fold cross-validation was used to tune a number of hyperparameters for linear, radial, and polynomial kernels. Experiments were done to investigate the performance of SVM models trained using 1 to 29 principal components (PCs). Several PC sets reached 100% accuracy in the final evaluation, with the minimal set being the first two principal components. Accumulated variance explained by the PCs used did not have a correlation with the performance of the model. The choice of kernel and hyperparameters is of utmost importance as the performance obtained can vary greatly. The results support previous studies that SVM can be useful in aiding the diagnosis of bipolar disorder, and that the use of PCA as a dimensionality reduction method in combination with SVM may be appropriate for the classification of neuroimaging data for illnesses not limited to bipolar disorder. Due to the limitation of a small sample size, the results call for future research using larger collaborative data sets to validate the accuracies obtained. / Syftet med denna studie är att undersöka hur dimensionalitetsreduktion av neuroradiologisk data före träning av stödvektormaskiner (SVMs) påverkar klassificeringsnoggrannhet av bipolär sjukdom. Studien använder principalkomponentanalys (PCA) för dimensionalitetsreduktion. En datauppsättning av 19 bipolära och 31 friska magnetisk resonanstomografi(MRT) bilder användes, vilka tillhör den öppna datakällan från studien UCLA Consortium for Neuropsychiatric Phenomics LA5c som finansierades av NIH Roadmap Initiative i syfte att främja genombrott i utvecklingen av nya behandlingar för neuropsykiatriska funktionsnedsättningar. Bilderna genomgick oskärpa, särdragsextrahering och PCA innan de användes som indata för att träna SVMs. Med 3-delad korsvalidering inställdes ett antal parametrar för linjära, radiala och polynomiska kärnor. Experiment gjordes för att utforska prestationen av SVM-modeller tränade med 1 till 29 principalkomponenter (PCs). Flera PC uppsättningar uppnådde 100% noggrannhet i den slutliga utvärderingen, där den minsta uppsättningen var de två första PCs. Den ackumulativa variansen över antalet PCs som användes hade inte någon korrelation med prestationen på modellen. Valet av kärna och hyperparametrar är betydande eftersom prestationen kan variera mycket. Resultatet stödjer tidigare studier att SVM kan vara användbar som stöd för diagnostisering av bipolär sjukdom och användningen av PCA som en dimensionalitetsreduktionsmetod i kombination med SVM kan vara lämplig för klassificering av neuroradiologisk data för bipolär och andra sjukdomar. På grund av begränsningen med få dataprover, kräver resultaten framtida forskning med en större datauppsättning för att validera de erhållna noggrannheten. Bipolar disorder diagnosis computer-aided medical diagnosis SVM Support vector machine PCA Principal component analysis dimensionality reduction feature reduction neuroimaging MRI sMRI machine learning classification psychiatric disorders mental illness Bipolär sjukdom diagnotisering datorstödd medicinsk diagnotisering SVM stödvektormaskin PCA principalkomponentanalys MRI magnetisk resonanstomografi MRT dimensionalitetsreduktion maskininlärning dimensionsreduktion klassificering psykiska sjukdomar Computer and Information Sciences Data- och informationsvetenskap
219	Multimedia Forensics Using Metadata Ziyue Xiang (17989381) 21 February 2024 (has links) <p dir="ltr">The rapid development of machine learning techniques makes it possible to manipulate or synthesize video and audio information while introducing nearly indetectable artifacts. Most media forensics methods analyze the high-level data (e.g., pixels from videos, temporal signals from audios) decoded from compressed media data. Since media manipulation or synthesis methods usually aim to improve the quality of such high-level data directly, acquiring forensic evidence from these data has become increasingly challenging. In this work, we focus on media forensics techniques using the metadata in media formats, which includes container metadata and coding parameters in the encoded bitstream. Since many media manipulation and synthesis methods do not attempt to hide metadata traces, it is possible to use them for forensics tasks. First, we present a video forensics technique using metadata embedded in MP4/MOV video containers. Our proposed method achieved high performance in video manipulation detection, source device attribution, social media attribution, and manipulation tool identification on publicly available datasets. Second, we present a transformer neural network based MP3 audio forensics technique using low-level codec information. Our proposed method can localize multiple compressed segments in MP3 files. The localization accuracy of our proposed method is higher compared to other methods. Third, we present an H.264-based video device matching method. This method can determine if the two video sequences are captured by the same device even if the method has never encountered the device. Our proposed method achieved good performance in a three-fold cross validation scheme on a publicly available video forensics dataset containing 35 devices. Fourth, we present a Graph Neural Network (GNN) based approach for the analysis of MP4/MOV metadata trees. The proposed method is trained using Self-Supervised Learning (SSL), which increased the robustness of the proposed method and makes it capable of handling missing/unseen data. Fifth, we present an efficient approach to compute the spectrogram feature with MP3 compressed audio signals. The proposed approach decreases the complexity of speech feature computation by ~77.6% and saves ~37.87% of MP3 decoding time. The resulting spectrogram features lead to higher synthetic speech detection performance.</p> Audio processing Computer vision Image and video coding Image processing Pattern recognition Video processing Digital forensics Deep learning Deepfake detection Digital forensics Video forensics Audio forensics Video metadata Audio metadata H.264 MP3 MP4 Video manipulation detection Video compression Audio compression Decision tree Deep learning Dimensionality reduction Spectrogram Graph neural networks Neural networks Transformer neural networks
220	Unsupervised Anomaly Detection and Root Cause Analysis in HFC Networks : A Clustering Approach Forsare Källman, Povel January 2021 (has links) Following the significant transition from the traditional production industry to an informationbased economy, the telecommunications industry was faced with an explosion of innovation, resulting in a continuous change in user behaviour. The industry has made efforts to adapt to a more datadriven future, which has given rise to larger and more complex systems. Therefore, troubleshooting systems such as anomaly detection and root cause analysis are essential features for maintaining service quality and facilitating daily operations. This study aims to explore the possibilities, benefits, and drawbacks of implementing cluster analysis for anomaly detection in hybrid fibercoaxial networks. Based on the literature review on unsupervised anomaly detection and an assumption regarding the anomalous behaviour in hybrid fibercoaxial network data, the kmeans, SelfOrganizing Map, and Gaussian Mixture Model were implemented both with and without Principal Component Analysis. Analysis of the results demonstrated an increase in performance for all models when the Principal Component Analysis was applied, with kmeans outperforming both SelfOrganizing Map and Gaussian Mixture Model. On this basis, it is recommended to apply Principal Component Analysis for clusteringbased anomaly detection. Further research is necessary to identify whether cluster analysis is the most appropriate unsupervised anomaly detection approach. / Följt av övergången från den traditionella tillverkningsindustrin till en informationsbaserad ekonomi stod telekommunikationsbranschen inför en explosion av innovation. Detta skifte resulterade i en kontinuerlig förändring av användarbeteende och branschen tvingades genomgå stora ansträngningar för att lyckas anpassa sig till den mer datadrivna framtiden. Större och mer komplexa system utvecklades och således blev felsökningsfunktioner såsom anomalidetektering och rotfelsanalys centrala för att upprätthålla servicekvalitet samt underlätta för den dagliga driftverksamheten. Syftet med studien är att utforska de möjligheterna, för- samt nackdelar med att använda klusteranalys för anomalidetektering inom HFC- nätverk. Baserat på litteraturstudien för oövervakad anomalidetektering samt antaganden för anomalibeteenden inom HFC- data valdes algritmerna k- means, Self- Organizing Map och Gaussian Mixture Model att implementeras, både med och utan Principal Component Analysis. Analys av resultaten påvisade en uppenbar ökning av prestanda för samtliga modeller vid användning av PCA. Vidare överträffade k- means, både Self- Organizing Maps och Gaussian Mixture Model. Utifrån resultatanalysen rekommenderas det således att PCA bör tillämpas vid klusterings- baserad anomalidetektering. Vidare är ytterligare forskning nödvändig för att avgöra huruvida klusteranalys är den mest lämpliga metoden för oövervakad anomalidetektering. Anomaly Detection Root Cause Analysis Cluster Analysis k- means Self- Organizing Map Gaussian Mixture Model Dimensionality Reduction Principal Component Analysis Hybrid Fiber- Coaxial Network. Anomalidetektering Rotfelsanalys Klusteranalys k- means Self- Organizing Map Gaussian Mixture Model Dimensionsreducering Principal Component Analysis Hybrid Fiber Coax- nät. Computer and Information Sciences Data- och informationsvetenskap

Search results