221 |
Robust and distributed top-n frequent-pattern mining with SAP BW acceleratorLehner, Wolfgang, Legler, Thomas, Schaffner, Jan, Krüger, Jens 22 April 2022 (has links)
Mining for association rules and frequent patterns is a central activity in data mining. However, most existing algorithms are only moderately suitable for real-world scenarios. Most strategies use parameters like minimum support, for which it can be very difficult to define a suitable value for unknown datasets. Since most untrained users are unable or unwilling to set such technical parameters, we address the problem of replacing the minimum-support parameter with top-n strategies. In our paper, we start by extending a top-n implementation of the ECLAT algorithm to improve its performance by using heuristic search strategy optimizations. Also, real-world datasets are often distributed and modern database architectures are switching from expensive SMPs to cheaper shared-nothing blade servers. Thus, most mining queries require distribution handling. Since partitioning can be forced by user-defined semantics, it is often forbidden to transform the data. Therefore, we developed an adaptive top-n frequent-pattern mining algorithm that simplifies the mining process on real distributions by relaxing some requirements on the results. We first combine the PARTITION and the TPUT algorithms to handle distributed top-n frequent-pattern mining. Then, we extend this new algorithm for distributions with real-world data characteristics. For frequent-pattern mining algorithms, equal distributions are important conditions, and tiny partitions can cause performance bottlenecks. Hence, we implemented an approach called MAST that defines a minimum absolute-support threshold. MAST prunes patterns with low chances of reaching the global top-n result set and high computing costs. In total, our approach simplifies the process of frequent-pattern mining for real customer scenarios and data sets. This may make frequent-pattern mining accessible for very new user groups. Finally, we present results of our algorithms when run on the SAP NetWeaver BW Acceleratorwith standard and real business datasets.
|
222 |
Towards Efficient Convolutional Neural Architecture DesignRichter, Mats L. 10 May 2022 (has links)
The design and adjustment of convolutional neural network architectures is an opaque and mostly trial and error-driven process.
The main reason for this is the lack of proper paradigms beyond general conventions for the development of neural networks architectures and lacking effective insights into the models that can be propagated back to design decision.
In order for the task-specific design of deep learning solutions to become more efficient and goal-oriented, novel design strategies need to be developed that are founded on an understanding of convolutional neural network models.
This work develops tools for the analysis of the inference process in trained neural network models.
Based on these tools, characteristics of convolutional neural network models are identified that can be linked to inefficiencies in predictive and computational performance.
Based on these insights, this work presents methods for effectively diagnosing these design faults before and during training with little computational overhead.
These findings are empirically tested and demonstrated on architectures with sequential and multi-pathway structures, covering all the common types of convolutional neural network architectures used for classification.
Furthermore, this work proposes simple optimization strategies that allow for goal-oriented and informed adjustment of the neural architecture, opening the potential for a less trial-and-error-driven design process.
|
223 |
Dateneffiziente selbstlernende neuronale ReglerHafner, Roland 04 December 2009 (has links)
Die vorliegende Arbeit untersucht den Entwurf und die Anwendung selbstlernender Regler als intelligente Reglerkomponente im Wirkungsablauf eines Regelkreises für regelungstechnische Anwendungen. Der aufwändige Prozess der Analyse des dynamischen Systems und der Reglersynthese, welche die klassischen Entwurfsmuster der Regelungstechnik benötigen, wird dabei ersetzt durch eine lernende Reglerkomponente. Diese kann mit sehr wenig Wissen über den zu regelnden Prozess eingesetzt werden und lernt direkt durch Interaktion eine präzise Regelung auf extern vorgegebene Führungsgrößen. Der Lernvorgang basiert dabei auf einem Optimierungsprozess mit einem leistungsfähigen Batch-Reinforcement-Lernverfahren, dem ´Neural Fitted Q-Iteration´. Dieses Verfahren wird auf seine Verwendung als selbstlernender Regler untersucht. Für die in den Untersuchungen festgestellten Unzulänglichkeiten des Verfahrens bezüglich der geforderten präzisen, zeitoptimalen Regelung werden verbesserte Vorgehensweisen entwickelt, die ebenfalls auf ihre Leistungsfähigkeit untersucht werden.Für typische regelungstechnische Problemstellungen sind die diskreten Aktionen des NFQ-Verfahrens nicht ausreichend, um eine präzise Regelung auf beliebige Führungsgrößen zu erzeugen.Durch die Entwicklung einer Erweiterung des NFQ für kontinuierliche Aktionen wird die Genauigkeit und Leistungsfähigkeit der selbstlernenden Regler drastisch erhöht, ohne die benötigte Interaktionszeit am Prozess zu erhöhen.An ausgewählten Problemen der Regelung linearer und nichtlinearer Prozesse wird die Leistungsfähigkeit der entwickelten Verfahren empirisch evaluiert. Es zeigt sich dabei, dass die hier entwickelten selbstlernenden Regler mit wenigen Minuten Interaktionszeit an einem Prozess eine präzise Regelungsstrategie für beliebige externe Führungsgrößen lernen, ohne dass Expertenwissen über den Prozess vorliegt.
|
224 |
Auswirkung des Rauschens und Rauschen vermindernder Maßnahmen auf ein fernerkundliches SegmentierungsverfahrenGerhards, Karl 31 July 2006 (has links)
Zur Verminderung des Rauschens sehr hochauflösender Satellitenbilder existieren eine Vielzahl von Glättungsalgorithmen. Die Wirkung verschiedener Tiefpaß- und kantenerhaltender Filter auf das Verhalten eines objektorientierten Segmentierungsverfahrens wird anhand zweier synthetischer Grauwertbilder und einer IKONOS-Aufnahme untersucht. Als Rauschmaß hat sich ein modifiziertes, ursprünglich von Baltsavias et al. [2001] vorgeschlagenes Verfahren bewährt, in dem je Grauwert nur die Standardabweichungen der gleichförmigsten Gebiete berücksichtigt werden. In Vergleich mit synthetisch verrauschten Bildern zeigt sich jedoch, daß auf diese Weise das Rauschen im Bild systematisch um fast den Faktor zwei unterschätzt wird. Einfache Filter wie Mittelwertfilter und davon abgeleitete Verfahren verschlechtern die Präzision der Objekterkennung dramatisch, kantenerhaltende Filter können bei stärker verrauschten Daten vorteilhaft sein.Als bester Filter, der bei Ansprüchen an präzise Segmentgrenzen im Pixelbereich sinnvoll einzusetzen ist und dabei mit nur einem Parameter gesteuert werden kann, erweist sich der modifizierte EPOS-Filter, ursprünglich vorgestellt von Haag und Sties [1994, 1996]. Die generellen Bildparameter, wie Standardabweichung oder Histogramm werden durch diesen kantenerhaltenden Filter nur unwesentlich beeinflußt.
|
225 |
Transparent Object Reconstruction and Registration Confidence Measures for 3D Point Clouds based on Data Inconsistency and Viewpoint AnalysisAlbrecht, Sven 28 February 2018 (has links)
A large number of current mobile robots use 3D sensors as part of their sensor setup. Common 3D sensors, i.e., laser scanners or RGB-D cameras, emit a signal (laser light or infrared light for instance), and its reflection is recorded in order to estimate depth to a surface. The resulting set of measurement points is commonly referred to as 'point clouds'. In the first part of this dissertation an inherent problem of sensors that emit some light signal is addressed, namely that these signals can be reflected and/or refracted by transparent of highly specular surfaces, causing erroneous or missing measurements. A novel heuristic approach is introduced how such objects may nevertheless be identified and their size and shape reconstructed by fusing information from several viewpoints of the scene. In contrast to other existing approaches no prior knowledge about the objects is required nor is the shape of the reconstructed objects restricted to a limited set of geometric primitives. The thesis proceeds to illustrate problems caused by sensor noise and registration errors and introduces mechanisms to address these problems. Finally a quantitative comparison between equivalent directly measured objects, the reconstructions and "ground truth" is provided. The second part of the thesis addresses the problem of automatically determining the quality of the registration for a pair of point clouds. Although a different topic, these two problems are closely related, if modeled in the fashion of this thesis. After illustrating why the output parameters of a popular registration algorithm (ICP) are not suitable to deduce registration quality, several heuristic measures are developed that provide better insight. Experiments performed on different datasets were performed to showcase the applicability of the proposed measures in different scenarios.
|
226 |
Classification of Glioblastoma Multiforme Patients Based on an Integrative Multi-Layer Finite Mixture Model SystemCampos Valenzuela, Jaime Alberto 26 November 2018 (has links)
Glioblastoma multiforme (GMB) is an extremely aggressive and invasive brain cancer with a median survival of less than one year. In addition, due to its anaplastic nature the histological classification of this cancer is not simple. These characteristics make this disease an interesting and important target for new methodologies of analysis and classification.
In recent years, molecular information has been used to segregate and analyze GBM patients, but generally this methodology utilizes single-`omic' data to perform the classification or multi-’omic’ data in a sequential manner.
In this project, a novel approach for the classification and analysis of patients with GBM is presented. The main objective of this work is to find clusters of patients with distinctive profiles using multi-’omic’ data with a real integrative methodology.
During the last years, the TCGA consortium has made publicly available thousands of multi-’omic’ samples for multiple cancer types. Thanks to this, it was possible to obtain numerous GBM samples (> 300) with data for gene and microRNA expression, CpG sites methylation and copy-number variation (CNV).
To achieve our objective, a mixture of linear models were built for each gene using its expression as output and a mixture of multi-`omic' data as covariates. Each model was coupled with a lasso penalization scheme, and thanks to the mixture nature of the model, it was possible to fit multiple submodels to discover different linear relationships in the same model.
This complex but interpretable method was used to train over \numprint{10000} models. For \texttildelow \numprint{2400} cases, two or more submodels were obtained.
Using the models and their submodels, 6 different clusters of patients were discovered. The clusters were profiled based on clinical information and gene mutations. Through this analysis, a clear separation between the younger patients and with higher survival rate (Clusters 1, 2 and 3) and those from older patients and lower survival rate (Clusters 4, 5 and 6) was found. Mutations in the gene IDH1 were found almost exclusively in Cluster 2, additionally, Cluster 5 presented a hypermutated profile. Finally, several genes not previously related to GBM showed a significant presence in the clusters, such as C15orf2 and CHEK2.
The most significant models for each clusters were studied, with a special focus on their covariants. It was discovered that the number of shared significant models were very small and that the well known GBM related genes appeared as significant covariates for plenty of models, such as EGFR1 and TP53. Along with them, ubiquitin-related genes (UBC and UBD) and NRF1, which have not been linked to GBM previously, had a very significant role.
This work showed the potential of using a mixture of linear models to integrate multi-’omic’ data and to group patients in order to profile them and find novel markers. The resulting clusters showed unique profiles and their significant models and covariates were comprised by well known GBM related genes and novel markers, which present the possibility for new approaches to study and attack this disease. The next step of the project is to improve several elements of the methodology to achieve a more detail analysis of the models and covariates, in particular taking into account the regression coefficients of the submodels.
|
227 |
Convolutional Neural Networks for Epileptic Seizure PredictionEberlein, Matthias, Hildebrand, Raphael, Tetzlaff, Ronald, Hoffmann, Nico, Kuhlmann, Levin, Brinkmann, Benjamin, Müller, Jens 27 February 2019 (has links)
Epilepsy is the most common neurological disorder and an accurate forecast of seizures would help to overcome the patient’s uncertainty and helplessness. In this contribution,
we present and discuss a novel methodology for the classification of intracranial electroencephalography (iEEG) for seizure prediction. Contrary to previous approaches, we categorically refrain from an extraction of hand-crafted features and use a convolutional neural network (CNN) topology instead for both the determination of suitable signal characteristics and the binary classification of preictal and interictal segments. Three different models have been evaluated on public datasets with long-term recordings from four dogs and three patients. Overall, our findings demonstrate the general applicability. In this work we discuss the strengths and limitations of our methodology.
|
228 |
A Formal View on Training of Weighted Tree Automata by Likelihood-Driven State Splitting and MergingDietze, Toni 03 June 2019 (has links)
The use of computers and algorithms to deal with human language, in both spoken and written form, is summarized by the term natural language processing (nlp). Modeling language in a way that is suitable for computers plays an important role in nlp. One idea is to use formalisms from theoretical computer science for that purpose. For example, one can try to find an automaton to capture the valid written sentences of a language. Finding such an automaton by way of examples is called training.
In this work, we also consider the structure of sentences by making use of trees. We use weighted tree automata (wta) in order to deal with such tree structures. Those devices assign weights to trees in order to, for example, distinguish between good and bad structures. The well-known expectation-maximization algorithm can be used to train the weights for a wta while the state behavior stays fixed. As a way to adapt the state behavior of a wta, state splitting, i.e. dividing a state into several new states, and state merging, i.e. replacing several states by a single new state, can be used. State splitting, state merging, and the expectation maximization algorithm already were combined into the state splitting and merging algorithm, which was successfully applied in practice. In our work, we formalized this approach in order to show properties of the algorithm. We also examined a new approach – the count-based state merging algorithm – which exclusively relies on state merging.
When dealing with trees, another important tool is binarization. A binarization is a strategy to code arbitrary trees by binary trees. For each of three different binarizations we showed that wta together with the binarization are as powerful as weighted unranked tree automata (wuta). We also showed that this is still true if only probabilistic wta and probabilistic wuta are considered.:How to Read This Thesis
1. Introduction
1.1. The Contributions and the Structure of This Work
2. Preliminaries
2.1. Sets, Relations, Functions, Families, and Extrema
2.2. Algebraic Structures
2.3. Formal Languages
3. Language Formalisms
3.1. Context-Free Grammars (CFGs)
3.2. Context-Free Grammars with Latent Annotations (CFG-LAs)
3.3. Weighted Tree Automata (WTAs)
3.4. Equivalences of WCFG-LAs and WTAs
4. Training of WTAs
4.1. Probability Distributions
4.2. Maximum Likelihood Estimation
4.3. Probabilities and WTAs
4.4. The EM Algorithm for WTAs
4.5. Inside and Outside Weights
4.6. Adaption of the Estimation of Corazza and Satta [CS07] to WTAs
5. State Splitting and Merging
5.1. State Splitting and Merging for Weighted Tree Automata
5.1.1. Splitting Weights and Probabilities
5.1.2. Merging Probabilities
5.2. The State Splitting and Merging Algorithm
5.2.1. Finding a Good π-Distributor
5.2.2. Notes About the Berkeley Parser
5.3. Conclusion and Further Research
6. Count-Based State Merging
6.1. Preliminaries
6.2. The Likelihood of the Maximum Likelihood Estimate and Its Behavior While Merging
6.3. The Count-Based State Merging Algorithm
6.3.1. Further Adjustments for Practical Implementations
6.4. Implementation of Count-Based State Merging
6.5. Experiments with Artificial Automata and Corpora
6.5.1. The Artificial Automata
6.5.2. Results
6.6. Experiments with the Penn Treebank
6.7. Comparison to the Approach of Carrasco, Oncina, and Calera-Rubio [COC01]
6.8. Conclusion and Further Research
7. Binarization
7.1. Preliminaries
7.2. Relating WSTAs and WUTAs via Binarizations
7.2.1. Left-Branching Binarization
7.2.2. Right-Branching Binarization
7.2.3. Mixed Binarization
7.3. The Probabilistic Case
7.3.1. Additional Preliminaries About WSAs
7.3.2. Constructing an Out-Probabilistic WSA from a Converging WSA
7.3.3. Binarization and Probabilistic Tree Automata
7.4. Connection to the Training Methods in Previous Chapters
7.5. Conclusion and Further Research
A. Proofs for Preliminaries
B. Proofs for Training of WTAs
C. Proofs for State Splitting and Merging
D. Proofs for Count-Based State Merging
Bibliography
List of Algorithms
List of Figures
List of Tables
Index
Table of Variable Names
|
229 |
Content-Aware Image Restoration Techniques without Ground Truth and Novel Ideas to Image ReconstructionBuchholz, Tim-Oliver 12 August 2022 (has links)
In this thesis I will use state-of-the-art (SOTA) image denoising methods to denoise electron microscopy (EM) data.
Then, I will present NoiseVoid a deep learning based self-supervised image denoising approach which is trained on single noisy observations.
Eventually, I approach the missing wedge problem in tomography and introduce a novel image encoding, based on the Fourier transform which I am using to predict missing Fourier coefficients directly in Fourier space with Fourier Image Transformer (FIT).
In the next paragraphs I will summarize the individual contributions briefly.
Electron microscopy is the go to method for high-resolution images in biological research.
Modern scanning electron microscopy (SEM) setups are used to obtain neural connectivity maps, allowing us to identify individual synapses.
However, slow scanning speeds are required to obtain SEM images of sufficient quality.
In (Weigert et al. 2018) the authors show, for fluorescence microscopy, how pairs of low- and high-quality images can be obtained from biological samples and use them to train content-aware image restoration (CARE) networks.
Once such a network is trained, it can be applied to noisy data to restore high quality images.
With SEM-CARE I present how this approach can be directly applied to SEM data, allowing us to scan the samples faster, resulting in $40$- to $50$-fold imaging speedups for SEM imaging.
In structural biology cryo transmission electron microscopy (cryo TEM) is used to resolve protein structures and describe molecular interactions.
However, missing contrast agents as well as beam induced sample damage (Knapek and Dubochet 1980) prevent acquisition of high quality projection images.
Hence, reconstructed tomograms suffer from low signal-to-noise ratio (SNR) and low contrast, which makes post-processing of such data difficult and often has to be done manually.
To facilitate down stream analysis and manual data browsing of cryo tomograms I present cryoCARE a Noise2Noise (Lehtinen et al. 2018) based denoising method which is able to restore high contrast, low noise tomograms from sparse-view low-dose tilt-series.
An implementation of cryoCARE is publicly available as Scipion (de la Rosa-Trevín et al. 2016) plugin.
Next, I will discuss the problem of self-supervised image denoising.
With cryoCARE I exploited the fact that modern cryo TEM cameras acquire multiple low-dose images, hence the Noise2Noise (Lehtinen et al. 2018) training paradigm can be applied.
However, acquiring multiple noisy observations is not always possible e.g. in live imaging, with old cryo TEM cameras or simply by lack of access to the used imaging system.
In such cases we have to fall back to self-supervised denoising methods and with Noise2Void I present the first self-supervised neural network based image denoising approach.
Noise2Void is also available as an open-source Python package and as a one-click solution in Fiji (Schindelin et al. 2012).
In the last part of this thesis I present Fourier Image Transformer (FIT) a novel approach to image reconstruction with Transformer networks.
I develop a novel 1D image encoding based on the Fourier transform where each prefix encodes the whole image at reduced resolution, which I call Fourier Domain Encoding (FDE).
I use FIT with FDEs and present proof of concept for super-resolution and tomographic reconstruction with missing wedge correction.
The missing wedge artefacts in tomographic imaging originate in sparse-view imaging.
Sparse-view imaging is used to keep the total exposure of the imaged sample to a minimum, by only acquiring a limited number of projection images.
However, tomographic reconstructions from sparse-view acquisitions are affected by missing wedge artefacts, characterized by missing wedges in the Fourier space and visible as streaking artefacts in real image space.
I show that FITs can be applied to tomographic reconstruction and that they fill in missing Fourier coefficients.
Hence, FIT for tomographic reconstruction solves the missing wedge problem at its source.:Contents
Summary iii
Acknowledgements v
1 Introduction 1
1.1 Scanning Electron Microscopy . . . . . . . . . . . . . . . . . . . . 3
1.2 Cryo Transmission Electron Microscopy . . . . . . . . . . . . . . . 4
1.2.1 Single Particle Analysis . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Cryo Tomography . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Tomographic Reconstruction . . . . . . . . . . . . . . . . . . . . . 8
1.4 Overview and Contributions . . . . . . . . . . . . . . . . . . . . . 11
2 Denoising in Electron Microscopy 15
2.1 Image Denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Supervised Image Restoration . . . . . . . . . . . . . . . . . . . . 19
2.2.1 Training and Validation Loss . . . . . . . . . . . . . . . . 19
2.2.2 Neural Network Architectures . . . . . . . . . . . . . . . . 21
2.3 SEM-CARE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.1 SEM-CARE Experiments . . . . . . . . . . . . . . . . . . 23
2.3.2 SEM-CARE Results . . . . . . . . . . . . . . . . . . . . . 25
2.4 Noise2Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5 cryoCARE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.5.1 Restoration of cryo TEM Projections . . . . . . . . . . . . 27
2.5.2 Restoration of cryo TEM Tomograms . . . . . . . . . . . . 29
2.5.3 Automated Downstream Analysis . . . . . . . . . . . . . . 31
2.6 Implementations and Availability . . . . . . . . . . . . . . . . . . 32
2.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.7.1 Tasks Facilitated through cryoCARE . . . . . . . . . . . 33
3 Noise2Void: Self-Supervised Denoising 35
3.1 Probabilistic Image Formation . . . . . . . . . . . . . . . . . . . . 37
3.2 Receptive Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3 Noise2Void Training . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.1 Implementation Details . . . . . . . . . . . . . . . . . . . . 41
3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.1 Natural Images . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4.2 Light Microscopy Data . . . . . . . . . . . . . . . . . . . . 44
3.4.3 Electron Microscopy Data . . . . . . . . . . . . . . . . . . 47
3.4.4 Errors and Limitations . . . . . . . . . . . . . . . . . . . . 48
3.5 Conclusion and Followup Work . . . . . . . . . . . . . . . . . . . 50
4 Fourier Image Transformer 53
4.1 Transformers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.1.1 Attention Is All You Need . . . . . . . . . . . . . . . . . . 55
4.1.2 Fast-Transformers . . . . . . . . . . . . . . . . . . . . . . . 56
4.1.3 Transformers in Computer Vision . . . . . . . . . . . . . . 57
4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2.1 Fourier Domain Encodings (FDEs) . . . . . . . . . . . . . 57
4.2.2 Fourier Coefficient Loss . . . . . . . . . . . . . . . . . . . . 59
4.3 FIT for Super-Resolution . . . . . . . . . . . . . . . . . . . . . . . 60
4.3.1 Super-Resolution Data . . . . . . . . . . . . . . . . . . . . 60
4.3.2 Super-Resolution Experiments . . . . . . . . . . . . . . . . 61
4.4 FIT for Tomography . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.4.1 Computed Tomography Data . . . . . . . . . . . . . . . . 64
4.4.2 Computed Tomography Experiments . . . . . . . . . . . . 66
4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5 Conclusions and Outlook 71
|
230 |
Data Augmentation GUI Tool for Machine Learning ModelsSharma, Sweta 30 October 2023 (has links)
The industrial production of semiconductor assemblies is subject to high requirements. As a result, several tests are needed in terms of component quality. In the long run, manual quality assurance (QA) is often connected with higher expenditures. Using a technique based on machine learning, some of these tests may be carried out automatically. Deep neural networks (NN) have shown to be very effective in a diverse range of computer vision applications. Especially convolutional neural networks (CNN), which belong to a subset of NN, are an effective tool for image classification. Deep NNs have the disadvantage of requiring a significant quantity of training data to reach excellent performance. When the dataset is too small a phenomenon known as overfitting can occur. Massive amounts of data cannot be supplied in certain contexts, such as the production of semiconductors. This is especially true given the relatively low number of rejected components in this field. In order to prevent overfitting, a variety of image augmentation methods may be used to the process of artificially creating training images. However, many of those methods cannot be used in certain fields due to their inapplicability. For this thesis, Infineon Technologies AG provided the images of a semiconductor component generated by an ultrasonic microscope. The images can be categorized as having a sufficient number of good and a minority of rejected components, with good components being defined as components that have been deemed to have passed quality control and rejected components being components that contain a defect and did not pass quality control.
The accomplishment of the project, the efficacy with which it is carried out, and its level of quality may be dependent on a number of factors; however, selecting the appropriate tools is one of the most important of these factors because it enables significant time and resource savings while also producing the best results. We demonstrate a data augmentation graphical user interface (GUI) tool that has been widely used in the domain of image processing. Using this method, the dataset size has been increased while maintaining the accuracy-time trade-off and optimizing the robustness of deep learning models. The purpose of this work is to develop a user-friendly tool that incorporates traditional, advanced, and smart data augmentation, image processing,
and machine learning (ML) approaches. More specifically, the technique mainly uses
are zooming, rotation, flipping, cropping, GAN, fusion, histogram matching,
autoencoder, image restoration, compression etc. This focuses on implementing and
designing a MATLAB GUI for data augmentation and ML models. The thesis was
carried out for the Infineon Technologies AG in order to address a challenge that all
semiconductor industries experience. The key objective is not only to create an easy-
to-use GUI, but also to ensure that its users do not need advanced technical
experiences to operate it. This GUI may run on its own as a standalone application.
Which may be implemented everywhere for the purposes of data augmentation and
classification. The objective is to streamline the working process and make it easy to
complete the Quality assurance job even for those who are not familiar with data
augmentation, machine learning, or MATLAB. In addition, research will investigate the
benefits of data augmentation and image processing, as well as the possibility that
these factors might contribute to an improvement in the accuracy of AI models.
|
Page generated in 0.071 seconds