• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 160
  • 30
  • 10
  • 7
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 262
  • 102
  • 77
  • 74
  • 65
  • 49
  • 49
  • 48
  • 47
  • 43
  • 39
  • 36
  • 35
  • 29
  • 28
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Extracting Symptoms from Narrative Text using Artificial Intelligence

Gandhi, Priyanka 12 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Electronic health records collect an enormous amount of data about patients. However, the information about the patient’s illness is stored in progress notes that are in an un- structured format. It is difficult for humans to annotate symptoms listed in the free text. Recently, researchers have explored the advancements of deep learning can be applied to pro- cess biomedical data. The information in the text can be extracted with the help of natural language processing. The research presented in this thesis aims at automating the process of symptom extraction. The proposed methods use pre-trained word embeddings such as BioWord2Vec, BERT, and BioBERT to generate vectors of the words based on semantics and syntactic structure of sentences. BioWord2Vec embeddings are fed into a BiLSTM neural network with a CRF layer to capture the dependencies between the co-related terms in the sentence. The pre-trained BERT and BioBERT embeddings are fed into the BERT model with a CRF layer to analyze the output tags of neighboring tokens. The research shows that with the help of the CRF layer in neural network models, longer phrases of symptoms can be extracted from the text. The proposed models are compared with the UMLS Metamap tool that uses various sources to categorize the terms in the text to different semantic types and Stanford CoreNLP, a dependency parser, that analyses syntactic relations in the sentence to extract information. The performance of the models is analyzed by using strict, relaxed, and n-gram evaluation schemes. The results show BioBERT with a CRF layer can extract the majority of the human-labeled symptoms. Furthermore, the model is used to extract symptoms from COVID-19 tweets. The model was able to extract symptoms listed by CDC as well as new symptoms.
32

A New SCADA Dataset for Intrusion Detection System Research

Turnipseed, Ian P 14 August 2015 (has links)
Supervisory Control and Data Acquisition (SCADA) systems monitor and control industrial control systems in many industrials and economic sectors which are considered critical infrastructure. In the past, most SCADA systems were isolated from all other networks, but recently connections to corporate enterprise networks and the Internet have increased. Security concerns have risen from this new found connectivity. This thesis makes one primary contribution to researchers and industry. Two datasets have been introduced to support intrusion detection system research for SCADA systems. The datasets include network traffic captured on a gas pipeline SCADA system in Mississippi State University’s SCADA lab. IDS researchers lack a common framework to train and test proposed algorithms. This leads to an inability to properly compare IDS presented in literature and limits research progress. The datasets created for this thesis are available to be used to aid researchers in assessing the performance of SCADA IDS systems.
33

Cyberthreats, Attacks and Intrusion Detection in Supervisory Control and Data Acquisition Networks

Gao, Wei 14 December 2013 (has links)
Supervisory Control and Data Acquisition (SCADA) systems are computer-based process control systems that interconnect and monitor remote physical processes. There have been many real world documented incidents and cyber-attacks affecting SCADA systems, which clearly illustrate critical infrastructure vulnerabilities. These reported incidents demonstrate that cyber-attacks against SCADA systems might produce a variety of financial damage and harmful events to humans and their environment. This dissertation documents four contributions towards increased security for SCADA systems. First, a set of cyber-attacks was developed. Second, each attack was executed against two fully functional SCADA systems in a laboratory environment; a gas pipeline and a water storage tank. Third, signature based intrusion detection system rules were developed and tested which can be used to generate alerts when the aforementioned attacks are executed against a SCADA system. Fourth, a set of features was developed for a decision tree based anomaly based intrusion detection system. The features were tested using the datasets developed for this work. This dissertation documents cyber-attacks on both serial based and Ethernet based SCADA networks. Four categories of attacks against SCADA systems are discussed: reconnaissance, malicious response injection, malicious command injection and denial of service. In order to evaluate performance of data mining and machine learning algorithms for intrusion detection systems in SCADA systems, a network dataset to be used for benchmarking intrusion detection systemswas generated. This network dataset includes different classes of attacks that simulate different attack scenarios on process control systems. This dissertation describes four SCADA network intrusion detection datasets; a full and abbreviated dataset for both the gas pipeline and water storage tank systems. Each feature in the dataset is captured from network flow records. This dataset groups two different categories of features that can be used as input to an intrusion detection system. First, network traffic features describe the communication patterns in a SCADA system. This research developed both signature based IDS and anomaly based IDS for the gas pipeline and water storage tank serial based SCADA systems. The performance of both types of IDS were evaluates by measuring detection rate and the prevalence of false positives.
34

Analysis and Comparison of a Detailed Land Cover Dataset versus the National Land Cover Dataset (NLCD) in Blacksburg, Virginia

White, Claire McKenzie 19 January 2012 (has links)
While many studies have completed accuracy assessments on the National Land Cover Dataset (NLCD), little research has utilized a detailed digitized land cover dataset, like that available for the Town of Blacksburg, for this comparison. This study aims to evaluate the information available from a detailed land cover dataset and compare it with the National Land Cover Dataset (NLCD) at a localized scale. More specifically, it utilizes the detailed land cover dataset for the Town of Blacksburg to analyze the land cover distribution for varying land uses including single-family residential, multi-family residential, and non-residential. In addition, an application scenario assigns an area-weighted curve number to watersheds based on each land cover dataset. This study exhibits the importance of obtaining detailed land cover datasets for cities and towns. Furthermore, it shows the comprehensive information and subsequent quantifications that can be surmised from a detailed land cover dataset. / Master of Science
35

Integrating Multiple Deep Learning Models for Disaster Description in Low-Altitude Videos

Wang, Haili 12 1900 (has links)
Computer vision technologies are rapidly improving and becoming more important in disaster response. The majority of disaster description techniques now focus either on identify objects or categorize disasters. In this study, we trained multiple deep neural networks on low-altitude imagery with highly imbalanced and noisy labels. We utilize labeled images from the LADI dataset to formulate a solution for general problem in disaster classification and object detection. Our research integrated and developed multiple deep learning models that does the object detection task as well as the disaster scene classification task. Our solution is competitive in the TRECVID Disaster Scene Description and Indexing (DSDI) task, demonstrating that it is comparable to other suggested approaches in retrieving disaster-related video clips.
36

Analys av prediktiv precision av maskininlärningsalgoritmer

Remgård, Jonas January 2017 (has links)
Maskininlärning (eng: Machine Learning) har på senare tid blivit ett populärt ämne. En fråga som många användare ställer sig är hur mycket data det behövs för att få ett så korrekt svar som möjligt. Detta arbete undersöker relationen mellan inlärningsdata, mängd såväl som struktur, och hur väl algoritmen presterar. Fyra olika typer av datamängder (Iris, Digits, Symmetriskt och Dubbelsymetriskt) studerades med hjälp av tre olika algoritmer (Support Vector Classifier, K-Nearest Neighbor och Decision Tree Classifier). Arbetet fastställer att alla tre algoritmers prestation förbättras vid större mängd inlärningsdata upp till en viss gräns, men att denna gräns är olika för varje algoritm. Datainstansernas struktur påverkar också algoritmernas prestation där dubbelsymmetri ger starkare prestation än enkelsymmetri. / In recent years Machine Learning has become a popular subject. A challange that many users face is choosing the correct amount of training data. This study researches the relationship between the amount and structure of training data and the accuracy of the algorithm. Four different datasets (Iris, Digits, Symmetry and Double symmetry) were used with three different algorithms (Support Vector Classifier, K-Nearest Neighbor and Decision Tree Classifier). This study concludes that all algorithms perform better with more training data up to a certain limit, which is different for each algorithm. The structure of the dataset also affects the performance, where double symmetry gives greater performance than simple symmetry.
37

Hierarchical Bayesian Dataset Selection

Zhou, Xiaona 05 1900 (has links)
Despite the profound impact of deep learning across various domains, supervised model training critically depends on access to large, high-quality datasets, which are often challenging to identify. To address this, we introduce <b>H</b>ierarchical <b>B</b>ayesian <b>D</b>ataset <b>S</b>election (<b>HBDS</b>), the first dataset selection algorithm that utilizes hierarchical Bayesian modeling, designed for collaborative data-sharing ecosystems. The proposed method efficiently decomposes the contributions of dataset groups and individual datasets to local model performance using Bayesian updates with small data samples. Our experiments on two benchmark datasets demonstrate that HBDS not only offers a computationally lightweight solution but also enhances interpretability compared to existing data selection methods, by revealing deep insights into dataset interrelationships through learned posterior distributions. HBDS outperforms traditional non-hierarchical methods by correctly identifying all relevant datasets, achieving optimal accuracy with fewer computational steps, even when initial model accuracy is low. Specifically, HBDS surpasses its non-hierarchical counterpart by 1.8% on DIGIT-FIVE and 0.7% on DOMAINNET, on average. In settings with limited resources, HBDS achieves a 6.9% higher accuracy than its non-hierarchical counterpart. These results confirm HBDS's effectiveness in identifying datasets that improve the accuracy and efficiency of deep learning models when collaborative data utilization is essential. / Master of Science / Deep learning technologies have revolutionized many domains and applications, from voice recognition in smartphones to automated recommendations on streaming services. However, the success of these technologies heavily relies on having access to large and high-quality datasets. In many cases, selecting the right datasets can be a daunting challenge. To tackle this, we have developed a new method that can quickly figure out which datasets or groups of datasets contribute most to improving the performance of a model with only a small amount of data needed. Our tests prove that this method is not only effective and light on computation but also helps us understand better how different datasets relate to each other.
38

Detekce chodců ve snímku pomocí metod strojového učení / Pedestrians Detection in Traffic Environment by Machine Learning

Tilgner, Martin January 2019 (has links)
Tato práce se zabývá detekcí chodců pomocí konvolučních neuronových sítí z pohledu autonomního vozidla. A to zejména jejich otestováním ve smyslu nalezení vhodné praxe tvorby datasetu pro machine learning modely. V práci bylo natrénováno celkem deset machine learning modelů meta architektur Faster R-CNN s ResNet 101 jako feature extraktorem a SSDLite s feature extraktorem MobileNet_v2. Tyto modely byly natrénovány na datasetech o různých velikostech. Nejlépší výsledky byly dosaženy na datasetu o velikosti 5000 snímků. Kromě těchto modelů byl vytvořen nový dataset zaměřující se na chodce v noci. Dále byla vytvořena knihovna Python funkcí pro práci s datasety a automatickou tvorbu datasetu.
39

Všesměrová detekce objektů / Multiview Object Detection

Lohniský, Michal January 2014 (has links)
This thesis focuses on modification of feature extraction and multiview object detection learning process. We add new channels to detectors based on the "Aggregate channel features" framework. These new channels are created by filtering the picture by kernels from autoencoders followed by nonlinear function processing. Experiments show that these channels are effective in detection but they are also more computationally expensive. The thesis therefore discusses possibilities for improvements. Finally the thesis evaluates an artificial car dataset and discusses its small benefit on several detectors.
40

Klasifikace lučních porostů v Krkonoších s využitím leteckých hyperspektrálních dat a s pomocí vector machines klasifikace / Classification of meadow vegetation in the Krkonoše Mts. using aerial hyperspectral data and support vector machines classifier

Hromádková, Lucie January 2015 (has links)
Meadow vegetation in the Krkonoše Mountains National Park is classified in this master thesis using aerial hyperspectral data from sensor AISA and Support Vector Machines (SVM) and Neural Networks (NN) classification algorithms. The main goals of the master thesis are to determine the best settings of SVM parameters and to propose an ideal design for a training dataset for this classification algorithm and mapping of the meadows in the Krkonoše mountains. The criterion of the tests will be the result of classification accuracy (confusion matrices and kappa coefficient). The additional goal of the master thesis is to compare performances of both utilized classifiers, especially regarding the amount of training pixels necessary for successful classification of the mountainous meadow vegetation. Classification maps of the area of interest and Python scripts are the main outputs of the master thesis. These outputs will be handed over to the Administration of the Krkonoše Mountains National Park for further utilization in the monitoring and protecting these valuable meadow vegetation communities. Key words: hyperspectral data, AISA, Support Vector Machines, Neural Networks, training dataset, mountainous meadow vegetation

Page generated in 0.0511 seconds