Global ETD Search

61	Constrained portfolio selection with Markov and non-Markov processes and insiders Durrell, Fernando January 2007 (has links) Word processed copy. Includes bibliographical references (p. 158-168). Statistical Sciences
62	A transdisciplinary study on developing knowledge based software tools for wildlife management in Namibia Paterson, Barbara January 2005 (has links) Includes bibliographical references. / Two software tools decision making in wildlife management were developed as part of the Transboundary Mammal Project, a joint initiative between the Ministry of Environment and Tourism, Namibia (MET) and the Namibia Nature Foundation (NNF). This project aimed to improve the management of selected rare and high value species in Namibia by building a knowledge base for better informed decision making. The knowledge base was required to encapsulate current knowledge and experience of conservation experts and specialists. To provide an electronic representation of this knowledge base a hypermedia Information System for Rare Species Management (known as IRAS) was designed and implemented. The research therefore explores the disciplinary interstices of information technology, conservation and ethics, against the cultural background of a post-colonial society in which the deficits of the past constrain the impact and the efficacy of technological interventions. Statistical Science
63	Investigating automated bird detection from webcams using machine learning Mirugwe, Alex 22 June 2022 (has links) One of the most challenging problems faced by ecologists and other biological researchers today is to analyze the massive amounts of data being collected by advanced monitoring systems such as camera traps, wireless sensor networks, high-frequency radio trackers, global positioning systems, and satellite tracking systems being used today. It has become expensive, laborious, and time-consuming to analyze the large datasets using manual and traditional statistical techniques. Recent developments in the field of deep learning are showing promising results towards automating the analysis of these extremely large datasets. The primary objective of this study is to test the capabilities of the state-of-the-art deep learning architectures to detect birds in the webcam captured images. A total of 10592 images were collected for this study from the Cornell Lab of Ornithology live stream feeds situated in six unique locations in United States, Ecuador, New Zealand, and Panama. To achieve the main objective of the study, two convolutional neural network object detection meta-architectures, single-shot detector (SSD) and Faster R-CNN in combination with MobileNet-V2, ResNet50, ResNet101, ResNet152, and Inception ResNet-V2 feature extractors were studied and evaluated. Through the use of transfer learning, all the models were initialized using weights pre-trained on the MS COCO (Microsoft Common Objects in Context) dataset provided by the TensorFlow 2 object detection API. The Faster R-CNN model coupled with ResNet152 outperformed all other models with a mean average precision of 92.3%. However, the SSD model with the MobileNet-V2 feature extraction network achieved the lowest inference time (110ms) and the smallest memory capacity (30.5MB) compared to its counterparts. The outstanding results achieved in this study confirm that deep learning-based algorithms are capable of detecting birds of different sizes in different environments and the best model could potentially help ecologists in monitoring and identifying birds from other species present in the environment. Statistical Sciences
64	Exploring the application of Word2Vec to basket transaction data in the grocery retail industry De Swardt, Gideon Jacobus 30 May 2022 (has links) In this thesis, we explore the application of Word2vec to basket transaction data provided by a large grocery retailer in South Africa. Word2vec is an algorithm based on representation learning. The objective of the exploration is to establish whether the application of Word2vec to basket transaction data would generate product embeddings that represent a useful relationship between products. Furthermore, we compareWord2vec's outputs and performance to traditional methods for studying product relationships which include Association Rules Mining (ARM) and Recommendation Systems. The results from the experiments showed that indeed product embeddings created by Word2vec on transaction data are meaningful and useful. It was clear that the idea of using transactions in the place of sentences to the neural network, provides analogous results to that of a natural language task. Word2vec clearly demonstrated its ability to cluster products that are homogeneous or fulfill similar needs. Furthermore this sort of product relationship was not provided by any other traditional methods, which was clear when comparing the outputs to that of ARM and Recommendation Systems. We also show that usingWord2vec could potentially provide insight on truly complementary products that ARM perhaps fails to do. Word2vec also proved to be incredibly scalable, taking input data of 20 times the size of what traditional methods could handle on a local computer. We end with a description of a potential application of the ideas learnt during the course of this study, with a real business problem, that we believe could lead to an enhanced customer shopping experience and in turn increase revenue and profits for the retailer. statistical sciences
65	Counting animals in ecological images Pillay, Nakkita 24 June 2022 (has links) In the field of ecology, counting of animals to estimate population size and prey abundance is important for the conservation of wildlife. This involves analyzing large volumes of image, video or audio data and manual counting. Automating the process of counting animals would be invaluable to researchers as it will eliminate the tedious time-consuming task of counting. The purpose of this dissertation is to address manual counting in images by implementing an automated solution using computer vision. This research applies a blob detection algorithm primarily based on the determinant of the Hessian matrix to estimate counts of animals in aerial images of colonies in a user-friendly web application and trains an object detection model using deep convolutional neural networks to automatically identify and count penguin prey in 2053 images extracted from animal-borne videos. The blob detection algorithm reports an average relative bias of less than 6% and the YOLOv3 object detection model automatically detects jellyfish, school of fish and fish with a mean average precision of 82,53% and counts with an average relative bias of -17,66% over all classes. The results show that applying traditional computer vision methods and deep learning on data-scarce and data-rich situations respectively, can save ecologists an immense amount of time used on manual tedious methods of analysis and counting. Additionally, these automated counting methods can contribute towards improving wildlife conservation and future studies. Statistical Sciences
66	Estimation of value-at-risk and expected shortfall using copulas Sumbhoolaul, Helina January 2008 (has links) Includes bibliographical references (leaves 76-77). Statistical Sciences
67	Decision support for the production and distribution of electricity under load shedding Rakotonirainy, Rosephine Georgina January 2016 (has links) Every day national power system networks provide thousands of MW of electric power from generating units to consumers. This process requires different operations and planning to ensure the security of the entire system. Part of the daily or weekly operation system is the so called Unit Commitment problem which consists of scheduling the available resources in order to meet the system demand. But the continuous growth in electricity demand might put pressure on the ability of the generation system to sufficiently provide supply. In such case load shedding (a controlled, enforced reduction in electricity supply) is necessary to prevent the risk to system collapse. In South Africa at the present time, a systematic lack of supply has meant that regular load shedding has taken place, with substantial economic and social costs. In this research project we study two optimization problems related to load shedding. The first is how load shedding can be integrated into the unit commitment problem. The second is how load shedding can be fairly and efficiently allocated across areas. We develop deterministic and stochastic linear and goal programming models for these purposes. Several case studies are conducted to explore the possible solutions that the proposed models can offer. Statistical Sciences
68	Multivariate analysis of the immune response upon recent acquisition of Mycobacterium tuberculosis infection Lloyd, Tessa 03 March 2022 (has links) Tuberculosis (TB), caused by the pathogen Mycobacterium tuberculosis (M.tb), is the leading cause of mortality due to an infectious agent worldwide. Based on data from an adolescent cohort study carried out from May 2005 to February 2009, we studied and compared the immune responses of individuals from four cohorts that were defined based on their longitudinal QFT results: the recent QFT converters, the QFT reverters, the persistent QFT positives and negatives. Analysis was based on the integration of different arms of the immune response, including adaptive and “innaptive” responses, measured on the cohorts. COMPASS was used to filter the adaptive dataset and identify bioligically meaningful subsets, while, for the innaptive dataset, we came up with a novel filtering method. Once the datasets were integrated, they were standardized using variance stabilizing (vast) standardization and missing values were imputed using a multiple factor analysis (MFA)-based approach. We first set out to define a set of immune features that changed during recent M.tb infection. This was achieved by employing the kmlShape clustering algorithm to the recent QFT converters. We identified 55 cell subsets to either increase or decrease post-infection. When we assessed how the associations between these changed pre- and post-infection using correlation networks, we found no notable differences. By comparing the recent QFT converters and the persistent QFT positives, a blood-based biomarker to distinguish between recent and established infection, namely ESAT6/CFP10-specific expression of HLA-DR on total Th1 cells, was identified using elastic net (EN) models (average AUROC = 0.87). The discriminatory ability of this variable was confirmed using two tree-based models. Lastly, to assess whether the QFT reverters are a biologically distinct group of individuals, we compared them to the persistent QFT positive and QFT negative individuals using a Projection to Latent Space Discriminant Analysis (PLS-DA) model. The results indicated that reverters appeared more similar to QFT negative individuals rather than QFT positive. Hence, QFT reversion may be associated with clearance of M.tb infection. Immune signatures associated with recent infection could be used to refine end-points of clinical trials testing vaccine efficacy against acquisition of M.tb infection, while immune signatures associated with QFT reversion could be tested as correlates of protection from M.tb infection. Statistical Sciences
69	Fault diagnosis in multivariate statistical process monitoring Mostert, Andre George 04 March 2022 (has links) The application of multivariate statistical process monitoring (MSPM) methods has gained considerable momentum over the last couple of decades, especially in the processing industry for achieving higher throughput at sustainable rates, reducing safety related events and minimizing potential environmental impacts. Multivariate process deviations occur when the relationships amongst many process characteristics are different from the expected. The fault detection ability of methods such as principal component analysis (PCA) and process monitoring has been reported in literature and demonstrated in selective practical applications. However, the methodologies employed to diagnose the reason for the identified multivariate process faults have not gained the anticipated traction in practice. One explanation for this might be that the current diagnostic approaches attempt to rank process variables according to their individual contribution to process faults. However, the lack of these approaches to correctly identify the variables responsible for the process deviation is well researched and communicated in literature. Specifically, these approaches suffer from a phenomenon known as fault smearing. In this research it is argued, using several illustrations, that the objective of assigning individual importance rankings to process variables is not appropriate in a multivariate setting. A new methodology is introduced for performing fault diagnosis in multivariate process monitoring. More specifically, a multivariate diagnostic method is proposed that ranks variable pairs as opposed to individual variables. For PCA based MSPM, a novel fault diagnosis method is developed that decomposes the fault identification statistics into a sum of parts, with each part representing the contribution of a specific variable pair. An approach is also developed to quantify the statistical significance of each pairwise contribution. In addition, it is illustrated how the pairwise contributions can be analysed further to obtain an individual importance ranking of the process variables. Two methodologies are developed that can be applied to calculate the individual ranking following the pairwise contributions analysis. However, it is advised that the individual rankings should be interpreted together with the pairwise contributions. The application of this new approach to PCA based MSPM and fault diagnosis is illustrated using a simulated data set. Statistical Sciences
70	Performance analysis of text classification algorithms for PubMed articles Savvi, Suzana 14 March 2022 (has links) The Medical Subject Headings (MeSH) thesaurus is a controlled vocabulary developed by the US National Library of Medicine (NLM) for indexing articles in Pubmed Central (PMC) archive. The annotation process is a complex and time-consuming task relying on subjective manual assignment of MeSH concepts. Automating such tasks with machine learning may provide a more efficient way of organizing biomedical literature in a less ambiguous way. This research provides a case study which compares the performance of several different machine learning algorithms (Topic Modelling, Random Forest, Logistic Regression, Support Vector Classifiers, Multinomial Naive Bayes, Convolutional Neural Network and Long Short-Term Memory (LSTM)) in reproducing manually assigned MeSH annotations. Records for this study were retrieved from Pubmed using the E-utilities API to the Entrez system of databases at NCBI (National Centre for Biotechnology Information). The MeSH vocabulary is organised in a hierarchical structure and article abstracts labelled with a single MeSH term from the top second two layers were selected for training the machine learning models. Various strategies for text multiclass classification were considered. One was a Chi-square test for feature selection which identified words relevant to each MeSH label. The second approach used Named Entity Recognition (NER) to extract entities from the unstructured text and another approach relied on word embeddings able to capture latent knowledge from literature. At the start of the study text was tokenised using the Term Frequency Inverse Document Frequency (Tf-idf) technique and topic modelling performed with the objective to ascertain the correlation between assigned topics (unsupervised learning task) and MeSH terms in PubMed. Findings revealed the degree of coupling was low although significant. Of all of the classifier models trained, logistic regression on Tf-idf vectorised entities achieved highest accuracy. Performance varied across the different MeSH categories. In conclusion automated curation of articles by abstract may be possible for those target classes classified reliably and reproducibly. Statistical Sciences

Search results