Spelling suggestions: "subject:"bootstrap aggregation"" "subject:"gbootstrap aggregation""
1 |
Optimal Sampling Designs for Functional Data AnalysisJanuary 2020 (has links)
abstract: Functional regression models are widely considered in practice. To precisely understand an underlying functional mechanism, a good sampling schedule for collecting informative functional data is necessary, especially when data collection is limited. However, scarce research has been conducted on the optimal sampling schedule design for the functional regression model so far. To address this design issue, efficient approaches are proposed for generating the best sampling plan in the functional regression setting. First, three optimal experimental designs are considered under a function-on-function linear model: the schedule that maximizes the relative efficiency for recovering the predictor function, the schedule that maximizes the relative efficiency for predicting the response function, and the schedule that maximizes the mixture of the relative efficiencies of both the predictor and response functions. The obtained sampling plan allows a precise recovery of the predictor function and a precise prediction of the response function. The proposed approach can also be reduced to identify the optimal sampling plan for the problem with a scalar-on-function linear regression model. In addition, the optimality criterion on predicting a scalar response using a functional predictor is derived when the quadratic relationship between these two variables is present, and proofs of important properties of the derived optimality criterion are also provided. To find such designs, an algorithm that is comparably fast, and can generate nearly optimal designs is proposed. As the optimality criterion includes quantities that must be estimated from prior knowledge (e.g., a pilot study), the effectiveness of the suggested optimal design highly depends on the quality of the estimates. However, in many situations, the estimates are unreliable; thus, a bootstrap aggregating (bagging) approach is employed for enhancing the quality of estimates and for finding sampling schedules stable to the misspecification of estimates. Through case studies, it is demonstrated that the proposed designs outperform other designs in terms of accurately predicting the response and recovering the predictor. It is also proposed that bagging-enhanced design generates a more robust sampling design under the misspecification of estimated quantities. / Dissertation/Thesis / Doctoral Dissertation Statistics 2020
|
2 |
Neural network ensemblesDe Jongh, Albert 04 1900 (has links)
Thesis (MSc)--Stellenbosch University, 2004. / ENGLISH ABSTRACT: It is possible to improve on the accuracy of a single neural network by using
an ensemble of diverse and accurate networks. This thesis explores diversity
in ensembles and looks at the underlying theory and mechanisms employed
to generate and combine ensemble members. Bagging and boosting are
studied in detail and I explain their success in terms of well-known theoretical
instruments. An empirical evaluation of their performance is conducted
and I compare them to a single classifier and to each other in terms of accuracy
and diversity. / AFRIKAANSE OPSOMMING: Dit is moontlik om op die akkuraatheid van 'n enkele neurale netwerk te verbeter
deur 'n ensemble van diverse en akkurate netwerke te gebruik. Hierdie
tesis ondersoek diversiteit in ensembles, asook die meganismes waardeur
lede van 'n ensemble geskep en gekombineer kan word. Die algoritmes
"bagging" en "boosting" word in diepte bestudeer en hulle sukses word aan
die hand van bekende teoretiese instrumente verduidelik. Die prestasie van
hierdie twee algoritmes word eksperimenteel gemeet en hulle akkuraatheid
en diversiteit word met 'n enkele netwerk vergelyk.
|
3 |
Decision Trees for Classification of Repeated MeasurementsHolmberg, Julianna January 2024 (has links)
Classification of data from repeated measurements is useful in various disciplines, for example that of medicine. This thesis explores how classification trees (CART) can be used for classifying repeated measures data. The reader is introduced to variations of the CART algorithm which can be used for classifying the data set and tests the performance of these algorithms on a data set that can be modelled using bilinear regression. The performance is compared with that of a classification rule based on linear discriminant analysis. It is found that while the performance of the CART algorithm can be satisfactory, using linear discriminant analysis is more reliable for achieving good results. / Klassificering av data från upprepade mätningar är användbart inom olika discipliner, till exempel medicin. Denna uppsats undersöker hur klassificeringsträd (CART) kan användas för att klassificera upprepade mätningar. Läsaren introduceras till varianter av CART-algoritmen som kan användas för att klassificera datamängden och testar prestandan för dessa algoritmer på en datamängd som kan modelleras med hjälp av bilinjär regression. Prestandan jämförs med en klassificeringsregel baserad på linjär diskriminantanalys. Det har visar sig att även om prestandan för CART-algoritmen kan vara tillfredsställande, är användning av linjär diskriminantanalys mer tillförlitlig för att uppnå goda resultat.
|
4 |
An IoT Solution for Urban Noise Identification in Smart Cities : Noise Measurement and ClassificationAlsouda, Yasser January 2019 (has links)
Noise is defined as any undesired sound. Urban noise and its effect on citizens area significant environmental problem, and the increasing level of noise has become a critical problem in some cities. Fortunately, noise pollution can be mitigated by better planning of urban areas or controlled by administrative regulations. However, the execution of such actions requires well-established systems for noise monitoring. In this thesis, we present a solution for noise measurement and classification using a low-power and inexpensive IoT unit. To measure the noise level, we implement an algorithm for calculating the sound pressure level in dB. We achieve a measurement error of less than 1 dB. Our machine learning-based method for noise classification uses Mel-frequency cepstral coefficients for audio feature extraction and four supervised classification algorithms (that is, support vector machine, k-nearest neighbors, bootstrap aggregating, and random forest). We evaluate our approach experimentally with a dataset of about 3000 sound samples grouped in eight sound classes (such as car horn, jackhammer, or street music). We explore the parameter space of the four algorithms to estimate the optimal parameter values for the classification of sound samples in the dataset under study. We achieve noise classification accuracy in the range of 88% – 94%.
|
5 |
Exploring advanced forecasting methods with applications in aviationRiba, Evans Mogolo 02 1900 (has links)
Abstracts in English, Afrikaans and Northern Sotho / More time series forecasting methods were researched and made available in recent
years. This is mainly due to the emergence of machine learning methods which also
found applicability in time series forecasting. The emergence of a variety of methods
and their variants presents a challenge when choosing appropriate forecasting methods.
This study explored the performance of four advanced forecasting methods: autoregressive
integrated moving averages (ARIMA); artificial neural networks (ANN); support
vector machines (SVM) and regression models with ARIMA errors. To improve their
performance, bagging was also applied. The performance of the different methods was
illustrated using South African air passenger data collected for planning purposes by
the Airports Company South Africa (ACSA). The dissertation discussed the different
forecasting methods at length. Characteristics such as strengths and weaknesses and
the applicability of the methods were explored. Some of the most popular forecast accuracy
measures were discussed in order to understand how they could be used in the
performance evaluation of the methods.
It was found that the regression model with ARIMA errors outperformed all the other
methods, followed by the ARIMA model. These findings are in line with the general
findings in the literature. The ANN method is prone to overfitting and this was evident
from the results of the training and the test data sets. The bagged models showed mixed
results with marginal improvement on some of the methods for some performance measures.
It could be concluded that the traditional statistical forecasting methods (ARIMA and
the regression model with ARIMA errors) performed better than the machine learning
methods (ANN and SVM) on this data set, based on the measures of accuracy used.
This calls for more research regarding the applicability of the machine learning methods
to time series forecasting which will assist in understanding and improving their
performance against the traditional statistical methods / Die afgelope tyd is verskeie tydreeksvooruitskattingsmetodes ondersoek as gevolg van die
ontwikkeling van masjienleermetodes met toepassings in die vooruitskatting van tydreekse.
Die nuwe metodes en hulle variante laat ʼn groot keuse tussen vooruitskattingsmetodes.
Hierdie studie ondersoek die werkverrigting van vier gevorderde vooruitskattingsmetodes:
outoregressiewe, geïntegreerde bewegende gemiddeldes (ARIMA), kunsmatige neurale
netwerke (ANN), steunvektormasjiene (SVM) en regressiemodelle met ARIMA-foute.
Skoenlussaamvoeging is gebruik om die prestasie van die metodes te verbeter. Die prestasie
van die vier metodes is vergelyk deur hulle toe te pas op Suid-Afrikaanse lugpassasiersdata
wat deur die Suid-Afrikaanse Lughawensmaatskappy (ACSA) vir beplanning ingesamel is.
Hierdie verhandeling beskryf die verskillende vooruitskattingsmetodes omvattend. Sowel
die positiewe as die negatiewe eienskappe en die toepasbaarheid van die metodes is
uitgelig. Bekende prestasiemaatstawwe is ondersoek om die prestasie van die metodes te
evalueer.
Die regressiemodel met ARIMA-foute en die ARIMA-model het die beste van die vier
metodes gevaar. Hierdie bevinding strook met dié in die literatuur. Dat die ANN-metode na
oormatige passing neig, is deur die resultate van die opleidings- en toetsdatastelle bevestig.
Die skoenlussamevoegingsmodelle het gemengde resultate opgelewer en in sommige
prestasiemaatstawwe vir party metodes marginaal verbeter.
Op grond van die waardes van die prestasiemaatstawwe wat in hierdie studie gebruik is, kan
die gevolgtrekking gemaak word dat die tradisionele statistiese vooruitskattingsmetodes
(ARIMA en regressie met ARIMA-foute) op die gekose datastel beter as die
masjienleermetodes (ANN en SVM) presteer het. Dit dui op die behoefte aan verdere
navorsing oor die toepaslikheid van tydreeksvooruitskatting met masjienleermetodes om
hul prestasie vergeleke met dié van die tradisionele metodes te verbeter. / Go nyakišišitšwe ka ga mekgwa ye mentši ya go akanya ka ga molokoloko wa dinako le
go dirwa gore e hwetšagale mo mengwageng ye e sa tšwago go feta. Se k e k a
le b a k a la g o t šwelela ga mekgwa ya go ithuta ya go diriša metšhene yeo le yona e
ilego ya dirišwa ka kakanyong ya molokolokong wa dinako. Go t šwelela ga mehutahuta
ya mekgwa le go fapafapana ga yona go tšweletša tlhohlo ge go kgethwa mekgwa ya
maleba ya go akanya.
Dinyakišišo tše di lekodišišitše go šoma ga mekgwa ye mene ya go akanya yeo e
gatetšego pele e lego: ditekanyotshepelo tšeo di kopantšwego tša poelomorago ya maitirišo
(ARIMA); dinetweke tša maitirelo tša nyurale (ANN); metšhene ya bekthara ya thekgo
(SVM); le mekgwa ya poelomorago yeo e nago le diphošo tša ARIMA. Go
kaonafatša go šoma ga yona, nepagalo ya go ithuta ka metšhene le yona e dirišitšwe.
Go šoma ga mekgwa ye e fepafapanego go laeditšwe ka go šomiša tshedimošo ya
banamedi ba difofane ba Afrika Borwa yeo e kgobokeditšwego mabakeng a dipeakanyo
ke Khamphani ya Maemafofane ya Afrika Borwa (ACSA). Sengwalwanyaki šišo se
ahlaahlile mekgwa ya kakanyo ye e fapafapanego ka bophara. Dipharologanyi tša go
swana le maatla le bofokodi le go dirišega ga mekgwa di ile tša šomišwa. Magato a
mangwe ao a tumilego kudu a kakanyo ye e nepagetšego a ile a ahlaahlwa ka nepo ya go
kwešiša ka fao a ka šomišwago ka gona ka tshekatshekong ya go šoma ga mekgwa ye.
Go hweditšwe gore mokgwa wa poelomorago wa go ba le diphošo tša ARIMA o phadile
mekgwa ye mengwe ka moka, gwa latela mokgwa wa ARIMA. Dikutollo tše di sepelelana
le dikutollo ka kakaretšo ka dingwaleng. Mo k gwa wa ANN o ka fela o fetišiša gomme
se se bonagetše go dipoelo tša tlhahlo le dihlo pha t ša teko ya tshedimošo. Mekgwa
ya nepagalo ya go ithuta ka metšhene e bontšhitše dipoelo tšeo di hlakantšwego tšeo di
nago le kaonafalo ye kgolo go ye mengwe mekgwa ya go ela go phethagatšwa ga
mešomo.
Go ka phethwa ka gore mekgwa ya setlwaedi ya go akanya dipalopalo (ARIMA le
mokgwa wa poelomorago wa go ba le diphošo tša ARIMA) e šomile bokaone go phala
mekgwa ya go ithuta ka metšhene (ANN le SVM) ka mo go sehlopha se sa
tshedimošo, go eya ka magato a nepagalo ya magato ao a šomišitšwego. Se se nyaka gore
go dirwe dinyakišišo tše dingwe mabapi le go dirišega ga mekgwa ya go ithuta ka
metšhene mabapi le go akanya molokoloko wa dinako, e lego seo se tlago thuša go
kwešiša le go kaonafatša go šoma ga yona kgahlanong le mekgwa ya setlwaedi ya
dipalopalo. / Decision Sciences / M. Sc. (Operations Research)
|
Page generated in 0.0857 seconds