Spelling suggestions: "subject:"boosted"" "subject:"roosted""
11 |
An Ontology-based Hybrid Recommendation System Using Semantic Similarity Measure And Feature WeightingCeylan, Ugur 01 September 2011 (has links) (PDF)
The task of the recommendation systems is to recommend items that are relevant to the preferences of users. Two main approaches in recommendation systems are collaborative filtering and content-based filtering. Collaborative filtering systems have some major problems such as sparsity, scalability, new item and new user problems. In this thesis, a hybrid recommendation system that is based on content-boosted collaborative filtering approach is proposed in order to overcome sparsity and new item problems of collaborative filtering. The content-based part of the proposed approach exploits semantic similarities between items based on a priori defined ontology-based metadata in movie domain and derived feature-weights from content-based user models. Using the semantic similarities between items and collaborative-based user models, recommendations are generated. The results of the evaluation phase show that the proposed approach improves the quality of recommendations.
|
12 |
Prediction of land cover in continental United States using machine learning techniquesAgarwalla, Yashika 08 June 2015 (has links)
Land cover is a reliable source for studying changes in the land use patterns at a large scale. With advent of satellite images and remote sensing technologies, land cover classification has become easier and more reliable. In contrast to the conventional land cover classification methods that make use of land and aerial photography, this research uses small scale Digital Elevation Maps and it’s corresponding land cover image obtained from Google Earth Engine. Two machine learning techniques, Boosted Regression Trees and Image Analogy, have been used for classification of land cover regions in continental United States. The topographical features selected for this study include slope, aspect, elevation and topographical index (TI). We assess the efficiency of machine learning techniques in land cover classification using satellite data to establish the topographic-land cover relation. The thesis establishes the topographic-land cover relation, which is crucial for conservation planning, and habitat or species management. The main contribution of the research is its demonstration of the dominance of various topographical attributes and the ability of the techniques used to predict land cover over large regions and to reproduce land cover maps in high resolution. In comparison to traditional remote sensing methods such as, aerial photography, to develop land cover maps, both the methods presented are inexpensive, faster. The need for this research is in synergy with past studies, which show that large-scale data, processing, along with integration and interpretation make automated and accurate methods of change in land cover mapping highly desirable.
|
13 |
Antipatharian Diversity and Habitat Suitability Mapping in the Mesophotic Zone of the Northwestern Gulf of MexicoNuttall, Marissa F 03 October 2013 (has links)
Little is known about the distribution of black corals in the northwestern Gulf of Mexico. Of thirty-nine species of black coral documented in the Western Atlantic, thirty have been previously documented by various studies in the Gulf of Mexico. This study proposes potential range extensions for four black coral species, including Stichopathes gracilis, Stichopathes semiglabra, Tanacetipathes paula, and Tanacetipathes spinescens, to include the Gulf of Mexico. The validation of in situ identifications of black coral species is evaluated, and recommendations for species identifications and species groupings are made. Black coral associated fauna are documented, supporting known associations and documenting potentially new associations and species.
Habitat suitability models for the distribution of black coral species at selected banks in the northwestern Gulf of Mexico were generated. Presence-only models made using the MaxEnt modeling program were compared to presence-absence models made using Boosted Regression Tree modeling techniques. Presence-absence models were documented to have greater predictive accuracy than the presence-only models, which showed evidence of model overfitting. The model was projected to five similar salt-dome features in the region, highlighting extensive habitat for multiple black coral species in these unexplored habitats. This study presents habitat suitability maps as a testable hypothesis for black coral distribution in the mesophotic zone of this region.
|
14 |
Recherche de résonance lourde dans le spectre de masse invariante top-antitop auprès de l'expérience ATLAS du LHC / Search for new physics in the top-antitop channel with the ATLAS experiment at LHC colliderDechenaux, Benjamin 04 October 2013 (has links)
La présente thèse constitue le compte rendu de l'analyse menée auprès du détecteur ATLAS du LHC et concernant la recherche de processus de création résonante de nouvelles particules se désintégrant en une paire de quarks top. Elle s'articule principalement autour de la notion de jet de hadrons, dont l'identification et la reconstruction est un enjeu capital pour toute mesure essayant de signer l'apparition de quarks top lors de processus de collisions proton-proton. Après une mise en contexte portant sur une description générale des caractéristiques théoriques et expérimentales que présente la thématique de la détection de jets de hadrons dans le détecteur ATLAS, nous présentons une première tentative de validation de la méthode d'étalonnage hadronique local, dont le but est de corriger ces jets des imprécisions de mesure engendrées par le détecteur. Dans la deuxième partie du document figure l'analyse menée sur les 14~fb$^{-1}$ de données de collisions proton-proton à $sqrt{s} = 8$~TeV, récoltées lors de l'année 2012, à la recherche de l'apparition de processus de création résonante de nouvelle particules extit{lourdes} dans le spectre de masse invariante top-antitop. Pour des particules lourdes, les quarks tops produits lors de la désintégration de ces dernières possèdent une impulsion très grande par rapport à leur masse et la désintégration de tels quarks top conduit souvent à une topologie dans l'état final dite og boostée fg, où le quark top, s'il se désintègre de manière hadronique, est très souvent reconstruit comme un seul jet, de large paramètre de rayon. Le présent travail de thèse propose ainsi une étude préliminaire pour reconstruire et identifier le plus précisément possible ce type de signal, en se basant sur l'étude de la sous-structure des jets de hadrons. / This report presents the analysis conducted with the ATLAS experiment at the LHC and searching for resonnant production of new particles decaying into a pair of top quarks. Top quark reconstruction is mainly build upon the notion of hadronic jets, whose identification and reconstruction is a crucial issue for any measure trying to sign top quark decays from proton-proton collisions processes. After a general description of the theoretical and experimental features of jet reconstruction in the ATLAS detector, we present a first attempt to validate the local hadronic calibration method, which aim at correcting the measurement of these objects from inaccuracies caused by detector effects. In the second part, we present the analysis conducted on 14~fb$^{-1}$ of proton-proton collision data at $ sqrt{s}=$8~TeV collected during the year 2012 and searching for resonnant creation of new extit{heavy} particles in top-antitop invariant mass spectrum. For heavy particles, the quarks produced in the decay of the latter have a high impulsion with respect to their mass and those top quark decays often results in a so called og boosted fg topology, where the hadronically decaying top quark is often reconstructed as a single jet of large radius parameter. In this context, we present a preliminary study to reconstruct and identify as precisely as possible this type of boosted topologies, based on the study of jet substructure.
|
15 |
Distribution and habitat use of sharks in the coastal waters of west-central FloridaMullins, Lindsay 25 November 2020 (has links)
An elasmobranch survey conducted from 2013-2018 in the waters adjacent to Pinellas County, Florida, was used for a baseline assessment of the local shark population. ArcGIS and Boosted Regression Trees were used to identify hot spots of abundance and links between environmental predictors and distribution, as well as create species distribution models. A diverse assemblage of sharks, dominated by five species: nurse shark, bonnethead, Atlantic sharpnose shark, blacktip shark, and blacknose shark, was identified. A large proportion of captures (~42%) were immature sharks. Results indicate areas characterized by seagrass and “No Internal Combustion Engine” zones correlate with greater diversity and abundance, particularly for immature sharks. BRT results underscored the importance of seagrass bottoms, as well as warm (>31℃) and shallow (< 6m) waters as essential habitat. By identifying spatially explicit areas and environmental conditions suited for shark abundance, this study provides practical resources for managing and protecting Florida’s sharks.
|
16 |
Modélisation d'un injecteur laser-plasma pour l'accélération multi-étages / Modelling of a laser-plasma injector for multi-stage accelerationLee, Patrick 11 July 2017 (has links)
L’accélération par sillage laser (ASL) repose sur l’interaction entre un faisceau laser intense et un plasma sous-dense. Au travers de cette interaction, une onde de plasma est générée avec un fort champ accélérateur, de trois ordres de grandeur plus élevé que celui d’un accélérateur conventionnel, rendant envisageable la réalisation d’accélérateurs futurs plus compacts. Pour la conception d’un futur accélérateur, un faisceau d’électrons de forte charge, faible dispersion en énergie et faible émittance doit être accéléré à des grandes énergies. Pour ce faire, la solution consiste à accélérer ces électrons dans un schéma multi-étages, qui est composé de trois étages: un injecteur, une ligne de transport et un accélérateur. Ce travail de thèse porte sur la modélisation de l’injecteur avec le code PIC Warp et sur les méthodes numériques telles que la technique de Lorentz-boosted frame pour diminuer le temps de calcul et la couche absorbante parfaite de Bérenger (PML) pour assurer la précision des calculs numériques. Ce travail de thèse a démontré l’efficacité de la PML dans les schémas FDTD à des ordres élevés et pseudo-spectral. Il a aussi démontré la convergence des résultats des simulations réalisées avec la technique de Lorentz-boosted frame dans un régime fortement non-linéaire de l’injecteur, permettant d’accélérer les calculs d’un facteur important (36) tout en assurant leur précision. La modélisation effectuée dans cette thèse a permis d’analyser et de comprendre les résultats expérimentaux, ainsi que de prédire les résultats des futures expériences. Plusieurs méthodes d’optimisation de l’injecteur ont également été proposées pour la génération d’un faisceau d’électrons conforme aux spécifications d’un futur accélérateur. / Laser Wakefield Acceleration (LWFA) relies on the interaction between an intense laser pulse and an under-dense plasma. This interaction generates a plasma wave with a strong accelerating field, which is three orders of magnitude higher than the one of the conventional accelerator; more compact accelerator is therefore theoretically possible. In the design of a future accelerator, a high quality electron bunch with a high charge, low energy spread and low emittance has to be accelerated to high energies. A solution for this is a multi-stage accelerator, which consists of an injector, a transport line and accelerator stages. This research work focuses on the modelling of the injector using the PIC code Warp and on the numerical methods such as the Lorentz-boosted frameto speedup calculations and the Perfectly Matched Layer (PML) to ensure the precision in numerical calculations. The outcome of this thesis has demonstrated the efficiency of the PML in the high-order FDTD and the pseudo-spectral solvers. Besides, it has also demonstrated the convergence of the results performed in simulations using the Lorentz-boosted frame technique. This technique speeds up simulations by a large factor (36) while preserving their accuracy. The modelling work in this thesis has allowed analysis and understanding of experimental results, as well as prediction of results for future experiments. This thesis has also shown ways to optimize the injector to deliver an electron bunch that conforms with the specifications of future accelerators.
|
17 |
Carbon Intensity Estimation of Publicly Traded Companies / Uppskattning av koldioxidintensitet hos börsnoterade bolagRibberheim, Olle January 2021 (has links)
The purpose of this master thesis is to develop a model to estimate the carbon intensity, i.e the carbon emission relative to economic activity, of publicly traded companies which do not report their carbon emissions. By using statistical and machine learning models, the core of this thesis is to develop and compare different methods and models with regard to accuracy, robustness, and explanatory value when estimating carbon intensity. Both discrete variables, such as the region and sector the company is operating in, and continuous variables, such as revenue and capital expenditures, are used in the estimation. Six methods were compared, two statistically derived and four machine learning methods. The thesis consists of three parts: data preparation, model implementation, and model comparison. The comparison indicates that boosted decision tree is both the most accurate and robust model. Lastly, the strengths and weaknesses of the methodology is discussed, as well as the suitability and legitimacy of the boosted decision tree when estimating carbon intensity. / Syftet med denna masteruppsats är att utveckla en modell som uppskattar koldioxidsintensiteten, det vill säga koldioxidutsläppen i förhållande till ekonomisk aktivitet, hos publika bolag som inte rapporterar sina koldioxidutsläpp. Med hjälp av statistiska och maskininlärningsmodeller kommer stommen i uppsatsen vara att utveckla och jämföra olika metoder och modeller utifrån träffsäkerhet, robusthet och förklaringsvärde vid uppskattning av koldioxidintensitet. Både diskreta och kontinuerliga variabler används vid uppskattningen, till exempel region och sektor som företaget är verksam i, samt omsättning och kapitalinvesteringar. Sex stycken metoder jämfördes, två statistiskt härledda och fyra maskininlärningsmetoder. Arbetet består av tre delar; förberedelse av data, modellutveckling och modelljämförelse, där jämförelsen indikerar att boosted decision tree är den modell som är både mest träffsäker och robust. Slutligen diskuteras styrkor och svagheter med metodiken, samt lämpligheten och tillförlitligheten med att använda ett boosted decision tree för att uppskatta koldioxidintensitet.
|
18 |
New Physics Probes at Present/Future Hadron Colliders via Vh ProductionEnglert, Philipp 26 April 2023 (has links)
In dieser Arbeit nutzen wir Effektive Feldtheorien, genauer gesagt die SMEFT, um BSM-Effekte modellunabhängig zu parametrisieren.
Wir demonstrieren die Relevanz von Präzisionsmessungen sowohl an aktuellen als auch an zukünftigen Hadronenbeschleunigern durch die Untersuchung von Vh-Dibosonen-Prozessen. Diese Prozesse ermöglichen uns die Untersuchung einer Reihe von Dimension-6-Operatoren, die BSM-Effekte erzeugen, die mit der Schwerpunktsenergie wachsen. Im Besonderen betrachten wir die leptonischen Zerfallskanäle der Vektorbosonen und zwei verschiedene Zerfallsmodi des Higgs-Bosons, den Diphoton-Kanal und den h->bb-Kanal.
Der Diphoton-Kanal zeichnet sich durch eine saubere Signatur aus, die mit relativ einfachen Mitteln sehr gut von den relevanten Hintergründen unterschieden werden kann. Aufgrund der geringen Rate dieses Higgs-Zerfallskanals werden diese Prozesse allerdings erst für die Untersuchung von BSM-Effekten am FCC-hh relevant. Dank des großen h->bb Verzweigungsverhältnisse liefert der Vh(->bb)-Kanal bereits eine kompetitive Sensitivität für BSM-Effekte am LHC. Jedoch leidet dieser Kanal unter großen QCD-induzierten Hintergründen, weswegen ausgefeiltere Analysetechniken nötig sind, um dieses Niveau an BSM-Sensitivität zu erreichen. Wir leiten die erwarteten Schranken für die zuvor erwähnten Operatoren für den Vh(->gamma gamma)-Kanal am FCC-hh und für den Vh(->bb)-Kanal am LHC Run 3, HL-LHC und FCC-hh her.
Unsere Studie des Vh(->bb)-Kanals zeigt, dass die Extraktion von Schranken für BSM-Operatoren an Hadronenbeschleunigern eine höchst nicht-triviale Aufgabe sein kann.
Algorithmen des Maschinellen Lernens können potenziell nützlich zur Analyse solch komplexer Event-Strukturen sein. Wir leiten Schranken her, indem wir Boosted Decision Trees zur Signal-Hintergrund Klassifizierung benutzen und und vergleichen sie mit den Schranken aus der zuvor diskutierten Cut-and-Count Analyse. Wir finden eine leichte Verbesserung von O(einige %) für die verschiedenen Operatoren. / In this thesis, we utilise the framework of Effective Field Theories, more specifically the Standard Model Effective Field Theory, to parameterise New-Physics effects in a model-independent way.
We demonstrate the relevance of precision measurements both at current and future hadron colliders by studying Vh-diboson-production processes. These processes allow us to probe a set of dimension-6 operators that generate BSM effects growing with the center-of-mass energy. More specifically, we consider the leptonic decay channels of the vector bosons and two different decay modes of the Higgs boson, the diphoton channel and the hadronic h->bb channel.
The diphoton channel is characterised by a clean signature that can be separated very well from the relevant backgrounds with relatively simple methods. However, due to the small rate of this Higgs-decay channel, these processes will only become viable to probe New-Physics effects at the FCC-hh. Thanks to the large h->bb branching ratio, the Vh(->bb) channel already provides competitive sensitivity to BSM effects at the LHC. However, it suffers from large QCD-induced backgrounds that require us to use more sophisticated analysis techniques to achieve this level of BSM sensitivity. We derive the expected bounds on the previously mentioned dimension-6 operators from the Vh(->gamma gamma) channel at the FCC-hh and from the Vh(->bb) channel at the LHC Run 3, HL-LHC and FCC-hh.
Our study of the Vh(->bb) channel demonstrates that extracting bounds on BSM operators at hadron colliders can be a highly non-trivial task. Machine-Learning algorithms can potentially be useful for the analysis of such complex event structures. We derive bounds using Boosted Decision Trees for the signal-background classification and compare them with the ones from the previously discussed cut-and-count analysis. We find a mild improvement of O(few %) across the different operators.
|
19 |
Loss Given Default Estimation with Machine Learning Ensemble Methods / Estimering av förlust vid fallissemang med ensembelmetoder inom maskininlärningVelka, Elina January 2020 (has links)
This thesis evaluates the performance of three machine learning methods in prediction of the Loss Given Default (LGD). LGD can be seen as the opposite of the recovery rate, i.e. the ratio of an outstanding loan that the loan issuer would not be able to recover in case the customer would default. The methods investigated are decision trees, random forest and boosted methods. All of the methods investigated performed well in predicting the cases were the loan is not recovered, LGD = 1 (100%), or the loan is totally recovered, LGD = 0 (0% ). When the performance of the models was evaluated on a dataset where the observations with LGD = 1 were removed, a significant decrease in performance was observed. The random forest model built on an unbalanced training dataset showed better performance on the test dataset that included values LGD = 1 and the random forest model built on a balanced training dataset performed better on the test set where the observations of LGD = 1 were removed. Boosted models evaluated in this study showed less accurate predictions than other methods used. Overall, the performance of random forest models showed slightly better results than the performance of decision tree models, although the computational time (the cost) was considerably longer when running the random forest models. Therefore decision tree models would be suggested for prediction of the Loss Given Default. / Denna uppsats undersöker och jämför tre maskininlärningsmetoder som estimerar förlust vid fallissemang (Loss Given Default, LGD). LGD kan ses som motsatsen till återhämtningsgrad, dvs. andelen av det utstående lånet som långivaren inte skulle återfå ifall kunden skulle fallera. Maskininlärningsmetoder som undersöks i detta arbete är decision trees, random forest och boosted metoder. Alla metoder fungerade väl vid estimering av lån som antingen inte återbetalas, dvs. LGD = 1 (100%), eller av lån som betalas i sin helhet, LGD = 0 (0%). En tydlig minskning i modellernas träffsäkerhet påvisades när modellerna kördes med ett dataset där observationer med LGD = 1 var borttagna. Random forest modeller byggda på ett obalanserat träningsdataset presterade bättre än de övriga modellerna på testset som inkluderade observationer där LGD = 1. Då observationer med LGD = 1 var borttagna visade det sig att random forest modeller byggda på ett balanserat träningsdataset presterade bättre än de övriga modellerna. Boosted modeller visade den svagaste träffsäkerheten av de tre metoderna som blev undersökta i denna studie. Totalt sett visade studien att random forest modeller byggda på ett obalanserat träningsdataset presterade en aning bättre än decision tree modeller, men beräkningstiden (kostnaden) var betydligt längre när random forest modeller kördes. Därför skulle decision tree modeller föredras vid estimering av förlust vid fallissemang.
|
20 |
Habitat Suitability Modeling for the Eastern Hog-nosed Snake, 'Heterodon platirhinos', in OntarioThomasson, Victor 26 September 2012 (has links)
With exploding human populations and landscapes that are changing, an increasing number of wildlife species are brought to the brink of extinction. In Canada, the eastern hog-nosed snake, 'Heterodon platirhinos', is found in a limited portion of southern Ontario. Designated as threatened by the Committee on the Status of Endangered Wildlife in Canada (COSEWIC), this reptile has been losing its habitat at an alarming rate. Due to the increase in development of southern Ontario, it is crucial to document what limits the snake’s habitat to direct conservation efforts better, for the long-term survival of this species. The goals of this study are: 1) to examine what environmental parameters are linked to the presence of the species at a landscape scale; 2) to predict where the snakes can be found in Ontario through GIS-based habitat suitability models (HSMs); and 3) to assess the role of biotic interactions in HSMs. Three models with high predictive power were employed: Maxent, Boosted Regression Trees (BRTs), and the Genetic Algorithm for Rule-set Production (GARP). Habitat suitability maps were constructed for the eastern hog-nosed snake for its entire Canadian distribution and models were validated with both threshold dependent and independent metrics. Maxent and BRT performed better than GARP and all models predict fewer areas of high suitability when landscape variables are used with current occurrences. Forest density and maximum temperature during the active season were the two variables that contributed the most to models predicting the current distribution of the species. Biotic variables increased the performance of models not by representing a limiting resource, but by representing the inequality of sampling and areas where forest remains. Although habitat suitability models rely on many assumptions, they remain useful in the fields of conservation and landscape management. In addition to help identify critical habitat, HSMs may be used as a tool to better manage land to allow for the survival of species at risk.
|
Page generated in 0.0492 seconds