21 |
Metody rychlého srovnání a identifikace sekvencí v metagenomických datech / Methods for fast sequence comparison and identification in metagenomic dataKupková, Kristýna January 2016 (has links)
Předmětem této práce je vytvoření metody sloužící k identifikaci organismů z metagenomických dat. Doposud k tomuto účelu spolehlivě dostačovaly metody založené na zarovnání sekvencí s referenční databází. Množství dat ovšem s rozvojem sekvenačních technik rapidně roste a tyto metody se tak stávají díky své výpočetní náročnosti nevhodnými. V této diplomové práci je popsán postup nové techniky, která umožňuje klasifikaci metagenomických dat bez nutnosti zarovnání. Metoda spočívá v převedení sekvenovaných úseků na genomické signály ve formě fázových reprezentací, ze kterých jsou následně extrahovány vektory příznaků. Těmito příznaky jsou tři Hjorthovy deskriptory. Ty jsou dále vystaveny metodě maximalizace věrohodnosti směsi Gaussovských rozložení, která umožňuje spolehlivé roztřídění fragmentů podle jejich příslušnosti k organismu.
|
22 |
MALDI-TOF MS Data Processing Using Wavelets, Splines and Clustering Techniques.Chen, Shuo 18 December 2004 (has links) (PDF)
Mass Spectrometry, especially matrix assisted laser desorption/ionization (MALDI) time of flight (TOF), is emerging as a leading technique in the proteomics revolution. It can be used to find disease-related protein patterns in mixtures of proteins derived from easily obtained samples. In this paper, a novel algorithm for MALDI-TOF MS data processing is developed. The software design includes the application of splines for data smoothing and baseline correction, wavelets for adaptive denoising, multivariable statistics techniques such as clustering analysis, and signal processing techniques to evaluate the complicated biological signals. A MatLab implementation shows the processing steps consecutively including step-interval unification, adaptive wavelet denoising, baseline correction, normalization, and peak detection and alignment for biomarker discovery.
|
23 |
ANAEROBIC DIGESTION OF DAIRY INDUSTRY WASTES: PROCESS PERFORMANCE AND MICROBIAL INSIGHTSFONTANA, ALESSANDRA 27 March 2018 (has links)
La produzione di biogas è un tematica di forte impatto globale per due ragioni principali: il prossimo esaurimento dei combustibili fossili e l’inquinamento ambientale dovuto allo smaltimento di scarti organici. La Digestione Anaerobica (DA) è un processo biologico che permette la risoluzione di entrambi i problemi, producendo energia (in forma di biogas) e convertendo gli scarti organici in metano e anidride carbonica. Tale processo è basato su una complessa catena sintrofica tra consorzi microbici che produco il substrato per la fase finale di metanogenesi.
Il siero di latte è uno scarto altamente inquinante derivante dal processo di lavorazione del formaggio e per questo è stato ampiamente investigato come substrato per la DA. Tuttavia esiste uno scarto meno noto prodotto dalle fasi di porzionatura e grattugia del formaggio a lunga stagionatura.
Il presente studio analizza il microbioma di digestori anaerobici processanti scarti dell’industria lattiero-casearia, quali letame bovino, siero di latte e scarto del formaggio a pasta dura. In particolare, viene analizzato l’effetto dei parametri di processo, delle diverse configurazioni dei reattori e del tipo di scarto, su tale microbioma. L’obiettivo è raggiunto tramite tecniche biomolecolari che permettono di quantificare e identificare le principali specie presenti nei reattori, insieme alla differente espressione genica in seguito all’iniezione di idrogeno a scopo di upgrading del biogas. / Biogas production is a hot topic, which has globally gained interest from many researchers over the past years. This fact is mainly due to the depletion of fossil fuels and environmental concerns regarding wastes disposal. Anaerobic Digestion (AD) represents a biological way to obtain both energy (in form of biogas) and waste discard, by converting the polluting organic matter. The overall process relies on a syntrophic chain where different microbial consortia produce the feed necessary for the final methanogenic step.
Cheese whey has been largely investigated for AD treatment, since is a high polluting waste derived from the cheese-making process. However, there is a less-known waste originating from the portioning and shaving phases of long-ripened hard-cheese.
This study aimed to investigate the microbiome of anaerobic digesters processing dairy industry wastes, such as cattle manure, cheese whey and hard-cheese powder wastes. In particular, the effects of process parameters, reactor configurations and type of dairy wastes, on the microbial populations, have been analyzed. The goal was achieved by means of culture-independent methods and high throughput sequencing, which allowed quantifying and identifying the main species present, as well as their differential gene expression in relation to hydrogen injection for biogas upgrading purposes.
|
24 |
Metody pro komparativní analýzu metagenomických dat / Methods for Comparative Analysis of Metagenomic DataSedlář, Karel January 2018 (has links)
Moderní výzkum v environmentální mikrobiologii využívá k popisu mikrobiálních komunit genomická data, především sekvenaci DNA. Oblast, která zkoumá veškerý genetický materiál přítomný v environmentálním vzorku, se nazývá metagenomika. Tato doktorská práce se zabývá metagenomikou z pohledu bioinformatiky, která je nenahraditelná při výpočetním zpracování dat. V teoretické části práce jsou popsány dva základní přístupy metagenomiky, včetně jejich základních principů a slabin. První přístup, založený na cíleném sekvenování, je dobře rozpracovanou oblastí s velkou řadou bioinformatických technik. Přesto mohou být metody pro porovnávání vzorků z několika prostředí podstatně vylepšeny. Přístup představený v této práci používá unikátní transformaci dat do podoby bipartitního grafu, kde je jedna partita tvořena taxony a druhá vzorky, případně různými prostředími. Takový graf plně reflektuje kvalitativní i kvantitativní složení analyzované mikrobiální sítě. Umožňuje masivní redukci dat pro jednoduché vizualizace bez negativních vlivů na automatickou detekci komunit, která dokáže odhalit shluky podobných vzorků a jejich typických mikrobů. Druhý přístup využívá sekvenace celého metagenomu. Tato strategie je novější a příslušející bioinformatické nástroje jsou méně propracované. Hlavní výzvou přitom zůstává rychlá klasifikace sekvencí, v metagenomice označovaná jako „binning“. Metoda představená v této práci využívá přístupu zpracování genomických signálů. Tato unikátní metodologie byla navržena na základě podrobné analýzy redundance genetické informace uložené v genomických signálech. Využívá transformace znakových sekvencí do několika variant fázových signálů. Navíc umožňuje přímé zpracování dat ze sekvenace nanopórem v podobě nativních proudových signálů.
|
25 |
Prediction of Credit Risk using Machine Learning ModelsIsaac, Philip January 2022 (has links)
This thesis aims to investigate different machine learning (ML) models and their performance to find the best performing model to predict credit risk at a specific company. Since granting credit to corporate customers is a part of this company's core business, managing the credit risk is of high importance. The company has of today only one credit risk measurement, which is obtained through an external company, and the goal is to find a model that outperforms this measurement. The study consists of two ML models, Logistic Regression (LR) and eXtreme Gradient Boosting. This thesis proves that both methods perform better than the external risk measurement and the LR method achieves the overall best performance. One of the most important analyses done in this thesis was handling the dataset and finding the best-suited combination of features that the ML models should use.
|
26 |
Exploration of microbial diversity and evolution through cultivation independent phylogenomicsMartijn, Joran January 2017 (has links)
Our understanding of microbial evolution is largely dependent on available genomic data of diverse organisms. Yet, genome-sequencing efforts have mostly ignored the diverse uncultivable majority in favor of cultivable and sociologically relevant organisms. In this thesis, I have applied and developed cultivation independent methods to explore microbial diversity and obtain genomic data in an unbiased manner. The obtained genomes were then used to study the evolution of mitochondria, Rickettsiales and Haloarchaea. Metagenomic binning of oceanic samples recovered draft genomes for thirteen novel Alphaproteobacteria-related lineages. Phylogenomics analyses utilizing the improved taxon sample suggested that mitochondria are not related to Rickettsiales but rather evolved from a proteobacterial lineage closely related to all sampled alphaproteobacteria. Single-cell genomics and metagenomics of lake and oceanic samples, respectively, identified previously unobserved Rickettsiales-related lineages. They branched early relative to characterized Rickettsiales and encoded flagellar genes, a feature once thought absent in this order. Flagella are most likely an ancestral feature, and were independently lost during Rickettsiales diversification. In addition, preliminary analyses suggest that ATP/ADP translocase, the marker for energy parasitism, was acquired after the acquisition of type IV secretion systems during the emergence of the Rickettsiales. Further exploration of the oceanic samples yielded the first draft genomes of Marine Group IV archaea, the closest known relatives of the Haloarchaea. The halophilic and generally aerobic Haloarchaea are thought to have evolved from an anaerobic methanogenic ancestor. The MG-IV genomes allowed us to study this enigmatic evolutionary transition. Preliminary ancestral reconstruction analyses suggest a gradual loss of methanogenesis and adaptation to an aerobic lifestyle, respectively. The thesis further presents a new amplicon sequencing method that captures near full-length 16S and 23S rRNA genes of environmental prokaryotes. The method exploits PacBio's long read technology and the frequent proximity of these genes in prokaryotic genomes. Compared to traditional partial 16S amplicon sequencing, our method classifies environmental lineages that are distantly related to reference taxa more confidently. In conclusion, this thesis provides new insights into the origins of mitochondria, Rickettsiales and Haloarchaea and illustrates the power of cultivation independent methods with respect to the study of microbial evolution.
|
27 |
Probing the Structure of Ionised ISM in Lyman-Continuum-Leaking Green Pea Galaxies with MUSENagar, Chinmaya January 2023 (has links)
Lyman continuum (LyC) photons are known to be responsible for reionising the universe after the end of the Dark Ages, which marked a period called the Epoch of Reionisation (EoR). While these high-energy photons are thought to predominantly originate from young, hot, massive stars within the earliest galaxies, and contributions from high-energy sources like quasars and AGN, the origins of these photons are yet not well known and highly debated. Detecting LyC photons from the early galaxies near the EoR is not possible as they get completely absorbed by the intergalactic medium (IGM) on their way to us, which has prompted the development of various indirect diagnostics to study the amount of LyC photons contributed by such galaxies by studying their analogues at low redshifts. In this study, we probe the ionised interstellar medium (ISM) of seven Green Pea galaxies through spatially resolved[O III] λ5007/[O II] λ3727 (O32) and [O III] λ5007/Hα λ6562 (O3Hα) emission-line ratio maps, using data from the Multi Unit Spectroscopic Explorer (MUSE) onboard the Very large telescope (VLT). Out of the two ratios, the former has proven to be a successful diagnostic in predicting Lyman continuum emitters (LCEs). Along with the line ratio maps, the surface brightness profiles of the galaxies are also studied to examine the spatial distribution of the emission lines and the regions from which they originate. The resulting maps indicate whether the ISM of the galaxies is ionization-bounded or density-bounded. Our analysis reveals that a subset of the galaxies with ionization-bounded ISM exhibits pronounced ionisation channels in the outer regions. These channels are potential pathways through which Lyman continuum photons may escape. For density-bounded ISM, the ionised ISM extends well beyond the stellar regions into the halos of the galaxies, highlighting their potential contribution to the ionising photon budget during the EoR. The findings emphasise the importance of spatially resolved ISM studies in understanding the mechanisms facilitating the escape of LyC photons.
|
28 |
Multi-defect detection in hardwood using AI on hyperspectral imagesYtterberg, Kalle January 2024 (has links)
With the evolution of GPU performance, the interest of using AI for all kinds of purposes has risen. Companies today put a great amount of resources to find new ways of using AI to increase the value of their products or automating processes. An area in the wood industry where AI is widely used and studied is in defect detection. In this thesis, the combination of using AI and hyperspectral images is studied and evaluated in the case of segmenting defects in hardwood with a U- Net network structure. The performance is compared to another known method usually used when dealing with high-dimensional data: PLS-DA. This thesis also compares the use of RGB image data in combination with AI, to further analyze the usefulness that the hyperspectral data provide. The results showed signs of improvement when using hyperspectral images com- pared to RGB images when detecting blue stain and red heartwood defects. De- tection of the defects rot and knots did however show no sign of improvements. Due to the annotations being more accurate in the RGB data, the results from the hyperspectral data-fed networks would suggest that blue stain and red heartwood could be of interest regarding further investigation. Computational performance is shown to vary across the different reduction meth- ods, and the results from this thesis provides some insight that might aid in the reasoning regarding how to choose an appropriate reduction method.
|
29 |
Proton computed tomography / Tomographie proton informatiséeQuiñones, Catherine Thérèse 28 September 2016 (has links)
L'utilisation de protons dans le traitement du cancer est largement reconnue grâce au parcours fini des protons dans la matière. Pour la planification du traitement par protons, l'incertitude dans la détermination de la longueur du parcours des protons provient principalement de l'inexactitude dans la conversion des unités Hounsfield (obtenues à partir de tomographie rayons X) en pouvoir d'arrêt des protons. La tomographie proton (pCT) est une solution attrayante car cette modalité reconstruit directement la carte du pouvoir d'arrêt relatif à l'eau (RSP) de l'objet. La technique pCT classique est basée sur la mesure de la perte d'énergie des protons pour reconstruire la carte du RSP de l'objet. En plus de la perte d'énergie, les protons subissent également des diffusions coulombiennes multiples et des interactions nucléaires qui pourraient révéler d'autres propriétés intéressantes des matériaux non visibles avec les cartes de RSP. Ce travail de thèse a consisté à étudier les interactions de protons au travers de simulations Monte Carlo par le logiciel GATE et d'utiliser ces informations pour reconstruire une carte de l'objet par rétroprojection filtrée le long des chemins les plus vraisemblables des protons. Mise à part la méthode pCT conventionnelle par perte d'énergie, deux modalités de pCT ont été étudiées et mises en œuvre. La première est la pCT par atténuation qui est réalisée en utilisant l'atténuation des protons pour reconstruire le coefficient d'atténuation linéique des interactions nucléaires de l'objet. La deuxième modalité pCT est appelée pCT par diffusion qui est effectuée en mesurant la variation angulaire due à la diffusion coulombienne pour reconstruire la carte de pouvoir de diffusion, liée à la longueur de radiation du matériau. L'exactitude, la précision et la résolution spatiale des images reconstruites à partir des deux modalités de pCT ont été évaluées qualitativement et quantitativement et comparées à la pCT conventionnelle par perte d'énergie. Alors que la pCT par perte d'énergie fournit déjà les informations nécessaires pour calculer la longueur du parcours des protons pour la planification du traitement, la pCT par atténuation et par diffusion donnent des informations complémentaires sur l'objet. D'une part, les images pCT par diffusion et par atténuation fournissent une information supplémentaire intrinsèque aux matériaux de l'objet. D'autre part, dans certains des cas étudiés, les images pCT par atténuation démontrent une meilleure résolution spatiale dont l'information fournie compléterait celle de la pCT par perte d'énergie. / The use of protons in cancer treatment has been widely recognized thanks to the precise stopping range of protons in matter. In proton therapy treatment planning, the uncertainty in determining the range mainly stems from the inaccuracy in the conversion of the Hounsfield units obtained from x-ray computed tomography to proton stopping power. Proton CT (pCT) has been an attractive solution as this modality directly reconstructs the relative stopping power (RSP) map of the object. The conventional pCT technique is based on measurements of the energy loss of protons to reconstruct the RSP map of the object. In addition to energy loss, protons also undergo multiple Coulomb scattering and nuclear interactions which could reveal other interesting properties of the materials not visible with the RSP maps. This PhD work is to investigate proton interactions through Monte Carlo simulations in GATE and to use this information to reconstruct a map of the object through filtered back-projection along the most likely proton paths. Aside from the conventional energy-loss pCT, two pCT modalities have been investigated and implemented. The first one is called attenuation pCT which is carried out by using the attenuation of protons to reconstruct the linear inelastic nuclear cross-section map of the object. The second pCT modality is called scattering pCT which is performed by utilizing proton scattering by measuring the angular variance to reconstruct the relative scattering power map which is related to the radiation length of the material. The accuracy, precision and spatial resolution of the images reconstructed from the two pCT modalities were evaluated qualitatively and quantitatively and compared with the conventional energy-loss pCT. While energy-loss pCT already provides the information needed to calculate the proton range for treatment planning, attenuation pCT and scattering pCT give complementary information about the object. For one, scattering pCT and attenuation pCT images provide an additional information intrinsic to the materials in the object. Another is that, in some studied cases, attenuation pCT images demonstrate a better spatial resolution and showed features that would supplement energy-loss pCT reconstructions.
|
30 |
Modélisation et simulation numérique de la dynamique des aérosols atmosphériquesDebry, Edouard 12 1900 (has links) (PDF)
Des modèles de chimie transport permettent le suivi réaliste des polluants en phase gazeuse dans l'atmosphère. Cependant, lapollution atmosphérique se trouve aussi sous forme de fines particules en suspension, les aérosols, qui interagissent avec la phase gazeuse, le rayonnement solaire, et possèdent une dynamique propre. Cette thèse a pour objet la modélisation et la simulation numérique de l'Equation Générale de la Dynamique des aérosols (GDE). La partie I traite de quelques points théoriques de la modélisation des aérosols. La partie II est consacrée à l'élaboration du module d'aérosols résolu en taille (SIREAM). dans la partie III, on effectue la réduction du modèle en vue de son utilisation dans un modèle de dispersion tel que POLAIR3D. Plusieurs points de modélisation restent encore largement ouverts: la partie organique des aérosols, le mélange externe, le couplage à la turbulence, et les nano-particules.
|
Page generated in 0.0696 seconds