721 |
Hodnocení morfologie patra u BCLP pacientů s palatoláliemi / Evaluation of the palate morphology in bilatelar cleft lip and palate clefts with palatolalyHamtilová, Martina January 2011 (has links)
The diploma work was based on the evaluation of dental casts of patients with bilateral cleft lip and palate (BCLP) with a mean age of 10. Patients consist of two groups, patients without defect in speech and with speech impairment (palatolaly). Palatolalies in the literature are primarily associated with velopharyngeal insufficiency. The study tested the working hypothesis that in the failure of speech is involved a different, specific in some way, palatal shape. Dental casts were scanned using a laser scanner and analyzed by 3-D geometric morphometry and multivariate statistics: principal component analysis (PCA), linear regression analysis and finite element analysis (FESA). Using linear regression it was found that the shape of the palate is affected in younger individuals by age, and so had to be 5 patients excluded for further analysis. Patients with palatolaly have lower variability the palatal shape than patients without palatolalie, so their palates are similar to each other and have a specific shape. Palates are wider and lower than in individuals without speech disorder and they have a characteristic deepening behind the anterior part of the palate. We assume that these features in palate morphology primarily the lower arch and the substantial deepening are most likely to affect the...
|
722 |
E-noses equipped with Artificial Intelligence Technology for diagnosis of dairy cattle disease in veterinary / E-nose utrustad med Artificiell intelligens teknik avsedd för diagnos av mjölkboskap sjukdom i veterinärHaselzadeh, Farbod January 2021 (has links)
The main goal of this project, running at Neurofy AB, was that developing an AI recognition algorithm also known as, gas sensing algorithm or simply recognition algorithm, based on Artificial Intelligence (AI) technology, which would have the ability to detect or predict diary cattle diseases using odor signal data gathered, measured and provided by Gas Sensor Array (GSA) also known as, Electronic Nose or simply E-nose developed by the company. Two major challenges in this project were to first overcome the noises and errors in the odor signal data, as the E-nose is supposed to be used in an environment with difference conditions than laboratory, for instance, in a bail (A stall for milking cows) with varying humidity and temperatures, and second to find a proper feature extraction method appropriate for GSA. Normalization and Principal component analysis (PCA) are two classic methods which not only intended for re-scaling and reducing of features in a data-set at pre-processing phase of developing of odor identification algorithm, but also it thought that these methods reduce the affect of noises in odor signal data. Applying classic approaches, like PCA, for feature extraction and dimesionality reduction gave rise to loss of valuable data which made it difficult for classification of odors. A new method was developed to handle noises in the odors signal data and also deal with dimentionality reduction without loosing of valuable data, instead of the PCA method in feature extraction stage. This method, which is consisting of signal segmentation and Autoencoder with encoder-decoder, made it possible to overcome the noise issues in data-sets and it also is more appropriate feature extraction method due to better prediction accuracy performed by the AI gas recognition algorithm in comparison to PCA. For evaluating of Autoencoder monitoring of its learning rate of was performed. For classification and predicting of odors, several classifier, among alias, Logistic Regression (LR), Support vector machine (SVM), Linear Discriminant Analysis (LDA), Random forest Classifier (RFC) and MultiLayer perceptron (MLP), was investigated. The best prediction was obtained by classifiers MLP . To validate the prediction, obtained by the new AI recognition algorithm, several validation methods like Cross validation, Accuracy score, balanced accuracy score , precision score, Recall score, and Learning Curve, were performed. This new AI recognition algorithm has the ability to diagnose 3 different diary cattle diseases with an accuracy of 96% despite lack of samples. / Syftet med detta projekt var att utveckla en igenkänning algoritm baserad på maskinintelligens (Artificiell intelligens (AI) ), även känd som gasavkänning algoritm eller igenkänningsalgoritm, baserad på artificiell intelligens (AI) teknologi såsom maskininlärning ach djupinlärning, som skulle kunna upptäcka eller diagnosera vissa mjölkkor sjukdomar med hjälp av luktsignaldata som samlats in, mätts och tillhandahållits av Gas Sensor Array (GSA), även känd som elektronisk näsa eller helt enkelt E-näsa, utvecklad av företaget Neorofy AB. Två stora utmaningar i detta projekt bearbetades. Första utmaning var att övervinna eller minska effekten av brus i signaler samt fel (error) i dess data då E-näsan är tänkt att användas i en miljö där till skillnad från laboratorium förekommer brus, till example i ett stall avsett för mjölkkor, i form av varierande fukthalt och temperatur. Andra utmaning var att hitta rätt dimensionalitetsreduktion som är anpassad till GSA. Normalisering och Principal component analysis (PCA) är två klassiska metoder som används till att både konvertera olika stora datavärden i datamängd (data-set) till samma skala och dimensionalitetsminskning av datamängd (data-set), under förbehandling process av utvecling av luktidentifieringsalgoritms. Dessa metoder används även för minskning eller eliminering av brus i luktsignaldata (odor signal data). Tillämpning av klassiska dimensionalitetsminskning algoritmer, såsom PCA, orsakade förlust av värdefulla informationer som var viktiga för kllasifisering. Den nya metoden som har utvecklats för hantering av brus i luktsignaldata samt dimensionalitetsminskning, utan att förlora värdefull data, är signalsegmentering och Autoencoder. Detta tillvägagångssätt har gjort det möjligt att övervinna brusproblemen i datamängder samt det visade sig att denna metod är lämpligare metod för dimensionalitetsminskning jämfört med PCA. För utvärdering of Autoencoder övervakning of inlärningshastighet av Autoencoder tillämpades. För klassificering, flera klassificerare, bland annat, LogisticRegression (LR), Support vector machine (SVM) , Linear Discriminant Analysis (LDA), Random forest Classifier (RFC) och MultiLayer perceptron (MLP) undersöktes. Bästa resultate erhölls av klassificeraren MLP. Flera valideringsmetoder såsom, Cross-validering, Precision score, balanced accuracy score samt inlärningskurva tillämpades. Denna nya AI gas igenkänningsalgoritm har förmågan att diagnosera tre olika mjölkkor sjukdomar med en noggrannhet på högre än 96%.
|
723 |
The impact of regional integration on socio-economic development in Southern African Customs Union countriesTafirenyika, Blessing 03 1900 (has links)
Regional integration gained popularity and is prioritised globally, especially in developing
economies, including those on the African continent. This is based on its potential to
accelerate trade, stimulate economic growth, and increase access to basic necessities
and to induce a sustainable increase in economic output and improved standards of living.
Regional integration in the context of developing economies is entirely implicit. Modern
literature observes it as a policy option for dealing with a wide variety of issues related to
politics, economic factors, and societal welfare. The SACU, existing since 1910, made
several trade agreements globally. The union aims at reducing inequalities, ensuring
continuous improvement in the general welfare of the population, and sustainable
economic growth. Research, though, indicates that the region persistently reflects poor
socio-economic conditions. This is accompanied by limited development in infrastructure,
lowly skilled and experienced workforce. Primary sector activities dominate their
economies, such as mining and agriculture, high levels of inequalities and poverty.
Regional integration was implemented differently in several countries globally, and Africa
in particular. The research noted that literature on regional integration and its implications
on socio-economic development lacks, especially in the context of SACU. A deficiency
was also emphasised the universal measurement of regional integration, which is not
standardised. Some research employed single variables as a proxy, whilst some
composite indices were also compiled and implemented, suiting the diverse setups and
environments. The development measurements, therefore, cannot universally be applied
attributable to context-specific concerns, prevalent in regions or countries. This study
developed the SACU Regional Integration Index (SRII) because the existing indices on
regional integration are limited concerning applicability. Most of the indices established in
the literature were developed for specific countries and regions with diverse
characteristics from those of the SACU region. In addition to a detailed literature review
and closing methodological divergencies, this study evaluated the effects of regional
integration on socio-economic development in the SACU countries. The objectives of the
study were first, to produce the SACU Regional Integration Index. Second, the study
aimed at evaluating the effect of regional integration on various socio-economic
development factors listed as economic growth, investments, and the Human
Development Index (HDI), inequalities and poverty. Third, the study provided policy
recommendations to the socio-economic problems encountered by the SACU countries;
and lastly, to implement the proposed SRII as a way of providing policymakers with the
actual impacts. The study employed the principal component analysis (PCA) to construct
the SRII. The Ordinary Least Squares (LSDV), fixed effects and random effects were
employed to ascertain the effect of regional integration on socio-economic development
in the SACU countries. The constructed SACU index comprises four dimensions. These
are trade integration; productive integration; infrastructure integration; and financial and
macroeconomic policies integration. The index revealed that SACU countries are
dominated by trade and productive integration. Further analysis of the results indicated
that collaboration on the financial and macroeconomic policies is lacking and the
infrastructure dimension is lagging in the SACU region. Based on the second objective,
the results indicate that regional integration is critical in improving trade openness and
HDI, especially in Lesotho, Botswana, and Namibia. The effect of regional integration on
real Gross Domestic Product (GDP) growth, inequalities, and poverty reduction was
realised in the long run through the interaction of all variables under study. This supported
the dynamic effects posited by the dynamic theory of regional integration. It was
established that growth, though, in infrastructure is insignificant compared to other
dimensions of regional integration. This explains why regional integration was
unsupportive concerning stimulating investments in all the economies forming the SACU
region. The third objective was to proffer policy recommendations. Several practical policy
recommendations emerged from this study, based on the literature findings and review.
These recommendations include implementing inclusive development programmes,
promotion private sector participation in economic activities, and policies, to boost
production capacity in the countries in this region. Based on the fourth objective, this study
further recommends SACU as a region, to integrate into the global economy. This can be
conducted by participating in global production networks for manufacturing and taking
advantage of emerging economies. This would diversify their export markets and their
sources of finance development. SACU countries should make regional integration and
trade a part of their national and sectoral development plans, ensuring coherent trade
and industrial policies. They should also improve their labour, education, social protection,
and safety nets. With data availability, this research can be extended to incorporate
quarterly data or more years of study. Time-series methods can be applied, such as the
Autoregressive Distributive Lag (ARDL) method. This will increase the sample size and
the number of observations, which can improve the outcome from the statistical and
econometric analysis. Future studies may also evaluate the applicability of the index
constructed in this study. / Economics / D. Phil. (Economics)
|
724 |
Assessment of blind source separation techniques for video-based cardiac pulse extractionWedekind, Daniel, Trumpp, Alexander, Gaetjen, Frederik, Rasche, Stefan, Matschke, Klaus, Malberg, Hagen, Zaunseder, Sebastian 09 September 2019 (has links)
Blind source separation (BSS) aims at separating useful signal content from distortions. In the contactless acquisition of vital signs by means of the camera-based photoplethysmogram (cbPPG), BSS has evolved the most widely used approach to extract the cardiac pulse. Despite its frequent application, there is no consensus about the optimal usage of BSS and its general benefit. This contribution investigates the performance of BSS to enhance the cardiac pulse from cbPPGs in dependency to varying input data characteristics. The BSS input conditions are controlled by an automated spatial preselection routine of regions of interest. Input data of different characteristics (wavelength, dominant frequency, and signal quality) from 18 postoperative cardiovascular patients are processed with standard BSS techniques, namely principal component analysis (PCA) and independent component analysis (ICA). The effect of BSS is assessed by the spectral signal-tonoise ratio (SNR) of the cardiac pulse. The preselection of cbPPGs, appears beneficial providing higher SNR compared to standard cbPPGs. Both, PCA and ICA yielded better outcomes by using monochrome inputs (green wavelength) instead of inputs of different wavelengths. PCA outperforms ICA for more homogeneous input signals. Moreover, for high input SNR, the application of ICA using standard contrast is likely to decrease the SNR.
|
725 |
Using Laser-Induced Breakdown Spectroscopy (LIBS) for Material Analysis / Using Laser-Induced Breakdown Spectroscopy (LIBS) for Material AnalysisPořízka, Pavel January 2014 (has links)
Tato doktorská práce je zaměřena na vývoj algoritmu ke zpracování dat naměřených zařízením pro spektrometrii laserem indukovaného plazmatu (angl. LIBS). Zařízení LIBS s tímto algoritmem by mělo být následně schopno provést třídění vzorků a kvantitativní analýzu analytu in-situ a v reálném čase. Celá experimentální část této práce byla provedena ve Spolkovém institutu pro materiálový výzku a testování (něm. BAM) v Berlíně, SRN, kde byl sestaven elementární LIBS systém. Souběžně s experimentílní prací byl vytvořen přehled literárních zdrojů s cílem podat ucelený pohled na problematiku chemometrických metod používaných k analýze LIBS měření. Použití chemometrických metod pro analýzu dat získaných pomocí LIBS měření je obecně doporučováno především tehdy, jsou-li analyzovány vzorky s komplexní matricí. Vývoj algoritmu byl zaměřen na kvantitativní analýzu a třídění vyvřelých hornin na základě měření pomocí LIBS aparatury. Sada vzorků naměřených použitím metody LIBS sestávala z certifikovaných referenčních materiálů a vzorků hornin shromážděných přímo na nalezištích mědi v Íránu. Vzorky z Íránu byly následně na místě roztříděny zkušeným geologem a množství mědi v daných vzorcích bylo změřeno na Univerzitě v Clausthalu, SRN. Výsledné kalibrační křivky byly silně nelineární, přestože byly sestaveny i z měření referenčních vzorků. Kalibrační křivku bylo možné rozložit na několik dílčích tak, že závislost intenzity měděné čáry na množství mědi se nacházela v jiném trendu pro jednotlivé druhy hornin. Rozdělení kalibrační křivky je zpravidla přisuzováno tzv. matričnímu jevu, který silně ovlivňuje měření metodou LIBS. Jinými slovy, pokud určujeme množství analytu ve vzorcích s různou matricí, je výsledná kalibrační křivka sestavená pouze z jedné proměnné (intenzity zvolené spektrální čáry analytu) nepřesná. Navíc, normalizace takto vytvořených kalibračních křivek k intenzitě spektrální čáry matrčního prvku nevedla k výraznému zlepšení linearity. Je obecně nemožné vybrat spektrální čáru jednoho matričního prvku pokud jsou analyzovány prvky s komplexním složením matric. Chemometrické metody, jmenovitě regrese hlavních komponent (angl. PCR) a regrese metodou nejmenších čtverců (angl. PLSR), byly použity v multivariační kvantitatvní analýze, tj. za použití více proměnných/spektrálních čar analytu a matričních prvků. Je potřeba brát v potaz, že PCR a PLSR mohou vyvážit matriční jev pouze do určité míry. Dále byly vzorky úspěšně roztříděny pomocí analýzy hlavních komponent (angl. PCA) a Kohonenových map na základě složení matričních prvků (v anglické literatuře se objevuje termín ‚spectral fingerprint‘) Na základě teorie a experimentálních měření byl navržen algoritmus pro spolehlivé třídění a kvantifikaci neznámých vzorků. Tato studie by měla přispět ke zpracování dat naměřených in-situ přístrojem pro dálkovou LIBS analýzu. Tento přístroj je v současnosti vyvíjen v Brně na Vysokém učení technickém. Toto zařízení bude nenahraditelné při kvantifikaci a klasifikaci vzorků pouze tehdy, pokud bude použito zároveň s chemometrickými metodami a knihovnami dat. Pro tyto účely byla již naměřena a testována část knihoven dat v zaměření na aplikaci metody LIBS do těžebního průmyslu.
|
726 |
PCA based dimensionality reduction of MRI images for training support vector machine to aid diagnosis of bipolar disorder / PCA baserad dimensionalitetsreduktion av MRI bilder för träning av stödvektormaskin till att stödja diagnostisering av bipolär sjukdomChen, Beichen, Chen, Amy Jinxin January 2019 (has links)
This study aims to investigate how dimensionality reduction of neuroimaging data prior to training support vector machines (SVMs) affects the classification accuracy of bipolar disorder. This study uses principal component analysis (PCA) for dimensionality reduction. An open source data set of 19 bipolar and 31 control structural magnetic resonance imaging (sMRI) samples was used, part of the UCLA Consortium for Neuropsychiatric Phenomics LA5c Study funded by the NIH Roadmap Initiative aiming to foster breakthroughs in the development of novel treatments for neuropsychiatric disorders. The images underwent smoothing, feature extraction and PCA before they were used as input to train SVMs. 3-fold cross-validation was used to tune a number of hyperparameters for linear, radial, and polynomial kernels. Experiments were done to investigate the performance of SVM models trained using 1 to 29 principal components (PCs). Several PC sets reached 100% accuracy in the final evaluation, with the minimal set being the first two principal components. Accumulated variance explained by the PCs used did not have a correlation with the performance of the model. The choice of kernel and hyperparameters is of utmost importance as the performance obtained can vary greatly. The results support previous studies that SVM can be useful in aiding the diagnosis of bipolar disorder, and that the use of PCA as a dimensionality reduction method in combination with SVM may be appropriate for the classification of neuroimaging data for illnesses not limited to bipolar disorder. Due to the limitation of a small sample size, the results call for future research using larger collaborative data sets to validate the accuracies obtained. / Syftet med denna studie är att undersöka hur dimensionalitetsreduktion av neuroradiologisk data före träning av stödvektormaskiner (SVMs) påverkar klassificeringsnoggrannhet av bipolär sjukdom. Studien använder principalkomponentanalys (PCA) för dimensionalitetsreduktion. En datauppsättning av 19 bipolära och 31 friska magnetisk resonanstomografi(MRT) bilder användes, vilka tillhör den öppna datakällan från studien UCLA Consortium for Neuropsychiatric Phenomics LA5c som finansierades av NIH Roadmap Initiative i syfte att främja genombrott i utvecklingen av nya behandlingar för neuropsykiatriska funktionsnedsättningar. Bilderna genomgick oskärpa, särdragsextrahering och PCA innan de användes som indata för att träna SVMs. Med 3-delad korsvalidering inställdes ett antal parametrar för linjära, radiala och polynomiska kärnor. Experiment gjordes för att utforska prestationen av SVM-modeller tränade med 1 till 29 principalkomponenter (PCs). Flera PC uppsättningar uppnådde 100% noggrannhet i den slutliga utvärderingen, där den minsta uppsättningen var de två första PCs. Den ackumulativa variansen över antalet PCs som användes hade inte någon korrelation med prestationen på modellen. Valet av kärna och hyperparametrar är betydande eftersom prestationen kan variera mycket. Resultatet stödjer tidigare studier att SVM kan vara användbar som stöd för diagnostisering av bipolär sjukdom och användningen av PCA som en dimensionalitetsreduktionsmetod i kombination med SVM kan vara lämplig för klassificering av neuroradiologisk data för bipolär och andra sjukdomar. På grund av begränsningen med få dataprover, kräver resultaten framtida forskning med en större datauppsättning för att validera de erhållna noggrannheten.
|
727 |
Detecting and Measuring Corruption and Inefficiency in Infrastructure Projects Using Machine Learning and Data AnalyticsSeyedali Ghahari (11182092) 19 February 2022 (has links)
Corruption is a social evil that resonates far and deep in societies,
eroding trust in governance, weakening the rule of law, impairing economic
development, and exacerbating poverty, social tension, and inequality. It is
a multidimensional and complex societal malady that occurs in various forms and
contexts. As such, any effort to combat corruption must be accompanied by a
thorough examination of the attributes that might play a key role in
exacerbating or mitigating corrupt environments. This dissertation identifies a number of attributes that
influence corruption, using machine learning techniques, neural network
analysis, and time series causal relationship analysis and aggregated data from
113 countries from 2007 to 2017. The results suggest that improvements in
technological readiness, human development index, and e-governance index have
the most profound impacts on corruption reduction. This dissertation discusses
corruption at each phase of infrastructure systems development and engineering
ethics that serve as a foundation for corruption mitigation. The dissertation then applies novel analytical
efficiency measurement methods to measure infrastructure inefficiencies, and to rank
infrastructure administrative jurisdictions at the state level. An efficiency frontier is
developed using optimization and the highest performing jurisdictions are
identified. The dissertation’s framework could serve as a
starting point for governmental and non-governmental oversight agencies to
study forms and contexts of corruption and inefficiencies, and to propose
influential methods for reducing the instances. Moreover, the framework can help
oversight agencies to promote the overall accountability of infrastructure
agencies by establishing a clearer connection between infrastructure investment
and performance, and by carrying out comparative assessments of infrastructure
performance across the jurisdictions under their oversight or supervision.
|
728 |
Facial and keystroke biometric recognition for computer based assessmentsAdetunji, Temitope Oluwafunmilayo 12 1900 (has links)
M. Tech. (Department of Information Technology, Faculty of Applied and Computer Sciences), Vaal University of Technology. / Computer based assessments have become one of the largest growing sectors in both nonacademic
and academic establishments. Successful computer based assessments require
security against impersonation and fraud and many researchers have proposed the use of
Biometric technologies to overcome this issue. Biometric technologies are defined as a
computerised method of authenticating an individual (character) based on behavioural and
physiological characteristic features. Basic biometric based computer based assessment
systems are prone to security threats in the form of fraud and impersonations. In a bid to
combat these security problems, keystroke dynamic technique and facial biometric
recognition was introduced into the computer based assessment biometric system so as to
enhance the authentication ability of the computer based assessment system. The keystroke
dynamic technique was measured using latency and pressure while the facial biometrics was
measured using principal component analysis (PCA). Experimental performance was carried
out quantitatively using MATLAB for simulation and Excel application package for data
analysis. System performance was measured using the following evaluation schemes: False
Acceptance Rate (FAR), False Rejection Rate (FRR), Equal Error Rate (EER) and Accuracy
(AC), for a comparison between the biometric computer based assessment system with and
without the keystroke and face recognition alongside other biometric computer based
assessment techniques proposed in the literature. Successful implementation of the proposed
technique would improve computer based assessment’s reliability, efficiency and
effectiveness and if deployed into the society would improve authentication and security
whilst reducing fraud and impersonation in our society.
|
729 |
THEORY OF AUTOMATICITY IN CONSTRUCTIONIkechukwu Sylvester Onuchukwu (17469117) 30 November 2023 (has links)
<p dir="ltr">Automaticity, an essential attribute of skill, is developed when a task is executed repeatedly with minimal attention and can have both good (e.g., productivity, skill acquisitions) and bad (e.g., accident involvement) implications on workers’ performance. However, the implications of automaticity in construction are unknown despite their significance. To address this knowledge gap, this research aimed to examine methods that are indicative of the development of automaticity on construction sites and its implications on construction safety and productivity. The objectives of the dissertation include: 1) examining the development of automaticity during the repetitive execution of a primary task of roofing construction and a concurrent secondary task (a computer-generated audio-spatial processing task) to measure attentional resources; 2) using eye-tracking metrics to distinguish between automatic and nonautomatic subjects and determine the significant factors contributing to the odds of automatic behavior; 3) determining which personal characteristics (such as personality traits and mindfulness dimensions) better explain the variability in the attention of workers while developing automaticity. To achieve this objective, 28 subjects were recruited to take part in a longitudinal study involving a total of 22 repetitive sessions of a simulated roofing task. The task involves the installation of 17 pieces of 25 ft2 shingles on a low-sloped roof model that was 8 ft wide, 8 ft long, and 4 ft high for one month in a laboratory. The collected data was analyzed using multiple statistical and data mining techniques such as repeated measures analysis of variance (RM-ANOVA), pairwise comparisons, principal component analysis (PCA), support vector machine (SVM), binary logistic regression (BLR), relative weight analyses (RWA), and advanced bootstrapping techniques to address the research questions. First, the findings showed that as the experiment progressed, there were significant improvements in the mean automatic performance measures such as the mean primary task duration, mean primary task accuracy, and mean secondary task score over the repeated measurements (p-value < 0.05). These findings were used to demonstrate that automaticity develops during repetitive construction activities. This is because these automatic performance measures provide an index for assessing feature-based changes that are synonymous with automaticity development. Second, this study successfully used supervised machine learning methods including SVM to classify subjects (with an accuracy of 76.8%) based on their eye-tracking data into automatic and nonautomatic states. Also, BLR was used to estimate the probability of exhibiting automaticity based on eye-tracking metrics and ascertain the variables significantly contributing to it. Eye-tracking variables collected towards safety harness and anchor, hammer, and work area AOIs were found to be significant predictors (p < 0.05) of the probability of exhibiting automatic behavior. Third, the results revealed that higher levels of agreeableness significantly impact increased levels of change in attention to productivity-related cues during automatic behavior. Additionally, higher levels of nonreactivity to inner experience significantly reduce the changes in attention to safety-related AOIs while developing automaticity. The findings of this study provide metrics to assess training effectiveness. The findings of this study can be used by practitioners to better understand the positive and negative consequences of developing automaticity, measure workers’ performance more accurately, assess training effectiveness, and personalize learning for workers. In long term, the findings of this study will also aid in improving human-AI teaming since the AI will be better able to understand the cognitive state of its human counterpart and can more precisely adapt to him or her.</p>
|
730 |
VISUAL ANALYTICS OF BIG DATA FROM MOLECULAR DYNAMICS SIMULATIONCatherine Jenifer Rajam Rajendran (5931113) 03 February 2023 (has links)
<p>Protein malfunction can cause human diseases, which makes the protein a target in the process of drug discovery. In-depth knowledge of how protein functions can widely contribute to the understanding of the mechanism of these diseases. Protein functions are determined by protein structures and their dynamic properties. Protein dynamics refers to the constant physical movement of atoms in a protein, which may result in the transition between different conformational states of the protein. These conformational transitions are critically important for the proteins to function. Understanding protein dynamics can help to understand and interfere with the conformational states and transitions, and thus with the function of the protein. If we can understand the mechanism of conformational transition of protein, we can design molecules to regulate this process and regulate the protein functions for new drug discovery. Protein Dynamics can be simulated by Molecular Dynamics (MD) Simulations.</p>
<p>The MD simulation data generated are spatial-temporal and therefore very high dimensional. To analyze the data, distinguishing various atomic interactions within a protein by interpreting their 3D coordinate values plays a significant role. Since the data is humongous, the essential step is to find ways to interpret the data by generating more efficient algorithms to reduce the dimensionality and developing user-friendly visualization tools to find patterns and trends, which are not usually attainable by traditional methods of data process. The typical allosteric long-range nature of the interactions that lead to large conformational transition, pin-pointing the underlying forces and pathways responsible for the global conformational transition at atomic level is very challenging. To address the problems, Various analytical techniques are performed on the simulation data to better understand the mechanism of protein dynamics at atomic level by developing a new program called Probing Long-distance interactions by Tapping into Paired-Distances (PLITIP), which contains a set of new tools based on analysis of paired distances to remove the interference of the translation and rotation of the protein itself and therefore can capture the absolute changes within the protein.</p>
<p>Firstly, we developed a tool called Decomposition of Paired Distances (DPD). This tool generates a distance matrix of all paired residues from our simulation data. This paired distance matrix therefore is not subjected to the interference of the translation or rotation of the protein and can capture the absolute changes within the protein. This matrix is then decomposed by DPD</p>
<p>using Principal Component Analysis (PCA) to reduce dimensionality and to capture the largest structural variation. To showcase how DPD works, two protein systems, HIV-1 protease and 14-3-3 σ, that both have tremendous structural changes and conformational transitions as displayed by their MD simulation trajectories. The largest structural variation and conformational transition were captured by the first principal component in both cases. In addition, structural clustering and ranking of representative frames by their PC1 values revealed the long-distance nature of the conformational transition and locked the key candidate regions that might be responsible for the large conformational transitions.</p>
<p>Secondly, to facilitate further analysis of identification of the long-distance path, a tool called Pearson Coefficient Spiral (PCP) that generates and visualizes Pearson Coefficient to measure the linear correlation between any two sets of residue pairs is developed. PCP allows users to fix one residue pair and examine the correlation of its change with other residue pairs.</p>
<p>Thirdly, a set of visualization tools that generate paired atomic distances for the shortlisted candidate residue and captured significant interactions among them were developed. The first tool is the Residue Interaction Network Graph for Paired Atomic Distances (NG-PAD), which not only generates paired atomic distances for the shortlisted candidate residues, but also display significant interactions by a Network Graph for convenient visualization. Second, the Chord Diagram for Interaction Mapping (CD-IP) was developed to map the interactions to protein secondary structural elements and to further narrow down important interactions. Third, a Distance Plotting for Direct Comparison (DP-DC), which plots any two paired distances at user’s choice, either at residue or atomic level, to facilitate identification of similar or opposite pattern change of distances along the simulation time. All the above tools of PLITIP enabled us to identify critical residues contributing to the large conformational transitions in both HIV-1 protease and 14-3-3σ proteins.</p>
<p>Beside the above major project, a side project of developing tools to study protein pseudo-symmetry is also reported. It has been proposed that symmetry provides protein stability, opportunities for allosteric regulation, and even functionality. This tool helps us to answer the questions of why there is a deviation from perfect symmetry in protein and how to quantify it.</p>
|
Page generated in 0.113 seconds