Global ETD Search

141	IRIG 106 Chapter 10 vs. iNET Packetization: Data Storage and Retrieval Jones, Charles H. 10 1900 (has links) ITC/USA 2012 Conference Proceedings / The Forty-Eighth Annual International Telemetering Conference and Technical Exhibition / October 22-25, 2012 / Town and Country Resort & Convention Center, San Diego, California / The approach to recording data during Test & Evaluation has evolved dramatically over the decades. A simple, traditional approach is to pull all data into a PCM format and record that. A common current approach is to record data in an IRIG 106 Chapter 10 compliant format that records different forms of data (bus, discrete, video, etc.) in different channels of the recorder or exported data file. With network telemetry on the horizon, in the form of the integrated Network Enhanced Telemetry (iNET) standards, much of the data will be transported in iNET messages via Ethernet frames. These messages can potentially carry any type of data from any source. How do we record this data? Ultimately, no matter how the data is stored, it must be translated into a form that can be used for data analysis. Data storage forms that are conducive to this analysis are not necessarily the same that are conducive to real time recording. This paper discusses options and tradeoffs of different approaches to incorporating iNET data structures into the existing T&E architecture. Recording standards data analysis IRIG 106 Chapter 10 iNET
142	Light Curves of Type Ia Supernovae and Preliminary Cosmological Constraints from the ESSENCE Survey Narayan, Gautham Siddharth 30 September 2013 (has links) The ESSENCE survey discovered 213 type Ia supernovae at redshifts 0.10 < z < 0.81 between 2002 and 2008. We present their R and I band light curve measurements, obtained using the MOSAIC II imager at the CTIO 4 m, along with rapid response spectroscopy for each object from a range of large aperture ground based telescopes. We detail our program to obtain quantitative classifications and precise redshifts from our spectroscopic follow-up of each object. We describe our efforts to improve the precision of the calibration of the CTIO 4 m natural photometric system. We use several empirical metrics to measure our internal photometric consistency and our absolute calibration of the survey. We assess the effect of various sources of systematic error on our measured fluxes, and estimate that the total systematic error budget from the photometric calibration is ~1%. We combine 108 ESSENCE SNIa that pass stringent quality cuts with a compilation of 441 SNIa from 3 year results presented by the Supernova Legacy Survey and Baryon Acoustic Oscillation measurements from the Sloan Digital Sky Survey to produce preliminary cosmological constraints employing the SNIa . This constitutes the largest sample of well-calibrated, spectroscopically confirmed SNIa to date. Assuming a flat Universe, we obtain a joint constraint of \(\Omega_M = 0.266^{+0.026}_{-0.016}(stat 1\sigma)\), and \(w = -1.112^{+0.069}_{-0.072}(stat 1\sigma)\). These measurements are consistent with a cosmological constant. / Physics Physics Astrophysics Astronomy Cosmology Data Analysis Methods Observations Supernovae Surveys
143	Modelling multivariate interval-censored and left-truncated survival data using proportional hazards model Cheung, Tak-lun, Alan, 張德麟 January 2003 (has links) published_or_final_version / abstract / toc / Statistics and Actuarial Science / Master / Master of Philosophy Survival analysis (Biometry) Interval analysis (Mathematics) Failure time data analysis.
144	Ανάλυση οικονομικών δεδομένων με χρήση τεχνικών εξόρυξης Ζαβουδάκης, Γεώργιος 19 May 2015 (has links) Μετά την μεγάλη έξαρση της τεχνολογικής ανάπτυξης ο όγκος των δεδομένων-πληροφοριών σήμερα είναι τεράστιος και όσο περνάνε τα χρόνια θα μεγαλώνει ακόμα περισσότερο. Είναι βέβαιο λοιπόν ότι ζούμε στην κοινωνία της πληροφορίας, όπου η μετατροπή των δεδομένων σε πληροφορία απαιτείται να οδηγεί στη μετατροπή της πληροφορίας σε γνώση. Έτσι δημιουργήθηκε η ανάγκη επεξεργασίας αυτών των δεδομένων και η μετατροπή τους σε χρήσιμες πληροφορίες που θα βοηθήσουν στην λήψη αποφάσεων. Οι τεχνικές εξόρυξης αποτελούν ένα σημαντικό εργαλείο που μας βοηθά να αντλήσουμε γνώση από μεγάλους όγκους δεδομένων και αν σκεφτούμε ότι όλα αυτά μπορούν να συνδυαστούν με στατιστικές μεθόδους τότε εύκολα μπορούμε να κάνουμε ανάκτηση πληροφορίας. Η συνύπαρξη ετερόκλητων επιστημονικών πεδίων όπως της στατιστικής, της μηχανικής εκμάθησης, της θεωρίας της πληροφορίας και των υπολογιστικών διαδικασιών, έχει δημιουργήσει μια νέα επιστήμη με δυναμικά εργαλεία. Η επιστήμη αυτή καλείται «Εξόρυξη Δεδομένων (ΕΔ)» (Data Mining) και είναι μέρος της διαδικασίας «Ανακάλυψης Γνώσης από Βάσεις Δεδομένων» (Knowledge Discovery in Databases - KDD). Τα εργαλεία της ΕΔ είναι οι αλγόριθμοί της, οι οποίοι επιχειρούν να βρουν χρήσιμα και κατανοητά πρότυπα στα δεδομένα. Κύριος στόχος της παρούσας Διπλωματικής Εργασίας είναι η συγκέντρωση βασικών αλγορίθμων και μεθόδων που επιλέγουν και καθαρίζουν δεδομένα, αναγνωρίζουν πρότυπα, βελτιστοποιούν ένα σύστημα διαχείρισης και συσταδοποιούν δεδομένα. Θα δώσουμε έμφαση σε αλγορίθμους που είναι κατάλληλοι για χρονικά οικονομικά δεδομένα. Εκτός από την καταγραφή των μεθόδων και εφαρμογών της Εξόρυξης δεδομένων και της KDD, θα εφαρμόσουμε τεχνικές συσταδοποίησης σε ένα σύνολο δεδομένων, το οποίο περιλαμβάνει οικονομικά δεδομένα από τρεις διαφορετικές κατηγορίες: τιμές των μετοχών υψηλής κεφαλαιοποίησης του δείκτη Nasdaq , η διαχρονική ισοτιμία Ευρώ/δολλαρίου και η διαχρονική διαμόρφωση των τιμών του πετρελαίου/ανα βαρέλι στις διεθνείς αγορές.Η εργασία αυτή χωρίζεται σε πέντε κεφάλαια: Εισαγωγή, θεωρητικό υπόβαθρο, μεθοδολογία, υλοποίηση πρακτικής εφαρμογής και συμπεράσματα. Στο κεφάλαιο 1 κάνουμε μια πρώτη γνωριμία με την Εξόρυξη γνώσης από Δεδομένα ,στο κεφάλαιο 2 γίνεται η βιβλιογραφική ανασκόπηση και παρουσιάζεται αναλυτικά όλο το θεωρητικό υπόβαθρο των μεθόδων που θα χρησιμοποιηθούν. Στο κεφάλαιο 3 παρουσιάζονται οι μεθοδολογίες (μέθοδοι εξόρυξης για συσταδοποίηση, κατηγοριοποίηση και πρόβλεψη) που χρησιμοποιήθηκαν για τη μελέτη, ενώ στο επόμενο κεφάλαιο παρουσιάζεται μια πρακτική εφαρμογή των παραπάνω ως αποτελέσματα των μεθοδολογιών αυτών. Και τέλος, στο κεφάλαιο 5 παρουσιάζονται κάποια συμπεράσματα που μπορούμε να εξάγουμε από την υλοποίηση της πρακτικής εφαρμογής. Η εργασία αυτή έχει ως στόχο να αναδείξει την σχέση που μπορεί να υπάρξει ανάμεσα στην Οικονομική επιστήμη και σε αυτήν της Τεχνητής Νοημοσύνης, εστιάζοντας κυρίως στο κατά πόσο η δεύτερη μπορεί να δώσει λύσεις σε καίρια ζητήματα, προβλήματα αλλά και προκλήσεις που παρουσιάζονται στο σύγχρονο οικονομικό περιβάλλον. Το μέσο για την εκπλήρωση αυτού του στόχου είναι οι τεχνικές Data Mining, που στα ελληνικά σαν όρος, αποδίδονται ως Τεχνικές Εξόρυξης Δεδομένων. Για την υλοποίηση της εργασίας αυτής, σαν πηγές χρησιμοποιήθηκαν πολλά επιστημονικά βιβλία που σχετίζονται με την Οικονομία, τα Χρηματοοικονομικά, την Τεχνητή Νοημοσύνη και τις μεθόδους Data Mining, τις Πολυκριτήριες Τεχνικές Ταξινόμησης αλλά και την Στατιστική. Το αποτέλεσμα από τον συνδυασμό των παραπάνω θα παρουσιαστεί στις σελίδες που θα ακολουθήσουν. / After the great upsurge of technological development the volume of currently-information data is huge and as the years pass will grow even more. It is certain, therefore, that we live in the information society, where the transformation of data into information needed to drive the conversion of information into knowledge. This created the need to process this data and turn them into useful information that will help in decision making. The mining techniques are an important tool that helps us to draw knowledge from large volumes of data and if we think that all this can be combined with statistical methods then we can easily retrieve information. The disparate disciplines such as statistics, machine learning, information theory and computational procedures, has created a new science with powerful tools. This science is called "Data Mining (DM)» and is part of the 'Knowledge Discovery from Databases ». The tools of DM are the algorithms that are trying to find useful and understandable patterns in data. The main objective of this thesis is the concentration of basic algorithms and methods chosen and cleanse data, recognize patterns, optimize a management system and clustering data. Will emphasize algorithms that are suitable for time economic data. Besides recording the methods and applications of data mining and KDD, we apply clustering techniques to a data set, which includes financial data from three different categories: price-cap stock index Nasdaq, the timeless rate Euro / dollar and the configuration of oil prices / per barrel in international markets. This paper is divided into five chapters: Introduction, theoretical background, methodology, implementation of practical application and conclusions. In Chapter 1, we make a first acquaintance with the Mining Data, in Chapter 2 is the literature review and presented in detail all the theoretical background of the methods used. Methodologies presented in Chapter 3 (mining methods for clustering, classification and prediction) used for the study, while the next chapter presents a practical application of the above as a result of these methodologies. Finally, Chapter 5 presents some conclusions can be drawn from the implementation of the practice.This paper aims to highlight the relationship that can exist between economic science and that of Artificial Intelligence, focusing mainly on whether the latter can provide solutions to key issues, problems and challenges presented in today's economic environment . The means to achieve this objective are the technical Data Mining, which in Greek as term, rendered as Technical Data Mining. For the realization of this work, as sources used many scientific books related to the Economy, Finance, Artificial Intelligence and methods Data Mining, the Multicriteria Classification Techniques and Statistics. The result from the combination of the above will be presented in the pages that follow. Τεχνικές εξόρυξης Ανάλυση δεδομένων 006.312 Data mining Data analysis
145	Predictive Gaussian Classification of Functional MRI Data Yourganov, Grigori 14 January 2014 (has links) This thesis presents an evaluation of algorithms for classification of functional MRI data. We evaluated the performance of probabilistic classifiers that use a Gaussian model against a popular non-probabilistic classifier (support vector machine, SVM). A pool of classifiers consisting of linear and quadratic discriminants, linear and non-linear Gaussian Naive Bayes (GNB) classifiers, and linear SVM, was evaluated on several sets of real and simulated fMRI data. Performance was measured using two complimentary metrics: accuracy of classification of fMRI volumes within a subject, and reproducibility of within-subject spatial maps; both metrics were computed using split-half resampling. Regularization parameters of multivariate methods were tuned to optimize the out-of-sample classification and/or within-subject map reproducibility. SVM showed no advantage in classification accuracy over Gaussian classifiers. Performance of SVM was matched by linear discriminant, and at times outperformed by quadratic discriminant or nonlinear GNB. Among all tested methods, linear and quadratic discriminants regularized with principal components analysis (PCA) produced spatial maps with highest within-subject reproducibility. We also demonstrated that the number of principal components that optimizes the performance of linear / quadratic discriminants is sensitive to the mean magnitude, variability and connectivity of simulated active signal. In real fMRI data, this number is correlated with behavioural measures of post-stroke recovery , and, in a separate study, with behavioural measures of self-control. Using the data from a study of cognitive aspects of aging, we accurately predicted the age group of the subject from within-subject spatial maps created by our pool of classifiers. We examined the cortical areas that showed difference in recruitment in young versus older subjects; this difference was demonstrated to be primarily driven by more prominent recruitment of task-positive network in older subjects. We conclude that linear and quadratic discriminants with PCA regularization are well-suited for fMRI data classification, particularly for within-subject analysis. fMRI data analysis principal components analysis machine learning 0541
146	Visual exploratory analysis of large data sets : evaluation and application Lam, Heidi Lap Mun 11 1900 (has links) Large data sets are difficult to analyze. Visualization has been proposed to assist exploratory data analysis (EDA) as our visual systems can process signals in parallel to quickly detect patterns. Nonetheless, designing an effective visual analytic tool remains a challenge. This challenge is partly due to our incomplete understanding of how common visualization techniques are used by human operators during analyses, either in laboratory settings or in the workplace. This thesis aims to further understand how visualizations can be used to support EDA. More specifically, we studied techniques that display multiple levels of visual information resolutions (VIRs) for analyses using a range of methods. The first study is a summary synthesis conducted to obtain a snapshot of knowledge in multiple-VIR use and to identify research questions for the thesis: (1) low-VIR use and creation; (2) spatial arrangements of VIRs. The next two studies are laboratory studies to investigate the visual memory cost of image transformations frequently used to create low-VIR displays and overview use with single-level data displayed in multiple-VIR interfaces. For a more well-rounded evaluation, we needed to study these techniques in ecologically-valid settings. We therefore selected the application domain of web session log analysis and applied our knowledge from our first three evaluations to build a tool called Session Viewer. Taking the multiple coordinated view and overview + detail approaches, Session Viewer displays multiple levels of web session log data and multiple views of session populations to facilitate data analysis from the high-level statistical to the low-level detailed session analysis approaches. Our fourth and last study for this thesis is a field evaluation conducted at Google Inc. with seven session analysts using Session Viewer to analyze their own data with their own tasks. Study observations suggested that displaying web session logs at multiple levels using the overview + detail technique helped bridge between high-level statistical and low-level detailed session analyses, and the simultaneous display of multiple session populations at all data levels using multiple views allowed quick comparisons between session populations. We also identified design and deployment considerations to meet the needs of diverse data sources and analysis styles. Information visualization Exploratory data analysis Human-computer interface Evaluation
147	Sports Supplements and Risk: Perceptions of Young Male Supplement Users Bowman, Carolyn 26 August 2011 (has links) The purpose of this study was to describe the experience of using sports supplements, from a risk theory perspective. Thematic analysis was used to conduct a secondary data analysis on 18 interviews done with young men who were interested in supplements. Participants were recruited from Guelph area commercial gyms and campus athletic centres. Participants used supplements because they worked out and wanted to gain muscle. Supplements, and especially protein, were part of a common knowledge among people who worked out. Participants evaluated whether supplements were ‘worth it’ by evaluating the cost, efficacy, and safety of supplements. Participants altered their behaviour in response to their perception of the riskiness of supplements, in order to feel safe. Many participants valued information from health professionals but found it lacking. Most information was available from sources that participants did not feel were credible. Sports Supplements Risk Perception Thematic Analysis Secondary Data Analysis
148	A New Reclassification Method for Highly Uncertain Microarray Data in Allergy Gene Prediction Paul, Jasmin 11 April 2012 (has links) The analysis of microarray data is a challenging task because of the large dimensionality and small sample size involved. Although a few methods are available to address the problem of small sample size, they are not sufficiently successful in dealing with microarray data from extremely small (~<20) sample sizes. We propose a method to incorporate information from diverse sources to analyze the microarray data so as to improve the predictability of significant genes. A transformed data set, including statistical parameters, literature mining and gene ontology data, is evaluated. We performed classification experiments to identify potential allergy-related genes. Feature selection is used to identify the effect of features on classifier behaviour. An exploratory and domain knowledge analysis was performed on noisy real-life allergy data, and a subset of genes was selected as positive and negative class. A new set of transformed variables, depending on the mean and standard deviation statistics of the data distribution and other data sources, was identified. Significant allergy- and immune-related genes from the microarray data were selected. Experiments showed that classification predictability of significant genes can be improved. Important features from the transformed variable set were also identified.
149	Predictive Gaussian Classification of Functional MRI Data Yourganov, Grigori 14 January 2014 (has links) This thesis presents an evaluation of algorithms for classification of functional MRI data. We evaluated the performance of probabilistic classifiers that use a Gaussian model against a popular non-probabilistic classifier (support vector machine, SVM). A pool of classifiers consisting of linear and quadratic discriminants, linear and non-linear Gaussian Naive Bayes (GNB) classifiers, and linear SVM, was evaluated on several sets of real and simulated fMRI data. Performance was measured using two complimentary metrics: accuracy of classification of fMRI volumes within a subject, and reproducibility of within-subject spatial maps; both metrics were computed using split-half resampling. Regularization parameters of multivariate methods were tuned to optimize the out-of-sample classification and/or within-subject map reproducibility. SVM showed no advantage in classification accuracy over Gaussian classifiers. Performance of SVM was matched by linear discriminant, and at times outperformed by quadratic discriminant or nonlinear GNB. Among all tested methods, linear and quadratic discriminants regularized with principal components analysis (PCA) produced spatial maps with highest within-subject reproducibility. We also demonstrated that the number of principal components that optimizes the performance of linear / quadratic discriminants is sensitive to the mean magnitude, variability and connectivity of simulated active signal. In real fMRI data, this number is correlated with behavioural measures of post-stroke recovery , and, in a separate study, with behavioural measures of self-control. Using the data from a study of cognitive aspects of aging, we accurately predicted the age group of the subject from within-subject spatial maps created by our pool of classifiers. We examined the cortical areas that showed difference in recruitment in young versus older subjects; this difference was demonstrated to be primarily driven by more prominent recruitment of task-positive network in older subjects. We conclude that linear and quadratic discriminants with PCA regularization are well-suited for fMRI data classification, particularly for within-subject analysis. fMRI data analysis principal components analysis machine learning 0541
150	Functional Chemometrics: Automated Spectral Smoothing with Spatially Adaptive Splines Fernandes, Philip Manuel 02 October 2012 (has links) Functional data analysis (FDA) is a demonstrably effective, practical, and powerful method of data analysis, yet it remains virtually unheard of outside of academic circles and has almost no exposure to industry. FDA adds to the milieu of statistical methods by treating functions of one or more independent variables as data objects, analogous to the way in which discrete points are the data objects we are familiar with in conventional statistics. The first step in functional analysis is to “functionalize” the data, or convert discrete points into a system represented most times by continuous functions. Choosing the type of functions to use is data-dependent and often straightforward – for example, Fourier series lend themselves well to periodic systems, while splines offer great flexibility in approximating more irregular trends, such as chemical spectra. This work explores the question of how B-splines can be rapidly and reliably used to denoised infrared chemical spectra, a difficult problem not only because of the many parameters involved in generating a spline fit, but also due to the disparate nature of spectra in terms of shape and noise intensity. Automated selection of spline parameters is required to support high-throughput analysis, and the heteroscedastic nature of such spectra presents challenges for existing techniques. The heuristic knot placement algorithm of Li et al. (2005) for 1D object contours is extended to spectral fitting by optimizing the denoising step for a range of spectral types and signal/noise ratios, using the following criteria: robustness to types of spectra and noise conditions, parsimony of knots, low computational demand, and ease of implementation in high-throughput settings. Pareto-optimal filter configurations are determined using simulated data from factorial experimental designs. The improved heuristic algorithm uses wavelet transforms and provides improved performance in robustness, parsimony of knots and the quality of functional regression models used to correlate real spectral data with chemical composition. In practical applications, functional principal component regression models yielded similar or significantly improved results when compared with their discrete partial least squares counterparts. / Thesis (Master, Chemical Engineering) -- Queen's University, 2012-10-01 20:18:31.119

Search results