Global ETD Search

31	Improving the Performance of a Hybrid Classification Method Using a Parallel Algorithm and a Novel Data Reduction Technique Phillips, Rhonda D. 21 August 2007 (has links) This thesis presents both a shared memory parallel version of the hybrid classification algorithm IGSCR (iterative guided spectral class rejection) and a novel data reduction technique that can be used in conjuction with pIGSCR (parallel IGSCR). The parallel algorithm is motivated by a demonstrated need for more computing power driven by the increasing size of remote sensing datasets due to higher resolution sensors, larger study regions, and the like. Even with a fast algorithm such as pIGSCR, the reduction of dimension in a dataset is desirable in order to decrease the processing time further and possibly improve overall classification accuracy. pIGSCR was developed to produce fast and portable code using Fortran 95, OpenMP, and the Hierarchical Data Format version 5 (HDF5) and accompanying data access library. The applicability of the faster pIGSCR algorithm is demonstrated by classifying Landsat data covering most of Virginia, USA into forest and non-forest classes with approximately 90 percent accuracy. Parallel results are given using the SGI Altix 3300 shared memory computer and the SGI Altix 3700 with as many as 64 processors reaching speedups of almost 77. This fast algorithm allows an analyst to perform and assess multiple classifications to refine parameters. As an example, pIGSCR was used for a factorial analysis consisting of 42 classifications of a 1.2 gigabyte image to select the number of initial classes (70) and class purity (70%) used for the remaining two images. A feature selection or reduction method may be appropriate for a specific lassification method depending on the properties and training required for the classification method, or an alternative band selection method may be derived based on the classification method itself. This thesis introduces a feature reduction method based on the singular value decomposition (SVD). This feature reduction technique was applied to training data from two multitemporal datasets of Landsat TM/ETM+ imagery acquired over a forested area in Virginia, USA and Rondonia, Brazil. Subsequent parallel iterative guided spectral class rejection (pIGSCR) forest/non-forest classifications were performed to determine the quality of the feature reduction. The classifications of the Virginia data were five times faster using SVD based feature reduction without affecting the classification accuracy. Feature reduction using the SVD was also compared to feature reduction using principal components analysis (PCA). The highest average accuracies for the Virginia dataset (88.34%) and for the Amazon dataset (93.31%) were achieved using the SVD. The results presented here indicate that SVD based feature reduction can produce statistically significantly better classifications than PCA. / Master of Science Landsat singular value decomposition feature reduction classification data reduction
32	Design, implementation, and evaluation of node placement and data reduction algorithms for large scale wireless networks Mehta, Hardik 01 December 2003 (has links) No description available. Wireless communication systems Sensor networks Large scale networks Data reduction Wireless communication systems Data reduction Large scale networks Sensor networks
33	Dimensionality Reduction for Commercial Vehicle Fleet Monitoring Baldiwala, Aliakbar 25 October 2018 (has links) A variety of new features have been added in the present-day vehicles like a pre-crash warning, the vehicle to vehicle communication, semi-autonomous driving systems, telematics, drive by wire. They demand very high bandwidth from in-vehicle networks. Various electronic control units present inside the automotive transmit useful information via automotive multiplexing. Automotive multiplexing allows sharing information among various intelligent modules inside an automotive electronic system. Optimum functionality is achieved by transmitting this data in real time. The high bandwidth and high-speed requirement can be achieved either by using multiple buses or by implementing higher bandwidth. But, by doing so the cost of the network and the complexity of the wiring in the vehicle increases. Another option is to implement higher layer protocol which can reduce the amount of data transferred by using data reduction (DR) techniques, thus reducing the bandwidth usage. The implementation cost is minimal as only the changes are required in the software and not in hardware. In our work, we present a new data reduction algorithm termed as “Comprehensive Data Reduction (CDR)” algorithm. The proposed algorithm is used for minimization of the bus utilization of CAN bus for a future vehicle. The reduction in the busload was efficiently made by compressing the parameters; thus, more number of messages and lower priority messages can be efficiently sent on the CAN bus. The proposed work also presents a performance analysis of proposed algorithm with the boundary of fifteen compression algorithm, and Compression area selection algorithms (Existing Data Reduction Algorithm). The results of the analysis show that proposed CDR algorithm provides better data reduction compared to earlier proposed algorithms. The promising results were obtained in terms of reduction in bus utilization, compression efficiency, and percent peak load of CAN bus. This Reduction in the bus utilization permits to utilize a larger number of network nodes (ECU’s) in the existing system without increasing the overall cost of the system. The proposed algorithm has been developed for automotive environment, but it can also be utilized in any applications where extensive information transmission among various control units is carried out via a multiplexing bus. Data reduction Fleet Monitoring system CAN communication system Comprehensive data reduction Controller area network CAN Busload reduction Dimensionality reductioin
34	High performance latent dirichlet allocation for text mining Liu, Zelong January 2013 (has links) Latent Dirichlet Allocation (LDA), a total probability generative model, is a three-tier Bayesian model. LDA computes the latent topic structure of the data and obtains the significant information of documents. However, traditional LDA has several limitations in practical applications. LDA cannot be directly used in classification because it is a non-supervised learning model. It needs to be embedded into appropriate classification algorithms. LDA is a generative model as it normally generates the latent topics in the categories where the target documents do not belong to, producing the deviation in computation and reducing the classification accuracy. The number of topics in LDA influences the learning process of model parameters greatly. Noise samples in the training data also affect the final text classification result. And, the quality of LDA based classifiers depends on the quality of the training samples to a great extent. Although parallel LDA algorithms are proposed to deal with huge amounts of data, balancing computing loads in a computer cluster poses another challenge. This thesis presents a text classification method which combines the LDA model and Support Vector Machine (SVM) classification algorithm for an improved accuracy in classification when reducing the dimension of datasets. Based on Density-Based Spatial Clustering of Applications with Noise (DBSCAN), the algorithm automatically optimizes the number of topics to be selected which reduces the number of iterations in computation. Furthermore, this thesis presents a noise data reduction scheme to process noise data. When the noise ratio is large in the training data set, the noise reduction scheme can always produce a high level of accuracy in classification. Finally, the thesis parallelizes LDA using the MapReduce model which is the de facto computing standard in supporting data intensive applications. A genetic algorithm based load balancing algorithm is designed to balance the workloads among computers in a heterogeneous MapReduce cluster where the computers have a variety of computing resources in terms of CPU speed, memory space and hard disk space. 006.3
35	IMPLEMENTING SOFTWARE PROCESS IMPROVEMENTS IN THE T&E COMMUNITY Posey, Chlotia 10 1900 (has links) International Telemetering Conference Proceedings / October 23-26, 2000 / Town & Country Hotel and Conference Center, San Diego, California / The Capability Maturity Model (CMM) developed by the Software Engineering Institute is widely promoted as a method to help decrease the volume of error riddled and late software projects. Because of the projected benefits, the 96th Communications Group/SC (SC) at Eglin Air Force Base began an intensive software process improvement effort in late 1997. This effort was rewarded in September 1999 when the group achieved a CMM Level 2 software rating on its first attempt. As of December 1999, 68% of assessed organizations remained at Level 1 on their first or second assessment. The SC success was not only obtained on its first attempt, but also 11 months ahead of the industry standard. The Level 2 rating was accomplished in the volatile environment needed to support the test and evaluation mission. This environment includes frequent requirement changes, short notice modifications, and externally driven schedules. One reason this milestone was possible is close and direct involvement by management. This paper will present additional factors to implementing a successful software process improvement effort. Capability Maturity Model (CMM) software process improvement test and evaluation data reduction
36	A PC Database and GUI for Telemetry Data Reduction Reinsmith, Lee, Surber, Steven 10 1900 (has links) International Telemetering Conference Proceedings / October 25-28, 1999 / Riviera Hotel and Convention Center, Las Vegas, Nevada / The Telemetry Definition and Processing (TDAP II) application is a PC-based software tool that meets the varied needs - both now and into the 21st century - of instrumentation engineers, data analysts, test engineers, and project personnel in the Test and Evaluation (T&E) community. TDAP II uses state-of-the-art commercial software technology that includes a Microsoft Access 97Ô database and a Microsoft Visual BasicÔ Graphical User Interface (GUI) for users to view and navigate the database. Developed by the Test and Analysis Division of the 96th Communications Group for the tenants of the Air Armament Center (AAC), Eglin AFB Florida, TDAP II provides a centralized repository for both aircraft and weapons instrumentation descriptions and telemetry EU conversion calibrations. Operating in a client/server environment, TDAP II can be effectively used on a small or large network as well as on both a classified or unclassified Intranet or Internet. This paper describes the components and design of this application, along with its operational flexibility and varied uses resulting from the chosen commercial software technology. PC Database Telemetry Data Reduction TDAP II Client/Server Application Database Security
37	Ground-based near-infrared remote sounding of ice giant clouds and methane Tice, Dane Steven January 2014 (has links) The ice giants, Uranus and Neptune, are the two outermost planets in our solar system. With only one satellite flyby each in the late 1980’s, the ice giants are arguably the least understood of the planets orbiting the Sun. A better understanding of these planets’ atmospheres will not only help satisfy the natural scientific curiosity we have about these distant spheres of gas, but also might provide insight into the dynamics and meteorology of our own planet’s atmosphere. Two new ground-based, near-infrared datasets of the ice giants are studied. Both datasets provide data in a portion of the electromagnetic spectrum that provides good constraint on the size of small scattering particles in the atmospheres’ clouds and haze layers. The broad extent of both telescopes’ spectral coverage allows characterisation of these small particles for a wide range of wavelengths. Both datasets also provide coverage of the 825 nm collision-induced hydrogen-absorption feature, allowing us to disentangle the latitudinal variation of CH4 abundance from the height and vertical extent of clouds in the upper troposphere. A two-cloud model is successfully fitted to IRTF SpeX Uranus data, parameterising both clouds with base altitude, fractional scale height, and total opacity. An optically thick, vertically thin cloud with a base pressure of 1.6 bar, tallest in the midlatitudes, shows strong preference for scattering particles of 1.35 μm radii. Above this cloud lies an optically thin, vertically extended haze extending upward from 1.0 bar and consistent with particles of 0.10 μm radii. An equatorial enrichment of methane abundance and a lower cloud of constant vertical thickness was shown to exist using two independent methods of analysis. Data from Palomar SWIFT of three different latitude regions. 551.51
38	An Empirical Approach to Evaluating Sufficient Similarity: Utilization of Euclidean Distance As A Similarity Measure Marshall, Scott 27 May 2010 (has links) Individuals are exposed to chemical mixtures while carrying out everyday tasks, with unknown risk associated with exposure. Given the number of resulting mixtures it is not economically feasible to identify or characterize all possible mixtures. When complete dose-response data are not available on a (candidate) mixture of concern, EPA guidelines define a similar mixture based on chemical composition, component proportions and expert biological judgment (EPA, 1986, 2000). Current work in this literature is by Feder et al. (2009), evaluating sufficient similarity in exposure to disinfection by-products of water purification using multivariate statistical techniques and traditional hypothesis testing. The work of Stork et al. (2008) introduced the idea of sufficient similarity in dose-response (making a connection between exposure and effect). They developed methods to evaluate sufficient similarity of a fully characterized reference mixture, with dose-response data available, and a candidate mixture with only mixing proportions available. A limitation of the approach is that the two mixtures must contain the same components. It is of interest to determine whether a fully characterized reference mixture (representative of the random process) is sufficiently similar in dose-response to a candidate mixture resulting from a random process. Four similarity measures based on Euclidean distance are developed to aid in the evaluation of sufficient similarity in dose-response, allowing for mixtures to be subsets of each other. If a reference and candidate mixture are concluded to be sufficiently similar in dose-response, inference about the candidate mixture can be based on the reference mixture. An example is presented demonstrating that the benchmark dose (BMD) of the reference mixture can be used as a surrogate measure of BMD for the candidate mixture when the two mixtures are determined to be sufficiently similar in dose-response. Guidelines are developed that enable the researcher to evaluate the performance of the proposed similarity measures. Classification Data Reduction Environmental Risk Assessment Equivalence Testing Mixed Models Biostatistics Physical Sciences and Mathematics Statistics and Probability
39	CorrelaÃÃo EspaÃo-Temporal Multivariada na Melhoria da PrecisÃo de PrediÃÃo para ReduÃÃo de Dados em Redes de Sensores Sem Fio / Improving Prediction Accuracy for WSN Data Reduction by Applying Multivariate Spatio-Temporal Correlation Carlos Giovanni Nunes de Carvalho 23 March 2012 (has links) FundaÃÃo de Amparo a Pesquisa do Estado do PiauÃ / A prediÃÃo de dados nÃo enviados ao sorvedouro Ã uma tÃcnica usada para economizar energia em RSSF atravÃs da reduÃÃo da quantidade de dados trafegados. PorÃm, os dispositivos devem rodar mecanismos simples devido as suas limitaÃÃes de recursos, os quais podem gerar erros indesejÃveis e isto pode nÃo ser muito preciso. Este trabalho propÃe um mÃtodo baseado na correlaÃÃo espacial e temporal multivariada para melhorar a precisÃo da prediÃÃo na reduÃÃo de dados de Redes de Sensores Sem Fio (RSSF). SimulaÃÃes foram feitas envolvendo funÃÃes de regressÃo linear simples e regressÃo linear mÃltipla para verificar o desempenho do mÃtodo proposto. Os resultados mostram um maior grau de correlaÃÃo entre as variÃveis coletadas em campo, quando comparadas com a variÃvel tempo, a qual Ã uma variÃvel independente usada para prediÃÃo. A precisÃo da prediÃÃo Ã menor quando a regressÃo linear simples Ã usada, enquanto a regressÃo linear mÃltipla Ã mais precisa. AlÃm disto, a soluÃÃo proposta supera algumas soluÃÃes atuais em cerca de 50% na prediÃÃo da variÃvel umidade e em cerca de 21% na prediÃÃo da variÃvel luminosidade. / Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, sensor devices must run simple mechanisms due to its constrained resources, which may cause unwanted errors and this may not be very accurate. This work proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to variable time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, the proposed solution outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. TeleinformÃtica Sistemas de comunicaÃÃo mÃvel Data reduction Prediction accuracy Wireless Sensor Network ENGENHARIAS
40	A Contribution To Modern Data Reduction Techniques And Their Applications By Applied Mathematics And Statistical Learning Sakarya, Hatice 01 January 2010 (has links) (PDF) High-dimensional data take place from digital image processing, gene expression micro arrays, neuronal population activities to financial time series. Dimensionality Reduction - extracting low dimensional structure from high dimension - is a key problem in many areas like information processing, machine learning, data mining, information retrieval and pattern recognition, where we find some data reduction techniques. In this thesis we will give a survey about modern data reduction techniques, representing the state-of-the-art of theory, methods and application, by introducing the language of mathematics there. This needs a special care concerning the questions of, e.g., how to understand discrete structures as manifolds, to identify their structure, preparing the dimension reduction, and to face complexity in the algorithmically methods. A special emphasis will be paid to Principal Component Analysis, Locally Linear Embedding and Isomap Algorithms. These algorithms are studied by a research group from Vilnius, Lithuania and Zeev Volkovich, from Software Engineering Department, ORT Braude College of Engineering, Karmiel, and others. The main purpose of this study is to compare the results of the three of the algorithms. While the comparison is beeing made we will focus the results and duration. QA Mathematics 1-939

Search results