Spelling suggestions: "subject:"1heory off algorithms"" "subject:"1heory oof algorithms""
131 |
Feature Selection and Classification Methods for Decision Making: A Comparative AnalysisVillacampa, Osiris 01 January 2015 (has links)
The use of data mining methods in corporate decision making has been increasing in the past decades. Its popularity can be attributed to better utilizing data mining algorithms, increased performance in computers, and results which can be measured and applied for decision making. The effective use of data mining methods to analyze various types of data has shown great advantages in various application domains. While some data sets need little preparation to be mined, whereas others, in particular high-dimensional data sets, need to be preprocessed in order to be mined due to the complexity and inefficiency in mining high dimensional data processing. Feature selection or attribute selection is one of the techniques used for dimensionality reduction. Previous research has shown that data mining results can be improved in terms of accuracy and efficacy by selecting the attributes with most significance. This study analyzes vehicle service and sales data from multiple car dealerships. The purpose of this study is to find a model that better classifies existing customers as new car buyers based on their vehicle service histories. Six different feature selection methods such as; Information Gain, Correlation Based Feature Selection, Relief-F, Wrapper, and Hybrid methods, were used to reduce the number of attributes in the data sets are compared. The data sets with the attributes selected were run through three popular classification algorithms, Decision Trees, k-Nearest Neighbor, and Support Vector Machines, and the results compared and analyzed. This study concludes with a comparative analysis of feature selection methods and their effects on different classification algorithms within the domain. As a base of comparison, the same procedures were run on a standard data set from the financial institution domain.
|
132 |
Optimizing Main Memory Usage in Modern Computing Systems to Improve Overall System PerformanceCampello, Daniel Jose 20 June 2016 (has links)
Operating Systems use fast, CPU-addressable main memory to maintain an application’s temporary data as anonymous data and to cache copies of persistent data stored in slower block-based storage devices. However, the use of this faster memory comes at a high cost. Therefore, several techniques have been implemented to use main memory more efficiently in the literature. In this dissertation we introduce three distinct approaches to improve overall system performance by optimizing main memory usage.
First, DRAM and host-side caching of file system data are used for speeding up virtual machine performance in today’s virtualized data centers. The clustering of VM images that share identical pages, coupled with data deduplication, has the potential to optimize main memory usage, since it provides more opportunity for sharing resources across processes and across different VMs. In our first approach, we study the use of content and semantic similarity metrics and a new algorithm to cluster VM images and place them in hosts where through deduplication we improve main memory usage.
Second, while careful VM placement can improve memory usage by eliminating duplicate data, caches in current systems employ complex machinery to manage the cached data. Writing data to a page not present in the file system page cache causes the operating system to synchronously fetch the page into memory, blocking the writing process. In this thesis, we address this limitation with a new approach to managing page writes involving buffering the written data elsewhere in memory and unblocking the writing process immediately. This buffering allows the system to service file writes faster and with less memory resources.
In our last approach, we investigate the use of emerging byte-addressable persistent memory technology to extend main memory as a less costly alternative to exclusively using expensive DRAM. We motivate and build a tiered memory system wherein persistent memory and DRAM co-exist and provide improved application performance at lower cost and power consumption with the goal of placing the right data in the right memory tier at the right time. The proposed approach seamlessly performs page migration across memory tiers as access patterns change and/or to handle tier memory pressure.
|
133 |
Large Scale Data Mining for IT Service ManagementZeng, Chunqiu 08 November 2016 (has links)
More than ever, businesses heavily rely on IT service delivery to meet their current and frequently changing business requirements. Optimizing the quality of service delivery improves customer satisfaction and continues to be a critical driver for business growth. The routine maintenance procedure plays a key function in IT service management, which typically involves problem detection, determination and resolution for the service infrastructure.
Many IT Service Providers adopt partial automation for incident diagnosis and resolution where the operation of the system administrators and automation operation are intertwined. Often the system administrators' roles are limited to helping triage tickets to the processing teams for problem resolving. The processing teams are responsible to perform a complex root cause analysis, providing the system statistics, event and ticket data. A large scale of system statistics, event and ticket data aggravate the burden of problem diagnosis on both the system administrators and the processing teams during routine maintenance procedures.
Alleviating human efforts involved in IT service management dictates intelligent and efficient solutions to maximize the automation of routine maintenance procedures. Three research directions are identified and considered to be helpful for IT service management optimization: (1) Automatically determine problem categories according to the symptom description in a ticket; (2) Intelligently discover interesting temporal patterns from system events; (3) Instantly identify temporal dependencies among system performance statistics data. Provided with ticket, event, and system performance statistics data, the three directions can be effectively addressed with a data-driven solution. The quality of IT service delivery can be improved in an efficient and effective way.
The dissertation addresses the research topics outlined above. Concretely, we design and develop data-driven solutions to help system administrators better manage the system and alleviate the human efforts involved in IT Service management, including (1) a knowledge guided hierarchical multi-label classification method for IT problem category determination based on both the symptom description in a ticket and the domain knowledge from the system administrators; (2) an efficient expectation maximization approach for temporal event pattern discovery based on a parametric model; (3) an online inference on time-varying temporal dependency discovery from large-scale time series data.
|
134 |
Sustainable Resource Management for Cloud Data CentersMahmud, A. S. M. Hasan 15 June 2016 (has links)
In recent years, the demand for data center computing has increased significantly due to the growing popularity of cloud applications and Internet-based services. Today's large data centers host hundreds of thousands of servers and the peak power rating of a single data center may even exceed 100MW. The combined electricity consumption of global data centers accounts for about 3% of worldwide production, raising serious concerns about their carbon footprint. The utility providers and governments are consistently pressuring data center operators to reduce their carbon footprint and energy consumption. While these operators (e.g., Apple, Facebook, and Google) have taken steps to reduce their carbon footprints (e.g., by installing on-site/off-site renewable energy facility), they are aggressively looking for new approaches that do not require expensive hardware installation or modification.
This dissertation focuses on developing algorithms and systems to improve the sustainability in data centers without incurring significant additional operational or setup costs. In the first part, we propose a provably-efficient resource management solution for a self-managed data center to cap and reduce the carbon emission while maintaining satisfactory service performance. Our solution reduces the carbon emission of a self-managed data center to net-zero level and achieves carbon neutrality. In the second part, we consider minimizing the carbon emission in a hybrid data center infrastructure that includes geographically distributed self-managed and colocation data centers. This segment identifies and addresses the challenges of resource management in a hybrid data center infrastructure and proposes an efficient distributed solution to optimize the workload and resource allocation jointly in both self-managed and colocation data centers. In the final part, we explore sustainable resource management from cloud service users' point of view. A cloud service user purchases computing resources (e.g., virtual machines) from the service provider and does not have direct control over the carbon emission of the service provider's data center. Our proposed solution encourages a user to take part in sustainable (both economical and environmental) computing by limiting its spending on cloud resource purchase while satisfying its application performance requirements.
|
135 |
Sorting by Block MovesHuang, Jici 01 January 2015 (has links)
The research in this thesis is focused on the problem of Block Sorting, which has applications in Computational Biology and in Optical Character Recognition (OCR). A block in a permutation is a maximal sequence of consecutive elements that are also consecutive in the identity permutation. BLOCK SORTING is the process of transforming an arbitrary permutation to the identity permutation through a sequence of block moves. Given an arbitrary permutation π and an integer m, the Block Sorting Problem, or the problem of deciding whether the transformation can be accomplished in at most m block moves has been shown to be NP-hard. After being known to be 3-approximable for over a decade, block sorting has been researched extensively and now there are several 2-approximation algorithms for its solution. This work introduces new structures on a permutation, which are called runs and ordered pairs, and are used to develop two new approximation algorithms. Both the new algorithms are 2-approximation algorithms, yielding the approximation ratio equal to the current best. This work also includes an analysis of both the new algorithms showing they are 2-approximation algorithms.
|
136 |
A Comparison of Cloud Computing Database Security AlgorithmsHoeppner, Joseph A 01 January 2015 (has links)
The cloud database is a relatively new type of distributed database that allows companies and individuals to purchase computing time and memory from a vendor. This allows a user to only pay for the resources they use, which saves them both time and money. While the cloud in general can solve problems that have previously been too costly or time-intensive, it also opens the door to new security problems because of its distributed nature. Several approaches have been proposed to increase the security of cloud databases, though each seems to fall short in one area or another.
This thesis presents the Hoeppner Security Algorithm (HSA) as a solution to these security problems. The HSA safeguards user’s data and metadata by adding fake records alongside the real records, breaking up the database by column or groups of columns, and by storing each group in a different cloud. The efficiency and security of this algorithm was compared to the Alzain algorithm (one of the proposed security solutions that inspired the HSA), and it was found that the HSA outperforms the Alzain algorithm in most every way.
|
137 |
Rigid and Non-rigid Point-based Medical Image RegistrationParra, Nestor Andres 13 November 2009 (has links)
The primary goal of this dissertation is to develop point-based rigid and non-rigid image registration methods that have better accuracy than existing methods. We first present point-based PoIRe, which provides the framework for point-based global rigid registrations. It allows a choice of different search strategies including (a) branch-and-bound, (b) probabilistic hill-climbing, and (c) a novel hybrid method that takes advantage of the best characteristics of the other two methods. We use a robust similarity measure that is insensitive to noise, which is often introduced during feature extraction. We show the robustness of PoIRe using it to register images obtained with an electronic portal imaging device (EPID), which have large amounts of scatter and low contrast. To evaluate PoIRe we used (a) simulated images and (b) images with fiducial markers; PoIRe was extensively tested with 2D EPID images and images generated by 3D Computer Tomography (CT) and Magnetic Resonance (MR) images. PoIRe was also evaluated using benchmark data sets from the blind retrospective evaluation project (RIRE). We show that PoIRe is better than existing methods such as Iterative Closest Point (ICP) and methods based on mutual information. We also present a novel point-based local non-rigid shape registration algorithm. We extend the robust similarity measure used in PoIRe to non-rigid registrations adapting it to a free form deformation (FFD) model and making it robust to local minima, which is a drawback common to existing non-rigid point-based methods. For non-rigid registrations we show that it performs better than existing methods and that is less sensitive to starting conditions. We test our non-rigid registration method using available benchmark data sets for shape registration. Finally, we also explore the extraction of features invariant to changes in perspective and illumination, and explore how they can help improve the accuracy of multi-modal registration. For multimodal registration of EPID-DRR images we present a method based on a local descriptor defined by a vector of complex responses to a circular Gabor filter.
|
138 |
Wildfire Assessment Using FARSITE Fire Modeling: A Case Study in the Chihuahua Desert of MexicoBrakeall, John 02 July 2013 (has links)
The Chihuahua desert is one of the most biologically diverse ecosystems in the world, but suffers serious degradation because of changes in fire regimes resulting in large catastrophic fires. My study was conducted in the Sierra La Mojonera (SLM) natural protected area in Mexico. The purpose of this study was to implement the use of FARSITE fire modeling as a fire management tool to develop an integrated fire management plan at SLM.
Firebreaks proved to detain 100% of wildfire outbreaks. The rosetophilous scrub experienced the fastest rate of fire spread and lowland creosote bush scrub experienced the slowest rate of fire spread. March experienced the fastest rate of fire spread, while September experienced the slowest rate of fire spread. The results of my study provide a tool for wildfire management through the use geospatial technologies and, in particular, FARSITE fire modeling in SLM and Mexico.
|
139 |
Real-Time Scheduling of Embedded Applications on Multi-Core PlatformsFan, Ming 21 March 2014 (has links)
For the past several decades, we have experienced the tremendous growth, in both scale and scope, of real-time embedded systems, thanks largely to the advances in IC technology. However, the traditional approach to get performance boost by increasing CPU frequency has been a way of past. Researchers from both industry and academia are turning their focus to multi-core architectures for continuous improvement of computing performance. In our research, we seek to develop efficient scheduling algorithms and analysis methods in the design of real-time embedded systems on multi-core platforms. Real-time systems are the ones with the response time as critical as the logical correctness of computational results. In addition, a variety of stringent constraints such as power/energy consumption, peak temperature and reliability are also imposed to these systems. Therefore, real-time scheduling plays a critical role in design of such computing systems at the system level.
We started our research by addressing timing constraints for real-time applications on multi-core platforms, and developed both partitioned and semi-partitioned scheduling algorithms to schedule fixed priority, periodic, and hard real-time tasks on multi-core platforms. Then we extended our research by taking temperature constraints into consideration. We developed a closed-form solution to capture temperature dynamics for a given periodic voltage schedule on multi-core platforms, and also developed three methods to check the feasibility of a periodic real-time schedule under peak temperature constraint. We further extended our research by incorporating the power/energy constraint with thermal awareness into our research problem. We investigated the energy estimation problem on multi-core platforms, and developed a computation efficient method to calculate the energy consumption for a given voltage schedule on a multi-core platform. In this dissertation, we present our research in details and demonstrate the effectiveness and efficiency of our approaches with extensive experimental results.
|
140 |
Evolved Design of a Nonlinear Proportional Integral Derivative (NPID) ControllerChopra, Shubham 01 January 2012 (has links)
This research presents a solution to the problem of tuning a PID controller for a nonlinear system. Many systems in industrial applications use a PID controller to control a plant or the process. Conventional PID controllers work in linear systems but are less effective when the plant or the process is nonlinear because PID controllers cannot adapt the gain parameters as needed. In this research we design a Nonlinear PID (NPID) controller using a fuzzy logic system based on the Mamdani type Fuzzy Inference System to control three different DC motor systems. This fuzzy system is responsible for adapting the gain parameters of a conventional PID controller. This fuzzy system's rule base was heuristically evolved using an Evolutionary Algorithm (Differential Evolution). Our results show that a NPID controller can restore a moderately or a heavily under-damped DC motor system under consideration to a desired behavior (slightly under-damped).
|
Page generated in 0.0702 seconds