Global ETD Search

271	M-AdaBoost-A Based Ensemble System for Network Intrusion Detection Zhou, Ying 01 January 2021 (has links) Network intrusion detection remains a challenging research area as it involves learning from large-scale imbalanced multiclass datasets. While machine learning algorithms have been widely used for network intrusion detection, most standard techniques cannot achieve consistent good performance across multiple classes. In this dissertation, a novel ensemble system was proposed based on the Modified Adaptive Boosting with Area under the curve (M-AdaBoost-A) algorithm to detect network intrusions more effectively. Multiple M-AdaBoost-A-based classifiers were combined into an ensemble by employing various strategies, including particle swarm optimization. To the best of our knowledge, this study is the first to utilize the M-AdaBoost-A algorithm for addressing class imbalance in network intrusion detection. Compared with existing standard techniques, the proposed ensemble system achieved superior performance across multiple classes in both 802.11 wireless intrusion detection and traditional enterprise intrusion detection.
272	Flow control in non-continuous chemical plants Srinivasan, Venkatesh 01 January 1992 (has links) We consider general-purpose and flexible non-continuous chemical plants under deterministic feedback control. The aim of this research has been to develop hierarchical distributed feedback based control policies to: (1) organize and schedule flow between the different unit operations, (2) ensure the satisfaction of both safety as well as product constraints, and (3) achieve desirable performance. We begin by developing a lower bound on total storage required to meet demand rates. We also develop an upper bound on total storage required by an optimal policy. Next we propose a distributed feedback control scheme to organize flow in flexible non-continuous chemical plants. We show that this control scheme is stable, and can be implemented with fixed and finite storage. The performance of this policy in a limited number of simulations was close to optimal. When there are severe constraints on the size of the intermediate storage, a global supervisor can be implemented to prevent deadlock. We show that when the plant satisfies a sparsity property, the global supervisory control problem is tractable. The overall control system approach that we have developed for non-continuous chemical plants is based on a two-tiered structure--(1) a global supervisor ensures that process and safety constraints are satisfied, and also keeps the plant state trajectory steered away from dead-lock; and (2) local controllers are designed to maximize a distributed performance measure.
273	Optimal control and analysis of bulk service queueing systems Han, Youngnam 01 January 1992 (has links) Queueing Theory has been successfully and extensively applied to the scheduling, control, and analysis of complex stochastic systems. In this dissertation, the problems of optimal scheduling, control and analysis of bulk service queueing systems are studied. A Dynamic Programming formulation is provided for the optimal service strategy of a two-server bulk queue. An extension of the general bulk service rule is shown to be optimal in the sense of minimizing either the finite discounted or average waiting cost. It is shown that the optimal dispatching rule is a multi-stage threshold type where servers are dispatched only when the number of waiting customers exceeds certain threshold values depending both on the number of waiting customers and the number of servers available at decision epochs. It is conjectured that the result is extendable to the case for more than two servers. Exact analysis of the state probability in equilibrium is carried out under the optimal policy obtained for a queue with two bulk servers. Comparison of the optimal threshold policy is carried out by evaluating a single stage vs. a two-stage threshold two-server system. By calculating the mean number of customers waiting in the queue of both systems, it is shown that a two-stage threshold policy yields optimal performance over the general bulk service rule under any operating condition. Examples for different parameter sets are provided. A network of two bulk service queues served by a common transport carrier with finite capacity is analyzed where the general bulk service rule is applied only at one queue. Decomposition is employed to provide an exact analysis of the steady-state probability distribution, mean waiting time distribution, and mean number of customers waiting at both queues in equilibrium. Networks of more than two bulk service queues can be analyzed by direct extension of the methodology. An optimization procedure for the optimal threshold value to minimize total mean waiting cost is also discussed.
274	Analysis and enhancement of a branch and bound approach to facilities layout in continuous flow manufacturing systems Punyagupta, Pirasan 01 January 1992 (has links) Today's Continuous Flow Manufacturing (CFM) systems are the integration of several manufacturing components found in traditional manufacturing facilities. They contain groups of machines (facilities, work centers or cells) linked together by transport systems (material handlers, conveyors, etc.). Series of parts are transported from one facility to the next depending on operation sequences. Usually, a major problem encountered in the optimal design of a CFM system is the assignment of these manufacturing components onto appropriate locations in the layout to obtain efficient CFM configurations with preferable flow of products and resource utilization. In this research, optimal operation allocations to groups of machines in a facility is termed the Resource Assignment Subproblem or RAS. The task of locating facilities onto predefined locations in the layout, taking into consideration operation sequences, is termed the Location Assignment Subproblem or LAS. Both the RAS and LAS generally involve complicated discrete mathematical models, thus, many past researchers have chosen to investigate them separately. Recent research conducted by Ketcham (1992a) has led to a mathematical representation which integrates LAS and RAS into a single model, called the Configuration Problem (CP). The solution method, which is a heuristic called the Resource Assignment algorithm, is also found to provide acceptably good solutions for complex models using reasonable computational efforts. The current implementation of the Resource Assignment algorithm, however, becomes inadequate for today's large-scale CFM facilities, where many products are produced at different sites. Thus, enhanced methodologies are developed in this research so that systems of various complexity and characteristics can be optimized more efficiently. The study of algorithm performance and solution characteristics has led to several enhancement techniques. The collection of these techniques, called the Meta-algorithm is a set of decision rules that suggests an optimization strategy, for the Resource Assignment algorithm, based on the characteristics of a given CFM system. The robustness of the Meta-algorithm is tested against a wide range of trial cases representing large-scale CFM systems found in industrial practice. Overall improved performance has been achieved by the Meta-algorithm, which is found most effective for segmenting and solving large-scale problems infeasible for the existing Resource Assignment algorithm.
275	Textual information retrieval : An approach based on language modeling and neural networks Georgakis, Apostolos A. January 2004 (has links) <p>This thesis covers topics relevant to information organization and retrieval. The main objective of the work is to provide algorithms that can elevate the recall-precision performance of retrieval tasks in a wide range of applications ranging from document organization and retrieval to web-document pre-fetching and finally clustering of documents based on novel encoding techniques.</p><p>The first part of the thesis deals with the concept of document organization and retrieval using unsupervised neural networks, namely the self-organizing map, and statistical encoding methods for representing the available documents into numerical vectors. The objective of this section is to introduce a set of novel variants of the self-organizing map algorithm that addresses certain shortcomings of the original algorithm.</p><p>In the second part of the thesis the latencies perceived by users surfing the Internet are shortened with the usage of a novel transparent and speculative pre-fetching algorithm. The proposed algorithm relies on a model of behaviour for the user browsing the Internet and predicts his future actions when surfing the Internet. In modeling the users behaviour the algorithm relies on the contextual statistics of the web pages visited by the user.</p><p>Finally, the last chapter of the thesis provides preliminary theoretical results along with a general framework on the current and future scientific work. The chapter describes the usage of the Zipf distribution for document organization and the usage of the adaboosting algorithm for the elevation of the performance of pre-fetching algorithms. </p> Language modeling Informatik, data- och systemvetenskap Informatik, data- och systemvetenskap
276	Discernibility and Rough Sets in Medicine: Tools and Applications Øhrn, Aleksander January 2000 (has links) <p>This thesis examines how discernibility-based methods can be equipped to posses several qualities that are needed for analyzing tabular medical data, and how these models can be evaluated according to current standard measures used in the health sciences. To this end, tools have been developed that make this possible, and some novel medical applications have been devised in which the tools are put to use.</p><p>Rough set theory provides a framework in which discernibility-based methods can be formulated and interpreted, and also forms an appealing foundation for data mining and knowledge discovery. When the medical domain is targeted, several factors become important. This thesis examines some of these factors, and holds them up to the current state-of-the-art in discernibility-based empirical modelling. Bringing together pertinent techniques, suitable adaptations of relevant theory for model construction and assessment are presented. Rough set classifiers are brought together with ROC analysis, and it is outlined how attribute costs and semantics can enter the modelling process.</p><p>ROSETTA, a comprehensive software system for conducting data analyses within the framework of rough set theory, has been developed. Under the hypothesis that the accessibility of such tools lowers the threshold for abstract ideas to migrate into concrete realization, this aids in reducing a gap between theoreticians and practitioners, and enables existing problems to be more easily attacked. The ROSETTA system boasts a set of flexible and powerful algorithms, and sets these in a user-friendly environment designed to support all phases of the discernibility-based modelling methodology. Researchers world-wide have already put the system to use in a wide variety of domains.</p><p>By and large, discernibility-based data analysis can be varied along two main axes: Which objects in the universe of discourse that we deem it necessary to discern between, and how we define that discernibility among these objects is allowed to take place. Using ROSETTA, this thesis has explored various facets of this also in three novel and distinctly different medical applications:</p><p>A method is proposed for identifying population subgroups for which expensive tests may be avoided, and experiments with a real-world database on a cardiological prognostic problem suggest that significant savings are possible.</p><p> A method is proposed for anonymizing medical databases with sensitive contents via cell suppression, thus aiding to preserve patient confidentiality.</p><p>* Very simple rule-based classifiers are employed to diagnose acute appendicitis, and their relative performance is compared to a team of experienced surgeons. The added value of certain biochemical tests is also demonstrated.</p> Informasjonsvitenskap Informatik, data- och systemvetenskap Informatik, data- och systemvetenskap
277	Hur förankras en policy? : En studie av Stockholms stads informationssäkerhet Granström, Carl, Mårtensson, Markus January 2005 (has links) No description available. Informatik, data- och systemvetenskap informationssäkerhet policy Informatik, data- och systemvetenskap
278	Statistical Considerations in the Analysis of Matched Case-Control Studies. With Applications in Nutritional Epidemiology Hansson, Lisbeth January 2001 (has links) <p>The case-control study is one of the most frequently used study designs in analytical epidemiology. This thesis focuses on some methodological aspects in the analysis of the results from this kind of study.</p><p>A population based case-control study was conducted in northern Norway and central Sweden in order to study the associations of several potential risk factors with thyroid cancer. Cases and controls were individually matched and the information on the factors under study was provided by means of a self-completed questionnaire. The analysis was conducted with logistic regression. No association was found with pregnancies, oral contraceptives and hormone replacement after menopause. Early pregnancy and artificial menopause were associated with an increased risk, and cigarette smoking with a decreased risk, of thyroid cancer (paper I). The relation with diet was also examined. High consumption with fat- and starch-rich diet was associated with an increased risk (paper II).</p><p>Conditional and unconditional maximum likelihood estimations of the parameters in a logistic regression were compared through a simulation study. Conditional estimation had higher root mean square error but better model fit than unconditional, especially for 1:1 matching, with relatively little effect of the proportion of missing values (paper III). Two common approaches to handle partial non-response in a questionnaire when calculating nutrient intake from diet variables were compared. In many situations it is reasonable to interpret the omitted self-reports of food consumption as indication of "zero-consumption" (paper IV).</p><p>The reproducibility of dietary reports was presented and problems for its measurements and analysis discussed. The most advisable approach to measure repeatability is to look at different correlation methods. Among factors affecting reproducibility frequency and homogeneity of consumption are presumably the most important ones (paper V). Nutrient variables can often have a mixed distribution form and therefore transformation to normality will be troublesome. When analysing nutrients we therefore recommend comparing the result from a parametric test with an analogous distribution-free test. Different methods to transform nutrient variables to achieve normality were discussed (paper VI). </p> Informatik, data- och systemvetenskap Informatik, data- och systemvetenskap
279	New Directions in Symbolic Model Checking d'Orso, Julien January 2003 (has links) <p>In today's computer engineering, requirements for generally high reliability have pushed the notion of testing to its limits. Many disciplines are moving, or have already moved, to more formal methods to ensure correctness. This is done by comparing the behavior of the system as it is implemented against a set of requirements. The ultimate goal is to create methods and tools that are able to perform this kind of verfication <i>automatically</i>: this is called <i>Model Checking</i>.</p><p>Although the notion of model checking has existed for two decades, adoption by the industry has been hampered by its poor applicability to complex systems. During the 90's, researchers have introduced an approach to cope with large (even infinite) state spaces: <i>Symbolic Model Checking</i>. The key notion is to represent large (possibly infinite) sets of states by a small formula (as opposed to enumerating all members). In this thesis, we investigate applying symbolic methods to different types of systems:</p><p><b>Parameterized systems.</b> We work whithin the framework of <i>Regular Model Chacking</i>. In regular model checking, we represent a global state as a word over a finite alphabet. A transition relation is represented by a regular length-preserving transducer. An important operation is the so-called transitive closure, which characterizes composing a transition relation with itself an arbitrary number of times. Since completeness cannot be achieved, we propose methods of computing closures that work as often as possible.</p><p><b>Games on infinite structures.</b> Infinite-state systems for which the transition relation is monotonic with respect to a well quasi-ordering on states can be analyzed. We lift the framework of well quasi-ordered domains toward games. We show that monotonic games are in general undecidable. We identify a subclass of monotonic games: downward-closed games. We propose an algorithm to analyze such games with a winning condition expressed as a safety property.</p><p><b>Probabilistic systems.</b> We present a framework for the quantitative analysis of probabilistic systems with an infinite state-space: given an initial state <i>s</i><i>init</i>, a set <i>F</i> of final states, and a rational <i>Θ</i> > 0, compute a rational ρ such that the probability of reaching <i>F</i> form <i>s</i><i>init</i> is between ρ and ρ + <i>Θ</i>. We present a generic algorithm and sufficient conditions for termination.</p> Informatik, data- och systemvetenskap Informatik, data- och systemvetenskap
280	Statistical Considerations in the Analysis of Matched Case-Control Studies. With Applications in Nutritional Epidemiology Hansson, Lisbeth January 2001 (has links) The case-control study is one of the most frequently used study designs in analytical epidemiology. This thesis focuses on some methodological aspects in the analysis of the results from this kind of study. A population based case-control study was conducted in northern Norway and central Sweden in order to study the associations of several potential risk factors with thyroid cancer. Cases and controls were individually matched and the information on the factors under study was provided by means of a self-completed questionnaire. The analysis was conducted with logistic regression. No association was found with pregnancies, oral contraceptives and hormone replacement after menopause. Early pregnancy and artificial menopause were associated with an increased risk, and cigarette smoking with a decreased risk, of thyroid cancer (paper I). The relation with diet was also examined. High consumption with fat- and starch-rich diet was associated with an increased risk (paper II). Conditional and unconditional maximum likelihood estimations of the parameters in a logistic regression were compared through a simulation study. Conditional estimation had higher root mean square error but better model fit than unconditional, especially for 1:1 matching, with relatively little effect of the proportion of missing values (paper III). Two common approaches to handle partial non-response in a questionnaire when calculating nutrient intake from diet variables were compared. In many situations it is reasonable to interpret the omitted self-reports of food consumption as indication of "zero-consumption" (paper IV). The reproducibility of dietary reports was presented and problems for its measurements and analysis discussed. The most advisable approach to measure repeatability is to look at different correlation methods. Among factors affecting reproducibility frequency and homogeneity of consumption are presumably the most important ones (paper V). Nutrient variables can often have a mixed distribution form and therefore transformation to normality will be troublesome. When analysing nutrients we therefore recommend comparing the result from a parametric test with an analogous distribution-free test. Different methods to transform nutrient variables to achieve normality were discussed (paper VI). Informatik, data- och systemvetenskap Informatik, data- och systemvetenskap

Search results