Global ETD Search

1	An Efficient Bit-Pattern-Based Algorithm for Mining Sequential Patterns in Protein Databases Jeng, Yin-han 26 June 2009 (has links) Proteins are the structural components of living cells and tissues, and thus an important building block in all living organisms. Patterns in proteins sequences are some subsequences which appear frequently. Patterns often denote important functional regions in proteins and can be used to characterize a protein family or discover the function of proteins. Moreover, it provides valuable information about the evolution of species. Patterns contain gaps of arbitrary size. Considering the no--gap--limit sequential pattern problem in a protein database, we may use the algorithm of mining sequential patterns to solve it. However, in a protein database, the order of segment appearing in protein sequences is important and it may appear many times repeatedly in a protein sequence. Therefore, we can not directly use the traditional sequential pattern mining algorithms to mine them. Many algorithms have been proposed to mine sequential patterns in protein databases, for example, the SP-index algorithm. They enumerate patterns of limited sizes (segments) in the solution space and find all patterns. The SP-index algorithm is based on the traditional sequential pattern mining algorithms and considers the the problem of the multiple--appearances of segments in a protein sequence. Although the SP-index algorithm considers the characteristics of bioinformatics, it still contains a time--consuming step which constructs the SP-tree to find the frequent patterns. In this step, it has to trace many nodes to get the result. Therefore, in this thesis, we propose a Bit--Pattern--based (BP) algorithm to improve the disadvantages of the SP-index algorithm. First, we transform the protein sequences into bit sequences. Second, we construct the frequent segments by using the AND operator. Because we use the bit operator, it is efficient to get the frequent segments. Then, we prune unnecessary frequent segments, which results in the case that we do not have to test many frequent segments in the following step. Third, we use the OR operator to get the longest pattern. In this step, we test whether two segments can be linked together to construct a long segment, and we get the result by testing once. Because we focus on which position the segment appears on, we can use the OR operator and then judge the bit sequences to get the result. Thus, we can avoid many testing processes. From our performance study based on the biological data, we show that we can improve the efficiency of the SP-index algorithm. Moreover, from our simulation results, we show that our proposed algorithm can improve the processing time up to 50\% as compared to the SP-index algorithm, since the SP--index algorithm has to trace many nodes to construct the longest pattern. Sequential Patterns Protein Databases Bit-Pattern-Based
2	Discovering and Using Patterns for Countering Security Challenges January 2014 (has links) abstract: Most existing security decisions for both defending and attacking are made based on some deterministic approaches that only give binary answers. Even though these approaches can achieve low false positive rate for decision making, they have high false negative rates due to the lack of accommodations to new attack methods and defense techniques. In this dissertation, I study how to discover and use patterns with uncertainty and randomness to counter security challenges. By extracting and modeling patterns in security events, I am able to handle previously unknown security events with quantified confidence, rather than simply making binary decisions. In particular, I cope with the following four real-world security challenges by modeling and analyzing with pattern-based approaches: 1) How to detect and attribute previously unknown shellcode? I propose instruction sequence abstraction that extracts coarse-grained patterns from an instruction sequence and use Markov chain-based model and support vector machines to detect and attribute shellcode; 2) How to safely mitigate routing attacks in mobile ad hoc networks? I identify routing table change patterns caused by attacks, propose an extended Dempster-Shafer theory to measure the risk of such changes, and use a risk-aware response mechanism to mitigate routing attacks; 3) How to model, understand, and guess human-chosen picture passwords? I analyze collected human-chosen picture passwords, propose selection function that models patterns in password selection, and design two algorithms to optimize password guessing paths; and 4) How to identify influential figures and events in underground social networks? I analyze collected underground social network data, identify user interaction patterns, and propose a suite of measures for systematically discovering and mining adversarial evidence. By solving these four problems, I demonstrate that discovering and using patterns could help deal with challenges in computer security, network security, human-computer interaction security, and social network security. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2014 Computer science Pattern-based Approach Security
3	From metaphors to intelligent patterns : milestones on the road to code re-use / Robert Lemke Lemke, Robert William January 2007 (has links) Computer applications can be described as largely rigid structures within which an information seeker must navigate in search of information - each screen, each transaction having underlying unique code. The larger the application, the higher the number of lines of code and the larger the size of the application executable. This study suggests an alternative pattern based approach, an approach driven by the information seeker. This alternative approach makes use of value embedded in intelligent patterns to assemble rules and logic constituents, numerous patterns aggregating to form a "virtual screen" based on the need of the information seeker. Once the information need is satisfied, the atomic rules and logic constituents dissipate and return to a base state. These same constituents are available, are reassembled and form the succeeding "virtual screen" to satisfy the following request. Metaphors are used to introduce current information solutions, where events are initiated and driven by physical constructs built using monolithic instruction sets. The metaphor approach is then expanded, illustrating how metaphors can be used to communicate an understanding between two likeminded intellects - this illustrates how spatial artifacts are used to carry intellectual value across the intellectual divide, from the one (intellectual source) to the other (intellectual target). At this point, the pattern based concept is introduced. This is where value, an intellectual appreciation hidden within spatiality, can be exploited towards the delivery of information. The pattern based approach makes use of multiple pattern "instances" to deliver functionality - each pattern instance has a specific embedded value. Numbers of these patterns aggregate to drive the formation of a "virtual screen" built using patterns, each pattern referencing and associating (physical) atomic logic and spatial constituents. This is analogous to painting a picture using removable dots. The dots can be used to describe a fish, and then, once appreciation has been completed, the image is destroyed and the dots are returned to the palette. These same dots can later be reapplied to present the picture of a dog, if that is requested by the information seeker. In both pictures the same "dots" are applied and reused. The form of the fish and dog are retained as value embedded within the patterns, the dots are building blocks aligned using instructions within the patterns. This study classifies existing application solutions as belonging to the Artifact-Pattern-Artifact (APA) group, and the pattern based approach belonging to the Pattern-Artifact-Pattern (PAP) group. An overview and the characteristics of each are presented. The document concludes by presenting the results obtained when using a prototype developed using the PAP approach. / Thesis (M.Sc. (Information Technology))--North-West University, Vaal Triangle Campus, 2008. Metaphors Intelligent patterns Milestones Code re - use Information solutions Computer applications Alternative pattern based approach
4	From metaphors to intelligent patterns : milestones on the road to code re-use / Robert Lemke Lemke, Robert William January 2007 (has links) Computer applications can be described as largely rigid structures within which an information seeker must navigate in search of information - each screen, each transaction having underlying unique code. The larger the application, the higher the number of lines of code and the larger the size of the application executable. This study suggests an alternative pattern based approach, an approach driven by the information seeker. This alternative approach makes use of value embedded in intelligent patterns to assemble rules and logic constituents, numerous patterns aggregating to form a "virtual screen" based on the need of the information seeker. Once the information need is satisfied, the atomic rules and logic constituents dissipate and return to a base state. These same constituents are available, are reassembled and form the succeeding "virtual screen" to satisfy the following request. Metaphors are used to introduce current information solutions, where events are initiated and driven by physical constructs built using monolithic instruction sets. The metaphor approach is then expanded, illustrating how metaphors can be used to communicate an understanding between two likeminded intellects - this illustrates how spatial artifacts are used to carry intellectual value across the intellectual divide, from the one (intellectual source) to the other (intellectual target). At this point, the pattern based concept is introduced. This is where value, an intellectual appreciation hidden within spatiality, can be exploited towards the delivery of information. The pattern based approach makes use of multiple pattern "instances" to deliver functionality - each pattern instance has a specific embedded value. Numbers of these patterns aggregate to drive the formation of a "virtual screen" built using patterns, each pattern referencing and associating (physical) atomic logic and spatial constituents. This is analogous to painting a picture using removable dots. The dots can be used to describe a fish, and then, once appreciation has been completed, the image is destroyed and the dots are returned to the palette. These same dots can later be reapplied to present the picture of a dog, if that is requested by the information seeker. In both pictures the same "dots" are applied and reused. The form of the fish and dog are retained as value embedded within the patterns, the dots are building blocks aligned using instructions within the patterns. This study classifies existing application solutions as belonging to the Artifact-Pattern-Artifact (APA) group, and the pattern based approach belonging to the Pattern-Artifact-Pattern (PAP) group. An overview and the characteristics of each are presented. The document concludes by presenting the results obtained when using a prototype developed using the PAP approach. / Thesis (M.Sc. (Information Technology))--North-West University, Vaal Triangle Campus, 2008. Metaphors Intelligent patterns Milestones Code re - use Information solutions Computer applications Alternative pattern based approach
5	Pattern Based System Engineering (PBSE)- Product Lifecycle Management (PLM) Integration and Validation Gupta, Rajat 17 November 2017 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Mass customization, small lot sizes, reduced cost, high variability of product types and changing product portfolio are characteristics of modern manufacturing systems during life cycle. A direct consequence of these characteristics is a more complex system and supply chain. Product lifecycle management (PLM) and model based system engineering (MBSE) are tools which have been proposed and implemented to address different aspects of this complexity and resulting challenges. Our previous work has successfully implemented a MBSE model into a PLM platform. More specifically, Pattern based system engineering (S* pattern) models of systems are integrated with TEAMCENTER to link and interface system level with component level, and streamline the lifecycle across disciplines. The benefit of the implementation is two folded. On one side it helps system engineers using system engineering models enable a shift from learning how to model to implementing the model, which leads to more effective systems definition, design, integration and testing. On the other side the PLM platform provides a reliable database to store legacy data for future use also track changes during the entire process, including one of the most important tools that a systems engineer needs which is an automatic report generation tool. In the current work, we have configured a PLM platform (TEAMCENTER) to support automatic generation of reports and requirements tables using a generic Oil Filter system lifecycle. There are three tables that have been configured for automatic generation which are Feature definitions table, Detail Requirements table and Stakeholder Feature Attributes table. These tables where specifically chosen as they describe all the requirements of the system and cover all physical behaviours the oil filter system shall exhibit during its physical interactions with external systems. The requirement tables represent core content for a typical systems engineering report. With the help of the automatic report generation tool, it is possible to prepare the entire report within one single system, the PLM system, to ensure a single reliable data source for an organization. Automatic generation of these contents can save the systems engineers time, avoid duplicated work and human errors in report preparation, train future generation of workforce in the lifecycle all the while encouraging standardized documents in an organization. MBSE (Model Based System Engineering) PBSE (Pattern Based System Engineering) System Engineering PLM (Product Lifecycle Management) Teamcenter Integration and Validation
6	Adding external factors in Time Series Forecasting : Case study: Ethereum price forecasting Vera Barberán, José María January 2020 (has links) The main thrust of time-series forecasting models in recent years has gone in the direction of pattern-based learning, in which the input variable for the models is a vector of past observations of the variable itself to predict. The most used models based on this traditional pattern-based approach are the autoregressive integrated moving average model (ARIMA) and long short-term memory neural networks (LSTM). The main drawback of the mentioned approaches is their inability to react when the underlying relationships in the data change resulting in a degrading predictive performance of the models. In order to solve this problem, various studies seek to incorporate external factors into the models treating the system as a black box using a machine learning approach which generates complex models that require a large amount of data for their training and have little interpretability. In this thesis, three different algorithms have been proposed to incorporate additional external factors into these pattern-based models, obtaining a good balance between forecast accuracy and model interpretability. After applying these algorithms in a study case of Ethereum price time-series forecasting, it is shown that the prediction error can be efficiently reduced by taking into account these influential external factors compared to traditional approaches while maintaining full interpretability of the model. / Huvudinstrumentet för prognosmodeller för tidsserier de senaste åren har gått i riktning mot mönsterbaserat lärande, där ingångsvariablerna för modellerna är en vektor av tidigare observationer för variabeln som ska förutsägas. De mest använda modellerna baserade på detta traditionella mönsterbaserade tillvägagångssätt är auto-regressiv integrerad rörlig genomsnittsmodell (ARIMA) och långa kortvariga neurala nätverk (LSTM). Den huvudsakliga nackdelen med de nämnda tillvägagångssätten är att de inte kan reagera när de underliggande förhållandena i data förändras vilket resulterar i en försämrad prediktiv prestanda för modellerna. För att lösa detta problem försöker olika studier integrera externa faktorer i modellerna som behandlar systemet som en svart låda med en maskininlärningsmetod som genererar komplexa modeller som kräver en stor mängd data för deras inlärning och har liten förklarande kapacitet. I denna uppsatsen har tre olika algoritmer föreslagits för att införliva ytterligare externa faktorer i dessa mönsterbaserade modeller, vilket ger en bra balans mellan prognosnoggrannhet och modelltolkbarhet. Efter att ha använt dessa algoritmer i ett studiefall av prognoser för Ethereums pristidsserier, visas det att förutsägelsefelet effektivt kan minskas genom att ta hänsyn till dessa inflytelserika externa faktorer jämfört med traditionella tillvägagångssätt med bibehållen full tolkbarhet av modellen. Time-series Forecasting Pattern-based models ARIMA LSTM Tidsserier Prognoser Mönsterbaserade modeller ARIMA LSTM Computer and Information Sciences Data- och informationsvetenskap
7	以型態辨識為主的中文資訊擷取技術研究翁嘉緯, Chia-Wei Weng Unknown Date (has links) 隨著網際網路的蓬勃發展，資訊擷取(Information Extraction)已經成為一個非常重要的技術。資訊擷取的目標為從非結構化的文字資料中，為特定的主題整理出相關之結構化資訊，其所牽涉的問題，包括分析文件的內容，篩選、擷取出相關的文字及其對應的意義。到目前為止，大部份的資訊擷取系統都著重在英文文件上，對於中文文件資訊擷取技術的研究才正在如火如荼的展開，加上全世界至少超過1/5的人說中文，積極投入中文資訊擷取的研究就顯得非常重要。中文的描述方式與英文有著很大的不同。在英文，詞跟詞之間有著明顯的『空白』，電腦可以很輕易的區隔輸入字串中每個詞。但是在中文，詞跟詞之間並沒有明顯的界限，一般的處理情形為利用詞典，將一個輸入字串中的文字，比對詞典內的詞來當做斷詞的依據，不過由於字組成詞的變化程度相當大，斷詞錯誤的情形仍很可能出現。因此，在本篇研究論文我們提出不做斷詞、不做詞性分析，而利用『型態辨識』的方法搭配『有限狀態自動機』的運作方式，來處理中文資訊擷取的問題。在實驗方面，我們以『總政府人事任免公報』當作測試資料，其精確度高達98%，而回收率也達到了97%。此外，我們也應用到其他不同的資料領域，對於建立跨領域之中文資訊擷取系統有了初步的研究進展，充分印證了本資訊擷取方法處理中文資訊擷取問題的可行性。 / With the explosion of World Wide Web, information extraction has become a major technical area. The goal of information extraction is to transform non-structured text into structured data of specific topic. It involves analyzing, filtering and extracting relevant parts of text and the corresponding meaning. Most information extraction research mainly focuses on English text. On the other hand, research on Chinese information extraction has not received as much attention. Considering the fact that one-fifth population in the world are Chinese-speaking people, Chinese information extraction technology will become increasingly important. Chinese language is different with English in many aspects. In English, words are separated with space such that computers can easily distinguish each word in the input string. In Chinese, there are no spaces between characters to segment them into meaningful words. A general solution is to match characters of the input string to the words in the dictionary to find proper word boundary. Yet, much flexibility and ambiguity exist in the combination of characters into words. Many errors may occur in word segmentation. . In this thesis, we propose an approach to Chinese information extraction based on pattern matching and finite state automata, without relying on word segmentation and part-of-speech tagging. The approach was evaluated with “government personnel directives in official gazettes” as test data, and it achieved performance measure of 98% precision and 97% recall. Moreover, the approach was extended to other data domains. The results have showed initial progress on the research of multiple- domain Chinese information extraction system. 資訊擷取型態辨識有限狀態自動機 Information Extraction Pattern based Finite State Automata
8	Engineering secure software architectures : patterns, models and analysis / Ingénierie des sytèmes sécurisés : patrons, modèles et analyses Motii, Anas 10 July 2017 (has links) De nos jours la plupart des organisations pour ne pas dire toutes, dépendent des technologies de l'information et de la communication (TIC) pour supporter plusieurs tâches et processus (quelquefois critiques). Cependant, dans la plupart des cas, les organisations et en particulier les petites entreprises accordent une importance limitée à l'information et à sa sécurité. En outre, sécuriser de tels systèmes est une tâche difficile en raison de la complexité et de la connectivité croissante du matériel et du logiciel dans le développement des TICs. Cet aspect doit alors être pris en compte dès les premières phases de conception. Dans ce travail, nous proposons une approche basée sur les modèles permettant de sécuriser des architectures logicielles en utilisant des patrons. Les contributions de ce travail sont : (1) un cadre de conception intégré pour la spécification et l'analyse d'architectures logicielles sécurisées, (2) une nouvelle méthodologie à base de modèles et de patrons et (3) une suite d'outils. Le fondement de l'approche associe un environnement basé sur des langages de modélisation pour la spécification et l'analyse des modèles d'architectures sécurisées et un dépôt à base de modèles d'artéfacts dédiés à la sécurité (modèle de patrons de sécurité, menaces et propriétés de sécurités) permettant la réutilisation de savoir-faire et de connaissances capitalisées. Pour cela on utilise des langages de modélisation pour la spécification et l'analyse de l'architecture. Le processus associé est constitué des activités suivantes : (a) analyse de risques à base de modèle appliquée à l'architecture du système pour identifier des menaces, (b) sélection et importation de modèles de patrons de sécurité, afin d'arrêter ou de mitiger les menaces identifiées, vers l'environnement de modélisation cible, (c) intégration de modèles de patrons dans le modèle d'architecture, (d) analyse de l'architecture obtenue par rapports aux exigences non-fonctionnelles et aux menaces résiduelles. Dans ce cadre, on s'est focalisé sur la vérification du maintien du respect des contraintes temporelles après application des patrons. La recherche de menaces résiduelles est réalisée à l'aide de techniques de vérification exploitant une représentation formelle des scénarios de menaces issus du modèle STRIDE et basés sur des référentiels de menaces existants (ex., CAPEC). Dans le cadre de l'assistance pour le développement des architectures sécurisées, nous avons implémenté une suite structurée d'outils autour du framework SEMCO et de la plateforme Eclipse Papyrus pour supporter les différentes activités basées sur un ensemble de langages de modélisation conforme à des standards OMG (UML et ses profils). Les solutions proposées ont été évaluées à travers l'utilisation d'un cas d'étude autour des systèmes SCADA (systèmes de contrôle et d'acquisition de données). / Nowadays most organizations depend on Information and Communication Technologies (ICT) to perform their daily tasks (sometimes highly critical). However, in most cases, organizations and particularly small ones place limited value on information and its security. In the same time, achieving security in such systems is a difficult task because of the increasing complexity and connectivity in ICT development. In addition, security has impacts on many attributes such as openness, safety and usability. Thus, security becomes a very important aspect that should be considered in early phases of development. In this work, we propose an approach in order to secure ICT software architectures during their development by considering the aforementioned issues. The contributions of this work are threefold: (1) an integrated design framework for the specification and analysis of secure software architectures, (2) a novel model- and pattern-based methodology and (3) a set of supporting tools. The approach associates a modeling environment based on a set of modeling languages for specifying and analyzing architecture models and a reuse model repository of modeling artifacts (security pattern, threat and security property models) which allows reuse of capitalized security related know-how. The approach consists of the following steps: (a) model-based risk assessment performed on the architecture to identify threats, (b) selection and instantiation of security pattern models towards the modeling environment for stopping or mitigating the identified threats, (c) integration of security pattern models into the architecture model, (d) analysis of the produced architecture model with regards to other non-functional requirements and residual threats. In this context, we focus on real-time constraints satisfaction preservation after application of security patterns. Enumerating the residual threats is done by checking techniques over the architecture against formalized threat scenarios from the STRIDE model and based on existing threat references (e.g., CAPEC). As part of the assistance for the development of secure architectures, we have implemented a tool chain based on SEMCO and Eclipse Papyrus to support the different activities based on a set of modeling languages compliant with OMG standards (UML and its profiles). The assessment of our work is presented via a SCADA system (Supervisory Control And Data Acquisition) case study. TIC Analyse de risques Ingénierie système à base de patrons Patrons de sécurité UML ICT Risk assessment Pattern-based system engineering (PBSE) Security patterns MDE UML
9	Pattern-based Specification and Formal Analysis of Embedded Systems Requirements and Behavioral Models Filipovikj, Predrag January 2017 (has links) Since the first lines of code were introduced in the automotive domain, vehicles have transitioned from being predominantly mechanical systems to software intensive systems. With the ever-increasing computational power and memory of vehicular embedded systems, a set of new, more powerful and more complex software functions are installed into vehicles to realize core functionalities. This trend impacts all phases of the system development including requirements specification, design and architecture of the system, as well as the integration and testing phases. In such settings, creating and managing different artifacts during the system development process by using traditional, human-intensive techniques becomes increasingly difficult. One problem stems from the high number and intricacy of system requirements that combine functional and possibly timing or other types of constraints. Another problem is related to the fact that industrial development relies on models, e.g. developed in Simulink, from which code may be generated, so the correctness of such models needs to be ensured. A potential way to address of the mentioned problems is by applying computer-aided specification, analysis and verification techniques already at the requirements stage, but also further at later development stages. Despite the high degree of automation, exhaustiveness and rigor of formal specification and analysis techniques, their integration with industrial practice remains a challenge. To address this challenge, in this thesis, we develop the foundation of a framework, tailored for industrial adoption, for formal specification and analysis of system requirements specifications and behavioral system models. First, we study the expressiveness of existing pattern-based techniques for creating formal requirements specifications, on a relevant industrial case study. Next, in order to enable practitioners to create formal system specification by using pattern-based techniques, we propose a tool called SeSAMM Specifier. Further, we provide an automated Satisfiability Modulo Theories (SMT)-based consistency analysis approach for the formally encoded system requirements specifications. The proposed SMT-based approach is suitable for early phases of the development for debugging the specifications. For the formal analysis of behavioral models, we provide an approach for statistical model checking of Simulink models by using the UPPAAL SMC tool. To facilitate the adoption of the approach, we provide the SIMPPAAL tool that automates procedure of generating network of stochastic timed automata for a given Simulink model. For validation, we apply our approach on a complex industrial model, namely the Brake-by-Wire function from Volvo GTT. / VeriSpec formal requirements consistency analysis formal analysis of Simulink models Engineering and Technology Teknik och teknologier Computer and Information Sciences Data- och informationsvetenskap
10	Simulating land use change for assessing future dynamics of land and water resources Anputhas, Markandu 02 1900 (has links) Models of land use change fall into two broad categories: pattern based and process based. This thesis focuses on pattern based land use change models, expanding our understanding of these models in three important ways. First, it is demonstrated that some driving variables do not have a smooth impact on the land use transition process. Our example variable is access to water. Land managers with access to piped water do not have any need for surface or groundwater. For variables like this, a model needs to change the way that driving variables are represented. The second important result is that including a variable which captures spatial correlation between land use types significantly increases the explanatory power of the prediction model. A major weakness of pattern based land use models is their inability to model interactions between neighbouring land parcels; the method proposed in this study can be an alternative to account for spatial neighbourhood association. These innovations are applied using the CLUE-S (Conversion of Land Use and its Effects at Small regional extent) system to the Deep Creek watershed in the Southern Interior of British Columbia. The results highlight the challenge of balancing the protection of agricultural land and conserving forest and natural areas when population and economic growth are inevitable. The results also demonstrate the implications of land use change on existing land use policies. The calibrated model was validated using remote sensing data. A series of discriminant functions were estimated for each land use type in the recent period and these functions were then used to classify. The calibrated model was run in reverse, back to the generated land use classification, and results compared. Fit was reasonable with error rates falling below ten percent when radii beyond 2.5 km were considered. Another important contribution is demonstrating the importance of modelling dynamic variables. Some important drivers are changing continuously and others depend on land use change itself. Failure to update these variables will bias model forecasts. Spatial neighbourhood association, an endogenous variable governed by land use change itself, is again used as the example dynamic variable. The study demonstrates the importance of updating all associated information. / Graduate Studies, College of (Okanagan) / Graduate CLUE-S discriminant function endogenous variables food security pattern based land use models remote sensing images simulation spatial association water district

Search results