Global ETD Search

1	Data Modeling for Static Analysis of Web Applications / Data Modeling for Static Analysis of Web Applications Baštecký, Pavel January 2015 (has links) The PHP is a very popular language which is used to write a server side part of web applications. The language is very simple to use and there are lots of small or more complex pages across the internet. But the great widespread of the PHP attracts the people which want to harm and compromise security of the web applications. The weverca analyzer is the first tool which is able to perform complex security analysis of a full page written in the modern version of the PHP and give information about possible security risks in the application. But the performance of Weverca is limited by its time and memory complexity caused by inefficient inner representation of a PHP memory state. The goal of this thesis is to find and solve main problems of the original memory representation. The output of this thesis is an implementation of the new memory representation which minimizes the complexity of the original solution. Powered by TCPDF (www.tcpdf.org)
2	Formal Specification and Verification of Data-Centric Web Services Moustafa, Iman Saleh 20 April 2012 (has links) In this thesis, we develop and evaluate a formal model and contracting framework for data-centric Web services. The central component of our framework is a formal specification of a common Create-Read-Update-Delete (CRUD) data store. We show how this model can be used in the formal specification and verification of both basic and transactional Web service compositions. We demonstrate through both formal proofs and empirical evaluations that our proposed framework significantly decreases ambiguity about a service, enhances its reuse, and facilitates detection of errors in service-based implementations. Web Services are reusable software components that make use of standardized interfaces to enable loosely-coupled business-to-business and customer-to-business interactions over the Web. In such environments, service consumers depend heavily on the service interface specification to discover, invoke, and synthesize services over the Web. Data-centric Web services are services whose behavior is determined by their interactions with a repository of stored data. A major challenge in this domain is interpreting the data that must be marshaled between consumer and producer systems. While the Web Services Description Language (WSDL) is currently the de facto standard for Web services, it only specifies a service operation in terms of its syntactical inputs and outputs; it does not provide a means for specifying the underlying data model, nor does it specify how a service invocation affects the data. The lack of data specification potentially leads to erroneous use of the service by a consumer. In this work, we propose a formal contract for data-centric Web services. The goal is to formally and unambiguously specify the service behavior in terms of its underlying data model and data interactions. We address the specification of a single service, a flow of services interacting with a single data store, and also the specification of distributed transactions involving multiple Web services interacting with different autonomous data stores. We use the proposed formal contract to decrease ambiguity about a service behavior, to fully verify a composition of services, and to guarantee correctness and data integrity properties within a transactional composition of services. / Ph. D. Formal Methods Data Modeling Software Specification and Verification Formal Methods Data Modeling Software Specification and Verification
3	Cataloging Theory in Search of Graph Theory and Other Ivory Towers. Object: Cultural Heritage Resource Description Networks Murray, Ronald J., Tillett, Barbara B. 18 July 2011 (has links) Working paper summarizing research into cataloging theory, history of science, mathematics, and information science. / The report summarizes a research program that has been investigating how catalogers, other Cultural Heritage information workers, World Wide Web/Semantic Web technologists, and the general public understand, explain, and manage resource description tasks by creating, counting, measuring, classifying, and otherwise arranging descriptions of Cultural Heritage resources within the Bibliographic Universe and beyond it. Cataloging Graph Theory Data Modeling Ethnomathematics History of Science Complementarity Exemplars
4	New Probabilistic Interest Measures for Association Rules Hahsler, Michael, Hornik, Kurt January 2006 (has links) (PDF) Mining association rules is an important technique for discovering meaningful patterns in transaction databases. Many different measures of interestingness have been proposed for association rules. However, these measures fail to take the probabilistic properties of the mined data into account. In this paper, we start with presenting a simple probabilistic framework for transaction data which can be used to simulate transaction data when no associations are present. We use such data and a real-world database from a grocery outlet to explore the behavior of confidence and lift, two popular interest measures used for rule mining. The results show that confidence is systematically influenced by the frequency of the items in the left hand side of rules and that lift performs poorly to filter random noise in transaction data. Based on the probabilistic framework we develop two new interest measures, hyper-lift and hyper-confidence, which can be used to filter or order mined association rules. The new measures show significant better performance than lift for applications where spurious rules are problematic. / Series: Research Report Series / Department of Statistics and Mathematics
5	Studies on applications of neural networks in modeling sparse datasets and in the analysis of dynamics of CA3 in hippocampus Keshavarz Hedayati, Babak 23 April 2019 (has links) Neural networks are an important tool in the field of data science as well as in the study of the very structures they were inspired from i.e. the human nervous system. In this dissertation, we studied the application of neural networks in data modeling as well as their role in studying the properties of various structures in the nervous system. This dissertation has two foci: one relates to developing methods that help improve \gls{generalization} in data models and the other is to study the possible effects of the structure on the function. As the first focus of this dissertation, we proposed a set of heuristics that improve the \gls{generalization} capability of the neural network models in regression and classification problems. To do so, we explored applying apprioi information in the form of \gls{regularization} of the behavior of the models. We used smoothness and self-consistency as the two regularized attributes that were enforced on the behavior of the neural networks in our model. We used our proposed heuristics to improve the performance neural network ensembles in regression problems (more specifically in quantitative structure–activity relationship (QSAR) modeling problems). We demonstrated that these heuristics result in significant improvements in the performance of the models we used. In addition, we developed an anomaly detection method to identify and exclude the outliers among unknown cases presented to the model. This was to ensure that the data model only made a prediction about the outcome of the unknown cases that were within its domain of applicability. This filtering resulted in further improvement of the performance of the model in our experiments. Furthermore, and through some modifications, we extended the application of our proposed heuristics to classification problems. We evaluated the performance of the resulting classification models over several datasets and demonstrated that the \gls{regularization}s we employed in our heuristics, had a positive effect on the performance of the data model across various classification problems as well. In the second part of this dissertation, we focused on studying the relationship between the structure and the functionality in the nervous system. More specifically, whether or not the structure implies functionality. In studying these possible effects, we elected to study CA3b in Hippocampus. For this reason, we used current related literature to derive a physiologically plausible model of CA3b. To make our proposed model as close as possible to its counterpart in the nervous system, we used large scale neural simulations, in excess of 45,000 neurons, in our experiments. We used the collective firings of all the neurons in our proposed structure to produce a time series signal. We considered this time-series signal which is a way to demonstrate the overall output of the structure should it be monitored by an EEG probe as the output of the structure. In our simulations, the structure produced and maintained a low frequency rhythm. We believe that this rhythm is similar to the Theta rhythm which occurs naturally in CA3b. We used the fundamental frequency of this rhythm in our experiments to quantify the effects of modifications in the structure. That is, we modified various properties of our CA3b and measured the changes in the fundamental frequency of the signal. We conducted various experiments on the structural properties (the length of axons of the neurons, the density of connections around the neurons, etc.) of the simulated CA3b structure. Our results show that the structure was very resilient to such modifications. Finally, we studied the effects of lesions in such a resilient structure. For these experiments, we introduced two types of lesions: many lesions of small radius and a few lesions with large radii. We then increased the severity of these lesions by increasing the number of lesions in the case of former and increasing the radius of lesions in the case of the latter. Our results showed that many small lesions in the structure have a more pronounced effect on the fundamental frequency compared to the few lesions with large radii. / Graduate neural networks data modeling hippocampus regularization regression classification
6	A new formal and analytical process to product modeling (PPM) method and its application to the precast concrete industry Lee, Ghang 08 November 2004 (has links) The current standard product (data) modeling process relies on the experience and subjectivity of data modelers who use their experience to eliminate redundancies and identify omissions. As a result, product modeling becomes a social activity that involves iterative review processes of committees. This study aims to develop a new, formal method for deriving product models from data collected in process models of companies within an industry sector. The theoretical goals of this study are to provide a scientific foundation to bridge the requirements collection phase and the logical modeling phase of product modeling and to formalize the derivation and normalization of a product model from the processes it supports. To achieve these goals, a new and formal method, Georgia Tech Process to Product Modeling (GTPPM), has been proposed. GTPPM consists of two modules. The first module is called the Requirements Collection and Modeling (RCM) module. It provides semantics and a mechanism to define a process model, information items used by each activity, and information flow between activities. The logic to dynamically check the consistency of information flow within a process also has been developed. The second module is called the Logical Product Modeling (LPM) module. It integrates, decomposes, and normalizes information constructs collected from a process model into a preliminary product model. Nine design patterns are defined to resolve conflicts between information constructs (ICs) and to normalize the resultant model. These two modules have been implemented as a Microsoft Visio ™ add-on. The tool has been registered and is also called GTPPM ™. The method has been tested and evaluated in the precast concrete sector of the construction industry through several GTPPM modeling efforts. By using GTPPM, a complete set of information items required for product modeling for a medium or a large industry can be collected without generalizing each company's unique process into one unified high-level model. However, the use of GTPPM is not limited to product modeling. It can be deployed in several other areas including: workflow management system or MIS (Management Information System) development software specification development business process re-engineering. Databases Data modeling Process modeling Product modeling Precast concrete
7	Modelagem para identificação de componentes de frações de petróleo. / Modeling for petroleum fractions components identification. Lorena de Lima Farah 23 April 2018 (has links) O propósito deste trabalho é avaliar as características e classificação de compostos orgânicos, presentes em frações de petróleo, em saturados, aromáticos, resinas e asfaltenos (SARA) e identificar a classe homóloga (compostos de enxofre, nitrogênio, oxigênio, enxofre-oxigênio, nitrogênio-oxigênio ou apenas hidrogênio e carbono), construindo um modelo, com base no peso molecular, e utilizando um banco de dados em Excel. No modelo, os compostos orgânicos são organizados em uma matriz atômica de séries homólogas. Essas são séries de compostos com propriedades químicas similares, que diferem por um peso molecular constante (CH2). Esses compostos são classificados por relações heurísticas de Hidrogênio/ Carbono (H/C) e Ligação Dupla Equivalente (DBE). DBE é o número de anéis ou ligações ? envolvendo o átomo de carbono, porque cada anel ou ligações ? resultam na perda de um átomo de hidrogênio. O banco de dados foi desenvolvido em Excel, usando programação em VBA (Visual Basic for Applications). Os dados experimentais foram obtidos utilizando a técnica analítica de espectrometria de massa por MALDI-TOF. Os resultados obtidos mostram que o algoritmo em VBA é capaz de identificar os compostos de uma amostra, dentro da faixa de erro definida e de acordo com a calibração do espectrômetro de massa. / The purpose of this study is to evaluate the characteristics and classification of organic compounds in petroleum fractions as saturates, aromatics, resins and asphaltenes (SARA) and identify the possible homologous class (compounds of sulfur, nitrogen, oxygen, sulfur-oxygen, nitrogen-oxygen or only hydrogen and carbon), building a model and using a database in Excel, based on molecular weight. In the modeling, organic compounds are organized in an atomic matrix of homologous series. These are series of similar chemical properties, which differ by a constant molecular weight (CH2). Then, these compounds are classified through heuristics of hydrogen/ carbon ratio (H/C) and DBE (Double Bond Equivalent) relations. DBE is the number of rings or ? bonds involving carbon, because each ring or ? bonds results in a hydrogen atoms loss. The database was developed in Excel using VBA (Visual Basic for Applications) programming. The experimental data were obtained by analytical technique of MALDI-TOF mass spectrometry. The results show that the VBA algorithm is able to identify compounds from a given sample, within an error marge defined according to the mass spectrometer calibration. Modelagem de dados Peso molecular Petróleo Data modeling Molecular weight Petroleum
8	Modelagem para dados de parasitismo / Modelling parasitism data João Mauricio Araújo Mota 23 September 2005 (has links) Experimentos com diferentes objetivos têm sido conduzidos a fim de se estudar o mecanismo do parasitismo, sendo muito comuns os bioensaios para encontrar condições ótimas para a produção de parasitas e para definir estratégias para liberações inundativas no campo. Assim, por exemplo, o número de ovos parasitados depende de fatores como: espécie, tipo e densidade do hospedeiro, longevidade do adulto e densidade do parasita, tipo de alimentação, temperatura, umidade etc. Logo, o objetivo de um determinado ensaio pode ser, então, estudar o comportamento da variável resposta como função do número de parasitóides ou do número de hospedeiros ou ainda do tipo de alimentação. Pode-se verificar que, em geral, as variáveis observáveis so contagens ou somas aleatórias de variáveis aleatórias ou proporções com denominadores fixos ou aleatórios. A distribuição padrão para modelar contagens é a Poisson enquanto que para proporções é a binomial. Em geral, elas não se ajustam dados oriundos do processo de parasitismo, pois suas pressuposições não são satisfeitas, e surgiram modelos alternativos e que levam em consideração o mecanismo de evitar o superparasitismo. Alguns deles supõem que a probabilidade de fuga (evitar o superparasitismo) é função do número de ovos presentes no hospedeiro (Bakker, 1967,1972; Rogers, 1975; Griffits, 1977), outros não consideram tal processo (Daley; Maindonald, 1989; Griffths, 1977). Outros, ainda, incluem o comportamento seletivo do parasita na escolha do hospedeiro e a habilidade do hospedeiro em atrair o parasita (Hemerik et al, 2002). Alguns deles surgiram independemente, outros como generalizações, sendo, portanto, de interesse um estudo adicional para ressaltar pontos comuns entre eles. No presente trabalho, são estudados 19 modelos probabilísticos para explicar a distribuição do número de ovos postos por um parasita em um determinado hospedeiro. Foi mostrada a equivalência entre alguns deles e, além disso, foi provado que um modelo usado por Faddy (1997) na estimação do tamanho de população animal generaliza-os. As propriedades desse modelo são apresentadas e discutidas. O uso do modelo de Faddy para a distribuição do número de ovos no sistema parasita-hospedeiro é o principal resultado teórico dessa tese. / Experiments with distinct aims have been conducted in order to study the parasitism mechanism. Bioassays to find optimum conditions for parasite production and to define strategies for application in the field are very common. In this way, for example, the number of parasitized eggs depends on factors such as: species, type and host density, adult longevity and parasite density, type of food, temperature, humidity etc. Hence, the objective of a specific assay can be to study the response variable behavior as a function of the number of parasitoids, number of hosts or type of food. In general, the observable variables are counts or random sums of random variables or proportions with fixed or random denominators. Poisson is the standard distribution used to model counts, while the distribution used for proportions is the binomial. In general, they do not fit to standard distributions dont fit to data generated by the parasitism process, because their assumptions are not satisfied and alternative models that consider the mechanism of avoidance of superparasitism have appeared in the literature. Some of them consider that the refuse probability (avoidance of superparasitism) is a function of the number of eggs in the host (Bakker, 1967, 1972; Rogers, 1975; Griffiths, 1977) others dont consider such process (Daley and Maindonald 1989; Griffiths, 1977). Some include the selective behavior of the parasite in the choice of the host and the host ability in attract the parasite (Hemerick et al., 2002). Some of the models have appeared independently, others as generalizations. There is therefore some interest in making a study of the common points of the models. In this work 19 probability models found in the literature are presented to explain the distribution of the number of eggs laid by a parasite on a specific host. As an initial result, the equivalence between some of these models is shown. Also it is shown that a model developed by Faddy (1997) can be considered as a generalization of 18 of the models. The properties of this model are presented and discussed. The equivalence between the Janardan and Faddy models is an original and interesting result. The use of Faddy´s model as a general probability model for the distribution of the number of eggs in a parasite-host system is the main theoretical result of this thesis. modelagem de dados parasitismo processos de Poisson data modeling parasitism poisson process
9	Dynamic Behavior Visualizer: A Dynamic Visual Analytics Framework for Understanding Complex Networked Models Maloo, Akshay 04 February 2014 (has links) Dynamic Behavior Visualizer (DBV) is a visual analytics environment to visualize the spatial and temporal movements and behavioral changes of an individual or a group, e.g. family within a realistic urban environment. DBV is specifically designed to visualize the adaptive behavioral changes, as they pertain to the interactions with multiple inter-dependent infrastructures, in the aftermath of a large crisis, e.g. hurricane or the detonation of an improvised nuclear device. DBV is web-enabled and thus is easily accessible to any user with access to a web browser. A novel aspect of the system is its scale and fidelity. The goal of DBV is to synthesize information and derive insight from it; detect the expected and discover the unexpected; provide timely and easily understandable assessment and the ability to piece together all this information. / Master of Science Information Visualization Visual Analytics Data Modeling Networked Models
10	Remote-Sensed LIDAR Using Random Sampling and Sparse Reconstruction Martinez, Juan Enrique Castorera 10 1900 (has links) ITC/USA 2011 Conference Proceedings / The Forty-Seventh Annual International Telemetering Conference and Technical Exhibition / October 24-27, 2011 / Bally's Las Vegas, Las Vegas, Nevada / In this paper, we propose a new, low complexity approach for the design of laser radar (LIDAR) systems for use in applications in which the system is wirelessly transmitting its data from a remote location back to a command center for reconstruction and viewing. Specifically, the proposed system collects random samples in different portions of the scene, and the density of sampling is controlled by the local scene complexity. The range samples are transmitted as they are acquired through a wireless communications link to a command center and a constrained absolute-error optimization procedure of the type commonly used for compressive sensing/sampling is applied. The key difficulty in the proposed approach is estimating the local scene complexity without densely sampling the scene and thus increasing the complexity of the LIDAR front end. We show here using simulated data that the complexity of the scene can be accurately estimated from the return pulse shape using a finite moments approach. Furthermore, we find that such complexity estimates correspond strongly to the surface reconstruction error that is achieved using the constrained optimization algorithm with a given number of samples. Compressive Sensing Sparse Signal Reconstruction LIDAR data modeling LIDAR data transmission

Search results