Global ETD Search

31	Fast Low Memory T-Transform: string complexity in linear time and space with applications to Android app store security. Rebenich, Niko 27 April 2012 (has links) This thesis presents flott, the Fast Low Memory T-Transform, the currently fastest and most memory efficient linear time and space algorithm available to compute the string complexity measure T-complexity. The flott algorithm uses 64.3% less memory and in our experiments runs asymptotically 20% faster than its predecessor. A full C-implementation is provided and published under the Apache Licence 2.0. From the flott algorithm two deterministic information measures are derived and applied to Android app store security. The derived measures are the normalized T-complexity distance and the instantaneous T-complexity rate which are used to detect, locate, and visualize unusual information changes in Android applications. The information measures introduced present a novel, scalable approach to assist with the detection of malware in app stores. / Graduate Kolmogorov complexity Lempel Ziv T-codes flott Fast T-decomposition ftd mobile computing Lempel-Ziv complexity LZ complexity LZ78 LZ77 LZ76 LZW normalized information distance NID normalized compression distance NCD McCreight Ukkonen phylogenetics phylogenetic tree Gemini Droiddream Apple iOS iPhone iPad mobile phone open source Apache License 2.0
32	Modelos de compressão de dados para classificação e segmentação de texturas Honório, Tatiane Cruz de Souza 31 August 2010 (has links) Made available in DSpace on 2015-05-14T12:36:26Z (GMT). No. of bitstreams: 1 parte1.pdf: 2704137 bytes, checksum: 1bc9cc5c3099359131fb11fa1878c22f (MD5) Previous issue date: 2010-08-31 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / This work analyzes methods for textures images classification and segmentation using lossless data compression algorithms models. Two data compression algorithms are evaluated: the Prediction by Partial Matching (PPM) and the Lempel-Ziv-Welch (LZW) that had been applied in textures classification in previous works. The textures are pre-processed using histogram equalization. The classification method is divided into two stages. In the learning stage or training, the compression algorithm builds statistical models for the horizontal and the vertical structures of each class. In the classification stage, samples of textures to be classified are compressed using models built in the learning stage, sweeping the samples horizontally and vertically. A sample is assigned to the class that obtains the highest average compression. The classifier tests were made using the Brodatz textures album. The classifiers were tested for various contexts sizes (in the PPM case), samples number and training sets. For some combinations of these parameters, the classifiers achieved 100% of correct classifications. Texture segmentation process was made only with the PPM. Initially, the horizontal models are created using eight textures samples of size 32 x 32 pixels for each class, with the PPM context of a maximum size 1. The images to be segmented are compressed by the models of classes, initially in blocks of size 64 x 64 pixels. If none of the models achieve a compression ratio at a predetermined interval, the block is divided into four blocks of size 32 x 32. The process is repeated until a model reach a compression ratio in the range of the compression ratios set for the size of the block in question. If the block get the 4 x 4 size it is classified as belonging to the class of the model that reached the highest compression ratio. / Este trabalho se propõe a analisar métodos de classificação e segmentação de texturas de imagens digitais usando algoritmos de compressão de dados sem perdas. Dois algoritmos de compressão são avaliados: o Prediction by Partial Matching (PPM) e o Lempel-Ziv-Welch (LZW), que já havia sido aplicado na classificação de texturas em trabalhos anteriores. As texturas são pré-processadas utilizando equalização de histograma. O método de classificação divide-se em duas etapas. Na etapa de aprendizagem, ou treinamento, o algoritmo de compressão constrói modelos estatísticos para as estruturas horizontal e vertical de cada classe. Na etapa de classificação, amostras de texturas a serem classificadas são comprimidas utilizando modelos construídos na etapa de aprendizagem, varrendo-se as amostras na horizontal e na vertical. Uma amostra é atribuída à classe que obtiver a maior compressão média. Os testes dos classificadores foram feitos utilizando o álbum de texturas de Brodatz. Os classificadores foram testados para vários tamanhos de contexto (no caso do PPM), amostras e conjuntos de treinamento. Para algumas das combinações desses parâmetros, os classificadores alcançaram 100% de classificações corretas. A segmentação de texturas foi realizada apenas com o PPM. Inicialmente, são criados os modelos horizontais usados no processo de segmentação, utilizando-se oito amostras de texturas de tamanho 32 x 32 pixels para cada classe, com o contexto PPM de tamanho máximo 1. As imagens a serem segmentadas são comprimidas utilizando-se os modelos das classes, inicialmente, em blocos de tamanho 64 x 64 pixels. Se nenhum dos modelos conseguir uma razão de compressão em um intervalo pré-definido, o bloco é dividido em quatro blocos de tamanho 32 x 32. O processo se repete até que algum modelo consiga uma razão de compressão no intervalo de razões de compressão definido para o tamanho do bloco em questão, podendo chegar a blocos de tamanho 4 x 4 quando o bloco é classificado como pertencente à classe do modelo que atingiu a maior taxa de compressão. Prediction by Partial Matching (PPM) Lempel-Ziv-Welch (LZW) Segmentação de texturas Classificação de texturas Reconhecimento de padrões Compressão de dados Prediction by Partial Matching (PPM) Lempel-Ziv-Welch (LZW) Texture segmentation Texture classification Histogram equalization Pattern recognition Data compression
33	Modern Error Control Codes and Applications to Distributed Source Coding Sartipi, Mina 15 August 2006 (has links) This dissertation first studies two-dimensional wavelet codes (TDWCs). TDWCs are introduced as a solution to the problem of designing a 2-D code that has low decoding- complexity and has the maximum erasure-correcting property for rectangular burst erasures. The half-rate TDWCs of dimensions N<sub>1</sub> X N<sub>2</sub> satisfy the Reiger bound with equality for burst erasures of dimensions N<sub>1</sub> X N<sub>2</sub>/2 and N<sub>1</sub>/2 X N<sub>2</sub>, where GCD(N<sub>1</sub>,N<sub>2</sub>) = 2. Examples of TDWC are provided that recover any rectangular burst erasure of area N<sub>1</sub>N<sub>2</sub>/2. These lattice-cyclic codes can recover burst erasures with a simple and efficient ML decoding. This work then studies the problem of distributed source coding for two and three correlated signals using channel codes. We propose to model the distributed source coding problem with a set of parallel channel that simplifies the distributed source coding to de- signing non-uniform channel codes. This design criterion improves the performance of the source coding considerably. LDPC codes are used for lossless and lossy distributed source coding, when the correlation parameter is known or unknown at the time of code design. We show that distributed source coding at the corner point using LDPC codes is simplified to non-uniform LDPC code and semi-random punctured LDPC codes for a system of two and three correlated sources, respectively. We also investigate distributed source coding at any arbitrary rate on the Slepian-Wolf rate region. This problem is simplified to designing a rate-compatible LDPC code that has unequal error protection property. This dissertation finally studies the distributed source coding problem for applications whose wireless channel is an erasure channel with unknown erasure probability. For these application, rateless codes are better candidates than LDPC codes. Non-uniform rateless codes and improved decoding algorithm are proposed for this purpose. We introduce a reliable, rate-optimal, and energy-efficient multicast algorithm that uses distributed source coding and rateless coding. The proposed multicast algorithm performs very close to network coding, while it has lower complexity and higher adaptability. Multicast Rateless codes Wyner-Ziv limit Reiger bound Error control coding Wavelet two-dimensionl codes Lossy distributed source coding Slepian-Wolf limit Distributed source coding Coding theory Multicasting (Computer networks)
34	Robust Techniques Of Language Modeling For Spoken Language Identification Basavaraja, S V January 2007 (has links) Language Identification (LID) is the task of automatically identifying the language of speech signal uttered by an unknown speaker. An N language LID task is to classify an input speech utterance, spoken by an unknown speaker and of unknown text, as belonging to one of the N languages L1, L2, . . , LN. We present a new approach to spoken language modeling for language identification using the Lempel-Ziv-Welch (LZW) algorithm, with which we try to overcome the limitations of n-gram stochastic models by automatically identifying the valid set of variable length patterns from the training data. However, since several patterns in a language pattern table are also shared by other language pattern tables, confusability prevailed in the LID task. To overcome this, three pruning techniques are proposed to make these pattern tables more language specific. For LID with limited training data, we present another language modeling technique, which compensates for language specific patterns missing in the language specific LZW pattern table. We develop two new discriminative measures for LID based on the LZW algorithm, viz., (i) Compression Ratio Score (LZW-CRS) and (ii) Weighted Discriminant Score (LZW-WDS). It is shown that for a 6-language LID task of the OGI-TS database, the new model (LZW-WDS) significantly outperforms the conventional bigram approach. With regard to the front end of the LID system, we develop a modified technique to model for Acoustic Sub-Word Units (ASWU) and explore its effectiveness. The segmentation of speech signal is done using an acoustic criterion (ML-segmentation). However, we believe that consistency and discriminability among speech units is the key issue for the success of ASWU based speech processing. We develop a new procedure for clustering and modeling the segments using sub-word GMMs. Because of the flexibility in choosing the labels for the sub-word units, we do an iterative re-clustering and modeling of the segments. Using a consistency measure of labeling the acoustic segments, the convergence of iterations is demonstrated. We show that the performance of new ASWU based front-end and the new LZW based back-end for LID outperforms the earlier reported PSWR based LID. Speech Recognition Speech Processing Language Processing Spoken Language Modeling Language Identification (LID) Language Models Spoken Language Identification Lempel-Ziv-Welch (LZW) Algorithm Acoustic Sub-Word Units (ASWU Language Modeling Communications Engineering
35	Optimized information processing in resource-constrained vision systems. From low-complexity coding to smart sensor networks MORBEE, MARLEEN 14 October 2011 (has links) Vision systems have become ubiquitous. They are used for traffic monitoring, elderly care, video conferencing, virtual reality, surveillance, smart rooms, home automation, sport games analysis, industrial safety, medical care etc. In most vision systems, the data coming from the visual sensor(s) is processed before transmission in order to save communication bandwidth or achieve higher frame rates. The type of data processing needs to be chosen carefully depending on the targeted application, and taking into account the available memory, computational power, energy resources and bandwidth constraints. In this dissertation, we investigate how a vision system should be built under practical constraints. First, this system should be intelligent, such that the right data is extracted from the video source. Second, when processing video data this intelligent vision system should know its own practical limitations, and should try to achieve the best possible output result that lies within its capabilities. We study and improve a wide range of vision systems for a variety of applications, which go together with different types of constraints. First, we present a modulo-PCM-based coding algorithm for applications that demand very low complexity coding and need to preserve some of the advantageous properties of PCM coding (direct processing, random access, rate scalability). Our modulo-PCM coding scheme combines three well-known, simple, source coding strategies: PCM, binning, and interpolative coding. The encoder first analyzes the signal statistics in a very simple way. Then, based on these signal statistics, the encoder simply discards a number of bits of each image sample. The modulo-PCM decoder recovers the removed bits of each sample by using its received bits and side information which is generated by interpolating previous decoded signals. Our algorithm is especially appropriate for image coding. / Morbee, M. (2011). Optimized information processing in resource-constrained vision systems. From low-complexity coding to smart sensor networks [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/12126 / Palancia Vision systems Distributed video coding Sensor networks Smart cameras Occupancy sensing Resource-constrained systems Computer vision Task assignment Camera selection Image and video processing Image and video compression Wyner-ziv coding Low-complexity Line sensors Information processing TEORIA DE LA SEÑAL Y COMUNICACIONES

Page generated in 0.0463 seconds