Spelling suggestions: "subject:"[een] MUTUAL INFORMATION"" "subject:"[enn] MUTUAL INFORMATION""
81 |
Achieving Perfect Location Privacy in Wireless Devices Using AnonymizationMontazeri, Zarrin 24 March 2017 (has links)
The popularity of mobile devices and location-based services (LBS) have created great concerns regarding the location privacy of the users of such devices and services. Anonymization is a common technique that is often being used to protect the location privacy of LBS users. This technique assigns a random pseudonym to each user and these pseudonyms can change over time. Here, we provide a general information theoretic definition for perfect location privacy and prove that perfect location privacy is achievable for mobile devices when using the anonymization technique appropriately. First, we assume that the user’s current location is independent from her past locations. Using this i.i.d model, we show that if the pseudonym of the user is changed before O(n2/(r−1)) number of anonymized observations is made by the adversary for that user, then she has perfect location privacy, where n is the number of users in the network and r is the number of all possible locations that the user might occupy. Then, we model each user’s movement by a Markov chain so that a user’s current location depends on his previous locations, which is a more realistic model when approximating real world data. We show that perfect location privacy is achievable in this model if the pseudonym of the user is changed before O(n2/(|E|−r)) anonymized observations is collected by the adversary for that user where |E| is the number of edges in the user’s Markov model.
|
82 |
Implementation and verification of the Information Bottleneck interpretation of deep neural networksLiu, Feiyang January 2018 (has links)
Although deep neural networks (DNNs) have made remarkable achievementsin various elds, there is still not a matching practical theory that is able toexplain DNNs' performances. Tishby (2015) proposed a new insight to analyzeDNN via the Information bottleneck (IB) method. By visualizing how muchrelevant information each layer contains in input and output, he claimed thatthe DNNs training is composed of tting phase and compression phase. Thetting phase is when DNNs learn information both in input and output, andthe prediction accuracy goes high during this process. Afterwards, it is thecompression phase when information in output is preserved while unrelatedinformation in input is thrown away in hidden layers. This is a tradeo betweenthe network complexity (complicated DNNs lose less information in input) andprediction accuracy, which is the same goal with the IB method.In this thesis, we verify this IB interpretation rst by reimplementing Tishby'swork, where the hidden layer distribution is approximated by the histogram(binning). Additionally, we introduce various mutual information estimationmethods like kernel density estimators. Based upon simulation results, we concludethat there exists an optimal bound on the mutual information betweenhidden layers with input and output. But the compression mainly occurs whenthe activation function is \double saturated", like hyperbolic tangent function.Furthermore, we extend the work to the simulated wireless model where thedata set is generated by a wireless system simulator. The results reveal that theIB interpretation is true, but the binning is not a correct tool to approximatehidden layer distributions. The ndings of this thesis reect the informationvariations in each layer during the training, which might contribute to selectingtransmission parameter congurations in each frame in wireless communicationsystems. / Ä ven om djupa neuronnät (DNN) har gjort anmärkningsvärda framsteg på olikaområden, finns det fortfarande ingen matchande praktisk teori som kan förklara DNNs prestanda. Tishby (2015) föreslog en ny insikt att analysera DNN via informationsflaskhack (IB) -metoden. Genom att visualisera hur mycket relevant information varje lager innehåller i ingång och utgång, hävdade han att DNNs träning består av monteringsfas och kompressionsfas. Monteringsfasenär när DNN lär sig information både i ingång och utgång, och prediktionsnoggrannheten ökar under denna process. Efteråt är det kompressionsfasen när information i utgången bevaras medan orelaterad information i ingången kastas bort. Det här är en kompromiss mellan nätkomplexiteten (komplicerade DNN förlorar mindre information i inmatning) och predictionsnoggrannhet, vilket är exakt samma mål med informationsflaskhals (IB) -metoden.I detta examensarbete kontrollerar vi denna IB-framställning först genom att implementera om Tishby’s arbete, där den dolda lagerfördelningen approximeras av histogrammet (binning). Dessutom introducerar vi olika metoder förömsesidig information uppskattning som kernel density estimators. Baserat på simuleringsresultatet drar vi slutsatsen att det finns en optimal bindning för denömsesidiga informationen mellan dolda lager med ingång och utgång. Men komprimeringen sker huvudsakligen när aktiveringsfunktionen är “dubbelmättad”, som hyperbolisk tangentfunktion.Dessutom utvidgar vi arbetet till den simulerad trådlösa modellen där data set genereras av en trådlös systemsimulator. Resultaten visar att IB-framställning är sann, men binningen är inte ett korrekt verktyg för att approximera dolda lagerfördelningar. Resultatet av denna examensarbete reflekterar informationsvariationerna i varje lager, vilket kan bidra till att välja överföringspa-rameterns konfigurationer i varje ram i trådlösa kommunikationssystem
|
83 |
Medical Image Registration and Application to Atlas-Based SegmentationGuo, Yujun 01 May 2007 (has links)
No description available.
|
84 |
MIMO block-fading channels with mismatched CSIAsyhari, A.Taufiq, Guillen i Fabregas, A. 23 August 2014 (has links)
Yes / We study transmission over multiple-input multiple-output (MIMO) block-fading channels with
imperfect channel state information (CSI) at both the transmitter and receiver. Specifically, based on
mismatched decoding theory for a fixed channel realization, we investigate the largest achievable rates
with independent and identically distributed inputs and a nearest neighbor decoder. We then study the
corresponding information outage probability in the high signal-to-noise ratio (SNR) regime and analyze
the interplay between estimation error variances at the transmitter and at the receiver to determine
the optimal outage exponent, defined as the high-SNR slope of the outage probability plotted in a
logarithmic-logarithmic scale against the SNR. We demonstrate that despite operating with imperfect
CSI, power adaptation can offer substantial gains in terms of outage exponent. / A. T. Asyhari was supported in part by the Yousef Jameel Scholarship, University of Cambridge, Cambridge, U.K., and the National Science Council of Taiwan under grant NSC 102-2218-E-009-001. A. Guillén i Fàbregas was supported in part by the European Research Council under ERC grant agreement 259663 and the Spanish Ministry of Economy and Competitiveness under grant TEC2012-38800-C03-03.
|
85 |
Termodinâmica e informação em redes quânticas lineares / Thermodynamics and information in linear quantum latticesMalouf, William Tiago Batista 24 May 2019 (has links)
Quando um sistema quântico é acoplado à diversos banhos térmicos de diferentes temperaturas, eventualmente um estado estacionário fora do equilíbrio (NESS), caracterizado por correntes internas de calor é atingido. Por um lado, essas correntes são responsáveis por causar decoerência e produzir entropia no sistema. Entretanto, sua existência também induz correlações entre diferentes partes do sistema. Neste trabalho, nós exploramos este duplo aspecto dos NESSs. Usando técnicas do espaço de fase nós calculamos a produção de entropia de Wigner em redes lineares harmônicas. Trabalhando no célebre limite de fraco acoplamento interno e dissipativo, nós obtivemos expressões simples e frechadas para a contribuição de cada corrente de quasi-probabilidade na entropia. Nossa análise também mostra que, a dinâmica interna (reversével) é exclusivamente responsável em manter a produção de entropia (irreversível) estacionária. Considerando um ponto de vista informacional, nós trabalhamos no problema de como quantificar a informação compartilhada entre partes desconexas de uma cadeia quântica em um estado estacionário fora do equilíbrio. Nós mostramos então que esta é mais precisamente caracterizada utilizando a informação mútua condicional (CMI), um quantificador mais geral de correlações tripartites do que a usual informação mútua. Como aplicação, nós utilizamos o paradigmático problema da transferência de energia em uma cadeia de osciladores sujeita a banhos internos auto-consistentes, que podem ser usados para mudar de um transporte balístico para difusivo. Nós encontramos que a produção de entropia escala com diferentes leis de potência nos regimes balístico e difusivo, permitindo então quantificar o \'\'custo entrópico da difusividade\'\'. Nós também computamos a CMI para cadeias de diversos tamanhos e assim encontramos leis de escala relacionando a informação compartilhada com a difusividade. Finalmente nós discutimos como esta nova perspectiva na caracterização de sistemas fora do equilíbrio pode ser aplicada para entender o problema de equilibração local em estados fora do equilíbrio. / When a quantum system is coupled to several heat baths at different temperatures, it eventually reaches a non-equilibrium steady state (NESS) featuring stationary internal heat currents. From one side, these currents are responsible to cause decorehence and produce entropy in the system. However, their existence also induce correlations between different parts of the system. In this work, we explore this two-folded aspect of NESSs. Using phase-space techniques we calculate the Wigner entropy production on general linear networks of harmonic nodes. Working in the ubiquitous limit of weak internal coupling and weak dissipation, we obtain simple closed-form expressions for the entropic contribution of each individual quasi-probability current. Our analysis also shows that, it is exclusively the (reversible) internal dynamics which maintain the stationary (irreversible) entropy production. From the informational point of view, we address how to quantify the amount of information that disconnected parts of a quantum chain share in a non-equilibrium steady-state. As we show, this is more precisely captured by the conditional mutual information (CMI), a more general quantifier of tripartite correlations than the usual mutual information. As an application, we apply our framework to the paradigmatic problem of energy transfer through a chain of oscillators subject to self-consistent internal baths that can be used to tune the transport from ballistic to diffusive. We find that the entropy production scales with different power law behaviors in the ballistic and diffusive regimes, hence allowing us to quantify what is the \'\'entropic cost of diffusivity\'\'. We also compute the CMI for arbitrary sizes and thus find the scaling rules connecting information sharing and diffusivity. Finally, we discuss how this new perspective in the characterization of non-equilibrium systems may be applied to understand the issue of local equilibration in non-equilibrium states.
|
86 |
Stochastic density ratio estimation and its application to feature selection / Estimação estocástica da razão de densidades e sua aplicação em seleção de atributosBraga, Ígor Assis 23 October 2014 (has links)
The estimation of the ratio of two probability densities is an important statistical tool in supervised machine learning. In this work, we introduce new methods of density ratio estimation based on the solution of a multidimensional integral equation involving cumulative distribution functions. The resulting methods use the novel V -matrix, a concept that does not appear in previous density ratio estimation methods. Experiments demonstrate the good potential of this new approach against previous methods. Mutual Information - MI - estimation is a key component in feature selection and essentially depends on density ratio estimation. Using one of the methods of density ratio estimation proposed in this work, we derive a new estimator - VMI - and compare it experimentally to previously proposed MI estimators. Experiments conducted solely on mutual information estimation show that VMI compares favorably to previous estimators. Experiments applying MI estimation to feature selection in classification tasks evidence that better MI estimation leads to better feature selection performance. Parameter selection greatly impacts the classification accuracy of the kernel-based Support Vector Machines - SVM. However, this step is often overlooked in experimental comparisons, for it is time consuming and requires familiarity with the inner workings of SVM. In this work, we propose procedures for SVM parameter selection which are economic in their running time. In addition, we propose the use of a non-linear kernel function - the min kernel - that can be applied to both low- and high-dimensional cases without adding another parameter to the selection process. The combination of the proposed parameter selection procedures and the min kernel yields a convenient way of economically extracting good classification performance from SVM. The Regularized Least Squares - RLS - regression method is another kernel method that depends on proper selection of its parameters. When training data is scarce, traditional parameter selection often leads to poor regression estimation. In order to mitigate this issue, we explore a kernel that is less susceptible to overfitting - the additive INK-splines kernel. Then, we consider alternative parameter selection methods to cross-validation that have been shown to perform well for other regression methods. Experiments conducted on real-world datasets show that the additive INK-splines kernel outperforms both the RBF and the previously proposed multiplicative INK-splines kernel. They also show that the alternative parameter selection procedures fail to consistently improve performance. Still, we find that the Finite Prediction Error method with the additive INK-splines kernel performs comparably to cross-validation. / A estimação da razão entre duas densidades de probabilidade é uma importante ferramenta no aprendizado de máquina supervisionado. Neste trabalho, novos métodos de estimação da razão de densidades são propostos baseados na solução de uma equação integral multidimensional. Os métodos resultantes usam o conceito de matriz-V , o qual não aparece em métodos anteriores de estimação da razão de densidades. Experimentos demonstram o bom potencial da nova abordagem com relação a métodos anteriores. A estimação da Informação Mútua - IM - é um componente importante em seleção de atributos e depende essencialmente da estimação da razão de densidades. Usando o método de estimação da razão de densidades proposto neste trabalho, um novo estimador - VMI - é proposto e comparado experimentalmente a estimadores de IM anteriores. Experimentos conduzidos na estimação de IM mostram que VMI atinge melhor desempenho na estimação do que métodos anteriores. Experimentos que aplicam estimação de IM em seleção de atributos para classificação evidenciam que uma melhor estimação de IM leva as melhorias na seleção de atributos. A tarefa de seleção de parâmetros impacta fortemente o classificador baseado em kernel Support Vector Machines - SVM. Contudo, esse passo é frequentemente deixado de lado em avaliações experimentais, pois costuma consumir tempo computacional e requerer familiaridade com as engrenagens de SVM. Neste trabalho, procedimentos de seleção de parâmetros para SVM são propostos de tal forma a serem econômicos em gasto de tempo computacional. Além disso, o uso de um kernel não linear - o chamado kernel min - é proposto de tal forma que possa ser aplicado a casos de baixa e alta dimensionalidade e sem adicionar um outro parâmetro a ser selecionado. A combinação dos procedimentos de seleção de parâmetros propostos com o kernel min produz uma maneira conveniente de se extrair economicamente um classificador SVM com boa performance. O método de regressão Regularized Least Squares - RLS - é um outro método baseado em kernel que depende de uma seleção de parâmetros adequada. Quando dados de treinamento são escassos, uma seleção de parâmetros tradicional em RLS frequentemente leva a uma estimação ruim da função de regressão. Para aliviar esse problema, é explorado neste trabalho um kernel menos suscetível a superajuste - o kernel INK-splines aditivo. Após, são explorados métodos de seleção de parâmetros alternativos à validação cruzada e que obtiveram bom desempenho em outros métodos de regressão. Experimentos conduzidos em conjuntos de dados reais mostram que o kernel INK-splines aditivo tem desempenho superior ao kernel RBF e ao kernel INK-splines multiplicativo previamente proposto. Os experimentos também mostram que os procedimentos alternativos de seleção de parâmetros considerados não melhoram consistentemente o desempenho. Ainda assim, o método Finite Prediction Error com o kernel INK-splines aditivo possui desempenho comparável à validação cruzada.
|
87 |
Plans prédictifs à taille fixe et séquentiels pour le krigeage / Fixed-size and sequential designs for krigingAbtini, Mona 30 August 2018 (has links)
La simulation numérique est devenue une alternative à l’expérimentation réelle pour étudier des phénomènes physiques. Cependant, les phénomènes complexes requièrent en général un nombre important de simulations, chaque simulation étant très coûteuse en temps de calcul. Une approche basée sur la théorie des plans d’expériences est souvent utilisée en vue de réduire ce coût de calcul. Elle consiste à partir d’un nombre réduit de simulations, organisées selon un plan d’expériences numériques, à construire un modèle d’approximation souvent appelé métamodèle, alors beaucoup plus rapide à évaluer que le code lui-même. Traditionnellement, les plans utilisés sont des plans de type Space-Filling Design (SFD). La première partie de la thèse concerne la construction de plans d’expériences SFD à taille fixe adaptés à l’identification d’un modèle de krigeage car le krigeage est un des métamodèles les plus populaires. Nous étudions l’impact de la contrainte Hypercube Latin (qui est le type de plans les plus utilisés en pratique avec le modèle de krigeage) sur des plans maximin-optimaux. Nous montrons que cette contrainte largement utilisée en pratique est bénéfique quand le nombre de points est peu élevé car elle atténue les défauts de la configuration maximin-optimal (majorité des points du plan aux bords du domaine). Un critère d’uniformité appelé discrépance radiale est proposé dans le but d’étudier l’uniformité des points selon leur position par rapport aux bords du domaine. Ensuite, nous introduisons un proxy pour le plan minimax-optimal qui est le plan le plus proche du plan IMSE (plan adapté à la prédiction par krigeage) et qui est coûteux en temps de calcul, ce proxy est basé sur les plans maximin-optimaux. Enfin, nous présentons une procédure bien réglée de l’optimisation par recuit simulé pour trouver les plans maximin-optimaux. Il s’agit ici de réduire au plus la probabilité de tomber dans un optimum local. La deuxième partie de la thèse porte sur un problème légèrement différent. Si un plan est construit de sorte à être SFD pour N points, il n’y a aucune garantie qu’un sous-plan à n points (n 6 N) soit SFD. Or en pratique le plan peut être arrêté avant sa réalisation complète. La deuxième partie est donc dédiée au développement de méthodes de planification séquentielle pour bâtir un ensemble d’expériences de type SFD pour tout n compris entre 1 et N qui soient toutes adaptées à la prédiction par krigeage. Nous proposons une méthode pour générer des plans séquentiellement ou encore emboités (l’un est inclus dans l’autre) basée sur des critères d’information, notamment le critère d’Information Mutuelle qui mesure la réduction de l’incertitude de la prédiction en tout point du domaine entre avant et après l’observation de la réponse aux points du plan. Cette approche assure la qualité des plans obtenus pour toutes les valeurs de n, 1 6 n 6 N. La difficulté est le calcul du critère et notamment la génération de plans en grande dimension. Pour pallier ce problème une solution a été présentée. Cette solution propose une implémentation astucieuse de la méthode basée sur le découpage par blocs des matrices de covariances ce qui la rend numériquement efficace. / In recent years, computer simulation models are increasingly used to study complex phenomena. Such problems usually rely on very large sophisticated simulation codes that are very expensive in computing time. The exploitation of these codes becomes a problem, especially when the objective requires a significant number of evaluations of the code. In practice, the code is replaced by global approximation models, often called metamodels, most commonly a Gaussian Process (kriging) adjusted to a design of experiments, i.e. on observations of the model output obtained on a small number of simulations. Space-Filling-Designs which have the design points evenly spread over the entire feasible input region, are the most used designs. This thesis consists of two parts. The main focus of both parts is on construction of designs of experiments that are adapted to kriging, which is one of the most popular metamodels. Part I considers the construction of space-fillingdesigns of fixed size which are adapted to kriging prediction. This part was started by studying the effect of Latin Hypercube constraint (the most used design in practice with the kriging) on maximin-optimal designs. This study shows that when the design has a small number of points, the addition of the Latin Hypercube constraint will be useful because it mitigates the drawbacks of maximin-optimal configurations (the position of the majority of points at the boundary of the input space). Following this study, an uniformity criterion called Radial discrepancy has been proposed in order to measure the uniformity of the points of the design according to their distance to the boundary of the input space. Then we show that the minimax-optimal design is the closest design to IMSE design (design which is adapted to prediction by kriging) but is also very difficult to evaluate. We then introduce a proxy for the minimax-optimal design based on the maximin-optimal design. Finally, we present an optimised implementation of the simulated annealing algorithm in order to find maximin-optimal designs. Our aim here is to minimize the probability of falling in a local minimum configuration of the simulated annealing. The second part of the thesis concerns a slightly different problem. If XN is space-filling-design of N points, there is no guarantee that any n points of XN (1 6 n 6 N) constitute a space-filling-design. In practice, however, we may have to stop the simulations before the full realization of design. The aim of this part is therefore to propose a new methodology to construct sequential of space-filling-designs (nested designs) of experiments Xn for any n between 1 and N that are all adapted to kriging prediction. We introduce a method to generate nested designs based on information criteria, particularly the Mutual Information criterion. This method ensures a good quality forall the designs generated, 1 6 n 6 N. A key difficulty of this method is that the time needed to generate a MI-sequential design in the highdimension case is very larg. To address this issue a particular implementation, which calculates the determinant of a given matrix by partitioning it into blocks. This implementation allows a significant reduction of the computational cost of MI-sequential designs, has been proposed.
|
88 |
Classification d'images RSO polarimétriques à haute résolution spatiale sur site urbain / High – Resolution Polarimetric SAR image classification on urban areasSoheili Majd, Maryam 28 April 2014 (has links)
Notre recherche vise à évaluer l’apport d’une seule image polarimétrique RSO (Radar à Synthèse d’Ouverture) à haute résolution spatiale pour classifier les surfaces urbaines. Pour cela, nous définissons plusieurs types de toits, de sols et d’objets.Dans un premier temps, nous proposons un inventaire d’attributs statistiques, texturaux et polarimétriques pouvant être utilisés dans un algorithme de classification. Nous étudions les lois statistiques des descripteurs et montrons que la distribution de Fisher est bien adaptée pour la plupart d’entre eux. Dans un second temps, plusieurs algorithmes de classification vectorielle supervisée sont testés et comparés, notamment la classification par maximum de vraisemblance basée sur une distribution gaussienne, ou celle basée sur la distribution de Wishart comme modèle statistique de la matrice de cohérence polarimétrique, ou encore l’approche SVM. Nous proposons alors une variante de l’algorithme par maximum de vraisemblance basée sur une distribution de Fisher, dont nous avons étudié l’adéquation avec l’ensemble de nos attributs. Nous obtenons une nette amélioration de nos résultats avec ce nouvel algorithme mais une limitation apparaît pour reconnaître certains toits. Ainsi, la forme des bâtiments rectangulaires est reconnue par opérations morphologiques à partir de l’image d’amplitude radar. Cette information spatiale est introduite dans le processus de classification comme contrainte. Nous montrons tout l’intérêt de cette information puisqu’elle empêche la confusion de classification entre pixels situés sur des toits plats et des pixels d’arbre. De plus, nous proposons une méthode de sélection des attributs les plus pertinents pour la classification, basée sur l’information mutuelle et une recherche par algorithme génétique. Nos expériences sont menées sur une image polarimétrique avec un pixel de 35 cm, acquise en 2006 par le capteur aéroporté RAMSES de l’ONERA. / In this research, our aim is to assess the potential of a one single look high spatial resolution polarimetric radar image for the classification of urban areas. For that purpose, we concentrate on classes corresponding to different kinds of roofs, objects and ground surfaces.At first, we propose a uni-variate statistical analysis of polarimetric and texture attributes, that can be used in a classification algorithm. We perform a statistical analysis of descriptors and show that the Fisher distribution is suitable for most of them. We then propose a modification of the maximum likelihood algorithm based on a Fisher distribution; we train it with all of our attributes. We obtain a significant improvement in our results with the new algorithm, but a limitation appears to recognize some roofs.Then, the shape of rectangular buildings is recognized by morphological operations from the image of radar amplitude. This spatial information is introduced in a Fisher-based classification process as a constraint term and we show that classification results are improved. In particular, it overcomes classification ambiguities between flat roof pixels and tree pixels.In a second step, some well-known algorithms for supervised classification are used. We deal with Maximum Likelihood based on complex Gaussian distribution (uni-variate) and multivariate Complex Gaussian using coherency matrix. Meanwhile, the support vector machine, as a nonparametric method, is used as classification algorithm. Moreover, a feature selection based on Genetic Algorithm using Mutual Information (GA-MI) is adapted to introduce optimal subset to classification method. To illustrate the efficiency of subset selection based on GA-MI, we perform a comparison experiment of optimal subset with different target decompositions based on different scattering mechanisms, including the Pauli, Krogager, Freeman, Yamaguchi, Barnes, Holm, Huynen and the Cloude decompositions. Our experiments are based on an image of a suburban area, acquired by the airborne RAMSES SAR sensor of ONERA, in 2006, with a spatial spacing of 35 cm. The results highlight the potential of such data to discriminate some urban land cover types.
|
89 |
Analyse et conception de code espace-temps en blocs pour transmissions MIMO codéesEL FALOU, Ammar 23 May 2013 (has links) (PDF)
Most of the modern wireless communication systems as WiMAX, DVB-NGH, WiFi, HSPA+ and 4G have adopted the use of multiple antennas at the transmitter and the receiver, called multiple-input multiple-output (MIMO). Space time coding for MIMO systems is a promising technology to increase the data rate and enhance the reliability of wireless communications. Space-time block codes (STBCs) are commonly designed according to the rank-determinant criteria suitable at high signal to noise ratios (SNRs). In contrast, wireless communication standards employ MIMO technology with capacity-approaching forward-error correcting (FEC) codes like turbo codes and low-density parity-check (LDPC) codes, ensuring low error rates even at low SNRs. In this thesis, we investigate the design of STBCs for MIMO systems with capacity-approaching FEC codes. We start by proposing a non-asymptotic STBC design criterion based on the bitwise mutual information (BMI) maximization between transmitted and soft estimated bits at a specific target SNR. According to the BMI criterion, we optimize several conventional STBCs. Their design parameters are shown to be SNR-dependent leading to the proposal of adaptive STBCs. Proposed adaptive STBCs offer identical or better performance than standard WiMAX profiles for all coding rates, without increasing the detection complexity. Among them, the proposed adaptive trace-orthonormal STBC can pass continuously from spatial multiplexing, suitable at low SNRs and therefore at low coding rates, to the Golden code, optimal at high SNRs. Uncorrelated, correlated channels and transmit antenna selection are considered. We design adaptive STBCs for these cases offering identical or better performance than conventional non-adaptive STBCs. In addition, conventional STBCs are designed in a way achieving the asymptotic DMT frontier. Recently, the finite-SNR DMT has been proposed to characterize the DMT at finite SNRs. Our last contribution consists of the derivation of the exact finite-SNR DMT for MIMO channels with dual antennas at the transmitter and/or the receiver. Both uncorrelated and correlated Rayleigh fading channels are considered. It is shown that at realistic SNRs, achievable diversity gains are significantly lower than asymptotic values. This finite-SNR could provide new insights on the design of STBCs at operational SNRs.
|
90 |
Discovery and Analysis of Aligned Pattern Clusters from Protein Family SequencesLee, En-Shiun Annie 28 April 2015 (has links)
Protein sequences are essential for encoding molecular structures and functions. Consequently, biologists invest substantial resources and time discovering functional patterns in proteins. Using high-throughput technologies, biologists are generating an increasing amount of data. Thus, the major challenge in biosequencing today is the ability to conduct data analysis in an effi cient and productive manner. Conserved amino acids in proteins reveal important functional domains within protein families. Conversely, less conserved amino acid variations within these protein sequence patterns reveal areas of evolutionary and functional divergence.
Exploring protein families using existing methods such as multiple sequence alignment is computationally expensive, thus pattern search is used. However, at present, combinatorial methods of pattern search generate a large set of solutions, and probabilistic methods require richer representations. They require biological ground truth of the input sequences, such as gene name or taxonomic species, as class labels based on traditional classi fication practice to train a model for predicting unknown sequences. However, these algorithms are inherently biased by mislabelling and may not be able to reveal class characteristics in a detailed and succinct manner.
A novel pattern representation called an Aligned Pattern Cluster (AP Cluster) as developed in this dissertation is compact yet rich. It captures conservations and variations of amino acids and covers more sequences with lower entropy and greatly reduces the number of patterns. AP Clusters contain statistically signi cant patterns with variations; their importance has been confi rmed by the following biological evidences: 1) Most of the discovered AP Clusters correspond to binding segments while their aligned columns correspond to binding sites as verifi ed by pFam, PROSITE, and the three-dimensional structure. 2) By compacting strong correlated functional information together, AP Clusters are able to reveal class characteristics for taxonomical classes, gene classes and other functional classes, or incorrect class labelling. 3) Co-occurrence of AP Clusters on the same homologous protein sequences are spatially close in the protein's three-dimensional structure.
These results demonstrate the power and usefulness of AP Clusters. They bring in
similar statistically signifi cance patterns with variation together and align them to reveal
protein regional functionality, class characteristics, binding and interacting sites for the
study of protein-protein and protein-drug interactions, for diff erentiation of cancer tumour
types, targeted gene therapy as well as for drug target discovery.
|
Page generated in 0.0521 seconds