• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 114
  • 28
  • 19
  • 8
  • 4
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 247
  • 68
  • 50
  • 49
  • 40
  • 38
  • 33
  • 31
  • 23
  • 22
  • 19
  • 19
  • 18
  • 17
  • 17
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
131

Bioinformatics Tools for Finding the Vocabularies of Genomes

Petri, Eric D.C. 02 October 2008 (has links)
No description available.
132

Scalable Task Parallel Programming in the Partitioned Global Address Space

Dinan, James S. 02 September 2010 (has links)
No description available.
133

Spacecraft & Hybrid Rocket Motor Flight Model Design for a Deep Space Mission : Scalable Hybrid Rocket Motor for Small Satellite Propulsion

Molas Roca, Pau January 2019 (has links)
In this thesis, the design and particularities of a unique and revolution- ary scalable propulsion system are presented. A spacecraft mechanical design is included together with a mission definition, aiming to provide a context for a technology demonstration in space of an Hybrid Rocket Motor (HRM) as satellite thruster. Rocket motors have been around for many decades, with their use mainly focused on launch vehicles and large satellites, thus restricting the access to space to institutions with big budgets. To overcome this limitation, the application of a cost-effective type of rocket motor without a heritage of space utilization is explored. This is the implementation of an HRM as satellite thruster. In Chapter 2, the characteristics of this particular case of chemical rocket motor are presented in detail. The HRM applied for the present mission is a particular case of an in- house developed motor design method. As presented in Chapter 7, a scalable and versatile mechanical and propulsion design have been elab- orated following the maturation of a scalability software (Appendix A). The combination of these constitute a valuable tool allowing for a fast and accurate motor design for the desired scenario. Taking advantage of this straightforward tool, an attractive mission was defined to provide a meaningful context for the maiden use of an HRMin space. A micro satellite deep space mission, defined in Chapter 3, was chosen to validate the tool and prove Hybrid Rocket Motors (HRMs) capabilities, showing the benefits of its use over other propulsion systems already available, specifically in the small satellite family. The spacecraft design was tackled aiming to support the motor’s scalable concept while complying with the mission requirements and space standards. The out- come is an easily adaptable satellite design, justified in Chapter 8. The performed structural simulations are outlined in Appendix C to validate the developed design. Ultimately, this thesis work intends to provide the space community with a noteworthy product, opening the access to interplanetary missions provided the reduced mission costs of small satellites mounted with anHRM as propulsion system. Arising from the thesis content, research papers (Part v) have been published and presented in distinguished congresses, contributing to space development.
134

Scalability Analysis and Optimization for Large-Scale Deep Learning

Pumma, Sarunya 03 February 2020 (has links)
Despite its growing importance, scalable deep learning (DL) remains a difficult challenge. Scalability of large-scale DL is constrained by many factors, including those deriving from data movement and data processing. DL frameworks rely on large volumes of data to be fed to the computation engines for processing. However, current hardware trends showcase that data movement is already one of the slowest components in modern high performance computing systems, and this gap is only going to increase in the future. This includes data movement needed from the filesystem, within the network subsystem, and even within the node itself, all of which limit the scalability of DL frameworks on large systems. Even after data is moved to the computational units, managing this data is not easy. Modern DL frameworks use multiple components---such as graph scheduling, neural network training, gradient synchronization, and input pipeline processing---to process this data in an asynchronous uncoordinated manner, which results in straggler processes and consequently computational imbalance, further limiting scalability. This thesis studies a subset of the large body of data movement and data processing challenges that exist in modern DL frameworks. For the first study, we investigate file I/O constraints that limit the scalability of large-scale DL. We first analyze the Caffe DL framework with Lightning Memory-Mapped Database (LMDB), one of the most widely used file I/O subsystems in DL frameworks, to understand the causes of file I/O inefficiencies. Based on our analysis, we propose LMDBIO---an optimized I/O plugin for scalable DL that addresses the various shortcomings in existing file I/O for DL. Our experimental results show that LMDBIO significantly outperforms LMDB in all cases and improves overall application performance by up to 65-fold on 9,216 CPUs of the Blues and Bebop supercomputers at Argonne National Laboratory. Our second study deals with the computational imbalance problem in data processing. For most DL systems, the simultaneous and asynchronous execution of multiple data-processing components on shared hardware resources causes these components to contend with one another, leading to severe computational imbalance and degraded scalability. We propose various novel optimizations that minimize resource contention and improve performance by up to 35% for training various neural networks on 24,576 GPUs of the Summit supercomputer at Oak Ridge National Laboratory---the world's largest supercomputer at the time of writing of this thesis. / Doctor of Philosophy / Deep learning is a method for computers to automatically extract complex patterns and trends from large volumes of data. It is a popular methodology that we use every day when we talk to Apple Siri or Google Assistant, when we use self-driving cars, or even when we witnessed IBM Watson be crowned as the champion of Jeopardy! While deep learning is integrated into our everyday life, it is a complex problem that has gotten the attention of many researchers. Executing deep learning is a highly computationally intensive problem. On traditional computers, such as a generic laptop or desktop machine, the computation for large deep learning problems can take years or decades to complete. Consequently, supercomputers, which are machines with massive computational capability, are leveraged for deep learning workloads. The world's fastest supercomputer today, for example, is capable of performing almost 200 quadrillion floating point operations every second. While that is impressive, for large problems, unfortunately, even the fastest supercomputers today are not fast enough. The problem is not that they do not have enough computational capability, but that deep learning problems inherently rely on a lot of data---the entire concept of deep learning centers around the fact that the computer would study a huge volume of data and draw trends from it. Moving and processing this data, unfortunately, is much slower than the computation itself and with the current hardware trends it is not expected to get much faster in the future. This thesis aims at making deep learning executions on large supercomputers faster. Specifically, it looks at two pieces associated with managing data: (1) data reading---how to quickly read large amounts of data from storage, and (2) computational imbalance---how to ensure that the different processors on the supercomputer are not waiting for each other and thus wasting time. We first analyze each performance problem to identify the root cause of it. Then, based on the analysis, we propose several novel techniques to solve the problem. With our optimizations, we are able to significantly improve the performance of deep learning execution on a number of supercomputers, including Blues and Bebop at Argonne National Laboratory, and Summit---the world's fastest supercomputer---at Oak Ridge National Laboratory.
135

Lh*rs p2p : une nouvelle structure de données distribuée et scalable pour les environnements Pair à Pair / Lh*rsp2p : a new scalable and distributed data structure for Peer to Peer environnements

Yakouben, Hanafi 14 May 2013 (has links)
Nous proposons une nouvelle structure de données distribuée et scalable appelée LH*RSP2P conçue pour les environnements pair à pair(P2P).Les données de l'application forment un fichier d’enregistrements identifiés par les clés primaires. Les enregistrements sont dans des cases mémoires sur des pairs, adressées par le hachage distribué (LH*). Des éclatements créent dynamiquement de nouvelles cases pour accommoder les insertions. L'accès par clé à un enregistrement comporte un seul renvoi au maximum. Le scan du fichier s’effectue au maximum en deux rounds. Ces résultats sont parmi les meilleurs à l'heure actuelle. Tout fichier LH*RSP2P est également protégé contre le Churn. Le calcul de parité protège toute indisponibilité jusqu’à k cases, où k ≥ 1 est un paramètre scalable. Un nouveau type de requêtes, qualifiées de sûres, protège également contre l’accès à toute case périmée. Nous prouvons les propriétés de notre SDDS formellement par une implémentation prototype et des expérimentations. LH*RSP2P apparaît utile aux applications Big Data, sur des RamClouds tout particulièrement / We propose a new scalable and distributed data structure termed LH*RSP2P designed for Peer-to-Peer environment (P2P). Application data forms a file of records identified by primary keys. Records are in buckets on peers, addressed by distributed linear hashing (LH*). Splits create new buckets dynamically, to accommodate inserts. Key access to a record uses at most one hop. Scan of the file proceeds in two rounds at most. These results are among best at present. An LH*RSP2P file is also protected against Churn. Parity calculation recovers from every unavailability of up to k≥1, k is a scalable parameter. A new type of queries, qualified as sure, protects also against access to any out-of-date bucket. We prove the properties of our SDDS formally, by a prototype implementation and experiments. LH*RSP2P appears useful for Big Data manipulations, over RamClouds especially.
136

Scalable Low Power Issue Queue And Store Queue Design For Superscalar Processors

Vivekanandham, Rajesh 12 1900 (has links)
A Large instruction window is a key requirement to exploit greater Instruction Level Parallelism in out-of-order superscalar processors. Along with the instruction window size, the size of various other structures including the issue queue, store queue and register file need to increase as well. However, the cycle time and energy consumption of conventional large monolithic Content Addressable Memories (CAMs), the underlying structure of most conventional issue queue and store queue designs, worsen rapidly with an increase in size. This results in a three way trade-off involving ILP, clock frequency and energy consumption. In this thesis, we propose efficient designs for the issue queue and the store queue that improve the circuit latency and energy consumption while minimizing the loss in IPC. We propose the Scalable Low power Issue Queue (SLIQ) design which segments the issue queue structure to reduce the latency. This is complemented with a fast Wakeup index to a consumer in the issue queue for every instruction. As this consumer instruction can be woken up directly, without any delay, this mitigates the IPC loss faced by the pipelined issue queue. Also, as the scheme incorporates a pipelined broadcast, the indices are not required for correctness and can simply be gang invalidated on branch mispredictions. The IPC loss of an 8 segment SLIQ is Within 2.3% for the entire SPEC CPU2000 benchmark suite while achieving a 39.3% reduction in issue latency. Further, in the SLIQ design unnecessary broadcasts to the higher segments are avoided most of the time as in a large majority of the cases, an instruction has a single consumer. This consumer is woken up either by direct indexing or by broadcast in the first segment of the SLIQ. This enables the 8 segment SLIQ to significantly reduce the energy consumption and the energy-delay product by 48.3% and 67.4% respectively on an average. SLIQ also allows the architects to segment the issue queue carefully so that the latency of the issue logic is just within the per pipeline stage latency goals of the design. We also propose the Scalable Low power Store Queue (SLSQ) to address similar problems associated with the store queue data forwarding logic. We extend the state- of-the-art Store Vector based Disambiguator to also predict the index of the store that will forward to a given load. SLSQ marginally adds to the hardware budget, but predicts the store queue index of the store which will forward with an accuracy of 99.5% on an average. SLSQ, thus, eliminates unnecessary address broadcasts and Compares and reduces energy consumption of the store-to-load forwarding logic by 78.4% and 91.6% for the SPEC Int and FP suites respectively. Another variant of SLSQ, eliminates the need for a CAM in the forwarding logic and achieves a 49.9% reduction in store to load data forwarding latency while incurring a minimal IPC loss less than 0.1% on average for the entire SPEC CPU2000 benchmark suite.
137

Scalable Sprase Bayesian Nonparametric and Matrix Tri-factorization Models for Text Mining Applications

Ranganath, B N January 2017 (has links) (PDF)
Hierarchical Bayesian Models and Matrix factorization methods provide an unsupervised way to learn latent components of data from the grouped or sequence data. For example, in document data, latent component corn-responds to topic with each topic as a distribution over a note vocabulary of words. For many applications, there exist sparse relationships between the domain entities and the latent components of the data. Traditional approaches for topic modelling do not take into account these sparsity considerations. Modelling these sparse relationships helps in extracting relevant information leading to improvements in topic accuracy and scalable solution. In our thesis, we explore these sparsity relationships for di errant applications such as text segmentation, topical analysis and entity resolution in dyadic data through the Bayesian and Matrix tri-factorization approaches, propos-in scalable solutions. In our rest work, we address the problem of segmentation of a collection of sequence data such as documents using probabilistic models. Existing state-of-the-art Hierarchical Bayesian Models are connected to the notion of Complete Exchangeability or Markov Exchangeability. Bayesian Nonpareil-metric Models based on the notion of Markov Exchangeability such as HDP-HMM and Sticky HDP-HMM, allow very restricted permutations of latent variables in grouped data (topics in documents), which in turn lead to com-mutational challenges for inference. At the other extreme, models based on Complete Exchangeability such as HDP allow arbitrary permutations within each group or document, and inference is significantly more tractable as a result, but segmentation is not meaningful using such models. To over-come these problems, we explored a new notion of exchangeability called Block Exchangeability that lies between Markov Exchangeability and Com-plate Exchangeability for which segmentation is meaningful, but inference is computationally less expensive than both Markov and Complete Exchange-ability. Parametrically, Block Exchangeability contains sparser number of transition parameters, linear in number of states compared to the quadratic order for Markov Exchangeability that is still less than that for Complete Exchangeability and for which parameters are on the order of the number of documents. For this, we propose a nonparametric Block Exchangeable model (BEM) based on the new notion of Block Exchangeability, which we have shown to be a superclass of Complete Exchangeability and subclass of Markov Exchangeability. We propose a scalable inference algorithm for BEM to infer the topics for words and segment boundaries associated with topics for a document using the collapsed Gibbs Sampling procedure. Empirical results show that BEM outperforms state-of-the-art nonparametric models in terms of scalability and generalization ability and shows nearly the same segmentation quality on News dataset, Product review dataset and on a Synthetic dataset. Interestingly, we can tune the scalability by varying the block size through a parameter in our model for a small trade-o with segmentation quality. In addition to exploring the association between documents and words, we also explore the sparse relationships for dyadic data, where associations between one pair of domain entities such as (documents, words) and as-associations between another pair such as (documents, users) are completely observed. We motivate the analysis of such dyadic data introducing an additional discrete dimension, which we call topics, and explore sparse relation-ships between the domain entities and the topic, such as of user-topic and document-topic respectively. In our second work, for this problem of sparse topical analysis of dyadic data, we propose a formulation using sparse matrix tri-factorization. This formulation requires sparsity constraints, not only on the individual factor matrices, but also on the product of two of the factors. To the best of our knowledge, this problem of sparse matrix tri-factorization has not been stud-ide before. We propose a solution that introduces a surrogate for the product of factors and enforces sparsity on this surrogate as well as on the individual factors through L1-regularization. The resulting optimization problem is e - cogently solvable in an alternating minimization framework over sub-problems involving individual factors using the well-known FISTA algorithm. For the sub-problems that are constrained, we use a projected variant of the FISTA algorithm. We also show that our formulation leads to independent sub-problems towards solving a factor matrix, thereby supporting parallel implementation leading to a scalable solution. We perform experiments over bibliographic and product review data to show that the proposed framework based on sparse tri-factorization formulation results in better generalization ability and factorization accuracy compared to baselines that use sparse bi-factorization. Even though the second work performs sparse topical analysis for dyadic data, ending sparse topical associations for the users, the user references with di errant names could belong to the same entity and those with same names could belong to different entities. The problem of entity resolution is widely studied in the research community, where the goal is to identify real users associated with the user references in the documents. Finally, we focus on the problem of entity resolution in dyadic data, where associations between one pair of domain entities such as documents-words and associations between another pair such as documents-users are ob.-served, an example of which includes bibliographic data. In our nil work, for this problem of entity resolution in bibliographic data, we propose a Bayesian nonparametric `Sparse entity resolution model' (SERM) exploring the sparse relationships between the grouped data involving grouping of the documents, and the topics/author entities in the group. Further, we also exploit the sparseness between an author entity and the associated author aliases. Grouping of the documents is achieved with the stick breaking prior for the Dirichlet processes (DP). To achieve sparseness, we propose a solution that introduces separate Indian Bu et process (IBP) priors over topics and the author entities for the groups and k-NN mechanism for selecting author aliases for the author entities. We propose a scalable inference for SERM by appropriately combining partially collapsed Gibbs sampling scheme in Focussed topic model (FTM), the inference scheme used for parametric IBP prior and the k-NN mechanism. We perform experiments over bibliographic datasets, Cite seer and Rexa, to show that the proposed SERM model imp-proves the accuracy of entity resolution by ending relevant author entities through modelling sparse relationships and is scalable, when compared to the state-of-the-art baseline
138

Design and Characterization of SRAMs for Ultra Dynamic Voltage Scalable (U-DVS) Systems

Viveka, K R January 2016 (has links) (PDF)
The ever expanding range of applications for embedded systems continues to offer new challenges (and opportunities) to chip manufacturers. Applications ranging from exciting high resolution gaming to routine tasks like temperature control need to be supported on increasingly small devices with shrinking dimensions and tighter energy budgets. These systems benefit greatly by having the capability to operate over a wide range of supply voltages, known as ultra dynamic voltage scaling (U-DVS). This refers to systems capable of operating from nominal voltages down to sub-threshold voltages. Memories play an important role in these systems with future chips estimated to have over 80% of chip area occupied by memories. This thesis presents the design and characterization of an ultra dynamic voltage scalable memory (SRAM) that functions from nominal voltages down to sub-threshold voltages without the need for external support. The key contributions of the thesis are as follows: 1) A variation tolerant reference generation for single ended sensing: We present a reference generator, for U-DVS memories, that tracks the memory over a wide range of voltages and is tunable to allow functioning down to sub-threshold voltages. Replica columns are used to generate the reference voltage which allows the technique to track slow changes such as temperature and aging. A few configurable cells in the replica column are found to be sufficient to cover the whole range of voltages of interest. The use of tunable delay line to generate timing is shown to help in overcoming the effects of process variations. 2) Random-sampling based tuning algorithm: Tuning is necessary to overcome the in-creased effects of variation at lower voltages. We present an random-sampling based BIST tuning algorithm that significantly speed-up the tuning ensuring that the time required to tune is comparable to a single MBIST algorithm. Further, the use of redundancy after delay tuning enables maximum utilization of redundancy infrastructure to reduce power consumption and enhance performance. 3) Testing and Characterization for U-DVS systems: Testing and characterization is an important challenge in U-DVS systems that have remained largely unexplored. We propose an iterative technique that allows realization of an on-chip oscilloscope with minimal area overhead. The all digital nature of the technique makes it simple to design and implement across technology nodes. Combining the proposed techniques allows the designed 4 Kb SRAM array to function from 1.2 V down to 310 mV with reads functioning down to 190 mV. This would contribute towards moving ultra wide voltage operation a step closer towards implementation in commercial designs.
139

Ten thousand applications in ten minutes : Evaluating scalable recruitment, evaluation and screening methods of candidates for sales jobs

Kirk, Stephen January 2017 (has links)
While personnel evaluation has been extensively covered in literature, little is known about evaluation procedures screening a large number of applicants. The basis of this research was to investigate if candidates for sales positions can be evaluated in a scalable way (where the number of applications does not impact the cost of evaluation much) for an on demand sales platform. The study consists of interviews with the recruiters and growth leads of the studied firm, a case study of a firm that has omitted resumes in their salesperson recruitment processes, and sample tests performed on candidates for sales positions. Further, some data on salespeople was collected and analysed. In summary, the study links the findings to the restrictions of a process that requires scalability. Previous research outlines how various indicators (personality facets, biodata, and optimism) predict sales performance in salespeople. Mental ability of candidates is relevant especially for the work training phase. Some of these findings were supported by the case study. While traditional resumes contain information predicting sales ability, some sales managers argue that they are obsolete. Previous research shows that recruiters risk drawing broad generalizations based on resume content. Video resumes have some potential, but currently have technical and ethical limitations. Personality and mental ability tests show predictive ability for sales performance, and are scalable. Previous research discusses limitations in many personality tests being commercial, resulting in limitations in how they may be modified; in their transparency of scoring; and validity studies being hard to conduct. Other limitations with personality tests in evaluation settings are that they are prone to faking. The study also suggests future topics of research in how culture defines what an ideal salesperson is, and extending these findings to other areas than sales. / Medan bedömning av sökande för tjänster har täckts i tidigare forskning, är lite känt om utvärderingsprocesser som utvärderar stort antal sökande. Denna studie söker att svara på om kandidater för säljtjänster kan utvärderas på ett skalbart sätt (där antalet sökande har liten påverkan på kostnaden för utvärdering) för en säljplattform. Studien består av intervjuer med rekryterare och growth leads av det studerade företaget, en fallstudie av ett företag som har slopat CV:n i sin ansökningsprocess, och test på kandidater för säljtjänster. Vidare analyserades befintlig data på säljare. Sammanfattningsvis länkar studien resultaten till de begränsningar som krävs av en skalbar process. Tidigare forskning visar hur olika indikatorer (personlighet, biografisk data, och optimism) kan förutse säljförmåga. Kandidatens mentala förmåga är särskilt relevant för träningsfasen. Vissa av dessa resultat stöds av fallstudien. Medan CV:n innehåller information för att förutse säljförmåga, hävdar vissa säljchefer att de är utdaterade. Tidigare forskning visar att rekryterare ibland generaliserar brett baserat på innehållet av ett CV. Videobaserade CV:n har viss potential, men har etiska och tekniska brister i dagsläget. Personlighetstest och test som mäter mental förmåga visar prediktiv potential för säljförmåga och är också skalbara. Tidigare forskning diskuterar även de begränsningar som uppstår av att många personlighetstest är kommersiella, vilket leder till begränsningar i hur de kan modifieras; i transparensen av rättningen; och att validitetsstudier är svåra att utföra på dem. Andra begränsningar med personlighetstest är att kandidater kan manipulera resultaten. Studien föreslår även framtida forskning inom till exempel hur kultur definierar en ideal säljperson, och om dessa resultat kan utökas till andra områden än försäljning.
140

Topics in Modern Bayesian Computation

Qamar, Shaan January 2015 (has links)
<p>Collections of large volumes of rich and complex data has become ubiquitous in recent years, posing new challenges in methodological and theoretical statistics alike. Today, statisticians are tasked with developing flexible methods capable of adapting to the degree of complexity and noise in increasingly rich data gathered across a variety of disciplines and settings. This has spurred the need for novel multivariate regression techniques that can efficiently capture a wide range of naturally occurring predictor-response relations, identify important predictors and their interactions and do so even when the number of predictors is large but the sample size remains limited. </p><p>Meanwhile, efficient model fitting tools must evolve quickly to keep pace with the rapidly growing dimension and complexity of data they are applied to. Aided by the tremendous success of modern computing, Bayesian methods have gained tremendous popularity in recent years. These methods provide a natural probabilistic characterization of uncertainty in the parameters and in predictions. In addition, they provide a practical way of encoding model structure that can lead to large gains in statistical estimation and more interpretable results. However, this flexibility is often hindered in applications to modern data which are increasingly high dimensional, both in the number of observations $n$ and the number of predictors $p$. Here, computational complexity and the curse of dimensionality typically render posterior computation inefficient. In particular, Markov chain Monte Carlo (MCMC) methods which remain the workhorse for Bayesian computation (owing to their generality and asymptotic accuracy guarantee), typically suffer data processing and computational bottlenecks as a consequence of (i) the need to hold the entire dataset (or available sufficient statistics) in memory at once; and (ii) having to evaluate of the (often expensive to compute) data likelihood at each sampling iteration. </p><p>This thesis divides into two parts. The first part concerns itself with developing efficient MCMC methods for posterior computation in the high dimensional {\em large-n large-p} setting. In particular, we develop an efficient and widely applicable approximate inference algorithm that extends MCMC to the online data setting, and separately propose a novel stochastic search sampling scheme for variable selection in high dimensional predictor settings. The second part of this thesis develops novel methods for structured sparsity in the high-dimensional {\em large-p small-n} regression setting. Here, statistical methods should scale well with the predictor dimension and be able to efficiently identify low dimensional structure so as to facilitate optimal statistical estimation in the presence of limited data. Importantly, these methods must be flexible to accommodate potentially complex relationships between the response and its associated explanatory variables. The first work proposes a nonparametric additive Gaussian process model to learn predictor-response relations that may be highly nonlinear and include numerous lower order interaction effects, possibly in different parts of the predictor space. A second work proposes a novel class of Bayesian shrinkage priors for multivariate regression with a tensor valued predictor. Dimension reduction is achieved using a low-rank additive decomposition for the latter, enabling a highly flexible and rich structure within which excellent cell-estimation and region selection may be obtained through state-of-the-art shrinkage methods. In addition, the methods developed in these works come with strong theoretical guarantees.</p> / Dissertation

Page generated in 0.0329 seconds