Spelling suggestions: "subject:"[een] COMPUTER SCIENCE"" "subject:"[enn] COMPUTER SCIENCE""
91 |
Search Interfaces for Integrating Crowdsourced Code Snippets within Development EnvironmentsWightman, Douglas 01 February 2013 (has links)
In this thesis we report on the design and evaluation of interfaces to support crowdsourced programming tasks. We present WordMatch and SnipMatch: two programming tools that can incorporate crowdsourced source code. The design of these tools is informed by an investigation of crowdsourcing; specifically, Crowdsourced Human-Based Computation (CHC) systems, which organize tasks performed by humans. Recommendations include methods to obtain and maintain users who are highly motivated to participate and methods to improve task performance. WordMatch, a novel programming environment for specifying direct answers for search queries, builds on this work by introducing a parameterized search interface that can be easily understood by end users. In a laboratory study, we found that people with basic computer literacy could be taught to create complex direct answers with minimal training. Finally, evaluations of SnipMatch, a search interface for curated source code snippets, demonstrate that features from WordMatch are applicable to general programming tasks. Participants in our longitudinal study reported that SnipMatch was an effective tool for reducing context switching and as a memory aid. / Thesis (Ph.D, Computing) -- Queen's University, 2013-01-31 12:50:58.15
|
92 |
General Purpose MCMC Sampling for Bayesian Model AveragingBoyles, Levi Beinarauskas 26 September 2014 (has links)
<p> In this thesis we explore the problem of inference for Bayesian model averaging. Many popular topics in Bayesian analysis, such as Bayesian nonparametrics, can be cast as model averaging problems. Model averaging problems offer unique difficulties for inference, as the parameter space is not fixed, and may be infinite. As such, there is little existing work on general purpose MCMC algorithms in this area. We introduce a new MCMC sampler, which we call Retrospective Jump sampling, that is suitable for general purpose model averaging. In the development of Retrospective Jump, some practical issues arise in the need for a MCMC sampler for finite dimensions that is suitable for multimodal target densities; we introduce Refractive Sampling as a sampler suitable in this regard. Finally, we evaluate Retrospective Jump on several model averaging and Bayesian nonparametric problems, and develop a novel latent feature model with hierarchical column structure which uses Retrospective Jump for inference.</p>
|
93 |
Effective Algorithms for the Satisfiability of Quantifier-Free Formulas Over Linear Real and Integer ArithmeticKing, Tim 19 December 2014 (has links)
<p> A core technique of modern tools for formally reasoning about computing systems is generating and dispatching queries to automated theorem provers, including Satisfiability Modulo Theories (SMT) provers. SMT provers aim at the tight integration of decision procedures for propositional satisfiability and decision procedures for fixed first-order theories – known as theory solvers. This thesis presents several advancements in the design and implementation of theory solvers for quantifier-free linear real, integer, and mixed integer and real arithmetic. These are implemented within the SMT system CVC4. We begin by formally describing the Satisfiability Modulo Theories problem and the role of theory solvers within CVC4. We discuss known techniques for building solvers for quantifier-free linear real, integer, and mixed integer and real arithmetic around the Simplex for SMT algorithm. We give several small improvements to theory solvers using this algorithm and describe the implementation and theory of this algorithm in detail. To extend the class of problems that the theory solver can robustly support, we borrow and adapt several techniques from linear programming (LP) and mixed integer programming (MIP) solvers which come from the tradition of optimization. We propose a new decision procedure for quantifier-free linear real arithmetic that replaces the Simplex for SMT algorithm with a variant of the Simplex algorithm that performs a form of optimization – minimizing the sum of infeasibilties. In this thesis, we additionally describe techniques for leveraging LP and MIP solvers to improve the performance of SMT solvers without compromising correctness. Previous efforts to leverage such solvers in the context of SMT have concluded that in addition to being potentially unsound, such solvers are too heavyweight to compete in the context of SMT. We present an empirical comparison against other state-of-the-art SMT tools to demonstrate the effectiveness of the proposed solutions.</p>
|
94 |
Positive-Unlabeled Learning in the Context of Protein Function PredictionYoungs, Noah 19 December 2014 (has links)
<p> With the recent proliferation of large, unlabeled data sets, a particular subclass of semisupervised learning problems has become more prevalent. Known as positive-unlabeled learning (PU learning), this scenario provides only positive labeled examples, usually just a small fraction of the entire dataset, with the remaining examples unknown and thus potentially belonging to either the positive or negative class. Since the vast majority of traditional machine learning classifiers require both positive and negative examples in the training set, a new class of algorithms has been developed to deal with PU learning problems.</p><p> A canonical example of this scenario is topic labeling of a large corpus of documents. Once the size of a corpus reaches into the thousands, it becomes largely infeasible to have a curator read even a sizable fraction of the documents, and annotate them with topics. In addition, the entire set of topics may not be known, or may change over time, making it impossible for a curator to annotate which documents are NOT about certain topics. Thus a machine learning algorithm needs to be able to learn from a small set of positive examples, without knowledge of the negative class, and knowing that the unlabeled training examples may contain an arbitrary number of additional but as yet unknown positive examples. </p><p> Another example of a PU learning scenario recently garnering attention is the protein function prediction problem (PFP problem). While the number of organisms with fully sequenced genomes continues to grow, the progress of annotating those sequences with the biological functions that they perform lags far behind. Machine learning methods have already been successfully applied to this problem, but with many organisms having a small number of positive annotated training examples, and the lack of availability of almost any labeled negative examples, PU learning algorithms have the potential to make large gains in predictive performance.</p><p> The first part of this dissertation motivates the protein function prediction problem, explores previous work, and introduces novel methods that improve upon previously reported benchmarks for a particular type of learning algorithm, known as Gaussian Random Field Label Propagation (GRFLP). In addition, we present improvements to the computational efficiency of the GRFLP algorithm, and a modification to the traditional structure of the PFP learning problem that allows for simultaneous prediction across multiple species.</p><p> The second part of the dissertation focuses specifically on the positive-unlabeled aspects of the PFP problem. Two novel algorithms are presented, and rigorously compared to existing PU learning techniques in the context of protein function prediction. Additionally, we take a step back and examine some of the theoretical considerations of the PU scenario in general, and provide an additional novel algorithm applicable in any PU context. This algorithm is tailored for situations in which the labeled positive examples are a small fraction of the set of true positive examples, and where the labeling process may be subject to some type of bias rather than being a random selection of true positives (arguably some of the most difficult PU learning scenarios).</p><p> The third and fourth sections return to the PFP problem, examining the power of tertiary structure as a predictor of protein function, as well as presenting two case studies of function prediction performance on novel benchmarks. Lastly, we conclude with several promising avenues of future research into both PU learning in general, and the protein function prediction problem specifically. </p>
|
95 |
Social and Emotional Characteristics of Speech-based In-Vehicle Information Systems : Impact on Attitude and Driving Behaviour / Sociala och Emotionella Egenskaper hos Talbaserade Informationssystem för Bilar : Effekter på Bilförares Attityder och KörbeteendenJonsson, Ing-Marie January 2009 (has links)
Modern vehicles use advanced information systems in vehicles to provide and control a wide variety of functions and features. Even modest vehicles today are equipped with systems that control diverse functions from air-conditioning to high quality audio/video systems. Since driving requires the use of eyes and hands, voice interaction has become more widely used by in-vehicle systems. Due to the technical complexity involved in voice recognition, focus has been on issues of speech ecognition. Speech generation is comparatively simple, but what effect does the choice of voice have on the driver? We know from human-human interaction that social cues of the voice itself influence attitude and interpretation of information. Introducing speech based communication with the car changes the relationship between driver and vehicle. So, for in-vehicle information systems, does the spoken voice matter? The work presented in this thesis studies the effects of the voice used by invehicle systems. A series of studies were used to answer the following questions: Do the characteristics of voices used by an in-vehicle system affect driver’s attitude? Do the characteristics of voice used by an in-vehicle system affect driver’s performance? Are social reactions to voice communication the same in the car environment as in the office environment? Results show that voices do matter! Voices trigger social and emotional effects that impact both attitude and driving performance. Moreover, there is not one effective voice that works for all drivers. Therefore an in-vehicle system that knows its driver and possibly adapts to its driver can be the most effective. Finally, an interesting observation from these studies is that social reactions to voice communication in the car are different than in the office, Similarity attraction, an otherwise solid finding in social science, did not hold all studies. It is hypothesized that this difference can be related to the different kinds of task demands when driving a car or working in an office environment. / Dagens fordon är oftast utrustade med avancerade informationssystem för att tillhandahålla och styra ett brett utbud av funktioner och tjänster från klimatsystem till audio- och videosystem. Eftersom bilkörning kräver förarens ögon på vägen och händer på ratten så blir talstyrda system allt vanligare. Utveckling av talbaserade bilsystem är oftast fokuserad på taligenkänningen, eftersom generering av förståeligt tal är jämförelsevis enkelt. Men vilken inverkan får den valda rösten? Vi vet att röster påverkar såväl attityd som tolkning av information i kommunikation mellan människor. Införandet av talbaserade bilsystem kommer därför att förändra relationen mellan förare och fordon. Frågan är dock hur röstens egenskaper påverkar föraren? Det arbete som presenteras här studerar effekterna av de röster som används av talsystem i bilar, där en rad studier utförts för att besvara följande frågor: Påverkar egenskaperna hos en röst i ett bilinformationsystem förarens attityd? Påverkar egenskaperna hos en röst i ett bilinformationsystem förarens körförmåga? Är reaktioner på talbaserade system i bilar desamma som för talbaserade system i hem- eller kontorsmiljö? Resultaten visar att röster påverkar förare! Röster utlöser sociala och emotionella reaktioner som påverkar både attityd och körbeteende. En röst passar dessutom inte alla! Ett bilsystem som känner och anpassar sig till sin förares sinnesstämning är mest effektivt. Dessutom visar resultaten att det förefaller som om reaktioner på röster och talbaserade system inte är densamma i bilar som i hem eller kontor. Så kallad similarity-attraction – att man attraheras av personer eller egenskaper om liknar en själv, en faktor som vanligtivs visats spela stor roll – observerades inte i alla studier. En hypotes är att skillnad i reaktion kan vara relaterad till arten av uppgift – att köra bil ställer andra krav än arbete i hem- eller kontorsmiljö.
|
96 |
Multi view image : surveillance and trackingBlack, James January 2004 (has links)
No description available.
|
97 |
A new visual query language and query optimization for mobile GPSElsidani Elariss, Haifa January 2008 (has links)
In recent years computer applications have been deployed to manage spatial data with Geographic Information Systems (GIS) to store and analyze data related to domains such as transportation and tourism. Recent developments have shown that there is an urgent need to develop systmes for mobile devices and particularly for Location Based Services (LBS) such as proximity analysis that helps in finding the nearest neighbors, for example. restaurant, and the facilities that are located within a circle area around the user's location, known as a buffer area, for example, all restaurants within 100 meters. The mobile market potential is across geographical and cultural boundaries. Hence the visualization of queries becomes important especially that the existing visual query languages have a number of limitations. They are not tailored for mobile GIS and they do not support dynamic complex queries (DCQ) and visual query formation. Thus, the first aim of this research is to develop a new visual query language (IVQL) for mobile GIS that handles static and DCQ for proximity analysis. IVQL is designed and implemented using smiley icons that visualize operators, values, and objects. The evaluation results reveal that it has an expressive power, easy-to-use user interface, easy query building, and a high user satisfaction. There is also a need that new optimization strategies consider the scale of mobile user queries. Existing query optimization strategies are based on the sharing and push-down paradigms and they do not cover multiple-DCQ (MDCQ) for proximity analysis. This leads to the second aim of this thesis which is to develop the query melting processor (QMP) that is responsible for processing MDCQs. QMP is based on the new Query Melting paradigm which consists of the sharing paradigm, query optimization, and is implemented by a new strategy "Melting Ruler". Moreover, with the increase in volume of cost sensitive mobile users, the need emerges to develop a time cost optimizer for processing MDCQs. Thus, the thirs aim of the thesis is to develop a new Decision Making Mechanism for time cost optimization (TCOP) and prove its cost effectiveness. TCOP is based on the new paradigm "Sharing global execution plans by MDCQs with similar scenarios". The experimental evaluation results, using a case study based on the map of Paris, proved that significant saving in time can be achieved by employing the newly developed strategies.
|
98 |
Genomic signal processing for enhanced microarray data clusteringSungoor, Ala M. H. January 2009 (has links)
Genomic signal processing is a new area of research that combines genomics with digital signal processing methodologies for enhanced genetic data analysis. Microarray is a well known technology for the evaluation of thousands of gene expression profiles. By considering these profiles as digital signals, the power of DSP methods can be applied to produce robust and unsupervised clustering of microarray samples. This can be achieved by transferring expression profiles into spectral components which are interpreted as a measure of profile similarity. This thesis introduces enhanced signal processing algorithms for robust clustering of micro array gene expression samples. The main aim of the research is to design and validate novel genomic signal processing methodologies for micro array data analysis based on different DSP methods. More specifically, clustering algorithms based on Linear prediction coding, Wavelet decomposition and Fractal dimension methods combined with Vector quantisation algorithm are applied and compared on a set of test microarray datasets. These techniques take as an input microarray gene expression samples and produce predictive coefficients arrays associated to the microarray data that are quantised in discrete levels, and consequently used for sample clustering. A variety of standard micro array datasets are used in this work to validate the robustness of these methods compared to conventional methods. Two well known validation approaches, i.e. Silhouette and Davies Bouldin index methods, are applied to evaluate internally and externally the genomic signal processing clustering results. In conclusion, the results demonstrate that genomic signal processing based methods outperform traditional methods by providing more clustering accuracy. Moreover, the study shows that the local features of the gene expression signals are better clustered using wavelets compared to the other DSP methods.
|
99 |
Investigating optical flow and tracking techniques for recovering motion within image sequencesCorvee, Etienne January 2005 (has links)
Analysing objects interacting in a 3D environment and captured by a video camera requires knowledge of their motions. Motion estimation provides such information, and consists of re-covering 2D image velocity, or optical flow, of the corresponding moving 3D objects. A gradient-based optical flow estimator is implemented in this thesis to produce a dense field of velocity vectors across an image. An iterative and parameterised approach is adopted which fits planar motion models locally on the image plane. Motion is then estimated using a least-squares minimisation approach. The possible approximations of the optical flow derivative are shown to differ greatly when the magnitude of the motion increases. However, the widely used derivative term remains the optimal approximation to use in the range of accuracies of the gradient-based estimators i.e. small motion magnitudes. Gradient-based estimators do not estimate motion robustly when noise, large motions and multiple motions are present across object boundaries. A robust statistical and multi-resolution estimator is developed in this study to address these limitations. Despite significant improvement in performance, the multiple motion problem remains a major limitation. A confidence measurement is designed around optical flow covariance to represent motion accuracy, and is shown to visually represent the lack of robustness across motion boundaries. The recent hyperplane technique is also studied as a global motion estimator but proved unreliable compared to the gradient-based approach. A computationally expensive optical flow estimator is then designed for the purpose of detecting at frame-rate moving objects occluding background scenes which are composed of static objects captured by moving pan and tilt cameras. This was achieved by adapting the estimator to perform global motion estimation i.e. estimating the motion of the background scenes. Moving objects are segmented from a thresholding operation on the grey-level differences between motion compensated background frames and captured frames. Filtering operations on small object dimensions and using moving edge information produced reliable results with small levels of noise. The issue of tracking moving objects is studied with the specific problem of data correspondence in occlusion scenarios.
|
100 |
Motion estimation and segmentation of colour image sequencesAmanatidis, Dimitrios E. January 2008 (has links)
The principal objective of this thesis is to develop improved motion estimation and segmentation techniques that meet the image-processing requirements of the post¬production industry. Starting with a rigorous taxonomy of existing image segmentation techniques, we proceed by focusing on motion estimation by means of optical flow calculation. A parametric motion model based method to estimate optical flow fields on three consecutive frames is developed and tested on a number of colour real sequences. Initial estimates are robustly refined in an iterative scheme and are enhanced by colour probability distribution information to enable foreground/background segmentation in a maximum a posteriori pixel classification scheme. Experiments, . show the significant contribution of the colour part towards a well-segmented image.Additionally, a very accurate variational optical flow computation method based on brightness constancy, gradient constancy and spatiotemporal smoothness constraints is modified and implemented so that it can robustly estimate global motion over three consecutive frames. Motion is enhanced by colour evidence in a similar manner and the method adopts the same probabilistic labelling procedure. After a comparison of the two methods on the same colour sequences, a third neural network based method is implemented, which initially estimates motion by employing two twin-layer optical flow calculating Gellular Neural Networks and proceeds in a similar manner, (incorporating colour information and probabilistic ally classifying pixels), leading to similar or improved quality results with the added advantage of significantly accelerated performance. Moreover, another CNN is employed with the task of offering spatial and temporal pixel compatibility constraint support, further improving the quality of the segmented images. Weights are used to control the respective contributing terms enabling optimization of the segmentation results for each sequence individually. Finally, as a case study of CNN implementation in hardware (FPGA), the use of Handel-G, a C-like, high-level, parallel, hardware description language, is exploited to allow for rapid translation of our algorithms to efficient hardware.
|
Page generated in 0.0727 seconds