521 |
Towards faster web page loads over multiple network pathsSalameh, Lynne January 2018 (has links)
The rising popularity of mobile devices as the main way people access the web has fuelled a corresponding need for faster web downloads on these devices. Emerging web protocols like HTTP/2 and QUIC employ several features that minimise page load times, but fail to take advantage of the availability of at least two interfaces on today's mobile devices. On the other hand, this spread of devices with access to multiple paths has prompted the design of Multipath TCP, a transport protocol that pools bandwidth across these paths. Although MPTCP was originally evaluated for bandwidth limited bulk transfers, in this work, we determine whether using MPTCP can reduce web page load times, which are often latency bound. To investigate the behaviour of web browsing over MPTCP, we instrumented the Chrome web browser's retrieval of 300 popular web sites in sufficient detail, and computed their dependency graph structure. Furthermore, we implemented PCP, an emulation framework that uses these dependency graphs to ask "what-if" questions about the interactions between a wide range of web site designs, varied network conditions, and different web and transport protocols. Using PCP, we first confirm previous results with respect to the improvements HTTP/2 offers over HTTP/1.1. One obstacle, though, is that many web sites have been sharded to improve performance with HTTP/1.1, spreading their content across multiple subdomains. We therefore examine whether the advice to unshard these domains is beneficial. We find that unsharding is generally advantageous, but is not consistently so. Finally, we examine the behaviour of HTTP/2 over MPTCP. We find that MPTCP can improve web page load times under some regimes; in other cases, using regular TCP on the "best" path is more advantageous. We present enhancements to multipath web browsing that allow it to perform as well as or better than regular TCP on the best path.
|
522 |
Inferring user needs and tasks from user interactionsMehrotra, R. January 2018 (has links)
The need for search often arises from a broad range of complex information needs or tasks (such as booking travel, buying a house, etc.) which lead to lengthy search processes characterised by distinct stages and goals. While existing search systems are adept at handling simple information needs, they offer limited support for tackling complex tasks. Accurate task representations could be useful in aptly placing users in the task-subtask space and enable systems to contextually target the user, provide them better query suggestions, personalization and recommendations and help in gauging satisfaction. The major focus of this thesis is to work towards task based information retrieval systems - search systems which are adept at understanding, identifying and extracting tasks as well as supporting user’s complex search task missions. This thesis focuses on two major themes: (i) developing efficient algorithms for understanding and extracting search tasks from log user and (ii) leveraging the extracted task information to better serve the user via different applications. Based on log analysis on a tera-byte scale data from a real-world search engine, detailed analysis is provided on user interactions with search engines. On the task extraction side, two bayesian non-parametric methods are proposed to extract subtasks from a complex task and to recursively extract hierarchies of tasks and subtasks. A novel coupled matrix-tensor factorization model is proposed that represents user based on their topical interests and task behaviours. Beyond personalization, the thesis demonstrates that task information provides better context to learn from and proposes a novel neural task context embedding architecture to learn query representations. Finally, the thesis examines implicit signals of user interactions and considers the problem of predicting user’s satisfaction when engaged in complex search tasks. A unified multi-view deep sequential model is proposed to make query and task level satisfaction prediction.
|
523 |
Editing fluid simulations with jet particlesHodgson, Julian January 2018 (has links)
Fluid simulation is an important topic in computer graphics in the pursuit of adding realism to films, video games and virtual environments. The results of a fluid simulation are hard to edit in a way that provide a physically plausible solution. Edits need to preserve the incompressibility condition in order to create natural looking water and smoke simulations. In this thesis we present an approach that allows a simple artist-friendly interface for designing and editing complex fluid-like flows that are guaranteed to be incompressible in two and three dimensions. Key to our method is a formulation for the design of flows using jet particles. Jet particles are Lagrangian solutions to a regularised form of Euler’s equations, and their velocity fields are divergence-free which motivates their use in computer graphics. We constrain their dynamics to design divergence-free flows and utilise them effectively in a modern visual effects pipeline. Using just a handful of jet particles we produce visually convincing flows that implicitly satisfy the incompressibility condition. We demonstrate an interactive tool in two dimensions for designing a range of divergence-free deformations. Further we describe methods to couple these flows with existing simulations in order to give the artist creative control beyond the initial outcome. We present examples of local temporal edits to smoke simulations in 2D and 3D. The resulting methods provide promising new ways to design and edit fluid-like deformations and to create general deformations in 3D modelling. We show how to represent existing divergence-free velocity fields using jet particles, and design new vector fields for use in fluid control applications. Finally we provide an efficient implementation for deforming grids, meshes, volumes, level sets, vectors and tensors, given a jet particle flow.
|
524 |
Optimal real-time bidding for display advertisingZhang, W. January 2016 (has links)
Real-Time Bidding (RTB) is revolutionising display advertising by facilitating a real-time auction for each ad impression. As they are able to use impression-level data, such as user cookies and context information, advertisers can adaptively bid for each ad impression. Therefore, it is important that an advertiser designs an effective bidding strategy which can be abstracted as a function - mapping from the information of a specific ad impression to the bid price. Exactly how this bidding function should be designed is a non-trivial problem. It is a problem which involves multiple factors, such as the campaign-specific key performance indicator (KPI), the campaign lifetime auction volume and the budget. This thesis is focused on the design of automatic solutions to this problem of creating optimised bidding strategies for RTB auctions: strategies which are optimal, that is, from the perspective of an advertiser agent - to maximise the campaign's KPI in relation to the constraints of the auction volume and the budget. The problem is mathematically formulated as a functional optimisation framework where the optimal bidding function can be derived without any functional form restriction. Beyond single-campaign bid optimisation, the proposed framework can be extended to multi-campaign cases, where a portfolio-optimisation solution of auction volume reallocation is performed to maximise the overall profit with a controlled risk. On the model learning side, an unbiased learning scheme is proposed to address the data bias problem resulting from the ad auction selection, where we derive a "bid-aware'' gradient descent algorithm to train unbiased models. Moreover, the robustness of achieving the expected KPIs in a dynamic RTB market is solved with a feedback control mechanism for bid adjustment. To support the theoretic derivations, extensive experiments are carried out based on large-scale real-world data. The proposed solutions have been deployed in three commercial RTB systems in China and the United States. The online A/B tests have demonstrated substantial improvement of the proposed solutions over strong baselines.
|
525 |
The blind software engineer : improving the non-functional properties of software by means of genetic improvementBruce, Bobby R. January 2018 (has links)
Life, even in its most basic of forms, continues to amaze mankind with the complexity of its design. When analysing this complexity it is easy to see why the idea of a grand designer has been such a prevalent idea in human history. If it is assumed intelligence is required to undertake a complex engineering feat, such as developing a modern computer system, then it is logical to assume a creature, even as basic as an earthworm, is the product of an even greater intelligence. Yet, as Darwin observed, intelligence is not a requirement for the creation of complex systems. Evolution, a phenomenon without consciousness or intellect can, over time, create systems of grand complexity and order. From this observation a question arises - is it possible to develop techniques inspired by Darwinian evolution to solve engineering problems without engineers? The first to ask such a question was Alan Turing, a person considered by many to be the father of computer science. In 1948 Turing proposed three approaches he believed could solve complex problems without the need for human intervention. The first was a purely logicdriven search. This arose a decade later in the form of general problem-solving algorithms. Though successful in solving toy problems which could be sufficiently formalised, solving real-world problems was found to be infeasible. The second approach Turing called 'cultural search'. This approach would store libraries of information to then reference and provide solutions to particular problems in accordance to this information. This is similar to what we would now refer to as an expert system. Though the first expert system is hard to date due to differences in definition, the development is normally attributed to Feigenbaum, Bachanan, Lederberg, and Sutherland for their work, originating in the 1960s, on the DENRAL system. Turing's last proposal was an iterative, evolutionary technique which he later expanded on stating: "We cannot expect to find a good child-machine at the first attempt. One must experiment with teaching one machine and see how well it learns. One can then try another and see if it is better or worse. There is an obvious connection between this process and evolution". Though a primitive proposal in comparison to modern techniques, Turing clearly identified the foundation of what we now refer to as Evolutionary Computation (EC). EC borrows principles from biological evolution and adapts them for use in computer systems. Despite EC initially appearing to be an awkward melding between the two perpendicular disciplines of biology and computer science, useful ideas from evolutionary theory can be utilised in engineering processes. Just as man dreamt of flight from watching birds, EC researchers dream of self-improving systems from observing evolutionary processes. Despite these similarities, evolutionary inspired techniques in computer science have yet to build complex software systems from scratch. Though they have been successfully utilised to solve complex problems, such as classification and clustering, there is a general acceptance that, as in nature, these evolutionary processes take vast amounts of time to create complex structures from simple starting points. Even the best computer systems cannot compete with nature's ability to evaluate many millions of variants in parallel over the course of millennia. It is for this reason research into modifying and optimising already existing software, a process known as Genetic Improvement, has blossomed. Genetic Improvement (commonly referred to as 'GI') modifies existing software using search-based techniques with respect to some objective. These search-based techniques are typically evolutionary and, if not, are based on iterative improvement which we may view as a form of evolution. GI sets out to solve the 'last mile' problems of software development; problems that arise in software engineering close to completion, such as bugs or sub-optimal performance. It is the genetic improvement of non-functional properties, such as execution time and energy consumption, which we concern ourselves with in this thesis, as we find it to be the area of research which is the most interesting, and the most exciting. It is hoped that those referencing this thesis may share the same vision: that the genetic improvement of non-functional properties has the potential to transform software development, and that the work presented here is a step towards that goal. The thesis is divided into six chapters (inclusive of this 'Introduction' chapter). In Chapter 2 we explain the background material necessary to understand the content discussed later in the following chapters. From this, in Chapter 3, we highlight our investigations into the novel nonfunctional property of energy consumption which, in part, includes a study in how energy may be reduced via the approximation of output. We then expand on this in Chapter 4 by discussing our investigations into the applicability of GI in the domain of approximate computing, which covers a study into optimising the non-functional properties of software running on novel hardware - in this case, Android tablet devices. We then show, in Chapter 5, early research into how GI may be used to specialise software for specific hardware targets; in particular, how GI may automatically modify sequential code to run on GPUs. Finally, in Chapter 6 we discuss what relevant work is currently being undertaken by using the area of genetic improvement, and provide the reader with clear and concise take-away messages from this thesis.
|
526 |
Learn to automate GUI tasks from demonstrationIntharah, Thanapong January 2018 (has links)
This thesis explores and extends Computer Vision applications in the context of Graphical User Interface (GUI) environments to address the challenges of Programming by Demonstration (PbD). The challenges are explored in PbD which could be addressed through innovations in Computer Vision, when GUIs are treated as an application domain, analogous to automotive or factory settings. Existing PbD systems were restricted by domain applications or special application interfaces. Although they use the term Demonstration, the systems did not actually see what the user performs. Rather they listen to the demonstrations through internal communications via operating system. Machine Vision and Human in the Loop Machine Learning are used to circumvent many restrictions, allowing the PbD system to watch the demonstration like another human observer would. This thesis will demonstrate that our prototype PbD systems allow non-programmer users to easily create their own automation scripts for their repetitive and looping tasks. Our PbD systems take their input from sequences of screenshots, and sometimes from easily available keyboard and mouse sniffer software. It will also be shown that the problem of inconsistent human demonstration can be remedied with our proposed Human in the Loop Computer Vision techniques. Lastly, the problem is extended to learn from demonstration videos. Due to the sheer complexity of computer desktop GUI manipulation videos, attention is focused on the domain of video game environments. The initial studies illustrate that it is possible to teach a computer to watch gameplay videos and to estimate what buttons the user pressed.
|
527 |
Inferring the global financial network from high-dimensional time-series of stock returnsTungsong, Sachapon January 2018 (has links)
Connectedness in a financial network refers to the structure of interlinkages among the financial institutions which encompasses three aspects: which institutions are linked, how many of the institutions are linked, and the magnitude of the linkages. This research measures time-varying connectedness in the global financial network using the following two frameworks: (1) vector autoregression-forecast error variance decomposition (VAR-FEVD) and (2) information filtering network-based algorithm LoGo-TMFG. In the first framework we construct a full connectedness network where each of the financial institutions is linked to the others using VAR. On the contrary, in the second framework we construct a sparse connectedness network where only significant links are kept and insignificant links are put to zeros using LoGo- TMFG, which is a novel sparse modeling algorithm. We show that both frameworks reveal strong variations of connectedness during past crises, but the connectedness measure computed on the sparse network can distinguish major crises better than that computed on the full network. This suggests that sparse modeling using the LoGo-TMFG algorithm increases the signal-to-noise ratio in the data and improves interpretability of the connectedness measure, which leads to better statistical inference of the result. In the first framework we analyze bank returns in North America, the European Union, and Southeast Asia from 2005 to 2016. We find that the North American system has the highest connectedness, suggesting that it is the most interconnected system. We perform Granger causality and transfer entropy tests which indicate that the connectedness of the North American system led that of the EU and Southeast Asia. Through our analysis we make technical improvements to the VAR-FEVD methodology and deal with the issues of outliers and overfitting of the VAR model. In the second framework we study rolling windows of high dimensional datasets comprising companies in the financial sector (GICS 40) globally from 1990 to 2016. Analyzing the global financial network as a system of ten economic regions, we find that the regions become more interconnected over time as evidenced by the increase in the number and size of inter-regional links. In addition, the regions are more interconnected during crises than during normal periods. North America and Europe, the two dominant regions, were connected to all other regions over the sample period from 1990 to 2016 and the links between the two regions were much stronger than those between the other regions. We find that North America, especially the U.S., was dominated by banks (GICS 4010) as they were the most impactful and vulnerable industry throughout the entire sample period. For the other regions, the dominant industry alternates between diversified financials (GICS 4020) and banks (GICS 4010). In this framework we contribute to the literature by addressing high dimensionality in financial data using the novel LoGo-TMFG algorithm which is the first application of the algorithm in connectedness measurement. In addition, our datasets are unique and much larger than those in other studies, where each rolling window contains up to 4,310 financial companies. By analyzing rolling windows of data, each of which contains companies that were active during the three-year period, we address the survival bias issue that many other studies do not. Our research findings are beneficial especially for policy makers, e.g., the central banks, who can use our connectedness metrics to enhance systemic risk monitoring. Practitioners in the macro research or macro trading desks at a bank or asset manager can also make use of both the methodologies we used as well as the research findings.
|
528 |
Representation learning for anomaly detection in computer visionAndrews, Jerone Theodore Alexander January 2018 (has links)
This thesis is a collection of three engineering-based research contributions, aiming to detect anomalous images, without a priori knowledge of the anomaly class. However, devising discriminative data representations in such settings is patently problematic. Obviating the need for explicit prior domain knowledge, this work roots itself in representation learning, using deep convolutional neural networks, charged with solving pseudo tasks. To begin, we investigate unsupervised auto-associative sparse dictionary learning to infer a set of basic elements. Significantly, we show that these elements are not unique to the training data and can be utilised for the faithful reconstruction of anomalous images. Furthermore, we highlight that encoded representations do not always improve upon those in raw pixel space. Moving away from reconstructive-based approaches, in our second contribution, we propose a novel deep distance metric learning approach, generating freely available supervisory signals that exist within visual data. Importantly, we demonstrate that the learnt appearance features can be effectively combined with generic pretrained image representations. Finally, premised on the notion that learning to recognise one kind of object assists with identifying another, we explore supervised inductive transfer learning. Representations are induced by learning to discriminate between different sub-concepts of the normal data, using fine-grained semantic labels. By forming a distribution over the sub-concepts of the normal class, we are able to detect previously unseen samples that deviate from the overarching concept. Notably, we show that current out-of-distribution detectors which utilise the maximum softmax probability, as an anomaly score, are incapable of illuminating the similarity of a novel sample to a universal concept of normality.
|
529 |
Software restructuring : understanding longitudinal architectural changes and refactoringPaixao, Matheus January 2018 (has links)
The complexity of software systems increases as the systems evolve. As the degradation of the system's structure accumulates, maintenance effort and defect-proneness tend to increase. In addition, developers often opt to employ sub-optimal solutions in order to achieve short-time goals, in a phenomenon that has been recently called technical debt. In this context, software restructuring serves as a way to alleviate and/or prevent structural degradation. Restructuring of software is usually performed in either higher or lower levels of granularity, where the first indicates broader changes in the system's structural architecture and the latter indicates refactorings performed to fewer and localised code elements. Although tools to assist architectural changes and refactoring are available, there is still no evidence these approaches are widely adopted by practitioners. Hence, an understanding of how developers perform architectural changes and refactoring in their daily basis and in the context of the software development processes they adopt is necessary. Current software development is iterative and incremental with short cycles of development and release. Thus, tools and processes that enable this development model, such as continuous integration and code review, are widespread among software engineering practitioners. Hence, this thesis investigates how developers perform longitudinal and incremental architectural changes and refactoring during code review through a wide range of empirical studies that consider different moments of the development lifecycle, different approaches, different automated tools and different analysis mechanisms. Finally, the observations and conclusions drawn from these empirical investigations extend the existing knowledge on how developers restructure software systems, in a way that future studies can leverage this knowledge to propose new tools and approaches that better fit developers' working routines and development processes.
|
530 |
An algorithmic investigation of conviction narrative theory : applications in business, finance and economicsNyman, R. B. E. January 2016 (has links)
The thesis aims to make conviction narrative theory (CNT) operational and test its validity via a combination of text analysis, network analysis and machine learning techniques. CNT is a theory of decision-making asserting that, when faced with uncertainty, agents are able to act by constructing narratives that yield conviction. The developed methodology is directed by CNT and therefore limits problems related to spurious correlations frequently encountered in studies using large datasets. The thesis provides empirical support of the theory and how it can be used to understand the economy and financial markets. The thesis develops a relative sentiment shift (RSS) methodology that captures emotional variables within text archives and also tests the extent to which these can be accurately measured, to establish causal economic and financial relationships hypothesised by the theory to exist on a macro level. Better-than-economic-consensus forecasts of the Michigan Consumer Confidence index, statistically significant explanatory power of real US GDP growth, evidence of causality from relative sentiment to the most widely used measure of financial market volatility, the VIX, are obtained in the process. On the micro level, the RSS methodology is applied to particular narratives to test theoretical expectations showing how it can be used to measure the emergence of phantastic object narratives, narratives for which anxiety substantially disappears despite the existence of conflicting evidence. To illustrate the importance of the overall ecology of narratives to understand shifts in macro sentiment and financial stability, as well as a means to qualitatively understand the relation between the macro and the micro approach, a form of dynamic content network analysis is applied. Using the narrative model, measures of the degree of formation of a dominant narrative are shown to correlate with RSS and Granger-cause indicators of financial stability, such as the VIX and the S&P 500 index.
|
Page generated in 0.0325 seconds