31 |
Novel Deep Learning Models for Spatiotemporal Predictive TasksLe, Quang 23 November 2022 (has links)
Spatiotemporal Predictive Learning (SPL) is an essential research topic involving many practical and real-world applications, e.g., motion detection, video generation, precipitation forecasting, and traffic flow prediction. The problems and challenges of this field come from numerous data characteristics in both time and space domains, and they vary depending on the specific task. For instance, spatial analysis refers to the study of spatial features, such as spatial location, latitude, elevation, longitude, the shape of objects, and other patterns. From the time domain perspective, the temporal analysis generally illustrates the time steps and time intervals of data points in the sequence, also known as interval recording or time sampling. Typically, there are two types of time sampling in temporal analysis: regular time sampling (i.e., the time interval is assumed to be fixed) and the irregular time sampling (i.e., the time interval is considered arbitrary) related closely to the continuous-time prediction task when data are in continuous space. Therefore, an efficient spatiotemporal predictive method has to model spatial features properly at the given time sampling types.
In this thesis, by taking advantage of Machine Learning (ML) and Deep Learning (DL) methods, which have achieved promising performance in many complicated computational tasks, we propose three DL-based models used for Spatiotemporal Sequence Prediction (SSP) with several types of time sampling. First, we design the Trajectory Gated Recurrent Unit Attention (TrajGRU-Attention) with novel attention mechanisms, namely Motion-based Attention (MA), to improve the performance of the standard Convolutional Recurrent Neural Networks (ConvRNNs) in the SSP tasks. In particular, the TrajGRU-Attention model can alleviate the impact of the vanishing gradient, which leads to the blurry effect in the long-term predictions and handle both regularly sampled and irregularly sampled time series. Consequently, this model can work effectively with different scenarios of spatiotemporal sequential data, especially in the case of time series with missing time steps. Second, by taking the idea of Neural Ordinary Differential Equations (NODEs), we propose Trajectory Gated Recurrent Unit integrating Ordinary Differential Equation techniques (TrajGRU-ODE) as a continuous time-series model. With Ordinary Differential Equation (ODE) techniques and the TrajGRU neural network, this model can perform continuous-time spatiotemporal prediction tasks and generate resulting output with high accuracy. Compared to TrajGRU-Attention, TrajGRU-ODE benefits from the development of efficient and accurate ODE solvers. Ultimately, we attempt to combine those two models to create TrajGRU-Attention-ODE. NODEs are still in their early stage of research, and recent ODE-based models were designed for many relatively simple tasks. In this thesis, we will train the models with several video datasets to verify the ability of the proposed models in practical applications.
To evaluate the performance of the proposed models, we select four available spatiotemporal datasets based on the complexity level, including the MovingMNIST, MovingMNIST++, and two real-life datasets: the weather radar HKO-7 and KTH Action. With each dataset, we train, validate, and test with distinct types of time sampling to justify the prediction ability of our models. In summary, the experimental results on the four datasets indicate the proposed models can generate predictions properly with high accuracy and sharpness. Significantly, the proposed models outperform state-of-the-art ODE-based approaches under SSP tasks with different circumstances of interval recording.
|
32 |
On the use of $\alpha$-stable random variables in Bayesian bridge regression, neural networks and kernel processes.pdfJorge E Loria (18423207) 23 April 2024 (has links)
<p dir="ltr">The first chapter considers the l_α regularized linear regression, also termed Bridge regression. For α ∈ (0, 1), Bridge regression enjoys several statistical properties of interest such</p><p dir="ltr">as sparsity and near-unbiasedness of the estimates (Fan & Li, 2001). However, the main difficulty lies in the non-convex nature of the penalty for these values of α, which makes an</p><p dir="ltr">optimization procedure challenging and usually it is only possible to find a local optimum. To address this issue, Polson et al. (2013) took a sampling based fully Bayesian approach to this problem, using the correspondence between the Bridge penalty and a power exponential prior on the regression coefficients. However, their sampling procedure relies on Markov chain Monte Carlo (MCMC) techniques, which are inherently sequential and not scalable to large problem dimensions. Cross validation approaches are similarly computation-intensive. To this end, our contribution is a novel non-iterative method to fit a Bridge regression model. The main contribution lies in an explicit formula for Stein’s unbiased risk estimate for the out of sample prediction risk of Bridge regression, which can then be optimized to select the desired tuning parameters, allowing us to completely bypass MCMC as well as computation-intensive cross validation approaches. Our procedure yields results in a fraction of computational times compared to iterative schemes, without any appreciable loss in statistical performance.</p><p><br></p><p dir="ltr">Next, we build upon the classical and influential works of Neal (1996), who proved that the infinite width scaling limit of a Bayesian neural network with one hidden layer is a Gaussian process, when the network weights have bounded prior variance. Neal’s result has been extended to networks with multiple hidden layers and to convolutional neural networks, also with Gaussian process scaling limits. The tractable properties of Gaussian processes then allow straightforward posterior inference and uncertainty quantification, considerably simplifying the study of the limit process compared to a network of finite width. Neural network weights with unbounded variance, however, pose unique challenges. In this case, the classical central limit theorem breaks down and it is well known that the scaling limit is an α-stable process under suitable conditions. However, current literature is primarily limited to forward simulations under these processes and the problem of posterior inference under such a scaling limit remains largely unaddressed, unlike in the Gaussian process case. To this end, our contribution is an interpretable and computationally efficient procedure for posterior inference, using a conditionally Gaussian representation, that then allows full use of the Gaussian process machinery for tractable posterior inference and uncertainty quantification in the non-Gaussian regime.</p><p><br></p><p dir="ltr">Finally, we extend on the previous chapter, by considering a natural extension to deep neural networks through kernel processes. Kernel processes (Aitchison et al., 2021) generalize to deeper networks the notion proved by Neal (1996) by describing the non-linear transformation in each layer as a covariance matrix (kernel) of a Gaussian process. In this way, each succesive layer transforms the covariance matrix in the previous layer by a covariance function. However, the covariance obtained by this process loses any possibility of representation learning since the covariance matrix is deterministic. To address this, Aitchison et al. (2021) proposed deep kernel processes using Wishart and inverse Wishart matrices for each layer in deep neural networks. Nevertheless, the approach they propose requires using a process that does not emerge from the limit of a classic neural network structure. We introduce α-stable kernel processes (α-KP) for learning posterior stochastic covariances in each layer. Our results show that our method is much better than the approach proposed by Aitchison et al. (2021) in both simulated data and the benchmark Boston dataset.</p>
|
33 |
Optimizing Bike Sharing Systems: Dynamic Prediction Using Machine Learning and Statistical Techniques and RebalancingAlmannaa, Mohammed Hamad 07 May 2019 (has links)
The large increase in on-road vehicles over the years has resulted in cities facing challenges in providing high-quality transportation services. Traffic jams are a clear sign that cities are overwhelmed, and that current transportation networks and systems cannot accommodate the current demand without a change in policy, infrastructure, transportation modes, and commuter mode choice. In response to this problem, cities in a number of countries have started putting a threshold on the number of vehicles on the road by deploying a partial or complete ban on cars in the city center. For example, in Oslo, leaders have decided to completely ban privately-owned cars from its center by the end of 2019, making it the first European city to totally ban cars in the city center. Instead, public transit and cycling will be supported and encouraged in the banned-car zone, and hundreds of parking spaces in the city will be replaced by bike lanes.
As a government effort to support bicycling and offer alternative transportation modes, bike-sharing systems (BSSs) have been introduced in over 50 countries. BSSs aim to encourage people to travel via bike by distributing bicycles at stations located across an area of service. Residents and visitors can borrow a bike from any station and then return it to any station near their destination. Bicycles are considered an affordable, easy-to-use, and, healthy transportation mode, and BSSs show significant transportation, environmental, and health benefits.
As the use of BSSs have grown, imbalances in the system have become an issue and an obstacle for further growth. Imbalance occurs when bikers cannot drop off or pick-up a bike because the bike station is either full or empty. This problem has been investigated extensively by many researchers and policy makers, and several solutions have been proposed. There are three major ways to address the rebalancing issue: static, dynamic and incentivized. The incentivized approaches make use of the users in the balancing efforts, in which the operating company incentives them to change their destination in favor of keeping the system balanced. The other two approaches: static and dynamic, deal with the movement of bikes between stations either during or at the end of the day to overcome station imbalances. They both assume the location and number of bike stations are fixed and only the bikes can be moved. This is a realistic assumption given that current BSSs have only fixed stations. However, cities are dynamic and their geographical and economic growth affects the distribution of trips and thus constantly changing BSS user behavior. In addition, work-related bike trips cause certain stations to face a high-demand level during weekdays, while these same stations are at a low-demand level on weekends, and thus may be of little use. Moreover, fixed stations fail to accommodate big events such as football games, holidays, or sudden weather changes.
This dissertation proposes a new generation of BSSs in which we assume some of the bike stations can be portable. This approach takes advantage of both types of BSSs: dock-based and dock-less. Towards this goal, a BSS optimization framework was developed at both the tactical and operational level. Specifically, the framework consists of two levels: predicting bike counts at stations using fast, online, and incremental learning approaches and then balancing the system using portable stations. The goal is to propose a framework to solve the dynamic bike sharing repositioning problem, aiming at minimizing the unmet demand, leading to increased user satisfaction and reducing repositioning/rebalancing operations.
This dissertation contributes to the field in five ways. First, a multi-objective supervised clustering algorithm was developed to identify the similarity of bike-usage with respect to time events. Second, a dynamic, easy-to-interpret, rapid approach to predict bike counts at stations in a BSS was developed. Third, a univariate inventory model using a Markov chain process that provides an optimal range of bike levels at stations was created. Fourth, an investigation of the advantages of portable bike stations, using an agent-based simulation approach as a proof-of-concept was developed. Fifth, mathematical and heuristic approaches were proposed to balance bike stations. / Doctor of Philosophy / Large urban areas are often associated with traffic congestion, high carbon mono/dioxide emissions (CO/CO2), fuel waste, and associated decreases in productivity. The estimated loss attributed to missed productivity and wasted fuel increased from $87.2 to $115 between 2007 and 2009. Driving in congested areas also results in long trip times. For instance, in 1993, drivers experienced trips that were 1.2 min/km longer in congested conditions.
As a result, commuters are encouraged to leave their cars at home and use public transportation modes instead. However, public transportation modes fails to deliver commuters to their exact destination. Users have to walk some distance, which is commonly called the “last mile”. Bike sharing systems (BSSs) have started to fill this gap, offering a flexible and convenient transportation mode for commuters, around the clock. This is in addition to individual financial savings, health benefits, and reduction in congestion and emissions. Resent reports have shown BSSs multiplying over 50 countries.
This notable expansion of BSSs also brings daily logistical challenges due to the imbalanced demand, causing some stations to run empty while others become full. Rebalancing the bike inventory in a BSS is crucial to ensure customer satisfaction and the whole system’s effectiveness. Most of the operating costs are also associated with rebalancing. The current rebalancing approaches assume stations are fixed and thus don’t take into account that the demand changes from weekday to weekend as well as from peak to non-peak hours, making some stations useless during specific days of the week and times of day. Furthermore, cities change continually with regard to demographics or structures and thus the distribution of trips also changes continually, leading to re-installation of stations to accommodate the dynamic change, which is both impractical and costly.
In this dissertation, we propose a new generation of BSS in which we assume some stations are portable, meaning they can move during the day. They can be either stand-alone or an extension of existing stations with the goal of accommodating the dynamic changes in the distribution of trips during the day. To implement our new BSSs, we developed a BSS optimization framework. This framework consists of two components: predicting the bike counts at stations using fast approaches and then balancing the system using portable stations. The goal is to propose a framework to solve the dynamic bike sharing repositioning problem, aiming at minimizing the unmet demand, leading to increased user satisfaction and reducing repositioning/rebalancing operations.
This dissertation contributes to the field in five ways. First, a novel algorithm was developed to identify the similarity of bike-usage with respect to time events. Second, easy-to-interpret and rapid approaches to predict bike counts at stations in a BSS were developed. Third, an inventory model using statistical techniques that provide an optimal range of bike levels at stations was created. Fourth, an investigation of the advantages of portable bike stations was developed. Fifth, mathematical approach was proposed to balance bike stations.
|
34 |
Inference and Learning with Planning ModelsAineto García, Diego 02 September 2022 (has links)
[ES] Inferencia y aprendizaje son los actos de razonar sobre evidencia recogida con el fin de alcanzar conclusiones lógicas sobre el proceso que la originó. En el contexto de un modelo de espacio de estados, inferencia y aprendizaje se refieren normalmente a explicar el comportamiento pasado de un agente, predecir sus acciones futuras, o identificar su modelo. En esta tesis, presentamos un marco para inferencia y aprendizaje en el modelo de espacio de estados subyacente al modelo de planificación clásica, y formulamos una paleta de problemas de inferencia y aprendizaje bajo este paraguas unificador. También desarrollamos métodos efectivos basados en planificación que nos permiten resolver estos problemas utilizando algoritmos de planificación genéricos del estado del arte. Mostraremos que un gran número de problemas de inferencia y aprendizaje claves que han sido tratados como desconectados se pueden formular de forma cohesiva y resolver siguiendo procedimientos homogéneos usando nuestro marco. Además, nuestro trabajo abre las puertas a nuevas aplicaciones para tecnología de planificación ya que resalta las características que hacen que el modelo de espacio de estados de planificación clásica sea diferente a los demás modelos. / [CA] Inferència i aprenentatge són els actes de raonar sobre evidència arreplegada a fi d'aconseguir conclusions lògiques sobre el procés que la va originar. En el context d'un model d'espai d'estats, inferència i aprenentatge es referixen normalment a explicar el comportament passat d'un agent, predir les seues accions futures, o identificar el seu model. En esta tesi, presentem un marc per a inferència i aprenentatge en el model d'espai d'estats subjacent al model de planificació clàssica, i formulem una paleta de problemes d'inferència i aprenentatge davall este paraigua unificador. També desenrotllem mètodes efectius basats en planificació que ens permeten resoldre estos problemes utilitzant algoritmes de planificació genèrics de l'estat de l'art. Mostrarem que un gran nombre de problemes d'inferència i aprenentatge claus que han sigut tractats com desconnectats es poden formular de forma cohesiva i resoldre seguint procediments homogenis usant el nostre marc. A més, el nostre treball obri les portes a noves aplicacions per a tecnologia de planificació ja que ressalta les característiques que fan que el model d'espai d'estats de planificació clàssica siga diferent dels altres models. / [EN] Inference and learning are the acts of reasoning about some collected evidence in order to reach a logical conclusion regarding the process that originated it. In the context of a state-space model, inference and learning are usually concerned with explaining an agent's past behaviour, predicting its future actions or identifying its model. In this thesis, we present a framework for inference and learning in the state-space model underlying the classical planning model, and formulate a palette of inference and learning problems under this unifying umbrella. We also develop effective planning-based approaches to solve these problems using off-the-shelf, state-of-the-art planning algorithms. We will show that several core inference and learning problems that previous research has treated as disconnected can be formulated in a cohesive way and solved following homogeneous procedures using the proposed framework. Further, our work opens the way for new applications of planning technology as it highlights the features that make the state-space model of classical planning different from other models. / The work developed in this doctoral thesis has been possible thanks to the FPU16/03184 fellowship that I have enjoyed for the duration of my PhD studies. I have also been supported by my advisors’ grants TIN2017-88476-C2-1-R, TIN2014-55637-C2-2-R-AR, and RYC-2015-18009. / Aineto García, D. (2022). Inference and Learning with Planning Models [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/185355
|
35 |
TDNet : A Generative Model for Taxi Demand Prediction / TDNet : En Generativ Modell för att Prediktera TaxiefterfråganSvensk, Gustav January 2019 (has links)
Supplying the right amount of taxis in the right place at the right time is very important for taxi companies. In this paper, the machine learning model Taxi Demand Net (TDNet) is presented which predicts short-term taxi demand in different zones of a city. It is based on WaveNet which is a causal dilated convolutional neural net for time-series generation. TDNet uses historical demand from the last years and transforms features such as time of day, day of week and day of month into 26-hour taxi demand forecasts for all zones in a city. It has been applied to one city in northern Europe and one in South America. In northern europe, an error of one taxi or less per hour per zone was achieved in 64% of the cases, in South America the number was 40%. In both cities, it beat the SARIMA and stacked ensemble benchmarks. This performance has been achieved by tuning the hyperparameters with a Bayesian optimization algorithm. Additionally, weather and holiday features were added as input features in the northern European city and they did not improve the accuracy of TDNet.
|
36 |
Choice Under Uncertainty: Violations of Optimality in Decision MakingRodenburg, Kathleen 11 June 2013 (has links)
This thesis is an investigation of how subjects behave in an individual binary choice decision task with the option to purchase or observe for free additional information before reaching a decision. In part 1 of this thesis, an investigative study is conducted with the intent to sharpen the view to literature concerning corresponding psychology and economics experiments designed to test decision tasks that involve purchasing and observing information from an imperfect message prior to taking a terminal action choice. This investigative study identifies areas of research that warrant further investigation as well as provides enhancements for execution in the subsequent experiment conducted in Part 2 & 3 of this thesis. In Part 2 & 3, I conduct an experiment to test how subjects behave in an individual binary choice decision task with the option to purchase or observe for free additional information before reaching a final decision. I find that subjects’ behaviour over time converges toward optimal decisions prior to observing an imperfect information signal. However, when subjects observe an imperfect information signal prior to their terminal choice there is greater deviation from optimal behaviour. I find in addition to behaviour that is reflective of a risk-neutral BEU maximizer, status quo bias, over-weighing the informational value of the message received and past statistically independent outcomes influencing future choices. The subjects’ willingness to pay (WTP) to use the additional information gathered from an imperfect message service when making a final decision was on average less than the risk neutral BEU willingness to pay benchmark. Moreover, as the informative value of the message increased, causing the BEU valuation to increase, subjects under-estimated the value of the message signal to a greater degree. Although risk attitudes may have influenced the subjects’ WTP decisions, it does not account for the increased conservative WTP behaviour when information became more valuable. Additionally, the findings from this study suggest that individuals adopt different decision rules depending on both personal attributes (i.e. skillset, gender, experience) and on the context and environment in which the decision task is conducted. / SSHRC grant: Social Sciences and Humanities Research Council via Dr. Bram Cadsby Professor Department of Economics, University of Guelph
|
37 |
Essays on Inflation Expectations, Heterogeneous Agents, and the Use of Approximated Solutions in the Estimation of DSGE modelsOrmeño Sánchez, Arturo 21 September 2011 (has links)
In this thesis I evaluate the departures of three common assumptions in macroeconomic modeling and estimation, namely the Rational Expectations (RE) hypothesis, the representative agent assumption and the use of first-order approximations in the estimation of dynamic stochastic general equilibrium (DSGE) models. In the first chapter I determine how the use of survey data on inflation expectations in the estimation of a model alters the evaluation of the RE assumption in comparison to an alternative assumption, namely learning. In chapter two, I use heterogeneous agent models to determine the relationship between income volatility and the demand for durable goods. In the third chapter I evaluate if the use of first-order approximations in the estimation of a model could affect the evaluation of the determinants of the Great Moderation. / En esta tesis analizo desvíos de tres supuestos comunes en la elaboración y estimación de modelos macroeconómicos. Estos supuestos son la Hipótesis de Expectativas Racionales (ER), el supuesto del Agente Representativo, y el uso de aproximaciones de primer orden en la estimación de los modelos de equilibrio general. En el primer capítulo determino como el empleo de datos de expectativas de inflación en la estimación de un modelo puede alterar la evaluación del supuesto de ER en comparación a un supuesto alternativo como learning. En el segundo capítulo, utilizo modelos de agentes heterogéneos para determinar la relación entre la volatilidad de los ingresos y la demanda de bienes durables. En el tercer capítulo, analizo si el uso de aproximaciones de primer orden afecta la evaluación de los determinantes de la Gran Moderación.
|
38 |
La construction des compétences d'enseignement des enseignants-chercheurs novices de l'université en France / Constructing the teaching competences of novice faculty members in FranceKiffer, Sacha 14 December 2016 (has links)
Le métier d’enseignant universitaire s’apprend le plus souvent sur le tas (Knight, Tait & Yorke, 2006). Mais que recouvre cet apprentissage sur le tas ? L’objectif de cette thèse est de déterminer les pratiques d’apprentissage au travers desquelles, en France, les enseignants universitaires débutants construisent leurs compétences d’enseignement. La recherche, menée auprès d'universitaires novices, questionne l'usage de huit modèles d’apprentissage susceptibles d’être à l’œuvre dans le processus de construction des compétences. La thèse montre que les pratiques des novices sont éclectiques et tendanciellement non-structurées. Alors que les pouvoirs publics considèrent la mise place d’une formation initiale formelle et systématique, cette recherche de thèse engage à une réflexion sur une formation pédagogique des universitaires qui prendrait en compte la diversité des pratiques et l'aspiration des novices à l'autonomie. / How to teach in academia is most often learnt on-the-job (Knight, Tait & Yorke, 2006). But what does the phrase “on-the-job learning” genuinely mean in this case? This doctoral dissertation aims to identify which learning practices novice academics make use of to construct their teaching competences in France. A survey was carried out amongst novice academics asking them to describe how eight learning models may have contributed to the process of constructing their teaching competences. Results show that novices’ practices are eclectic and mainly informal. While public authorities have been developing for a while formal training structures targeted at all newly-hired academics, this research suggests that the variety of practices and the aspiration of novices to informality should also be taken into account.
|
39 |
Cardinality Estimation with Local Deep Learning ModelsWoltmann, Lucas, Hartmann, Claudio, Thiele, Maik, Habich, Dirk, Lehner, Wolfgang 14 June 2022 (has links)
Cardinality estimation is a fundamental task in database query processing and optimization. Unfortunately, the accuracy of traditional estimation techniques is poor resulting in non-optimal query execution plans. With the recent expansion of machine learning into the field of data management, there is the general notion that data analysis, especially neural networks, can lead to better estimation accuracy. Up to now, all proposed neural network approaches for the cardinality estimation follow a global approach considering the whole database schema at once. These global models are prone to sparse data at training leading to misestimates for queries which were not represented in the sample space used for generating training queries. To overcome this issue, we introduce a novel local-oriented approach in this paper, therefore the local context is a specific sub-part of the schema. As we will show, this leads to better representation of data correlation and thus better estimation accuracy. Compared to global approaches, our novel approach achieves an improvement by two orders of magnitude in accuracy and by a factor of four in training time performance for local models.
|
40 |
Applications of Formal Explanations in MLSmyrnioudis, Nikolaos January 2023 (has links)
The most performant Machine Learning (ML) classifiers have been labeled black-boxes due to the complexity of their decision process. eXplainable Artificial Intelligence (XAI) methods aim to alleviate this issue by crafting an interpretable explanation for a models prediction. A drawback of most XAI methods is that they are heuristic with some drawbacks such as non determinism and locality. Formal Explanations (FE) have been proposed as a way to explain the decisions of classifiers by extracting a set of features that guarantee the prediction. In this thesis we explore these guarantees for different use cases: speeding up the inference speed of tree-based Machine Learning classifiers, curriculum learning using said classifiers and also reducing training data. We find that under the right circumstances we can achieve up to 6x speedup by partially compiling the model to a set of rules that are extracted using formal explainability methods. / De mest effektiva maskininlärningsklassificerarna har betecknats som svarta lådor på grund av komplexiteten i deras beslutsprocess. Metoder för förklarbar artificiell intelligens (XAI) syftar till att lindra detta problem genom att skapa en tolkbar förklaring för modellens prediktioner. En nackdel med de flesta XAI-metoder är att de är heuristiska och har vissa nackdelar såsom icke-determinism och lokalitet. Formella förklaringar (FE) har föreslagits som ett sätt att förklara klassificerarnas beslut genom att extrahera en uppsättning funktioner som garanterar prediktionen. I denna avhandling utforskar vi dessa garantier för olika användningsfall: att öka inferenshastigheten för maskininlärningsklassificerare baserade på träd, kurser med hjälp av dessa klassificerare och även minska träningsdata. Vi finner att under rätt omständigheter kan vi uppnå upp till 6 gånger snabbare prestanda genom att delvis kompilera modellen till en uppsättning regler som extraheras med hjälp av formella förklaringsmetoder.
|
Page generated in 0.0749 seconds