Global ETD Search

51	Detecting and Mitigating Rumors in Social Media Islam, Mohammad Raihanul 19 June 2020 (has links) The penetration of social media today enables the rapid spread of breaking news and other developments to millions of people across the globe within hours. However, such pervasive use of social media by the general masses to receive and consume news is not without its attendant negative consequences as it also opens opportunities for nefarious elements to spread rumors or misinformation. A rumor generally refers to an interesting piece of information that is widely disseminated through a social network and whose credibility cannot be easily substantiated. A rumor can later turn out to be true or false or remain unverified. The spread of misinformation and fake news can lead to deleterious effects on users and society. The objective of the proposed research is to develop a range of machine learning methods that will effectively detect and characterize rumor veracity in social media. Since users are the primary protagonists on social media, analyzing the characteristics of information spread w.r.t. users can be effective for our purpose. For our first problem, we propose a method of computing user embeddings from underlying social networks. For our second problem, we propose a long short-term memory (LSTM) based model that can classify whether a story discussed in a thread can be categorized as a false, true, or unverified rumor. We demonstrate the utility of user features computed from the first problem to address the second problem. For our third problem, we propose a method that uses user profile information to detect rumor veracity. This method has the advantage of not requiring the underlying social network, which can be tedious to compute. For the last problem, we investigate a rumor mitigation technique that recommends fact-checking URLs to rumor debunkers, i.e., social network users who are very passionate about disseminating true news. Here, we incorporate the influence of other users on rumor debunkers in addition to their previous URL sharing history to recommend relevant fact-checking URLs. / Doctor of Philosophy / A rumor is generally defined as an interesting piece of a story that cannot be authenticated easily. On social networks, a user can generally find an interesting piece of news or story and may share (retweet) it. A story that initially appears plausible can later turn out to be false or remain unverified. The propagation of false rumors on social networks has a deteriorating effect on user experience. Therefore, rumor veracity detection is important, and drawing interest in social network research. In this thesis, we develop various machine learning models that detect rumor veracity. For this purpose, we exploit different types of information regarding users, such as profile details and connectivity with other users etc. Moreover, we propose a rumor mitigation technique that recommends fact-checking URLs to social network users who are passionate about debunking rumors. Here, we leverage similar techniques used in e-commerce sites for recommending products to solve this problem. Rumor Veracity Detection Rumor Mitigation Social Network Analysis Deep learning (Machine learning)
52	A Profit-Neutral Double-price-signal Retail Electricity Market Solution for Incentivizing Price-responsive DERs Considering Network Constraints Cai, Mengmeng 23 June 2020 (has links) Emerging technologies, including distributed energy resources (DERs), internet-of-things and advanced distribution management systems, are revolutionizing the power industry. They provide benefits like higher operation flexibility and lower bulk grid dependency, and are moving the modern power grid towards a decentralized, interconnected and intelligent direction. Consequently, the emphasis of the system operation management has been shifted from the supply-side to the demand-side. It calls for a reconsideration of the business model for future retail market operators. To address this need, this dissertation proposes an innovative retail market solution tailored to market environments penetrated with price-responsive DERs. The work is presented from aspects of theoretical study, test-bed platform development, and experimental analysis, within which two topics relevant to the retail market operation are investigated in depth. The first topic covers the modeling of key retail market participants. With regard to price-insensitive participants, fixed loads are treated as the representative. Deep learning-based day-ahead load forecasting models are developed in this study, utilizing both recurrent and convolutional neural networks, to predict the part of demands that keep fixed regardless of the market price. With regard to price-sensitive participants, battery storages are selected as the representative. An optimization-based battery arbitrage model is developed in this study to represent their price-responsive behaviors in response to a dynamic price. The second topic further investigates how the retail market model and pricing strategy should be designed to incentivize these market participants. Different from existing works, this study innovatively proposes a profit-neutral double-price-signal retail market model. Such a design differentiates elastic prosumers, who actively offer flexibilities to the system operation, from normal inelastic consumers/generators, based on their sensitivities to the market price. Two price signals, namely retail grid service price and retail energy price, are then introduced to separately quantify values of the flexibility, provided by elastic participants, and the electricity commodity, sold/bought to/from inelastic participants. Within the proposed retail market, a non-profit retail market operator (RMO) manages and settles the market through determining the price signals and supplementary subsidy to minimize the overall system cost. In response to the announced retail grid service price, elastic prosumers adjust their day-ahead operating schedules to maximize their payoffs. Given the interdependency between decisions made by the RMO and elastic participants, a retail pricing scheme, formulated based on a bi-level optimization framework, is proposed. Additional efforts are made on merging and linearizing the original non-convex bi-level problem into a single-level mixed-integer linear programming problem to ensure the computational efficiency of the retail pricing tool. Case studies are conducted on a modified IEEE 34-bus test-bed system, simulating both physical operations of the power grid and financial interactions inside the retail market. Experimental results demonstrate promising properties of the proposed retail market solution: First of all, it is able to provide cost-saving benefits to inelastic customers and create revenues for elastic customers at the same time, justifying the rationalities of these participants to join the market. Second of all, the addition of the grid service subsidy not only strengthens the profitability of the elastic customer, but also ensures that the benefit enjoyed per customer will not be compromised by the competition brought up by a growing number of participants. Furthermore, it is able to properly capture impacts from line losses and voltage constraints on the system efficiency and stability, so as to derive practical pricing solutions that respect the system operating rules. Last but not least, it encourages the technology improvement of elastic assets as elastic assets in better conditions are more profitable and could better save the electricity bills for inelastic customers. Above all, the superiority of the proposed retail market solution is proven. It can serve as a promising start for the retail electricity market reconstruction. / Doctor of Philosophy / The electricity market plays a critical role in ensuring the economic and secure operation of the power system. The progress made by distributed energy resources (DERs) has reshaped the modern power industry bringing a larger proportion of price-responsive behaviors to the demand-side. It challenges the traditional wholesale-only electricity market and calls for an addition of retail markets to better utilize distributed and elastic assets. Therefore, this dissertation targets at offering a reliable and computational affordable retail market solution to bridge this knowledge gap. Different from existing works, this study assumes that the retail market is managed by a profit-neutral retail market operator (RMO), who oversees and facilitates the system operation for maximizing the system efficiency rather than making profits. Market participants are categorized into two groups: inelastic participants and elastic participants, based on their sensitivity to the market price. The motivation behind this design is that instead of treating elastic participants as normal customers, it is more reasonable to treat them as grid service providers who offer operational flexibilities that benefit the system efficiency. Correspondingly, a double-signal pricing scheme is proposed, such that the flexibility, provided by elastic participants, and the electricity commodity, generated/consumed by inelastic participants, are separately valued by two distinct prices, namely retail grid service price and retail energy price. A grid service subsidy is also introduced in the pricing system to provide supplementary incentives to elastic customers. These two price signals in addition to the subsidy are determined by the RMO via solving a bi-level optimization problem given the interdependency between the prices and reaction of elastic participants. Experimental results indicate that the proposed retail market model and pricing scheme are beneficial for both types of market participants, practical for the network-constrained real-world implementation, and supportive for the technology improvement of elastic assets. Retail Electricity Market Load Forecasting Battery Arbitrage Bi-level Optimization Deep learning (Machine learning)
53	A Deep Learning Approach to Predict Accident Occurrence Based on Traffic Dynamics Khaghani, Farnaz 05 1900 (has links) Traffic accidents are of concern for traffic safety; 1.25 million deaths are reported each year. Hence, it is crucial to have access to real-time data and rapidly detect or predict accidents. Predicting the occurrence of a highway car accident accurately any significant length of time into the future is not feasible since the vast majority of crashes occur due to unpredictable human negligence and/or error. However, rapid traffic incident detection could reduce incident-related congestion and secondary crashes, alleviate the waste of vehicles’ fuel and passengers’ time, and provide appropriate information for emergency response and field operation. While the focus of most previously proposed techniques is predicting the number of accidents in a certain region, the problem of predicting the accident occurrence or fast detection of the accident has been little studied. To address this gap, we propose a deep learning approach and build a deep neural network model based on long short term memory (LSTM). We apply it to forecast the expected speed values on freeways’ links and identify the anomalies as potential accident occurrences. Several detailed features such as weather, traffic speed, and traffic flow of upstream and downstream points are extracted from big datasets. We assess the proposed approach on a traffic dataset from Sacramento, California. The experimental results demonstrate the potential of the proposed approach in identifying the anomalies in speed value and matching them with accidents in the same area. We show that this approach can handle a high rate of rapid accident detection and be implemented in real-time travelers’ information or emergency management systems. / M.S. / Rapid traffic accident detection/prediction is essential for scaling down non-recurrent conges- tion caused by traffic accidents, avoiding secondary accidents, and accelerating emergency system responses. In this study, we propose a framework that uses large-scale historical traffic speed and traffic flow data along with the relevant weather information to obtain robust traffic patterns. The predicted traffic patterns can be coupled with the real traffic data to detect anomalous behavior that often results in traffic incidents in the roadways. Our framework consists of two major steps. First, we estimate the speed values of traffic at each point based on the historical speed and flow values of locations before and after each point on the roadway. Second, we compare the estimated values with the actual ones and introduce the ones that are significantly different as an anomaly. The anomaly points are the potential points and times that an accident occurs and causes a change in the normal behavior of the roadways. Our study shows the potential of the approach in detecting the accidents while exhibiting promising performance in detecting the accident occurrence at a time close to the actual time of occurrence. Deep learning (Machine learning) LSTM Bi-directional LSTM Anomaly Detection Database management
54	Modified Kernel Principal Component Analysis and Autoencoder Approaches to Unsupervised Anomaly Detection Merrill, Nicholas Swede 01 June 2020 (has links) Unsupervised anomaly detection is the task of identifying examples that differ from the normal or expected pattern without the use of labeled training data. Our research addresses shortcomings in two existing anomaly detection algorithms, Kernel Principal Component Analysis (KPCA) and Autoencoders (AE), and proposes novel solutions to improve both of their performances in the unsupervised settings. Anomaly detection has several useful applications, such as intrusion detection, fault monitoring, and vision processing. More specifically, anomaly detection can be used in autonomous driving to identify obscured signage or to monitor intersections. Kernel techniques are desirable because of their ability to model highly non-linear patterns, but they are limited in the unsupervised setting due to their sensitivity of parameter choices and the absence of a validation step. Additionally, conventionally KPCA suffers from a quadratic time and memory complexity in the construction of the gram matrix and a cubic time complexity in its eigendecomposition. The problem of tuning the Gaussian kernel parameter, $sigma$, is solved using the mini-batch stochastic gradient descent (SGD) optimization of a loss function that maximizes the dispersion of the kernel matrix entries. Secondly, the computational time is greatly reduced, while still maintaining high accuracy by using an ensemble of small, textit{skeleton} models and combining their scores. The performance of traditional machine learning approaches to anomaly detection plateaus as the volume and complexity of data increases. Deep anomaly detection (DAD) involves the applications of multilayer artificial neural networks to identify anomalous examples. AEs are fundamental to most DAD approaches. Conventional AEs rely on the assumption that a trained network will learn to reconstruct normal examples better than anomalous ones. In practice however, given sufficient capacity and training time, an AE will generalize to reconstruct even very rare examples. Three methods are introduced to more reliably train AEs for unsupervised anomaly detection: Cumulative Error Scoring (CES) leverages the entire history of training errors to minimize the importance of early stopping and Percentile Loss (PL) training aims to prevent anomalous examples from contributing to parameter updates. Lastly, early stopping via Knee detection aims to limit the risk of over training. Ultimately, the two new modified proposed methods of this research, Unsupervised Ensemble KPCA (UE-KPCA) and the modified training and scoring AE (MTS-AE), demonstrates improved detection performance and reliability compared to many baseline algorithms across a number of benchmark datasets. / Master of Science / Anomaly detection is the task of identifying examples that differ from the normal or expected pattern. The challenge of unsupervised anomaly detection is distinguishing normal and anomalous data without the use of labeled examples to demonstrate their differences. This thesis addresses shortcomings in two anomaly detection algorithms, Kernel Principal Component Analysis (KPCA) and Autoencoders (AE) and proposes new solutions to apply them in the unsupervised setting. Ultimately, the two modified methods, Unsupervised Ensemble KPCA (UE-KPCA) and the Modified Training and Scoring AE (MTS-AE), demonstrates improved detection performance and reliability compared to many baseline algorithms across a number of benchmark datasets. Machine learning Deep learning (Machine learning) Anomaly Detection Autoencoder Kernel Principal Component Analysis
55	Increasing Accessibility of Electronic Theses and Dissertations (ETDs) Through Chapter-level Classification Jude, Palakh Mignonne 07 July 2020 (has links) Great progress has been made to leverage the improvements made in natural language processing and machine learning to better mine data from journals, conference proceedings, and other digital library documents. However, these advances do not extend well to book-length documents such as electronic theses and dissertations (ETDs). ETDs contain extensive research data; stakeholders -- including researchers, librarians, students, and educators -- can benefit from increased access to this corpus. Challenges arise while working with this corpus owing to the varied nature of disciplines covered as well as the use of domain-specific language. Prior systems are not tuned to this corpus. This research aims to increase the accessibility of ETDs by the automatic classification of chapters of an ETD using machine learning and deep learning techniques. This work utilizes an ETD-centric target classification system. It demonstrates the use of custom trained word and document embeddings to generate better vector representations of this corpus. It also describes a methodology to leverage extractive summaries of chapters of an ETD to aid in the classification process. Our findings indicate that custom embeddings and the use of summarization techniques can increase the performance of the classifiers. The chapter-level labels generated by this research help to identify the level of interdisciplinarity in the corpus. The automatic classifiers can also be further used in a search engine interface that would help users to find the most appropriate chapters. / Master of Science / Electronic Theses and Dissertations (ETDs) are submitted by students at the end of their academic study. These works contain research information pertinent to a given field. Increasing the accessibility of such documents will be beneficial to many stakeholders including students, researchers, librarians, and educators. In recent years, a great deal of research has been conducted to better extract information from textual documents with the use of machine learning and natural language processing. However, these advances have not been applied to increase the accessibility of ETDs. This research aims to perform the automatic classification of chapters extracted from ETDs. That will reduce the human effort required to label the key parts of these book-length documents. Additionally, when considered by search engines, such categorization can aid users to more easily find the chapters that are most relevant to their research. Electronic Theses and Dissertations Classification Machine learning Deep learning (Machine learning) Natural Language Processing
56	Land Cover Quantification using Autoencoder based Unsupervised Deep Learning Manjunatha Bharadwaj, Sandhya 27 August 2020 (has links) This work aims to develop a deep learning model for land cover quantification through hyperspectral unmixing using an unsupervised autoencoder. Land cover identification and classification is instrumental in urban planning, environmental monitoring and land management. With the technological advancements in remote sensing, hyperspectral imagery which captures high resolution images of the earth's surface across hundreds of wavelength bands, is becoming increasingly popular. The high spectral information in these images can be analyzed to identify the various target materials present in the image scene based on their unique reflectance patterns. An autoencoder is a deep learning model that can perform spectral unmixing by decomposing the complex image spectra into its constituent materials and estimating their abundance compositions. The advantage of using this technique for land cover quantification is that it is completely unsupervised and eliminates the need for labelled data which generally requires years of field survey and formulation of detailed maps. We evaluate the performance of the autoencoder on various synthetic and real hyperspectral images consisting of different land covers using similarity metrics and abundance maps. The scalability of the technique with respect to landscapes is assessed by evaluating its performance on hyperspectral images spanning across 100m x 100m, 200m x 200m, 1000m x 1000m, 4000m x 4000m and 5000m x 5000m regions. Finally, we analyze the performance of this technique by comparing it to several supervised learning methods like Support Vector Machine (SVM), Random Forest (RF) and multilayer perceptron using F1-score, Precision and Recall metrics and other unsupervised techniques like K-Means, N-Findr, and VCA using cosine similarity, mean square error and estimated abundances. The land cover classification obtained using this technique is compared to the existing United States National Land Cover Database (NLCD) classification standard. / Master of Science / This work aims to develop an automated deep learning model for identifying and estimating the composition of the different land covers in a region using hyperspectral remote sensing imagery. With the technological advancements in remote sensing, hyperspectral imagery which captures high resolution images of the earth's surface across hundreds of wavelength bands, is becoming increasingly popular. As every surface has a unique reflectance pattern, the high spectral information contained in these images can be analyzed to identify the various target materials present in the image scene. An autoencoder is a deep learning model that can perform spectral unmixing by decomposing the complex image spectra into its constituent materials and estimate their percent compositions. The advantage of this method in land cover quantification is that it is an unsupervised technique which does not require labelled data which generally requires years of field survey and formulation of detailed maps. The performance of this technique is evaluated on various synthetic and real hyperspectral datasets consisting of different land covers. We assess the scalability of the model by evaluating its performance on images of different sizes spanning over a few hundred square meters to thousands of square meters. Finally, we compare the performance of the autoencoder based approach with other supervised and unsupervised deep learning techniques and with the current land cover classification standard. Deep learning (Machine learning) Autoencoder Land Cover Hyperspectral Imagery Spectral Unmixing Reflectance Spectra
57	Representational Capabilities of Feed-forward and Sequential Neural Architectures Sanford, Clayton Hendrick January 2024 (has links) Despite the widespread empirical success of deep neural networks over the past decade, a comprehensive understanding of their mathematical properties remains elusive, which limits the abilities of practitioners to train neural networks in a principled manner. This dissertation provides a representational characterization of a variety of neural network architectures, including fully-connected feed-forward networks and sequential models like transformers. The representational capabilities of neural networks are most famously characterized by the universal approximation theorem, which states that sufficiently large neural networks can closely approximate any well-behaved target function. However, the universal approximation theorem applies exclusively to two-layer neural networks of unbounded size and fails to capture the comparative strengths and weaknesses of different architectures. The thesis addresses these limitations by quantifying the representational consequences of random features, weight regularization, and model depth on feed-forward architectures. It further investigates and contrasts the expressive powers of transformers and other sequential neural architectures. Taken together, these results apply a wide range of theoretical tools—including approximation theory, discrete dynamical systems, and communication complexity—to prove rigorous separations between different neural architectures and scaling regimes. Computer science Neural networks (Computer science) Deep learning (Machine learning) Computer networks--Scalability
58	Going Deeper with Images and Natural Language Ma, Yufeng 29 March 2019 (has links) One aim in the area of artificial intelligence (AI) is to develop a smart agent with high intelligence that is able to perceive and understand the complex visual environment around us. More ambitiously, it should be able to interact with us about its surroundings in natural languages. Thanks to the progress made in deep learning, we've seen huge breakthroughs towards this goal over the last few years. The developments have been extremely rapid in visual recognition, in which machines now can categorize images into multiple classes, and detect various objects within an image, with an ability that is competitive with or even surpasses that of humans. Meanwhile, we also have witnessed similar strides in natural language processing (NLP). It is quite often for us to see that now computers are able to almost perfectly do text classification, machine translation, etc. However, despite much inspiring progress, most of the achievements made are still within one domain, not handling inter-domain situations. The interaction between the visual and textual areas is still quite limited, although there has been progress in image captioning, visual question answering, etc. In this dissertation, we design models and algorithms that enable us to build in-depth connections between images and natural languages, which help us to better understand their inner structures. In particular, first we study how to make machines generate image descriptions that are indistinguishable from ones expressed by humans, which as a result also achieved better quantitative evaluation performance. Second, we devise a novel algorithm for measuring review congruence, which takes an image and review text as input and quantifies the relevance of each sentence to the image. The whole model is trained without any supervised ground truth labels. Finally, we propose a brand new AI task called Image Aspect Mining, to detect visual aspects in images and identify aspect level rating within the review context. On the theoretical side, this research contributes to multiple research areas in Computer Vision (CV), Natural Language Processing (NLP), interactions between CVandNLP, and Deep Learning. Regarding impact, these techniques will benefit related users such as the visually impaired, customers reading reviews, merchants, and AI researchers in general. / Doctor of Philosophy / One aim in the area of artificial intelligence (AI) is to develop a smart agent with high intelligence that is able to perceive and understand the complex visual environment around us. More ambitiously, it should be able to interact with us about its surroundings in natural languages. Thanks to the progress made in deep learning, we’ve seen huge breakthroughs towards this goal over the last few years. The developments have been extremely rapid in visual recognition, in which machines now can categorize images into multiple classes, and detect various objects within an image, with an ability that is competitive with or even surpasses that of humans. Meanwhile, we also have witnessed similar strides in natural language processing (NLP). It is quite often for us to see that now computers are able to almost perfectly do text classification, machine translation, etc. However, despite much inspiring progress, most of the achievements made are still within one domain, not handling inter-domain situations. The interaction between the visual and textual areas is still quite limited, although there has been progress in image captioning, visual question answering, etc. In this dissertation, we design models and algorithms that enable us to build in-depth connections between images and natural languages, which help us to better understand their inner structures. In particular, first we study how to make machines generate image descriptions that are indistinguishable from ones expressed by humans, which as a result also achieved better quantitative evaluation performance. Second, we devise a novel algorithm for measuring review congruence, which takes an image and review text as input and quantifies the relevance of each sentence to the image. The whole model is trained without any supervised ground truth labels. Finally, we propose a brand new AI task called Image Aspect Mining, to detect visual aspects in images and identify aspect level rating within the review context. On the theoretical side, this research contributes to multiple research areas in Computer Vision (CV), Natural Language Processing (NLP), interactions between CV&NLP, and Deep Learning. Regarding impact, these techniques will benefit related users such as the visually impaired, customers reading reviews, merchants, and AI researchers in general. Image Captioning Quasi-Supervised Learning Image Aspect Mining GANs Deep learning (Machine learning)
59	Generating Canonical Sentences from Question-Answer Pairs of Deposition Transcripts Mehrotra, Maanav 15 September 2020 (has links) In the legal domain, documents of various types are created in connection with a particular case, such as testimony of people, transcripts, depositions, memos, and emails. Deposition transcripts are one such type of legal document, which consists of conversations between the different parties in the legal proceedings that are recorded by a court reporter. Court reporting has been traced back to 63 B.C. It has transformed from the initial scripts of ``Cuneiform", ``Running Script", and ``Grass Script" to Certified Access Real-time Translation (CART). Since the boom of digitization, there has been a shift to storing these in the PDF/A format. Deposition transcripts are in the form of question-answer (QA) pairs and can be quite lengthy for common people to read. This gives us a need to develop some automatic text-summarization method for the same. The present-day summarization systems do not support this form of text, entailing a need to process them. This creates a need to parse such documents and extract QA pairs as well as any relevant supporting information. These QA pairs can then be converted into complete canonical sentences, i.e., in a declarative form, from which we could extract some insights and use for further downstream tasks. This work investigates the same, as well as using deep-learning techniques for such transformations. / Master of Science / In the legal domain, documents of various types are created in connection with a particular case, such as the testimony of people, transcripts, memos, and emails. Deposition transcripts are one such type of legal document, which consists of conversations between a lawyer and one of the parties in the legal proceedings, captured by a court reporter. Since the boom of digitization, there has been a shift to storing these in the PDF/A format. Deposition transcripts are in the form of question-answer (QA) pairs and can be quite lengthy. Though automatic summarization could help, present-day systems do not work well with such texts. This creates a need to parse these documents and extract QA pairs as well as any relevant supporting information. The QA pairs can then be converted into canonical sentences, i.e., in a declarative form, from which we could extract some insights and support downstream tasks. This work describes these conversions, as well as using deep-learning techniques for such transformations. Natural Language Processing Deep learning (Machine learning) Legal Tech Legal Depositions
60	Application of Machine Learning to Multi Antenna Transmission and Machine Type Resource Allocation Emenonye, Don-Roberts Ugochukwu 11 September 2020 (has links) Wireless communication systems is a well-researched area in electrical engineering that has continually evolved over the past decades. This constant evolution and development have led to well-formulated theoretical baselines in terms of reliability and efficiency. However, most communication baselines are derived by splitting the baseband communications into a series of modular blocks like modulation, coding, channel estimation, and orthogonal frequency modulation. Subsequently, these blocks are independently optimized. Although this has led to a very efficient and reliable process, a theoretical verification of the optimality of this design process is not feasible due to the complexities of each individual block. In this work, we propose two modifications to these conventional wireless systems. First, with the goal of designing better space-time block codes for improved reliability, we propose to redesign the transmit and receive blocks of the physical layer. We replace a portion of the transmit chain - from modulation to antenna mapping with a neural network. Similarly, the receiver/decoder is also replaced with a neural network. In other words, the first part of this work focuses on jointly optimizing the transmit and receive blocks to produce a set of space-time codes that are resilient to Rayleigh fading channels. We compare our results to the conventional orthogonal space-time block codes for multiple antenna configurations. The second part of this work investigates the possibility of designing a distributed multiagent reinforcement learning-based multi-access algorithm for machine type communication. This work recognizes that cellular networks are being proposed as a solution for the connectivity of machine type devices (MTDs) and one of the most crucial aspects of scheduling in cellular connectivity is the random access procedure. The random access process is used by conventional cellular users to receive an allocation for the uplink transmissions. This process usually requires six resource blocks. It is efficient for cellular users to perform this process because transmission of cellular data usually requires more than six resource blocks. Hence, it is relatively efficient to perform the random access process in order to establish a connection. Moreover, as long as cellular users maintain synchronization, they do not have to undertake the random access process every time they have data to transmit. They can maintain a connection with the base station through discontinuous reception. On the other hand, the random access process is unsuitable for MTDs because MTDs usually have small-sized packets. Hence, performing the random access process to transmit such small-sized packets is highly inefficient. Also, most MTDs are power constrained, thus they turn off when they have no data to transmit. This means that they lose their connection and can't maintain any form of discontinuous reception. Hence, they perform the random process each time they have data to transmit. Due to these observations, explicit scheduling is undesirable for MTC. To overcome these challenges, we propose bypassing the entire scheduling process by using a grant free resource allocation scheme. In this scheme, MTDs pseudo-randomly transmit their data in random access slots. Note that this results in the possibility of a large number of collisions during the random access slots. To alleviate the resulting congestion, we exploit a heterogeneous network and investigate the optimal MTD-BS association which minimizes the long term congestion experienced in the overall cellular network. Our results show that we can derive the optimal MTD-BS association when the number of MTDs is less than the total number of random access slots. / Master of Science / Wireless communication systems is a well researched area of engineering that has continually evolved over the past decades. This constant evolution and development has led to well formulated theoretical baselines in terms of reliability and efficiency. This two part thesis investigates the possibility of improving these wireless systems with machine learning. First, with the goal of designing more resilient codes for transmission, we propose to redesign the transmit and receive blocks of the physical layer. We focus on jointly optimizing the transmit and receive blocks to produce a set of transmit codes that are resilient to channel impairments. We compare our results to the current conventional codes for various transmit and receive antenna configuration. The second part of this work investigates the possibility of designing a distributed multi-access scheme for machine type devices. In this scheme, MTDs pseudo-randomly transmit their data by randomly selecting time slots. This results in the possibility of a large number of collisions occurring in the duration of these slots. To alleviate the resulting congestion, we employ a heterogeneous network and investigate the optimal MTD-BS association which minimizes the long term congestion experienced in the overall network. Our results show that we can derive the optimal MTD-BS algorithm when the number of MTDs is less than the total number of slots. Machine Type Communication Space Time Block Coding Deep learning (Machine learning) Reinforcement Learning

Search results