• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 14
  • 4
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 34
  • 15
  • 11
  • 6
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Relationship Between Mean, Median, Mode with Unimodal Grouped Data

Zheng, Shimin, Mogusu, Eunice, Veeranki, Sreenivas P., Quinn, Megan 03 November 2015 (has links)
Background: It is widely believed that the median of a unimodal distribution is "usually" between the mean and the mode for right skewed or left skewed distributions. However, this is not always true, especially with grouped data. For some research, analyses must be conducted based on grouped data since complete raw data are not always available. A gap exists in the body of research on the mean-median-mode inequality for grouped data. Methods: For grouped data, the median Me=L+((n/2-F)/fm)×d and the mode Mo=L+(D1/(D1+D2))×d, where L is the median/modal group lower boundary, n is the total frequency, F and G are the cumulative frequencies of the groups before and after the median/modal group respectively, D1= fm - fm-1 and D2=fm - fm+1, fmis the median/modal group frequency, fm-1 and fm+1 are the premodal and postmodal group frequency respectively. Assuming there are k groups and k is odd, group width d is the same for each group and the mode and median are within (k+1)/2th group. Necessary and sufficient conditions are derived for each of six arrangements of mean, median and mode. Results: Table available at https://apha.confex.com/apha/143am/webprogram/Paper326538.html Conclusion: For grouped data, the mean-median-mode inequality can be any order of six possibilities.
12

Grouped variable selection in high dimensional partially linear additive Cox model

Liu, Li 01 December 2010 (has links)
In the analysis of survival outcome supplemented with both clinical information and high-dimensional gene expression data, traditional Cox proportional hazard model fails to meet some emerging needs in biological research. First, the number of covariates is generally much larger the sample size. Secondly, predicting an outcome with individual gene expressions is inadequate because a gene's expression is regulated by multiple biological processes and functional units. There is a need to understand the impact of changes at a higher level such as molecular function, cellular component, biological process, or pathway. The change at a higher level is usually measured with a set of gene expressions related to the biological process. That is, we need to model the outcome with gene sets as variable groups and the gene sets could be partially overlapped also. In this thesis work, we investigate the impact of a penalized Cox regression procedure on regularization, parameter estimation, variable group selection, and nonparametric modeling of nonlinear eects with a time-to-event outcome. We formulate the problem as a partially linear additive Cox model with high-dimensional data. We group genes into gene sets and approximate the nonparametric components by truncated series expansions with B-spline bases. After grouping and approximation, the problem of variable selection becomes that of selecting groups of coecients in a gene set or in an approximation. We apply the group Lasso to obtain an initial solution path and reduce the dimension of the problem and then update the whole solution path with the adaptive group Lasso. We also propose a generalized group lasso method to provide more freedom in specifying the penalty and excluding covariates from being penalized. A modied Newton-Raphson method is designed for stable and rapid computation. The core programs are written in the C language. An user-friendly R interface is implemented to perform all the calculations by calling the core programs. We demonstrate the asymptotic properties of the proposed methods. Simulation studies are carried out to evaluate the finite sample performance of the proposed procedure using several tuning parameter selection methods for choosing the point on the solution path as the nal estimator. We also apply the proposed approach on two real data examples.
13

The Asymptotic Loss of Information for Grouped Data

Felsenstein, Klaus, Pötzelberger, Klaus January 1995 (has links) (PDF)
We study the loss of information (measured in terms of the Kullback- Leibler distance) caused by observing "grouped" data (observing only a discretized version of a continuous random variable). We analyse the asymptotical behaviour of the loss of information as the partition becomes finer. In the case of a univariate observation, we compute the optimal rate of convergence and characterize asymptotically optimal partitions (into intervals). In the multivariate case we derive the asymptotically optimal regular sequences of partitions. Forthermore, we compute the asymptotically optimal transformation of the data, when a sequence of partitions is given. Examples demonstrate the efficiency of the suggested discretizing strategy even for few intervals. (author's abstract) / Series: Forschungsberichte / Institut für Statistik
14

On Learning from Collective Data

Xiong, Liang 01 December 2013 (has links)
In many machine learning problems and application domains, the data are naturally organized by groups. For example, a video sequence is a group of images, an image is a group of patches, a document is a group of paragraphs/words, and a community is a group of people. We call them the collective data. In this thesis, we study how and what we can learn from collective data. Usually, machine learning focuses on individual objects, each of which is described by a feature vector and studied as a point in some metric space. When approaching collective data, researchers often reduce the groups into vectors to which traditional methods can be applied. We, on the other hand, will try to develop machine learning methods that respect the collective nature of data and learn from them directly. Several different approaches were taken to address this learning problem. When the groups consist of unordered discrete data points, it can naturally be characterized by its sufficient statistics – the histogram. For this case we develop efficient methods to address the outliers and temporal effects in the data based on matrix and tensor factorization methods. To learn from groups that contain multi-dimensional real-valued vectors, we develop both generative methods based on hierarchical probabilistic models and discriminative methods using group kernels based on new divergence estimators. With these tools, we can accomplish various tasks such as classification, regression, clustering, anomaly detection, and dimensionality reduction on collective data. We further consider the practical side of the divergence based algorithms. To reduce their time and space requirements, we evaluate and find methods that can effectively reduce the size of the groups with little impact on the accuracy. We also proposed the conditional divergence along with an efficient estimator in order to correct the sampling biases that might be present in the data. Finally, we develop methods to learn in cases where some divergences are missing, caused by either insufficient computational resources or extreme sampling biases. In addition to designing new learning methods, we will use them to help the scientific discovery process. In our collaboration with astronomers and physicists, we see that the new techniques can indeed help scientists make the best of data.
15

How does monetary policy affect income inequality in Japan? Evidence from grouped data

Feldkircher, Martin, Kakamu, Kazuhiko January 2018 (has links) (PDF)
We examine the effects of monetary policy on income inequality in Japan using a novel econometric approach that jointly estimates the Gini coefficient based on micro-level grouped data of households and the dynamics of macroeconomic quantities. Our results indicate different effects on income inequality for different types of households: A monetary tightening increases inequality when income data is based on households whose head is employed (workers' households), while the effect reverses over the medium term when considering a broader definition of households. Differences in the relative strength of the transmission channels can account for this finding. Finally we demonstrate that the proposed joint estimation strategy leads to more informative inference while results based on the frequently used two-step estimation approach yields inconclusive results. / Series: Working Papers in Regional Science
16

Lateral Resistance of Grouped Piles Near 20-ft Tall MSE Abutment Wall with Strip Reinforcements

Farnsworth, Zachary Paul 10 August 2020 (has links)
A team from Brigham Young University and I performed full-scale lateral load tests on individual and grouped 12.75x0.375 inch pipe piles spaced at varying distances behind an MSE wall. The individually loaded pile which acted as a control was spaced at 4.0 pile diameters from the wall face, and the three grouped piles which were loaded in unison were spaced at 3.0, 2.8, and 1.8 pile diameters from the wall face and transversely spaced at 4.7 pile diameters center-to-center. The purpose of these tests was to determine the extent of group effects on lateral pile resistance, induced loads in soil reinforcements, and MSE wall panel deflections compared to those previously observed in individually laterally loaded piles behind MSE walls. The computer model LPILE was used in my analysis of the measured test data. The p-multipliers back-calculated with LPILE for the grouped piles were 0.25, 0.60, and 0.25 for the grouped piles spaced at 3.0, 2.8, and 1.8 pile diameters from the wall, respectively. These values are lower than that predicted for piles at the same pile-to-wall spacings using the most recent equation for computing p-multipliers. I propose the use of an additional p-multiplier for grouped piles near an MSE wall, a group-effect p-multiplier, to account for discrepancies between individual and grouped pile behaviors. The group effect p-multipliers were 0.35, 0.91, and 0.74 for the grouped piles spaced at 3.0, 2.8, and 1.8 pile diameters from the wall, respectively. The average group-effect p-multiplier was 0.66. Additionally, I used LPILE to analyze test data from Pierson et al. (2009), who had previously performed full-scale lateral load tests of individual and grouped shafts. In said analysis, the group of three 3-foot diameter concrete shafts spaced at 2.0 shaft diameters from the wall face and transversely spaced at 5.0 shaft diameters center-to-center had an average group effect p-multiplier of 0.78. As in previous studies, the induced forces in soil reinforcements in this study were greatest either near the locations of the test piles or at the MSE wall face. The most recent equation for calculating the maximum induced force in a soil reinforcement strip was reasonably effective in predicting the measured maximum loads when superimposed between the test piles, with 65% and 85% of the data points falling within the one and two standard deviation boundaries, respectively, of the original data used to develop the equation. Deflection of the MSE wall panels was greater during the grouped pile test than was previously observed for individually loaded piles under similar pile head deflections--with a maximum wall deflection of 0.31 inch compared to the previous average of 0.10 inch for pile head deflections of about 1.25 inches.
17

Product Layout Optimization for Autonomous Warehouses with Grouped Products

Nilsson, Max, Olsson, Hampus January 2020 (has links)
To utilize space better, warehouses stack their products on top of each other. This increases the risk of injury for workers when storing and retrieving the products. Some warehouses counteract this by using robots to retrieve products to a picking area where a human worker picks the products needed to fulfill an order. This means that it is important for the robots to be effective when retrieving products to reduce the time the worker spends waiting in the picking area. This thesis focuses on the grouping of products in the containers when they are stored in the warehouse. The robots will then retrieve one container at a time and if the grouping of products is done correctly this should decrease the number of retrievals required to fulfill an order. In order to make the decision on which products to group together, an application was developed that data mined previous orders that the warehouse had received in an attempt to extract information about the products. With the help of this information the application then suggests different product layouts that focus on different goals when they are created. The different layouts are then compared against each other in order to determine which layout technique produces the best results. This algorithm has been named the PLO-algorithm. The results showed that when a product is placed with the PLO-algorithm, the most important aspect to consider is the relations it has with the other products it is grouped with. The results also showed that data mining orders that are too old can have a negative impact on the result if not handled correctly. The results also showed that when constructing the warehouse you should try to avoid restrictions that affect which products can be placed together as much as possible since these restrictions can impact the effectiveness of the warehouse in a negative way. The thesis draws the conclusion that there is a clear gain in effectiveness for warehouses to have a planed layout for their products. It is recommended to data mine previous orders to extract relations between the products if possible since this piece of information showed the best results in this thesis. It is also in the warehouse best interest to avoid as many restrictions as possible that affect which products can be placed together since this can impact the results in a negative way. It is also beneficial to not include data that is too old in the data mining since this can impact the results in a negative way if not handled correctly. / För att utnyttja sitt utrymme bättre staplar lagerhus sina produkter på höjden. Detta medför högre risker för personskada vid hämtning och lämning av produkter, en del lagerhus löser detta genom att använda sig av robotar som hämtar och lämnar produkterna i lagerhuset. Robotarna hämtar och lämnar produkterna i en plock zon där en mänsklig arbetare plockar de produkter som behövs för en order. Detta innebär att det är viktigt att robotarna är effektiva i sin hämtning av produkter för att minska väntetiden för arbetarna i plock zonen. I ett försök att effektivisera robotarna fokuserar denna avhandling på gruperingen av produkterna i behållarna. Detta innebär att beslutet om vilka produkter som ska grupperas tillsammans i samma behållare är viktig eftersom om rätt produkter lagras tillsammans så kommer detta minska antalet hämtningar och lämningar som krävs för att uppfylla en beställning. För att hjälpa till med detta beslut skapades en applikation som analyserade tidigare beställningar som varuhuset erhållit i ett försök att extrahera information om produkterna. Applikationen skapar sedan olika förslag på produkt placeringar där de olika förslagen fokuserar på olika mål för att undersöka vilket mål som är viktigast att fokusera på när en produkt ska placeras. Algoritmen i denna applikation har valts att kallas för PLO-algoritmen. Resultaten visade att när en produkt ska placeras med PLO-algoritmen så är det viktigt att gruppera produkten med produkter den har starka relationer till. Resultatet visade också att när data ska analyseras bör inte för gammal data analyseras då äldre relationer mellan produkter som inte stämmer längre kan påverka resultatet negativt om algoritmen ej hanterar detta på något sätt. Resultaten visade också att vid konstruktionen av lagerhuset bör restriktioner som begränsar hur produkter kan placeras, undvikas om möjligt då dessa kan påverka lagerhusets effektivitet negativt. Slutsatsen som kan dras är att ett lagerhus kan tjäna väldigt mycket på att ha en plan när de bestämmer hur deras produkter ska placeras. Om det finns möjlighet att analysera tidigare beställningar efter relationer mellan produkter så är detta rekommenderat då det visade bäst resultat i denna undersökning. Det är även till lagerhusets fördel att försöka undvika restriktioner på deras lagersystem när det byggs eftersom det möjliggör för fler kombinationer när produkterna ska grupperas. Till sist så visar avhandlingen att med datan som användes att det var fördelaktigt att inte göra analys på för gammal data, då detta ger sämre resultat.
18

The Relationship Between the Mean, Median, and Mode with Grouped Data

Zheng, Shimin, Mogusu, Eunice, Veeranki, Sreenivas P., Quinn, Megan, Cao, Yan 03 May 2016 (has links)
It is widely believed that the median is “usually” between the mean and the mode for skewed unimodal distributions. However, this inequality is not always true, especially with grouped data. Unavailability of complete raw data further necessitates the importance of evaluating this characteristic in grouped data. There is a gap in the current statistical literature on assessing mean–median–mode inequality for grouped data. The study aims to evaluate the relationship between the mean, median, and mode with unimodal grouped data; derive conditions for their inequalities; and present their application.
19

Distributionally Robust Learning under the Wasserstein Metric

Chen, Ruidi 29 September 2019 (has links)
This dissertation develops a comprehensive statistical learning framework that is robust to (distributional) perturbations in the data using Distributionally Robust Optimization (DRO) under the Wasserstein metric. The learning problems that are studied include: (i) Distributionally Robust Linear Regression (DRLR), which estimates a robustified linear regression plane by minimizing the worst-case expected absolute loss over a probabilistic ambiguity set characterized by the Wasserstein metric; (ii) Groupwise Wasserstein Grouped LASSO (GWGL), which aims at inducing sparsity at a group level when there exists a predefined grouping structure for the predictors, through defining a specially structured Wasserstein metric for DRO; (iii) Optimal decision making using DRLR informed K-Nearest Neighbors (K-NN) estimation, which selects among a set of actions the optimal one through predicting the outcome under each action using K-NN with a distance metric weighted by the DRLR solution; and (iv) Distributionally Robust Multivariate Learning, which solves a DRO problem with a multi-dimensional response/label vector, as in Multivariate Linear Regression (MLR) and Multiclass Logistic Regression (MLG), generalizing the univariate response model addressed in DRLR. A tractable DRO relaxation for each problem is being derived, establishing a connection between robustness and regularization, and obtaining upper bounds on the prediction and estimation errors of the solution. The accuracy and robustness of the estimator is verified through a series of synthetic and real data experiments. The experiments with real data are all associated with various health informatics applications, an application area which motivated the work in this dissertation. In addition to estimation (regression and classification), this dissertation also considers outlier detection applications.
20

MAKING A GROUPED-DATA FREQUENCY TABLE: DEVELOPMENT AND EXAMINATION OF THE ITERATION ALGORITHM

Lohaka, Hippolyte O. January 2007 (has links)
No description available.

Page generated in 0.0446 seconds