31 |
ENHANCE NMF-BASED RECOMMENDATION SYSTEMS WITH AUXILIARY INFORMATION IMPUTATIONAlghamedy, Fatemah 01 January 2019 (has links)
This dissertation studies the factors that negatively impact the accuracy of the collaborative filtering recommendation systems based on nonnegative matrix factorization (NMF). The keystone in the recommendation system is the rating that expresses the user's opinion about an item. One of the most significant issues in the recommendation systems is the lack of ratings. This issue is called "cold-start" issue, which appears clearly with New-Users who did not rate any item and New-Items, which did not receive any rating.
The traditional recommendation systems assume that users are independent and identically distributed and ignore the connections among users whereas the recommendation actually is a social activity. This dissertation aims to enhance NMF-based recommendation systems by utilizing the imputation method and limiting the errors that are introduced in the system. External information such as trust network and item categories are incorporated into NMF-based recommendation systems through the imputation.
The proposed approaches impute various subsets of the missing ratings. The subsets are defined based on the total number of the ratings of the user or item before the imputation, such as impute the missing ratings of New-Users, New-Items, or cold-start users or items that suffer from the lack of the ratings. In addition, several factors are analyzed that affect the prediction accuracy when the imputation method is utilized with NMF-based recommendation systems. These factors include the total number of the ratings of the user or item before the imputation, the total number of imputed ratings for each user and item, the average of imputed rating values, and the value of imputed rating values. In addition, several strategies are applied to select the subset of missing ratings for the imputation that lead to increasing the prediction accuracy and limiting the imputation error. Moreover, a comparison is conducted with some popular methods that are in common with the proposed method in utilizing the imputation to handle the lack of ratings, but they differ in the source of the imputed ratings.
Experiments on different large-size datasets are conducted to examine the proposed approaches and analyze the effects of the imputation on accuracy. Users and items are divided into three groups based on the total number of the ratings before the imputation is applied and their recommendation accuracy is calculated. The results show that the imputation enhances the recommendation system by capacitating the system to recommend items to New-Users, introduce New-Items to users, and increase the accuracy of the cold-start users and items. However, the analyzed factors play important roles in the recommendation accuracy and limit the error that is introduced from the imputation.
|
32 |
Neural network imputation : a new fashion or a good toolAmer, Safaa R. 07 June 2004 (has links)
Most statistical surveys and data collection studies encounter missing data. A common
solution to this problem is to discard observations with missing data while reporting
the percentage of missing observations in different output tables. Imputation is a tool
used to fill in the missing values. This dissertation introduces the missing data
problem as well as traditional imputation methods (e.g. hot deck, mean imputation,
regression, Markov Chain Monte Carlo, Expectation-Maximization, etc.). The use of
artificial neural networks (ANN), a data mining technique, is proposed as an effective
imputation procedure. During ANN imputation, computational effort is minimized
while accounting for sample design and imputation uncertainty. The mechanism and
use of ANN in imputation for complex survey designs is investigated.
Imputation methods are not all equally good, and none are universally good. However,
simulation results and applications in this dissertation show that regression, Markov
chain Monte Carlo, and ANN yield comparable results. Artificial neural networks
could be considered as implicit models that take into account the sample design
without making strong parametric assumptions. Artificial neural networks make few
assumptions about the data, are asymptotically good and robust to multicollinearity
and outliers. Overall, ANN could be time and resources efficient for an experienced
user compared to other conventional imputation techniques. / Graduation date: 2005
|
33 |
Multiple comparisons using multiple imputation under a two-way mixed effects interaction modelKosler, Joseph Stephen, January 2006 (has links)
Thesis (Ph. D.)--Ohio State University, 2006. / Title from first page of PDF file. Includes bibliographical references (p. 233-237).
|
34 |
Essays on Unions, Wages and Performance: Evidence from Latin AmericaRios, Fernando 13 August 2013 (has links)
Unions are one of the most important institutions in labor markets, and are capable of affecting workers (wages) and employers (performance). Despite the relevance unions have had worldwide, most of the literature has concentrated on the economic effects of unions in the U.S. and other developed countries, with few studies concentrating on what unions do in developing countries.
Because developing countries have contrasting differences compared to developed countries, in terms of economic development, legal settings and institutions, it is possible that conclusions reached in the broader literature might not be appropriate in the framework of developing countries. This dissertation aims to fill this gap in the literature studying the economic effects of unions on wages and performance in selected developing countries in Latin America: Argentina, Bolivia, Chile, Mexico, Panama and Uruguay.
The first essay focuses on the impact of unions on wages distribution in Bolivia and Chile, using the novel Recentered Influence Function decomposition. Although both countries have considerably different levels of economic development and institutions, the estimations indicate unions have similar effects increasing wages and reducing wage inequality at the top of the distribution. These results are similar to those found replicating the methodology using U.S. data. The results suggest that the common economic and political forces that govern the role of unions as collective bargaining units transcend other contextual differences in these countries.
The second essay analyzes the impact of unions on economic performance of establishments in the manufacturing sector in Argentina, Bolivia, Chile, Mexico, Panama and Uruguay. Using an augmented Cobb Douglas production function, the essay finds that unions have a positive, but small, effect on productivity, with the exception of Argentina. Analyses on alternative measures of performance show that, for most cases, the positive productivity effects barely offset the higher union compensation; that unions show no relationship with sales growth; and that unionized establishments usually reduce investment in capital and R&D. While no single narrative can explain all observed effects across countries, the results provide a step forward to understand the role of unions on economic performance in developing countries.
|
35 |
Imputation, Estimation and Missing Data in FinanceDiCesare, Giuseppe January 2006 (has links)
Suppose <em>X</em> is a diffusion process, possibly multivariate, and suppose that there are various segments of the components of <em>X</em> that are missing. This happens, for example, if <em>X</em> is the price of various assets and these prices are only observed at specific discrete trading times. Imputation (or conditional simulation) of the missing pieces of the sample paths of <em>X</em> is discussed in several settings. When <em>X</em> is a Brownian motion the conditioned process is a tied down Brownian motion or a Brownian bridge process. In the special case of Gaussian stochastic processes the problem is simplified since the conditional finite dimensional distributions of the process are multivariate Normal. For more general diffusion processes, including those with jump components, an acceptance-rejection simulation algorithm is introduced which enables one to sample from the exact conditional distribution without appealing to approximate time step methods such as the popular Euler or Milstein schemes. The method is referred to as <em>pathwise imputation</em>. Its practical implementation relies only on the basic elements of simulation while its theoretical justification depends on the pathwise properties of stochastic processes and in particular Girsanov's theorem. The method allows for the complete characterization of the bridge paths of complicated diffusions using only Brownian bridge interpolation. The imputation methods discussed are applied to estimation, variance reduction and exotic option pricing.
|
36 |
Imputation, Estimation and Missing Data in FinanceDiCesare, Giuseppe January 2006 (has links)
Suppose <em>X</em> is a diffusion process, possibly multivariate, and suppose that there are various segments of the components of <em>X</em> that are missing. This happens, for example, if <em>X</em> is the price of various assets and these prices are only observed at specific discrete trading times. Imputation (or conditional simulation) of the missing pieces of the sample paths of <em>X</em> is discussed in several settings. When <em>X</em> is a Brownian motion the conditioned process is a tied down Brownian motion or a Brownian bridge process. In the special case of Gaussian stochastic processes the problem is simplified since the conditional finite dimensional distributions of the process are multivariate Normal. For more general diffusion processes, including those with jump components, an acceptance-rejection simulation algorithm is introduced which enables one to sample from the exact conditional distribution without appealing to approximate time step methods such as the popular Euler or Milstein schemes. The method is referred to as <em>pathwise imputation</em>. Its practical implementation relies only on the basic elements of simulation while its theoretical justification depends on the pathwise properties of stochastic processes and in particular Girsanov's theorem. The method allows for the complete characterization of the bridge paths of complicated diffusions using only Brownian bridge interpolation. The imputation methods discussed are applied to estimation, variance reduction and exotic option pricing.
|
37 |
Comparison of Imputation Methods on Estimating Regression Equation in MNAR MechanismPan, Wensi January 2012 (has links)
In this article, we propose an overview of missing data problem, introduce three missing data mechanisms and study general solutions to them when estimating a linear regression equation. When we have partly missing data, there are two common ways to solve this problem. One way is to ignore those records with missing values. Another method is to impute those observations being missed. Imputation methods arepreferred since they provide full datasets. We observed that there is not a general imputation solution in missing not at random (MNAR) mechanism. In order to check the performance of existing imputation methods in a regression model, a simulation study is set up. Listwise deletion, simple imputation and multiple imputation are selected into comparison which focuses on the effect on parameter estimates and standard errors. The simulation results illustrate that the listwise deletion provides reliable parameter estimates. Simple imputation performs better than multiple imputation in a model with a high determination coefficient. Multiple imputation,which offers a suitable solution for missing at random (MAR), is not valid for MNAR.
|
38 |
La détermination du patrimoine public responsableBouteiller, Julien 13 October 2001 (has links) (PDF)
La question de la détermination du patrimoine public responsable est d'abord issue d'un impératif contentieux : le juge administratif conditionne en effet la recevabilité d'unrecours de plein contentieux au fait que la personne actionnée soit effectivement la personne responsable. Cette circonstance conduit à s'interroger sur les critères utilisés par le juge pour déterminer la personne publique responsable et fait apparaître, ce que confirme une doctrine majoritaire, qu'ils reviennent à la mise en oeuvre du "critère fonctionnel d'imputation des dommages". Une étude préalable des différences qui séparent les notions d'imputation et de causalité conduit dans une première partie à proposer un recensement des situations, simples ou complexes, de détermination du patrimoine responsable. Cette phénoménologie s'achève sur le constat de l'impossibilité de considérer le critère fonctionnel d'imputation comme universellement utilisable au regard des distorsions que le juge, notamment en matière de police, lui fait subir. Une seconde partie, dès lors, est consacrée à un approfondissement des fonctions de la mise en oeuvre de la responsabilité des personnes morales de droit public, à l'aide notamment de l'outil légué par le doyen HAURIOU qu'est l'analyse institutionnelle...
|
39 |
On the use of multiple imputation in handling missing values in longitudinal studiesChan, Pui-shan, 陳佩珊 January 2004 (has links)
published_or_final_version / Medical Sciences / Master / Master of Medical Sciences
|
40 |
Praleistų reikšmių įrašymo metodų efektyvumas turizmo tyrime / Efficiency of missing data imputation methods in the survey on tourismBinkytė, Kristina 08 September 2009 (has links)
Šiame darbe išnagrinėjome kelis praleistų reikšmių įrašymo metodus, kuriuos taikėme išvykstamojo turizmo statistinio tyrimo 2.6. klausimo pirmiems dviem punktams: paslaugų paketo ir transporto išlaidoms. Įrašymo metodų efektyvumo analizę atlikome su pilnais duomenimis, juose fiktyviai padarydamos praleistas reikšmes ir į jas įrašydamos reikšmes keliais praleistų reikšmių įrašymo metodais. Tuomet turėdamos tikras ir įrašytas reikšmes galėjome palyginti parametrų įverčius. Kadangi praleistos reikšmės gali atsirasti atsitiktinai ir neatsitiktinai, todėl mes praleistų reikšmių įrašymo metodus taikėme trims atvejams: kai praleistos reikšmės atsiranda atsitiktinai, kai praleistos reikšmės atsiranda tada, kai neatsako respondentai turėję didžiausias ar mažiausias išlaidas kelionėje. Praleistų reikšmių įrašymui taikėme skirstiniu pagrįstą, vidurkio, atsitiktinio pakartojimo, santykiu pagrįstą ir daugiareikšmio įrašymo metodus, nesudarydamos įrašymo klasių ir sudarydamos įrašymo klases. Taigi, siūlome tokį pat praleistų reikšmių įrašymo metodų efektyvumo tyrimą atlikti ir likusiems 2.6. klausimo punktams, nusistatyti tinkamiausią įrašymo metodą ir tada jį taikyti jau tikroms praleistoms reikšmėms įrašyti. Be to, reikėtų atsižvelgti ir į dėl įrašymo atsirandančios dispersijos įvertinį, nes jos indėlis į bendrą dispersijos įvertinį yra nemažas. Atlikus praleistų reikšmių įrašymą, bus galima taikyti kompiuterinius įverčių skaičiavimo metodus ir nebus prarasta kita informacija, kurią... [toliau žr. visą tekstą] / In this work, we examined some missing data imputation methods in the survey on outbound tourism for the package tour and transport expenses. We performed an analysis of the efficiency of missing data imputation methods using full data sets with fictitious missing data applying various missing data imputation methods to fill in the missing data. Thus, we had real values and imputed values and could compare the estimated parameters. The missing data can appear randomly and non-randomly, so we applied missing data imputation methods in three cases: when missing data appear randomly and when missing data appear in case of non-response of respondents who had the highest or the lowest travel expenses. We applied distribution, average, random, ratio and multiple imputation methods for missing data imputation without using imputation classes and using imputation classes. We propose to perform the same efficiency survey of missing data imputation methods for the remaining items of expenses in the outbound tourism questionnaire in order to find out a convenient missing data imputation method and apply it for the real missing data (the current analysis was performed applying fictitious missing data). After the missing data imputation, we can apply the procedures of parameter estimation and we will not lose other information as it would be the case with the elimination of questionnaires having missing data.
|
Page generated in 0.0691 seconds