Global ETD Search

1	Methods for longitudinal data measured at distinct time points Xiong, Xiaoqin January 2010 (has links) For longitudinal data where the response and time-dependent predictors within each individual are measured at distinct time points, traditional longitudinal models such as generalized linear mixed effects models or marginal models cannot be directly applied. Instead, some preprocessing such as smoothing is required to temporally align the response and predictors. In Chapter 2, we propose a binning method, which results in equally spaced bins of time for both the response and predictor(s). Hence, after incorporating binning, traditional models can be applied. The proposed binning approach was applied on a longitudinal hemodialysis study to look for possible contemporaneous and lagged effects between occurrences of a health event (i.e., infection) and levels of a protein marker of inflammation (i.e., C-reactive protein). Both Poisson mixed effects models and zero-inflated Poisson (ZIP) mixed effects models were applied to the subsequent binned data, and some important biological findings about contemporaneous and lagged associations were uncovered. In addition, a simulation study was conducted to investigate various properties of the binning approach. In Chapter 3, asymptotic properties have been derived for the fixed effects association parameter estimates following binning, under different data scenarios. In addition, we propose some leave-one-subject-out cross-validation algorithms for bin size selection. In Chapter 4, in order to identify levels of a predictor that might be indicative of recently occurred event(s), we propose a generalized mixed effects regression tree (GMRTree) based method which estimates the tree by standard tree method such as CART and estimates the random effects by a generalized linear mixed effects model. One of the main steps in this method was to use a linearization technique to change the longitudinal count response into a continuous surrogate response. Simulations have shown that the GMRTree method can effectively detect the underlying tree structure in an applicable longitudinal dataset, and has better predictive performance than either a standard tree approach without random effects or a generalized linear mixed effects model, assuming the underlying model indeed has a tree structure. We have also applied this method to two longitudinal datasets, one from the aforementioned hemodialysis study and the other from an epilepsy study. longitudinal regression tree Statistics
2	Methods for longitudinal data measured at distinct time points Xiong, Xiaoqin January 2010 (has links) For longitudinal data where the response and time-dependent predictors within each individual are measured at distinct time points, traditional longitudinal models such as generalized linear mixed effects models or marginal models cannot be directly applied. Instead, some preprocessing such as smoothing is required to temporally align the response and predictors. In Chapter 2, we propose a binning method, which results in equally spaced bins of time for both the response and predictor(s). Hence, after incorporating binning, traditional models can be applied. The proposed binning approach was applied on a longitudinal hemodialysis study to look for possible contemporaneous and lagged effects between occurrences of a health event (i.e., infection) and levels of a protein marker of inflammation (i.e., C-reactive protein). Both Poisson mixed effects models and zero-inflated Poisson (ZIP) mixed effects models were applied to the subsequent binned data, and some important biological findings about contemporaneous and lagged associations were uncovered. In addition, a simulation study was conducted to investigate various properties of the binning approach. In Chapter 3, asymptotic properties have been derived for the fixed effects association parameter estimates following binning, under different data scenarios. In addition, we propose some leave-one-subject-out cross-validation algorithms for bin size selection. In Chapter 4, in order to identify levels of a predictor that might be indicative of recently occurred event(s), we propose a generalized mixed effects regression tree (GMRTree) based method which estimates the tree by standard tree method such as CART and estimates the random effects by a generalized linear mixed effects model. One of the main steps in this method was to use a linearization technique to change the longitudinal count response into a continuous surrogate response. Simulations have shown that the GMRTree method can effectively detect the underlying tree structure in an applicable longitudinal dataset, and has better predictive performance than either a standard tree approach without random effects or a generalized linear mixed effects model, assuming the underlying model indeed has a tree structure. We have also applied this method to two longitudinal datasets, one from the aforementioned hemodialysis study and the other from an epilepsy study. longitudinal regression tree Statistics
3	Identifying responders to melphalan and dexamethasone for newly diagnosed multiple myeloma patients Esmaeili, Abbas 22 July 2008 (has links) Background: MY7 clinical trial compared dexamethasone plus melphalan (MD) vs. prednisone plus melphalan (MP) in multiple myeloma treatment and found no statistically significant difference in overall survival (OS) between the two groups. But, patients reacted to treatment differently. We aimed to identify patients who might have benefited from dexamethasone, and characterize them by their baseline demographic and clinical factors. Methods: First, the prognostic model for OS was developed on the MP arm. The estimated coefficients and baseline hazard were applied to the MD arm to derive martingale residuals (MR). Classification and regression tree analysis was done to identify independent predictive factors for OS and MR was used as response variable. All covariates in categorical shape were used as independent variables to develop the predictive model in MD arm. MP arm was divided accordingly. Subgroups with negative mean MR (survived > expected) were candidates for positive responders while those with positive mean MR (survived < expected) were candidates of negative responders. Mean MR in each subgroup and p values from comparison of OS (log rank test stratified by subgroups) were used to combine the appropriate subgroups as the positive responders or negative responders. Results: A total of 97 patients (42%) in MD arm were identified as positive responders and their OS (median of 44.5 months) was significantly longer than that (median of 33 months) in the corresponding subgroups in MP arm (HR = 0.56, 95% CI 0.4-0.8; p = 0.0014). All positive responders had three common baseline characteristics: aged ≤75 years, calcium concentration ≤2.6 mmol/L and Durie-Salmon stages 2 or 3. Among patients with ECOG performance status<2 those with either HGB≥100 mg/dl or HGB<100 mg/dl and WBC≥4,000 and <4 lytic bone lesions were categorized as positive responders. Also, among the patients with ECOG performance status≥2, males with >3 lytic bone lesions were positive responders. Negative responders (HR = 1.56, 95% confidence interval 1.1 – 2.2; p = 0.006) included patients aged >75 or aged ≤75 with calcium concentration >2.6 mmol/L or aged ≤75 with calcium concentration ≤2.6 mmol/L but had Durie-Salmon stage 1. Conclusions: Evaluation of the hypotheses validity warrants further studies. / Thesis (Master, Community Health & Epidemiology) -- Queen's University, 2008-07-21 13:46:53.748 Multiple myeloma Melphalan Dexamethasone Regression tree analysis
4	The Approach-dependent, Time-dependent, Label-constrained Shortest Path Problem and Enhancements for the CART Algorithm with Application to Transportation Systems Jeenanunta, Chawalit 30 July 2004 (has links) In this dissertation, we consider two important problems pertaining to the analysis of transportation systems. The first of these is an approach-dependent, time-dependent, label-constrained shortest path problem that arises in the context of the Route Planner Module of the Transportation Analysis Simulation System (TRANSIMS), which has been developed by the Los Alamos National Laboratory for the Federal Highway Administration. This is a variant of the shortest path problem defined on a transportation network comprised of a set of nodes and a set of directed arcs such that each arc has an associated label designating a mode of transportation, and an associated travel time function that depends on the time of arrival at the tail node, as well as on the node via which this node was approached. The lattermost feature is a new concept injected into the time-dependent, label-constrained shortest path problem, and is used to model turn-penalties in transportation networks. The time spent at an intersection before entering the next link would depend on whether we travel straight through the intersection, or make a right turn at it, or make a left turn at it. Accordingly, we model this situation by incorporating within each link's travel time function a dependence on the link via which its tail node was approached. We propose two effective algorithms to solve this problem by adapting two efficient existing algorithms to handle time dependency and label constraints: the Partitioned Shortest Path (PSP) algorithm and the Heap-Dijkstra (HP-Dijkstra) algorithm, and present related theoretical complexity results. In addition, we also explore various heuristic methods to curtail the search. We explore an Augmented Ellipsoidal Region Technique (A-ERT) and a Distance-Based A-ERT, along with some variants to curtail the search for an optimal path between a given origin and destination to more promising subsets of the network. This helps speed up computation without sacrificing optimality. We also incorporate an approach-dependent delay estimation function, and in concert with a search tree level-based technique, we derive a total estimated travel time and use this as a key to prioritize node selections or to sort elements in the heap. As soon as we reach the destination node, while it is within some p% of the minimum key value of the heap, we then terminate the search. We name the versions of PSP and HP-Dijkstra that employ this method as Early Terminated PSP (ET-PSP) and Early Terminated Heap-Dijkstra (ETHP-Dijkstra) algorithms. All of these procedures are compared with the original Route Planner Module within TRANSIMS, which is implemented in the Linux operating system, using C++ along with the g++ GNU compiler. Extensive computational testing has been conducted using available data from the Portland, Oregon, and Blacksburg, Virginia, transportation networks to investigate the efficacy of the developed procedures. In particular, we have tested twenty-five different combinations of network curtailment and algorithmic strategies on three test networks: the Blacksburg-light, the Blacksburg-full, and the BigNet network. The results indicate that the Heap-Dijkstra algorithm implementations are much faster than the PSP algorithmic approaches for solving the underlying problem exactly. Furthermore, mong the curtailment schemes, the ETHP-Dijkstra with p=5%, yields the best overall results. This method produces solutions within 0.37-1.91% of optimality, while decreasing CPU effort by 56.68% at an average, as compared with applying the best available exact algorithm. The second part of this dissertation is concerned with the Classification and Regression Tree (CART) algorithm, and its application to the Activity Generation Module of TRANSIMS. The CART algorithm has been popularly used in various contexts by transportation engineers and planners to correlate a set of independent household demographic variables with certain dependent activity or travel time variables. However, the algorithm lacks an automated mechanism for deriving classification trees based on optimizing specified objective functions and handling desired side-constraints that govern the structure of the tree and the statistical and demographic nature of its leaf nodes. Using a novel set partitioning formulation, we propose new tree development, and more importantly, optimal pruning strategies to accommodate the consideration of such objective functions and side-constraints, and establish the theoretical validity of our approach. This general enhancement of the CART algorithm is then applied to the Activity Generator module of TRANSIMS. Related computational results are presented using real data pertaining to the Portland, Oregon, and Blacksburg, Virginia, transportation networks to demonstrate the flexibility and effectiveness of the proposed approach in classifying data, as well as to examine its numerical performance. The results indicate that a variety of objective functions and constraints can be readily accommodated to efficiently control the structural information that is captured by the developed classification tree as desired by the planner or analyst, dependent on the scope of the application at hand. / Ph. D. Integer Programming Classification and Regression Tree Shortest path
5	鋼構造部材のコンクリート境界部における経時的な腐食挙動に関する研究貝沼, 重信, KAINUMA, Shigenobu, 細見, 直史, HOSOMI, Naofumi, 金, 仁泰, KIM, In-Tae, 伊藤, 義人, ITOH, Yoshito 01 1900 (has links) No description available. local corrosion steel concrete accelerated exposure test variogram regression tree
6	Application of CART Decision Tree On the Evaluation of Mutual Fund Hsu, Chiny-Yin 04 August 2006 (has links) None CART(Classification and Regression Tree) Mutual Fund Performance Evaluation
7	海洋環境下における長尺鋼部材の腐食挙動の評価・予測に関する基礎的研究 ITOH, Yoshito, GOTO, Atsushi, HOSOMI, Naofumi, KAINUMA, Shigenobu, 伊藤, 義人, 後藤, 淳, 細見, 直史, 貝沼, 重信 20 May 2009 (has links) No description available. numerical simulation variogram regression tree marine environment steel member corrosion
8	Regression Tree-Based Methodology for Customizing Building Energy Benchmarks to Individual Commercial Buildings January 2013 (has links) abstract: According to the U.S. Energy Information Administration, commercial buildings represent about 40% of the United State's energy consumption of which office buildings consume a major portion. Gauging the extent to which an individual building consumes energy in excess of its peers is the first step in initiating energy efficiency improvement. Energy Benchmarking offers initial building energy performance assessment without rigorous evaluation. Energy benchmarking tools based on the Commercial Buildings Energy Consumption Survey (CBECS) database are investigated in this thesis. This study proposes a new benchmarking methodology based on decision trees, where a relationship between the energy use intensities (EUI) and building parameters (continuous and categorical) is developed for different building types. This methodology was applied to medium office and school building types contained in the CBECS database. The Random Forest technique was used to find the most influential parameters that impact building energy use intensities. Subsequently, correlations which were significant were identified between EUIs and CBECS variables. Other than floor area, some of the important variables were number of workers, location, number of PCs and main cooling equipment. The coefficient of variation was used to evaluate the effectiveness of the new model. The customization technique proposed in this thesis was compared with another benchmarking model that is widely used by building owners and designers namely, the ENERGY STAR's Portfolio Manager. This tool relies on the standard Linear Regression methods which is only able to handle continuous variables. The model proposed uses data mining technique and was found to perform slightly better than the Portfolio Manager. The broader impacts of the new benchmarking methodology proposed is that it allows for identifying important categorical variables, and then incorporating them in a local, as against a global, model framework for EUI pertinent to the building type. The ability to identify and rank the important variables is of great importance in practical implementation of the benchmarking tools which rely on query-based building and HVAC variable filters specified by the user. / Dissertation/Thesis / M.S. Built Environment 2013 Architecture Architectural engineering Energy Energy Benchmarking Regression tree
9	Population Modeling of the Rainwater Killifish, Lucania parva, in Florida Bay Using Multivariate Regression Trees Marcum, Pamela C. 23 August 2013 (has links) Modeling is a powerful tool that can be used to identify important relationships between organisms and their habitat (Guisan & Zimmermann, 2000). Understanding the dynamics of how the two relate to one another is important for conserving and managing ecosystems, but the extreme complexity of those ecosystems makes it very difficult to fully diagram. Unlike many other modeling techniques, Multivariate Regression Trees (MRTs) are not limited by a prior assumptions, pre-determined relationships, transformations, or correlations. MRTs have the power to provide both explanation and prediction of ecological data by producing simple models that are easy to interpret. This study proposed to use MRTs to evaluate and model relationships between Lucania parva and the environment and habitat of Florida Bay. Counts were transformed to presence-absence and abundance groupings. Models were first run using a variety of combination of response variables and all explanatory variables. Results of these models were used to select the best combination of response and explanatory variables in an effort to create a best fit model. Models indicated that Lucania parva populations are found in the dense (cover ≥50%), shallow water (<1.8 m) grass beds that occur in the western portion of Florida Bay. A best fit model was able to explain 63.7% of the variance with predictive error of 0.43. population modeling rainwater killifish regression tree multivariate regression tree Lucania parva species-environment interactions Florida Bay Marine Biology
10	[en] MEAN AND REALIZED VOLATILITY SMOOTH TRANSITION MODELS APPLIED TO RETURN FORECASTING AND AUTOMATIC TRADING / [pt] MODELOS DE TRANSIÇÃO SUAVE PARA MÉDIA E VOLATILIDADE REALIZADA APLICADOS À PREVISÃO DE RETORNOS E NEGOCIAÇÃO AUTOMÁTICA CAMILA ROSA EPPRECHT 30 March 2009 (has links) [pt] O principal objetivo desta dissertação é comparar o desempenho de modelos lineares e não-lineares de previsão de retornos de 23 ativos do mercado acionário americano. Propõe-se o modelo STAR-Tree Heterocedástico, que faz uso da metodologia do STAR-Tree (Smooth Transition AutoRegression Tree) aplicada a séries temporais heterocedásticas. Com a disponibilidade de dados de retorno e da volatilidade realizada de ações intra-diários, as séries de retornos são transformadas através da divisão de cada retorno pela sua volatilidade realizada. A série transformada apresenta variância constante. O modelo é uma combinação da metodologia STAR (Smooth Transition AutoRegression) e do algoritmo CART (Classification and Regression Tree). O modelo resultante pode ser interpretado como uma regressão de múltiplos regimes com transição suave. A especificação do modelo é feita através de testes de Multiplicadores de Lagrange, que indicam o nó a ser dividido e a variável de transição correspondente. Os modelos de comparação usados são o modelo Média, o método Naive, modelos lineares ARX e Redes Neurais. As previsões dos modelos foram avaliadas através de medidas estatísticas e financeiras. Os resultados financeiros baseam-se em uma regra de negociação automática que informa o momento de comprar e vender cada ativo. O modelo STAR-Tree Heterocedástico teve resultados estatísticos equivalentes aos dos outros modelos, porém apresentou um desempenho financeiro superior para a maioria das séries. A volatilidade realizada também foi estimada usando a metodologia STAR-Tree, e sua previsão foi utilizada para fazer uma análise de alavancagem financeira. / [en] The main goal of this dissertation is to compare the performance of linear and nonlinear models to forecast 23 assets of the American Stocks Market. The Heteroscedastic STAR-Tree Model is proposed using the STAR- Tree (Smooth Transition AutoRegression Tree) methodology applied to heteroscedastic time series. As assets returns and realized volatility intraday data are available, the returns series are transformed by dividing each return by its realized volatility, which gives homocedastic series. The model is a combination of the STAR (Smooth Transition AutoRegression) methodology and the CART (Classification and Regression Tree) algorithm. The resulting model can be interpreted as a smooth transition multiple regime regression. The model specification is done by Lagrange Multiplier tests that indicate the node to be split and the corresponding transition variable. The comparison models used are the Mean model, Naive method, ARX linear models and Neural Networks. The forecasting models were evaluated through statistical and financial measures. The financial results are based on an automatic trading rule that signals buy and hold moments in each stock. The Heteroscedastic STAR-Tree Model statistical performance was equivalent to the other models, however its financial performance was superior for most of the series. The STAR-Tree methodology was also applied for forecasting the realized volatility, and the forecasts were used in financial leverage analysis. [pt] ARVORE DE REGRESSAO [en] REGRESSION TREE [pt] MODELOS NAO-LINEARES [en] NONLINEAR MODELS [pt] HETEROCEDASTICIDADE [en] HETEROSKEDASTICITY

Search results