• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 96
  • 43
  • 23
  • 22
  • 17
  • 11
  • 7
  • 4
  • 4
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • Tagged with
  • 279
  • 279
  • 87
  • 45
  • 42
  • 42
  • 40
  • 39
  • 35
  • 28
  • 27
  • 27
  • 25
  • 25
  • 20
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
211

臨床實驗藥量特性之研究 / Characterizing dose response curve in clincal trials

方廷企 Unknown Date (has links)
在製藥工業中,藥量特性之研究常被應用於藥理學、毒物學及臨床試驗中。藥量特性之研究同時在臨床試驗中的第一階段的藥物安全性及第二階段的藥物有效性中扮演著重要的角色。透過藥量特性的研究,使我們對於藥物的開發有著更深一層的認識,並可藉此縮短藥物核准上市的時間。 在多數的情況下,我們對於藥物動力學之參數與藥物劑量間的線性關係,有著特別的興趣。基於這個目的,一些典型的方法即是由一般線性模式發展而成的。然而,這些典型的統計方法常遇到下列的難題而違反其假設:(一)不同藥物劑量間的異質變異性。(二)不滿足常態性的假設。針對這些問題,我們藉由比較不同劑量間的斜率關係的無母數檢定程序來評估其線性關係並刻劃出藥物的反應曲線,文中並藉此方法舉出交叉實驗之實例。 / The problem of characterizing dose response curve for a pharmacokinetic parameter over a specific dose range is considered. In many cases, it is of interest to determine dose linearity (or dose proportionality) between the pharmacokinetic parameters and dose levels. For this purpose, several classical methods based on a general linear model procedure are available. However, two difficulties commonly encountered, namely (i) heterogeneity of the varibility at different dose levels and (ii) violation of the normality assumptions, often make the classical methods not applicable. To account for these problems, we propose a general nonparametric test procedure by comparing the slopes at different dose level to asses dose linearity and to characterize dose response curve. An example concerning the study of dose response of a compound based on a four-way crossover experiment is presented.
212

Truncated Data Problems In Helical Cone-Beam Tomography

Anoop, K P 06 1900 (has links)
This report delves into two of the major truncated data problems in helical cone-beam tomography: Axial truncation and Lateral truncation. The problem of axial truncation, also classically known as the Long Object problem, was a major challenge in the development of helical scan tomography. Generalization of the Feldkamp method (FDK) for circular scan to the helical scan trajectory was known to give reasonable solutions to the problem. The FDK methods are approximate in nature and hence provide only approximate solution to the long object problem. Recently, many methods which provide exact solution to this problem have been developed the major breakthrough being the Katsevich’s algorithm which is exact, efficient and also requires lesser detector area compared to Feldkamp methods. The first part of the report deals with the implementation strategies for methods capable of handling axial truncation. Here, we specifically look at the exact and efficient Katsevich’s solution to long object problem and the class of approximate solutions provided by the generalized FDK formulae. The later half of the report looks at the lateral truncation problem and suggests new methods to handle such truncation in helical scan CT. Simulation results for reconstruction with laterally truncated projection data, assuming it to be complete, gives severe artifacts which even penetrates into the field of view (FOV). A row-by-row data completion approach using Linear Prediction is introduced for helical scan truncated data. An extension/improvement of this technique known as Windowed Linear Prediction approach is introduced. Efficacy of both these techniques are shown using simulation with standard phantoms. Various image quality measures for the resulting reconstructed images are used to evaluate the performance of the proposed methods against an already existing technique. Motivated by a study of the autocorrelation and partial autocorrelation functions of the projection data the use of a non-stationary linear model, the ARIMA model, is proposed for data completion. The new model is first validated in the 2D truncated data situation. Also a method of incorporating the parallel beam data consistency condition into this new method is considered. Performance evaluation of the new method with consistency condition shows that it can outperform the existing techniques. Simulation experiments show the efficacy of the ARIMA model for data completion in 2D as well as 3D truncated data scenario. The model is shown to work well for the laterally truncated helical cone-beam case.
213

Modeling framework for socioeconomic analysis of managed lanes

Khoeini, Sara 08 June 2015 (has links)
Managed lanes are a form of congestion pricing that use occupancy and toll payment requirements to utilize capacity more efficiently. How socio-spatial characteristics impact users’ travel behavior toward managed lanes is the main research question of this study. This research is a case study of the conversion of a High Occupancy Vehicle (HOV) lane to a High Occupancy Toll (HOT) lane, implemented in Atlanta I-85 on 2011. To minimize the cost and maximize the size of the collected data, an innovative and cost-effective modeling framework for socioeconomic analysis of managed lanes has been developed. Instead of surveys, this research is based on the observation of one and a half million license plates, matched to household locations, collected over a two-year study period. Purchased marketing data, which include detailed household socioeconomic characteristics, supplemented the household corridor usage information derived from license plate observations. Generalized linear models have been used to link users’ travel behavior to socioeconomic attributes. Furthermore, GIS raster analysis methods have been utilized to visualize and quantify the impact of the HOV-to-HOT conversion on the corridor commutershed. At the local level, this study conducted a comprehensive socio-spatial analysis of the Atlanta I-85 HOV to HOT conversion. At the general scale, this study enhances managed lanes’ travel demand models with respect to users’ characteristics and introduces a comprehensive modeling framework for the socioeconomic analysis of managed lanes. The methods developed through this research will inform future Traffic and Revenue Studies and help to better predict the socio-spatial characteristics of the target market.
214

Prédiction de l'évolution de la scoliose idiopathique de l'adolescent à l'aide des paramètres tridimensionnels du rachis

Nault, Marie-Lyne 08 1900 (has links)
La scoliose idiopathique de l’adolescent est une déformation 3D du rachis. La littérature comporte une multitude d’études sur la prédiction de l’évolution et l’identification de facteurs de risque de progression. Pour l’instant les facteurs de risque établis sont l’amplitude de la déformation, la maturité squelettique et le type de courbure. Plusieurs autres champs ont été explorés comme les aspects génétiques, biochimiques, mécaniques, posturaux et topographiques, sans vraiment apporter beaucoup de précision à la prédiction de l’évolution. L’avancement de la technologie permet maintenant de générer des reconstructions 3D du rachis à l’aide des radiographies standard et d’obtenir des mesures de paramètres 3D. L’intégration de ces paramètres 3D dans un modèle prédictif représente une avenue encore inexplorée qui est tout à fait logique dans le contexte de cette déformation 3D du rachis. L’objectif général de cette thèse est de développer un modèle de prédiction de l’angle de Cobb à maturité squelettique à partir de l’information disponible au moment de la première visite, soit l’angle de Cobb initial, le type de courbure, l’âge osseux et des paramètres 3D du rachis. Dans une première étude, un indice d’âge osseux a été développé basé sur l’ossification de l’apophyse iliaque et sur le statut du cartilage triradié. Cet indice comporte 3 stades et le second stade, qui est défini par un cartilage triradié fermé avec maximum 1/3 d’ossification de l’apophyse iliaque, représente le moment pendant lequel la progression de la scoliose idiopathique de l’adolescent est la plus rapide. Une seconde étude rétrospective a permis de mettre en évidence le potentiel des paramètres 3D pour améliorer la prédiction de l’évolution. Il a été démontré qu’à la première visite il existe des différences pour 5 paramètres 3D du rachis entre un groupe de patients qui sera éventuellement opéré et un groupe qui ne progressera pas. Ces paramètres sont : la moyenne da la cunéiformisation 3D des disques apicaux, la rotation intervertébrale à la jonction inférieure de la courbure, la torsion, le ratio hauteur/largeur du corps vertébral de T6 et de la colonne complète. Les deux dernières études sont basées sur une cohorte prospective de 133 patients avec une scoliose idiopathique de l’adolescent suivi dès leur première visite à l’hôpital jusqu’à maturité squelettique. Une première étude a permis de mettre en évidence les différences morphologiques à la première visite entre les patients ayant progresser de plus ou moins de 6°. Des différences ont été mise en évidence pour la cyphose, l’angle de plan de déformation maximal, la rotation ntervertébrale l’apex, la torsion et plusieurs paramètres de «slenderness». Ensuite une seconde étude a permis de développer un modèle prédictif basé sur un modèle linéaire général en incluant l’indice d’âge osseux développé dans la première étude, le type de courbure, l’amplitude de l’angle de Cobb à la première visite, l’angle de déformation du plan maximale, la cunéiformisation 3D des disques T3-T4, T8-­T9, T11-­T12 et la somme des cunéiformisation 3D de tous les disques thoraciques et lombaires. Le coefficient de détermination multiple pour cette modélisation est de 0.715. Le modèle prédictif développé renforce l’importance de considérer la scoliose idiopathique dans les trois dimensions et il permettra d’optimiser la prédiction de l’évolution au moment de la première visite. / Prediction of curve progression remains a challenge for the clinicians at the first visit for a patient with adolescent idiopathic scoliosis. Prediction of progression is based on curve type, curve magnitude and skeletal or chronological age. The failure to accurately predict the risk of progression can lead to inadequate treatment, as well as unnecessary medical visits and radiographs. Three-­dimensional evaluation is currently more popular in the scoliosis research community either for classification or treatment planning. The global objective of this thesis was to develop a predictive model of the final Cobb angle in adolescent idiopathic scoliosis based on 3D spine parameters and on skeletal age, curve type and curve magnitude. The first study was to develop a skeletal maturity system based on ossification of iliac apophysis and triradiate cartilage status, two details available on routine radiographs. A 3 stages system was develop with the second stage being associated to the rapid acceleration phase of the deformation. The second stage corresponds to a closed triradiate cartilage and a maximum of 1/3 of the iliac apophysis ossification. The second study was a retrospective case control study. It showed the potential for 3D parameters to help predict the evolution. 3D spinal parameters at first visit of a group of patient eventually treated with surgery and a group who had no progression revealed statistical differences for: mean 3D wedging of the apical disks, lower junctional vertebral axial rotation, torsion and T6 and whole spine height/width ratio were all significantly affected. The last study was a prospective cohort study based on an adolescent idiopathic scoliosis group followed from first visit to skeletal maturity. A general linear model was developed based on 3D spinal parameters and on curve type, Cobb angle at first visit and with the 3 stages skeletal maturity system developed in the first study. The predictive model obtained has a determination coefficient of 0,715. Included 3D parameters predictors were: the angle of the plane of maximal curvature, the 3D wedging of T3-­T4, T8-­T9 and T11-­T12 disks, and the sum of 3D wedging of all thoracic and lumbar disks. The predictive model developed reinforced the importance of considering adolescent idiopathic scoliosis in a three-­‐dimensional point of view. This model will also optimise the prediction of evolution at first visit.
215

Sélection de modèle d'imputation à partir de modèles bayésiens hiérarchiques linéaires multivariés

Chagra, Djamila 06 1900 (has links)
Résumé La technique connue comme l'imputation multiple semble être la technique la plus appropriée pour résoudre le problème de non-réponse. La littérature mentionne des méthodes qui modélisent la nature et la structure des valeurs manquantes. Une des méthodes les plus populaires est l'algorithme « Pan » de (Schafer & Yucel, 2002). Les imputations rapportées par cette méthode sont basées sur un modèle linéaire multivarié à effets mixtes pour la variable réponse. La méthode « BHLC » de (Murua et al, 2005) est une extension de « Pan » dont le modèle est bayésien hiérarchique avec groupes. Le but principal de ce travail est d'étudier le problème de sélection du modèle pour l'imputation multiple en termes d'efficacité et d'exactitude des prédictions des valeurs manquantes. Nous proposons une mesure de performance liée à la prédiction des valeurs manquantes. La mesure est une erreur quadratique moyenne reflétant la variance associée aux imputations multiples et le biais de prédiction. Nous montrons que cette mesure est plus objective que la mesure de variance de Rubin. Notre mesure est calculée en augmentant par une faible proportion le nombre de valeurs manquantes dans les données. La performance du modèle d'imputation est alors évaluée par l'erreur de prédiction associée aux valeurs manquantes. Pour étudier le problème objectivement, nous avons effectué plusieurs simulations. Les données ont été produites selon des modèles explicites différents avec des hypothèses particulières sur la structure des erreurs et la distribution a priori des valeurs manquantes. Notre étude examine si la vraie structure d'erreur des données a un effet sur la performance du choix des différentes hypothèses formulées pour le modèle d'imputation. Nous avons conclu que la réponse est oui. De plus, le choix de la distribution des valeurs manquantes semble être le facteur le plus important pour l'exactitude des prédictions. En général, les choix les plus efficaces pour de bonnes imputations sont une distribution de student avec inégalité des variances dans les groupes pour la structure des erreurs et une loi a priori choisie pour les valeurs manquantes est la loi normale avec moyenne et variance empirique des données observées, ou celle régularisé avec grande variabilité. Finalement, nous avons appliqué nos idées à un cas réel traitant un problème de santé. Mots clés : valeurs manquantes, imputations multiples, modèle linéaire bayésien hiérarchique, modèle à effets mixtes. / Abstract The technique known as multiple imputation seems to be the most suitable technique for solving the problem of non-response. The literature mentions methods that models the nature and structure of missing values. One of the most popular methods is the PAN algorithm of Schafer and Yucel (2002). The imputations yielded by this method are based on a multivariate linear mixed-effects model for the response variable. A Bayesian hierarchical clustered and more flexible extension of PAN is given by the BHLC model of Murua et al. (2005). The main goal of this work is to study the problem of model selection for multiple imputation in terms of efficiency and accuracy of missing-value predictions. We propose a measure of performance linked to the prediction of missing values. The measure is a mean squared error, and hence in addition to the variance associated to the multiple imputations, it includes a measure of bias in the prediction. We show that this measure is more objective than the most common variance measure of Rubin. Our measure is computed by incrementing by a small proportion the number of missing values in the data and supposing that those values are also missing. The performance of the imputation model is then assessed through the prediction error associated to these pseudo missing values. In order to study the problem objectively, we have devised several simulations. Data were generated according to different explicit models that assumed particular error structures. Several missing-value prior distributions as well as error-term distributions are then hypothesized. Our study investigates if the true error structure of the data has an effect on the performance of the different hypothesized choices for the imputation model. We concluded that the answer is yes. Moreover, the choice of missing-value prior distribution seems to be the most important factor for accuracy of predictions. In general, the most effective choices for good imputations are a t-Student distribution with different cluster variances for the error-term, and a missing-value Normal prior with data-driven mean and variance, or a missing-value regularizing Normal prior with large variance (a ridge-regression-like prior). Finally, we have applied our ideas to a real problem dealing with health outcome observations associated to a large number of countries around the world. Keywords: Missing values, multiple imputation, Bayesian hierarchical linear model, mixed effects model. / Les logiciels utilisés sont Splus et R.
216

DirichletReg: Dirichlet Regression for Compositional Data in R

Maier, Marco J. 18 January 2014 (has links) (PDF)
Dirichlet regression models can be used to analyze a set of variables lying in a bounded interval that sum up to a constant (e.g., proportions, rates, compositions, etc.) exhibiting skewness and heteroscedasticity, without having to transform the data. There are two parametrization for the presented model, one using the common Dirichlet distribution's alpha parameters, and a reparametrization of the alpha's to set up a mean-and-dispersion-like model. By applying appropriate link-functions, a GLM-like framework is set up that allows for the analysis of such data in a straightforward and familiar way, because interpretation is similar to multinomial logistic regression. This paper gives a brief theoretical foundation and describes the implementation as well as application (including worked examples) of Dirichlet regression methods implemented in the package DirichletReg (Maier, 2013) in the R language (R Core Team, 2013). (author's abstract) / Series: Research Report Series / Department of Statistics and Mathematics
217

聯準會模型的國際普遍性與門檻回歸應用 / The International Test and the Threshold Regressive Analysis of the Fed model

潘彥君 Unknown Date (has links)
本篇論文檢驗聯準會模型在六個亞洲市場:中國大陸、印度、馬來西亞、新加坡、台灣和泰國是否成立。我們首先檢驗共整合檢定來觀察變數之間長期的關係;另外,針對線性的指標模型,我們則檢測其是否具有非線性的門檻自回歸情形。實證結果顯示,於共整合檢定下,六個國家的股票價格、股票報酬和十年期債券殖利率具有長期共整合關係;而在非線性的TAR模型配適下,其解釋能力優於線性的AR模型。 / This paper studies the Fed Model in six Asia countries, China, India, Malaysia, Singapore, Taiwan, and Thailand. We examine the cointegraiton test for the long-run relationship and build a nonlinear threshold autoregressive model (TAR) between the long -term government bond yield, the stock index and the earning s index. Our empirical results show that such a long-run relationship indeed exists for those countries. In addition, the explanatory power of TAR model is better than linear AR model.
218

Applications of Spatio-temporal Analytical Methods in Surveillance of Ross River Virus Disease

Hu, Wenbiao January 2005 (has links)
The incidence of many arboviral diseases is largely associated with social and environmental conditions. Ross River virus (RRV) is the most prevalent arboviral disease in Australia. It has long been recognised that the transmission pattern of RRV is sensitive to socio-ecological factors including climate variation, population movement, mosquito-density and vegetation types. This study aimed to assess the relationships between socio-environmental variability and the transmission of RRV using spatio-temporal analytic methods. Computerised data files of daily RRV disease cases and daily climatic variables in Brisbane, Queensland during 1985-2001 were obtained from the Queensland Department of Health and the Australian Bureau of Meteorology, respectively. Available information on other socio-ecological factors was also collected from relevant government agencies as follows: 1) socio-demographic data from the Australia Bureau of Statistics; 2) information on vegetation (littoral wetlands, ephemeral wetlands, open freshwater, riparian vegetation, melaleuca open forests, wet eucalypt, open forests and other bushland) from Brisbane City Council; 3) tidal activities from the Queensland Department of Transport; and 4) mosquito-density from Brisbane City Council. Principal components analysis (PCA) was used as an exploratory technique for discovering spatial and temporal pattern of RRV distribution. The PCA results show that the first principal component accounted for approximately 57% of the information, which contained the four seasonal rates and loaded highest and positively for autumn. K-means cluster analysis indicates that the seasonality of RRV is characterised by three groups with high, medium and low incidence of disease, and it suggests that there are at least three different disease ecologies. The variation in spatio-temporal patterns of RRV indicates a complex ecology that is unlikely to be explained by a single dominant transmission route across these three groupings. Therefore, there is need to explore socio-economic and environmental determinants of RRV disease at the statistical local area (SLA) level. Spatial distribution analysis and multiple negative binomial regression models were employed to identify the socio-economic and environmental determinants of RRV disease at both the city and local (ie, SLA) levels. The results show that RRV activity was primarily concentrated in the northeast, northwest and southeast areas in Brisbane. The negative binomial regression models reveal that RRV incidence for the whole of the Brisbane area was significantly associated with Southern Oscillation Index (SOI) at a lag of 3 months (Relative Risk (RR): 1.12; 95% confidence interval (CI): 1.06 - 1.17), the proportion of people with lower levels of education (RR: 1.02; 95% CI: 1.01 - 1.03), the proportion of labour workers (RR: 0.97; 95% CI: 0.95 - 1.00) and vegetation density (RR: 1.02; 95% CI: 1.00 - 1.04). However, RRV incidence for high risk areas (ie, SLAs with higher incidence of RRV) was significantly associated with mosquito density (RR: 1.01; 95% CI: 1.00 - 1.01), SOI at a lag of 3 months (RR: 1.48; 95% CI: 1.23 - 1.78), human population density (RR: 3.77; 95% CI: 1.35 - 10.51), the proportion of indigenous population (RR: 0.56; 95% CI: 0.37 - 0.87) and the proportion of overseas visitors (RR: 0.57; 95% CI: 0.35 - 0.92). It is acknowledged that some of these risk factors, while statistically significant, are small in magnitude. However, given the high incidence of RRV, they may still be important in practice. The results of this study suggest that the spatial pattern of RRV disease in Brisbane is determined by a combination of ecological, socio-economic and environmental factors. The possibility of developing an epidemic forecasting system for RRV disease was explored using the multivariate Seasonal Auto-regressive Integrated Moving Average (SARIMA) technique. The results of this study suggest that climatic variability, particularly precipitation, may have played a significant role in the transmission of RRV disease in Brisbane. This finding cannot entirely be explained by confounding factors such as other socio-ecological conditions because they have been unlikely to change dramatically on a monthly time scale in this city over the past two decades. SARIMA models show that monthly precipitation at a lag 2 months (=0.004,p=0.031) was statistically significantly associated with RRV disease. It suggests that there may be 50 more cases a year for an increase of 100 mm precipitation on average in Brisbane. The predictive values in the model were generally consistent with actual values (root-mean-square error (RMSE): 1.96). Therefore, this model may have applications as a decision support tool in disease control and risk-management planning programs in Brisbane. The Polynomial distributed lag (PDL) time series regression models were performed to examine the associations between rainfall, mosquito density and the occurrence of RRV after adjusting for season and auto-correlation. The PDL model was used because rainfall and mosquito density can affect not merely RRV occurring in the same month, but in several subsequent months. The rationale for the use of the PDL technique is that it increases the precision of the estimates. We developed an epidemic forecasting model to predict incidence of RRV disease. The results show that 95% and 85% of the variation in the RRV disease was accounted for by the mosquito density and rainfall, respectively. The predictive values in the model were generally consistent with actual values (RMSE: 1.25). The model diagnosis reveals that the residuals were randomly distributed with no significant auto-correlation. The results of this study suggest that PDL models may be better than SARIMA models (R-square increased and RMSE decreased). The findings of this study may facilitate the development of early warning systems for the control and prevention of this widespread disease. Further analyses were conducted using classification trees to identify major mosquito species of Ross River virus (RRV) transmission and explore the threshold of mosquito density for RRV disease in Brisbane, Australia. The results show that Ochlerotatus vigilax (RR: 1.028; 95% CI: 1.001 - 1.057) and Culex annulirostris (RR: 1.013, 95% CI: 1.003 - 1.023) were significantly associated with RRV disease cycles at a lag of 1 month. The presence of RRV was associated with average monthly mosquito density of 72 Ochlerotatus vigilax and 52 Culex annulirostris per light trap. These results may also have applications as a decision support tool in disease control and risk management planning programs. As RRV has significant impact on population health, industry, and tourism, it is important to develop an epidemic forecast system for this disease. The results of this study show the disease surveillance data can be integrated with social, biological and environmental databases. These data can provide additional input into the development of epidemic forecasting models. These attempts may have significant implications in environmental health decision-making and practices, and may help health authorities determine public health priorities more wisely and use resources more effectively and efficiently.
219

[en] COREFERENCE RESOLUTION FOR THE ENGLISH LANGUAGE / [pt] RESOLUÇÃO DE CO-REFERÊNCIA PARA A LÍNGUA INGLESA

ADRIEL GARCIA HERNANDEZ 28 July 2017 (has links)
[pt] Um dos problemas encontrados nos sistemas de processamento de linguagem natural é a dificuldade em identificar elementos textuais que se referem à mesma entidade. Este fenômeno é chamado de correferência. Resolver esse problema é parte integrante da compreensão do discurso, permitindo que os usuários da linguagem conectem as partes da informação de fala relativas à mesma entidade. Por conseguinte, a resolução de correferência é um importante foco de atenção no processamento da linguagem natural.Apesar da riqueza das pesquisas existentes, o desempenho atual dos sistemas de resolução de correferência ainda não atingiu um nível satisfatório. Neste trabalho, descrevemos um sistema de aprendizado estruturado para resolução de correferências em restrições que explora duas técnicas: árvores de correferência latente e indução automática de atributos guiadas por entropia. A modelagem de árvore latente torna o problema de aprendizagem computacionalmente viável porque incorpora uma estrutura escondida relevante. Além disso, utilizando um método automático de indução de recursos, podemos construir eficientemente modelos não-lineares, usando algoritmos de aprendizado de modelo linear como, por exemplo, o algoritmo de perceptron estruturado e esparso.Nós avaliamos o sistema para textos em inglês, utilizando o conjunto de dados da CoNLL-2012 Shared Task. Para a língua inglesa, nosso sistema obteve um valor de 62.24 por cento no score oficial dessa competição. Este resultado está abaixo do desempenho no estado da arte para esta tarefa que é de 65.73 por cento. No entanto, nossa solução reduz significativamente o tempo de obtenção dos clusters dos documentos, pois, nosso sistema leva 0.35 segundos por documento no conjunto de testes, enquanto no estado da arte, leva 5 segundos para cada um. / [en] One of the problems found in natural language processing systems, is the difficulty to identify textual elements referring to the same entity, this task is called coreference. Solving this problem is an integral part of discourse comprehension since it allows language users to connect the pieces of speech information concerning to the same entity. Consequently, coreference resolution is a key task in natural language processing.Despite the large efforts of existing research, the current performance of coreference resolution systems has not reached a satisfactory level yet. In this work, we describe a structure learning system for unrestricted coreferencere solution that explores two techniques: latent coreference trees and automatic entropy-guided feature induction. The latent tree modeling makes the learning problem computationally feasible,since it incorporates are levant hidden structure. Additionally,using an automatic feature induction method, we can efciently build enhanced non-linear models using linear model learning algorithms, namely, the structure dandsparse perceptron algorithm. We evaluate the system on the CoNLL-2012 Shared Task closed track data set, for the English portion. The proposed system obtains a 62.24 per cent value on the competition s official score. This result is be low the 65.73 per cent, the state-of-the-art performance for this task. Nevertheless, our solution significantly reduces the time to obtain the clusters of adocument, since, our system takes 0.35 seconds per document in the testing set, while in the state-of-the-art, it takes 5 seconds for each one.
220

Análise dos resultados da avaliação do SAEB/2003 via regressão linear múltipla / SAEB/2003: analysis of results in Ceará state using multilinear regression

TROMPIERI FILHO, Nicolino January 2007 (has links)
TROMPIERE FILHO, Nicolino . Análise dos resultados da avaliação do SAEB/2003 via regressão linear múltipla. 2007. 99f. Tese (Doutorado em Educação) – Universidade Federal do Ceará, Faculdade de Educação, Programa de Pós-Graduação em Educação Brasileira, Fortaleza-CE, 2007. / Submitted by Maria Josineide Góis (josineide@ufc.br) on 2012-07-10T14:34:17Z No. of bitstreams: 1 2007_Tes_NTFilho.pdf: 840956 bytes, checksum: d655076b1043c961097d33ba9b1a426e (MD5) / Approved for entry into archive by Maria Josineide Góis(josineide@ufc.br) on 2012-07-10T15:06:02Z (GMT) No. of bitstreams: 1 2007_Tes_NTFilho.pdf: 840956 bytes, checksum: d655076b1043c961097d33ba9b1a426e (MD5) / Made available in DSpace on 2012-07-10T15:06:02Z (GMT). No. of bitstreams: 1 2007_Tes_NTFilho.pdf: 840956 bytes, checksum: d655076b1043c961097d33ba9b1a426e (MD5) Previous issue date: 2007 / This study had as its objective the identification in the results of SAEB/2003, here in Ceará, of a multilinear model with variables that gave a greater degree of explaination for the results in mathematics of tests given in the randomic sample of the fourth years students in public schools, the fourth year students in private schools, the eight year students in public schools and the eight year students in private schools. The following variables were taken into account: the director, the teacher, the student, the physical state of the school as variables that had significant relevance with the results of the tests in mathematics in each one of the samples, using the resources of the software SPSS (Statical Package for the School Sciences), a linear regression in each sample, using the method "stepware". Four multilinear models with variables were obtained that permitted a greater degree of explaination for the results of the mathematic tests in each sample. The study concluded affirming that an analysis of this type permits the formulation of public politics capable of overcoming the search for better forms of teaching with punctual interventions. / O presente estudo teve como objetivo identificar nos resultados do SAEB/2003, no Ceará, um modelo multilinear com variáveis que garantam o maior grau de explicação do rendimento em matemática nos testes aplicados nas amostras de 4ª série da escola pública; 4ª série da escola particular; 8ª série da escola pública e 8ª série da escola particular. Tomando-se, entre as variáveis do diretor, do professor, do aluno e das condições físicas da escola, as variáveis que se relacionavam significativamente com o rendimento no teste de matemática em cada uma das amostraS, com os recursos do software SPSS (Statical Package for the Social Sciences), uma regressão linear em cada amostra, com o método “stepware”. Obtiveram-se quatro modelos lineares múltiplos com variáveis que permitem um maior grau de explicação do rendimento no teste em matemática em cada uma das amostras. Conclui-se que a análise desse tipo permite a formulação de políticas públicas capazes de superar a busca da melhoria do ensino por intervenções pontuais.

Page generated in 0.0352 seconds