Global ETD Search

191	Redução de dimensionalidade aplicada à diarização de locutor / Dimensionality reduction applied to speaker diarization Silva, Sérgio Montazzolli January 2013 (has links) Atualmente existe uma grande quantidade de dados multimídia sendo geradas todos os dias. Estes dados são oriundos de diversas fontes, como transmissões de rádio ou televisão, gravações de palestras, encontros, conversas telefônicas, vídeos e fotos capturados por celular, entre outros. Com isto, nos últimos anos o interesse pela transcrição de dados multimídia tem crescido, onde, no processamento de voz, podemos destacar as áreas de Reconhecimento de Locutor, Reconhecimento de Fala, Diarização de Locutor e Rastreamento de Locutores. O desenvolvimento destas áreas vem sendo impulsionado e direcionado pelo NIST, que periodicamente realiza avaliações sobre o estado-da-arte. Desde 2000, a tarefa de Diarização de Locutor tem se destacado como uma das principáis frentes de pesquisa em transcrição de dados de voz, tendo sido avaliada pelo NIST por diversas vezes na última década. O objetivo desta tarefa é encontrar o número de locutores presentes em um áudio, e rotular seus respectivos trechos de fala, sem que nenhuma informação tenha sido previamente fornecida. Em outras palavras, costuma-se dizer que o objetivo é responder a questão "Quem falou e quando?". Um dos grandes problemas nesta área é se conseguir obter um bom modelo para cada locutor presente no áudio, dada a pouca quantidade de informações e a alta dimensionalidade dos dados. Neste trabalho, além da criação de um Sistema de Diarização de Locutor, iremos tratar este problema mediante à redução de dimensionalidade através de análises estatísticas. Usaremos a Análise de Componentes Principáis, a Análise de Discriminantes Lineares e a recém apresentada Análise de Semi-Discriminantes Lineares. Esta última utiliza um método de inicialização estático, iremos propor o uso de um método dinâmico, através da detecção de pontos de troca de locutor. Também investigaremos o comportamento destas análises sob o uso simultâneo de múltiplas parametrizações de curto prazo do sinal acústico. Os resultados obtidos mostram que é possível preservar - ou até melhorar - o desempenho do sistema, mesmo reduzindo substâncialmente o número de dimensões. Isto torna mais rápida a execução de algoritmos de Aprendizagem de Máquina e reduz a quantidade de memória necessária para armezenar os dados. / Currently, there is a large amount of multimedia data being generated everyday. These data come from various sources, such as radio or television, recordings of lectures and meetings, telephone conversations, videos and photos captured by mobile phone, among others. Because of this, interest in automatic multimedia data transcription has grown in recent years, where, for voice processing, we can highlight the areas of Speaker Recognition, Speech Recognition, Speaker Diarization and Speaker Tracking. The development of such areas is being conducted by NIST, which periodically promotes state-of-the-art evaluations. Since 2000, the task of Speaker Diarization has emerged as one of the main research fields in voice data transcription, having been evaluated by NIST several times in the last decade. The objective of this task is to find the number of speakers in an audio recording, and properly label their speech segments without the use of any training information. In other words , it is said that the goal of Speaker Diarization is to answer the question "Who spoke when?". A major problem in this area is to obtain a good speaker model from the audio, given the limited amount of information available and the high dimensionality of the data. In the current work, we will describe how our Speaker Diarization System was built, and we will address the problem mentioned by lowering the dimensionality of the data through statistical analysis. We will use the Principal Component Analysis, the Linear Discriminant Analysis and the newly presented Fisher Linear Semi-Discriminant Analysis. The latter uses a static method for initialization, and here we propose the use of a dynamic method by the use of a speaker change points detection algorithm. We also investigate the behavior of these data analysis techniques under the simultaneous use of multiple short term features. Our results show that it is possible to maintain - and even improve - the system performance, by substantially reducing the number of dimensions. As a consequence, the execution of Machine Learning algorithms is accelerated while reducing the amount of memory required to store the data. Processamento : Linguagem natural Voz computacional Speaker diarization Discriminant analysis Dimensionality reduction
192	Classificação de gasolinas comerciais através de métodos estatísticos multivariáveis. / Classification of commercial gasoline through multivariable statistical methods. Marcelo Aparecido Mendonça 29 March 2005 (has links) Neste trabalho estuda-se a aplicação de métodos estatísticos multivariáveis para a classificação de gasolinas comerciais em conformidade à legislação vigente. Atualmente, a ANP baseia a classificação em limites máximos e mínimos para uma série de diferentes propriedades físico-químicas. O objetivo do trabalho é propor uma metodologia para fazer uma triagem das amostras coletadas durante o Programa de Monitoramento da Qualidade dos Combustíveis através de um método de classificação. Ela utiliza a espectroscopia NIR, que é uma técnica rápida e não destrutiva, como método analítico. Com isto será possível reduzir o número de ensaios físico-químicos que não necessariamente seriam realizados sistematicamente em todas as amostras, reduzindo-se os custos e aumentando-se a quantidade de postos monitorados. As análises NIR produzem grandes quantidades de dados, o que leva à utilização de técnicas estatísticas multivariáveis para estabelecer as metodologias de classificação. Neste trabalho utilizam-se técnicas já consagradas, como a PCA e a PLS para a compressão dos dados e a LDA e QDA para a classificação das amostras. Os dados analisados correspondem às propriedades físico-químicas e aos espectros NIR de um conjunto de 216 amostras de gasolinas comerciais, utilizado para a concepção dos modelos de classificação, e de outro de 50 amostras, utilizado para a validação dos modelos. Os modelos testados no trabalho foram as combinações da PCA-LDA, PCA-QDA, PLS-LDA, PLS-QDA, PLS (regressão) e a análise dos gráficos de scores (biplot). Os melhores desempenhos foram obtidos pelos gráficos dos scores, em seguida pela regressão PLS, PLS-QDA, PCA-QDA e PLS-QDA. Existem ainda algumas etapas a serem alcançadas para tornar prática a utilização da classificação de gasolinas comerciais através de NIR, no entanto, a contribuição deste estudo é importante pois permitiu demonstrar a sua viabilidade técnica. / In this work, the application of multivariable statistical methods for the classification of commercial gasoline in accordance to applicable laws in Brazil is studied. In the present, the ANP bases the classification of gasoline on lower and upper bounds defined for a number of physico-chemical properties. The objective of this work is to propose an alternative analysis methodology, that is adequate for making a pre-sorting of the samples collected by the Fuel Quality Monitoring Program through a classification method. This method is based on NIR spectroscopy, that is a fast and non-destructive technique, as the analytical method. In this way, it would be possible to reduce the number of physico-chemical analyses, as it would be possible not to perform them on every sample, reducing costs and increasing the quantity and frequency of gas stations that could be monitored. NIR analyses produce a great quantity of data, that makes the use of multivariable statistical techniques necessary in order to set up classification methodologies. In this work the well-known PCA and PLS techniques are used for data compression, and LDA and QDA analyses for sample classification. The data studied correspond to the physico-chemical properties and NIR spectra of a total of 216 commercial gasoline samples, used for model design, and of a 50 samples, used for validation. The classification methods that are tested are combinations of PCA-LDA, PCA-QDA, PLS-LDA, PLS-QDA, PLS (regression) and data compression scores graphical analysis (biplot). Best performance was obtained with compression scores graphical analysis, followed by PLS regression, PLS-QDA, PCA-QDA and PLS-QDA. There are still some steps to be fulfilled before the usage of commercial gasoline classification through NIR could be practical. However, this study has shown that this methodology is technically feasible. análise discriminante classificação gasolina métodos multivariáveis PCA PLS classification discriminant analysis gasoline multivariable methods PCA PLS
193	Redução de dimensionalidade aplicada à diarização de locutor / Dimensionality reduction applied to speaker diarization Silva, Sérgio Montazzolli January 2013 (has links) Atualmente existe uma grande quantidade de dados multimídia sendo geradas todos os dias. Estes dados são oriundos de diversas fontes, como transmissões de rádio ou televisão, gravações de palestras, encontros, conversas telefônicas, vídeos e fotos capturados por celular, entre outros. Com isto, nos últimos anos o interesse pela transcrição de dados multimídia tem crescido, onde, no processamento de voz, podemos destacar as áreas de Reconhecimento de Locutor, Reconhecimento de Fala, Diarização de Locutor e Rastreamento de Locutores. O desenvolvimento destas áreas vem sendo impulsionado e direcionado pelo NIST, que periodicamente realiza avaliações sobre o estado-da-arte. Desde 2000, a tarefa de Diarização de Locutor tem se destacado como uma das principáis frentes de pesquisa em transcrição de dados de voz, tendo sido avaliada pelo NIST por diversas vezes na última década. O objetivo desta tarefa é encontrar o número de locutores presentes em um áudio, e rotular seus respectivos trechos de fala, sem que nenhuma informação tenha sido previamente fornecida. Em outras palavras, costuma-se dizer que o objetivo é responder a questão "Quem falou e quando?". Um dos grandes problemas nesta área é se conseguir obter um bom modelo para cada locutor presente no áudio, dada a pouca quantidade de informações e a alta dimensionalidade dos dados. Neste trabalho, além da criação de um Sistema de Diarização de Locutor, iremos tratar este problema mediante à redução de dimensionalidade através de análises estatísticas. Usaremos a Análise de Componentes Principáis, a Análise de Discriminantes Lineares e a recém apresentada Análise de Semi-Discriminantes Lineares. Esta última utiliza um método de inicialização estático, iremos propor o uso de um método dinâmico, através da detecção de pontos de troca de locutor. Também investigaremos o comportamento destas análises sob o uso simultâneo de múltiplas parametrizações de curto prazo do sinal acústico. Os resultados obtidos mostram que é possível preservar - ou até melhorar - o desempenho do sistema, mesmo reduzindo substâncialmente o número de dimensões. Isto torna mais rápida a execução de algoritmos de Aprendizagem de Máquina e reduz a quantidade de memória necessária para armezenar os dados. / Currently, there is a large amount of multimedia data being generated everyday. These data come from various sources, such as radio or television, recordings of lectures and meetings, telephone conversations, videos and photos captured by mobile phone, among others. Because of this, interest in automatic multimedia data transcription has grown in recent years, where, for voice processing, we can highlight the areas of Speaker Recognition, Speech Recognition, Speaker Diarization and Speaker Tracking. The development of such areas is being conducted by NIST, which periodically promotes state-of-the-art evaluations. Since 2000, the task of Speaker Diarization has emerged as one of the main research fields in voice data transcription, having been evaluated by NIST several times in the last decade. The objective of this task is to find the number of speakers in an audio recording, and properly label their speech segments without the use of any training information. In other words , it is said that the goal of Speaker Diarization is to answer the question "Who spoke when?". A major problem in this area is to obtain a good speaker model from the audio, given the limited amount of information available and the high dimensionality of the data. In the current work, we will describe how our Speaker Diarization System was built, and we will address the problem mentioned by lowering the dimensionality of the data through statistical analysis. We will use the Principal Component Analysis, the Linear Discriminant Analysis and the newly presented Fisher Linear Semi-Discriminant Analysis. The latter uses a static method for initialization, and here we propose the use of a dynamic method by the use of a speaker change points detection algorithm. We also investigate the behavior of these data analysis techniques under the simultaneous use of multiple short term features. Our results show that it is possible to maintain - and even improve - the system performance, by substantially reducing the number of dimensions. As a consequence, the execution of Machine Learning algorithms is accelerated while reducing the amount of memory required to store the data. Processamento : Linguagem natural Voz computacional Speaker diarization Discriminant analysis Dimensionality reduction
194	Contribuição ao estudo da solvência empresarial: uma análise de modelos de previsão - estudo exploratório aplicado em empresas mineiras / Contribution to the study of the business solvency: an analysis of forecast models. Poueri do Carmo Mário 06 February 2002 (has links) O trabalho aqui apresentado é uma análise retrospectiva de modelos desenvolvidos, no Brasil, sobre o estudo da previsão de insolvência das empresas, objetivando-se avaliar a aplicação de métodos quantitativos para fins de análise de demonstrações contábeis. Considera-se que é relevante a avaliação da continuidade da empresa, e que, se for possível identificar fato em contrário, o uso de modelos de previsão é de importância no que tange à decisão de concessão de crédito, tanto no âmbito da intermediação financeira, realizada pelos bancos, quanto no âmbito de transações comerciais entre fornecedores e clientes. Desta última, pode-se inferir sobre a avaliação da concessão ou não da Concordata para uma empresa, servindo aqueles modelos como ferramental de análise da capacidade da empresa em cumprir o acordo da concordata, ponto esse explorado nesta pesquisa. Através da aplicação dos modelos sobre uma amostra de empresas que haviam solicitado a concordata, pôde-se avaliar se mantinham uma capacidade de discriminar as empresas que lograriam êxito na concordata. Como ferramental estatístico, é utilizada a Análise Discriminante, técnica de análise multivariada, que busca classificar os dados em dois grupos específicos. Neste trabalho, foram definidos como grupo de empresas solventes e grupo de empresas insolventes. Verificou-se que as premissas para utilização da técnica estatística de Análise Discriminante podem limitar, não invalidar, esses modelos. Há necessidade de se avaliarem os dados das amostras para se verificar se é possível ou não o uso da técnica de Análise Discriminante, além do que necessitam recorrentemente, de ser recalculados. Essa limitação reduziu-se quando se utilizaram os modelos em conjunto ou integrados, como verificado nos testes realizados. Outra técnica utilizada nesse estudo foi a de se gerar um modelo que congregue os melhores indicadores dos modelos analisados, obtendo-se um modelo de previsão, que pode ser considerado híbrido ou misto. Esse modelo foi testado quanto à sua capacidade de avaliar se as empresas concluiriam suas concordatas e, também, em sua capacidade de discriminar as empresas nos dois grupos anteriormente descritos (Solventes e Insolventes), ambos formados por empresas situadas em Belo Horizonte, Betim e Contagem. Como ressaltado, existem limitações ao uso desses modelos, que se iniciam pela própria ferramenta da Análise Discriminante. Porém, a sua utilização pode tornar mais objetiva a decisão de se conceder ou não a Concordata a uma empresa, ou, até mesmo, uma linha de crédito especial para cliente de um fornecedor ou de uma instituição bancária que se encontre nessa situação. Portanto, verificou-se ser possível, através das demonstrações contábeis das empresas objeto do estudo, a previsão da tendência de solvência ou insolvência daquelas, avaliando-se se lograriam êxito com a concordata. / This study is a retrospective analysis of models developed in Brazil with respect to the study of forecasting company insolvency, aimed at evaluating the application of quantitative methods to the financial analysis of financial statements. Evaluating the going-concern of companies is considered relevant. If facts can be identified indicating the opposite, the use of forecasting models is important what the decision on the extension of credit is concerned, not only in the field of financial intermediation, realized by banks, but also in the field of commercial transactions between suppliers and clients. From this decision, inferences can be made about the evaluation of whether a composition of debt will be conceded to a company, in which the models mentioned above will serve as tools for analyzing the companys capacity to fulfill the composition agreement, an issue that is dealt with in this research. By means of the application of those models to a sample of companies that had applied for composition of debt, it could be evaluated whether the models maintained their capacity to distinguish the companies that were successful in the composition of debt. As a statistical tool, the Discriminant Analysis is used. This is a multivariate analysis technique that seeks to classify the data in two specific groups. In this study, they were defined as solvent companies group and insolvent companies group. It was verified that the premises for using the statistical technique of Discriminant Analysis can limit, but not invalidate these models. The data of the samples need to be assessed in order to verify whether it is possible or not to use the Discriminant Analysis technique. In addition, they recurrently need to be recalculated. This limitation was reduced when the models were used together or in an integrated way, as verified in the accomplished tests. Another technique used in this study was the creation of a model that unites the best indicators of the models that were analyzed, obtaining a forecasting model, which can be considered a hybrid or mixed. This model was tested for its capacity to evaluate whether the companies would conclude the composition of debt as well as its capacity to discriminate the companies in the two groups previously described (Solvent and Insolvent), both of which consist of companies located in Belo Horizonte, Betim and Contagem. As highlighted, the use of these models is limited, starting with the Discriminant Analysis tool itself. Nevertheless, their utilization can make the decision on the concession of debt composition to a company more objective, or even the decision on extending a special credit line to the customer of a supplier or to the client of a bank who finds himself in this situation. Therefore, it was confirmed that the analysis of the financial statements of the firms included in this study permits to forecast the possibility to determine the solvency or insolvency trend of the firms, as well as to assess their eventual success with the concordat. Analise Discriminante Concordata Falência Insolvência Modelos de Previsão Bankruptcy Concordat Discriminant Analysis Forecast Models Insolvency
195	The use of permanent maxillary and mandibular canines in sex and age determination in a South African sample Ackermann, Anja January 2013 (has links) Dental anthropologists study the variation around the common shared patterns of teeth. These differences in the development, size and morphology of teeth are often used to help estimate the age and sex of unknown individuals. The aim of the study was two-fold. Firstly, it was determined whether sexually dimorphic characteristics exist in the size of permanent canines of South Africans, and whether these differences are of sufficient magnitude to make them usable as a method to determine sex from unknown remains. For this purpose the mesiodistal and buccolingual crown diameters and the maxillary/mandibular canine index were used. Secondly, the Lamendin technique of age estimation was tested and adapted to a South African sample. Therefore, this study aimed to assess the usability of human permanent canines in the determination of two demographic characteristics, namely sex and age, in a South African sample. A sample of known sex, age and population group was obtained from the Pretoria Bone Collection (University of Pretoria, South Africa) and the Raymond A. Dart Collection (University of Witwatersrand, Johannesburg, South Africa). The canines of 498 skulls were measured from four groups namely, black males, black females, white males and white females. The age of the sample ranged from 20 to 90 years. Using discriminant function analysis, it was possible to differentiate between the sexes with a relatively good accuracy of up to 87%. It was also evident that the two populations differed from one another as far as tooth size is concerned. Lamendin’s method of age estimation yielded poor precision and accuracy. Periodontosis was better correlated with age than root transparency, where the highest R2 value was 0.35. In summary it seems that the dimensions of the canine are useful in estimation of sex, should the population group be known. The Lamendin technique, however, gave relatively poor results even though new population specific formulae were created for the black and white populations of this sample. It could only estimate the age of the sample with an R2 value of 0.41 and mean errors ranging from 12.02 to 15.76 years. / Dissertation (MSc)--University of Pretoria, 2013. / gm2014 / Anatomy / unrestricted Canines Mesiodistal Buccolingual Sex Age Lamendin Discriminant Periodontosis Root height UCTD
196	Comparison of sexually dimorphic patterns in the postcrania of South Africans and North Americans Krüger, Gabriele Christa January 2015 (has links) While postcraniometric sex estimation has shown promising results in North American (NA) samples, methods and standards for sex estimation in South Africa (SA) are restricted by incomplete samples and a lack of robust statistical techniques. The purpose of this study was to evaluate accuracies of sex estimation in the postcrania of modern South Africans using multivariate statistics and to compare pattern expression of sexual dimorphism in black, white and coloured groups. The study included analysing the skeletons of a total of 360 SA black, white and coloured individuals and the data of 240 NA black and white individuals (equal sex and ancestry). Sympercents expressed sexual dimorphism and where compared in the three SA groups and with the NA individuals. The creation of different bone models and a variety of multivariate models revealed the potential of multivariate techniques. Comparisons of linear discriminant analysis (LDA), flexible discriminant analysis (FDA) and logistic regression indicated which model provided the greatest discriminatory power between sex and sex-ancestry groups in SA. Among the SA groups coloureds were the most sexually dimorphic; however, overall NA individual showed the greatest differences between the sexes. Multivariate classification accuracies using bone models (various measurements from individual bones) ranged between 75% and 91%, whereas classification accuracies using multivariate subsets (combinations of measurements from different bones) ranged from 85% to 98%. When classifying into sex and ancestry, a multivariate subset using eight measurements achieved classification accuracies of up to 80%. Overall FDA achieved the best results, whereas logistic regression achieved the lowest results for both bone models and multivariate subsets. Postcranial bones achieve comparable classification accuracies to the pelvis and higher accuracies than metric or morphological techniques using the cranium in SA. Large differences in sexual dimorphism between NA and SA warrant the creation of population-specific standards and custom databases for SA. / Dissertation (MSc)--University of Pretoria, 2015. / Anatomy / MSc / Unrestricted Physical Anthropology Sex estimation Sexual dimorphism Flexible Discriminant Analysis (FDA) Sympercents Anthropology UCTD
197	A comparative study of the capital structures of liquid and liquidity-stressed banks Momberume, Richard 24 July 2013 (has links) M.Comm. (Financial Management) / The costs of the 2007- 09 financial crises on global economies have resulted in new central bank rules to strengthen financial institutions. The question of whether there were any significant differences in capital structures between banks who were liquid and those who were liquidity constrained in the 2007– 2009 global financial crisis, still needs to be answered. Theoretical models on corporate failure partly explain how bank capital management impacts on whether a bank fails or not. This study investigates the differences in capital ratios between banks who were liquidity- stressed and those who were liquid. A comparative analysis of selected banking capital ratios were done followed by a discriminant analysis to determine if there is a relationship between the capital structures of liquid and liquidity- stressed banks. It was found that there were differences in capital structures of liquid and liquidity- stressed banks but capital ratios on their own, could not be used as early warning sign for bank failure. Bank liquidity Bank failures Discriminant analysis Global Financial Crisis, 2008-2009 Basel II (2004)
198	Real-time Embedded Age and Gender Classification in Unconstrained Video Azarmehr, Ramin January 2015 (has links) Recently, automatic demographic classification has found its way into embedded applications such as targeted advertising in mobile devices, and in-car warning systems for elderly drivers. In this thesis, we present a complete framework for video-based gender classification and age estimation which can perform accurately on embedded systems in real-time and under unconstrained conditions. We propose a segmental dimensionality reduction technique utilizing Enhanced Discriminant Analysis (EDA) to minimize the memory and computational requirements, and enable the implementation of these classifiers for resource-limited embedded systems which otherwise is not achievable using existing resource-intensive approaches. On a multi-resolution feature vector we have achieved up to 99.5% compression ratio for training data storage, and a maximum performance of 20 frames per second on an embedded Android platform. Also, we introduce several novel improvements such as face alignment using the nose, and an illumination normalization method for unconstrained environments using bilateral filtering. These improvements could help to suppress the textural noise, normalize the skin color, and rectify the face localization errors. A non-linear Support Vector Machine (SVM) classifier along with a discriminative demography-based classification strategy is exploited to improve both accuracy and performance of classification. We have performed several cross-database evaluations on different controlled and uncontrolled databases to assess the generalization capability of the classifiers. Our experiments demonstrated competitive accuracies compared to the resource-demanding state-of-the-art approaches. Support Vector Machine Linear Discriminant Analysis Principal Component Analysis Local Binary Patterns
199	Catégorisation par mesures de dissimilitude et caractérisation d'images en multi échelle / Classification by dissilimarity data and Multiresolution Image Analysis Manolova, Agata 11 October 2011 (has links) Dans cette thèse, on introduit la métrique "Coefficient de forme" pour la classement des données de dissimilitudes. Cette approche est inspirée par l'analyse discriminante géométrique et on a défini des règles de décision pour imiter le comportement du classifieur linéaire et quadratique. Le nombre de paramètres est limité (deux par classe). On a également étendu et amélioré cette démarche avantageuse et rapide pour apprendre uniquement à partir des représentations de dissimilitudes en utilisant l'efficacité du classificateur des Machines à Vecteurs de Support. Comme contexte applicatif pour la classification par dissimilitudes, on utilise la recherche d'images à l'aide d'une représentation des images en multi échelle en utilisant la "Pyramide Réduite Différentielle". Une application pour la description de visages est développée. Des résultats de classification à partir du coefficient de forme et utilisant une version adaptée des Machines à Vecteurs de Support, sur des bases de données issues des applications du monde réel sont présentés et comparés avec d'autres méthodes de classement basées sur des dissimilitudes. Il en ressort une forte robustesse de la méthode proposée avec des perfommances supérieures ou égales aux algorithmes de l'état de l'art. / The dissimilarity representation is an alternative for the use of features in the recognition of real world objects like images, spectra and time-signal. Instead of an absolute characterization of objects by a set of features, the expert or the system is asked to define a measure that estimates the dissimilarity between pairs of objects. Such a measure may also be defined for structural representations such as strings and graphs. The dissimilarity representation is potentially able to bridge structural and statistical pattern recognition. In this thesis we introduce a new fast Mahalanobis-like metric the “Shape Coefficient” for classification of dissimilarity data. Our approach is inspired by the Geometrical Discriminant Analysis and we have defined decision rules to mimic the behavior of the linear and quadratic classifier. The number of parameters is limited (two per class). We also expand and ameliorate this advantageous and rapid adaptive approach to learn only from dissimilarity representations by using the effectiveness of the Support Vector Machines classifier for real-world classification tasks. Several methods for incorporating dissimilarity representations are presented, investigated and compared to the “Shape Coefficient” in this thesis: • Pekalska and Duin prototype dissimilarity based classifiers; • Haasdonk's kernel based SVM classifier; • KNN classifier. Numerical experiments on artificial and real data show interesting behavior compared to Support Vector Machines and to KNN classifier: (a) lower or equivalent error rate, (b) equivalent CPU time, (c) more robustness with sparse dissimilarity data. The experimental results on real world dissimilarity databases show that the “Shape Coefficient” can be an alternative approach to these known methods and can be as effective as them in terms of accuracy for classification. Analyse discriminante Description d'images Données de dissimilitudes Classement Catégorisation Image Classification Image Description Dissimilarity data Discriminant Analysis
200	Effects of age and schooling on 22 ability and achievement tests Gambrell, James Lamar 01 May 2013 (has links) Although much educational research has investigated the relative effectiveness of different educational interventions and policies, little is known about the absolute net benefits of K-12 schooling independent of growth due to chronological age and out-of-school experience. The nearly universal policy of age tracking in schools makes this a difficult topic to investigate. However, a quasi-experimental regression discontinuity design can be used to separate observed test score differences between grades into independent age and schooling components, yielding an estimate of the net effects of school exposure at each grade level. In this study, a multilevel version of this design was applied to scores on 22 common ability and achievement tests from two major standardized test batteries. The ability battery contained 9 measures of Verbal, Quantitative, and Figural reasoning. The achievement battery contained 13 measures in the areas of Language, Mathematics, Reading, Social Studies, Science, and Sources of Information. The analysis was based on sample of over 20,000 students selected from a longitudinal database collected by a large U.S. parochial school system. The theory of fluid (Gf) and crystallized (Gc) intelligence predicts that these tests will show systematically different levels of sensitivity to schooling. Indeed, the achievement (Gc) tests were found to be three times more sensitive to schooling than they were to aging (one-year effect sizes of .41 versus .15), whereas the ability (Gf) tests were equally influenced by age (.18) and schooling (.19). Nonetheless, the schooling effect on most Gf tests was substantial, especially when the compounding over a typical school career is considered. This replicates the results of previous investigations of age and schooling using regression discontinuity methods and once again contradicts common interpretations of fluid ability. Different measures of a construct often exhibited varying levels of school sensitivity. Those tests that were less sensitive to schooling generally required reading, reasoning, transfer, synthesis, or translation; posed a wider range of questions; and/or presented problems in an unfamiliar format. Quantitative reasoning tests showed more sensitivity to schooling than figural reasoning tests, while verbal reasoning tests occupied a middle ground between the two. Schooling had the most impact on basic arithmetic skills and mathematical concepts, and a significantly weaker impact on the solution of math word problems. School-related gains on isolated language skills were much larger than gains on solving grammar problems in context. The weakest schooling impact overall was on reading comprehension where effects were no larger than those on verbal ability measures. An interesting dichotomy was found between spelling and paper folding (a measure of figural and spatial reasoning). Spelling skills showed robust schooling effects but a consistently negative age slope, a puzzling result which indicates that younger students in each group outperformed older students. Paper folding showed the opposite pattern, a large age effect and a small but consistently negative schooling effect. Results serve to rebut skepticism about both the impact of schooling on test scores and the validity of distinctions between ability and achievement. It is argued that the regression discontinuity design has great potential in the measurement of school effectiveness, while also offering a source of validity evidence for test developers and test users. Implications for theories of cognitive ability and future research on schooling effects are discussed. Age Effect Discriminant Validity Fluid Intelligence Intelligence Schooling Effect Test Validity Educational Psychology

Search results