1 |
A hybrid approach to automatic text summarizationYuan, Li-An 18 October 2007 (has links)
Automatic text summarization can efficiently and effectively save users¡¦ time while reading text documents. The objective of automatic text summarization is to extract essential sentences that cover almost all the concepts of a document so that
users are able to comprehend the ideas the document tries to address by simply reading through the corresponding summary. This research focuses on developing a hybrid automatic text summarization
approach, KCS, to enhancing the quality of summaries.
This approach basically consists of two major components: first, it employs the K-mixture probabilistic model to calculate term weights in a statistical sense; it then identifies the term relationship
between nouns and nouns as well as nouns and verbs, which results in the connective strength (CS) of nouns. With the connective strengths available scores of sentences can be calculated and ranked to be extracted.
We conduct three experiments to justify the proposed approach. The quality of summary is examined by its capability of increasing accuracy of text classification,while the classifier employed, the Naïve Bayes classifier, is kept the same through all experiments. The results show that the K-mixture model is more contributive to document classification than traditional TFIDF weighting scheme. It, however, is still no better than CS, a more complex linguistic-based approach. More importantly, our proposed approach, KCS, performs best among all approaches considered. It implies that KCS can extract more representative sentences from the document and its feasibility in text summarization applications is thus justified.
|
2 |
A profile of direct foreign investment in Ohio: A nonparametric statistical approachWolf, Milton A. January 1993 (has links)
No description available.
|
3 |
Statistical approach to the elastic property extraction and planar elastic response of polycrystalline thin-filmsChoi, Jaehwan 29 September 2004 (has links)
No description available.
|
4 |
A Survays On Fading Channel Over West - Java Area for Flight Test Radio Telemetering PurposesSoelaiman, Adi Dharma, Pudjiastuti, Rina 10 1900 (has links)
International Telemetering Conference Proceedings / October 17-20, 1988 / Riviera Hotel, Las Vegas, Nevada / This paper discusses one approach to determine a characteristic of West - Java's air and ground segment as a block-box to accomodate radio waves propagation, especially in L-band ranges, by evaluating both the topographical data and radio reception pattern as measured from ground based telemetry receiving-end system. All the measured signals are random and assumed to be stationair and ergodic. In order to characterize the channel for polarization diversity reception, some statistical analysis are applied to the signal strength measured of both - RHCP and LHCP components of 1531 MHz propagated waves as transmitted fr om NC212-200 PK-NZJ-aircraft. Some computer calculated correlograms of measured data are shown herewith, it is focused for a certain radio corridor at radial 265E relative to the ground based receiving antenna. More over some curves of predicted multipath gain factor are also presented to gain more theoretical back ground. When this paper is written, a further field experiments on the matter concerned is beeing conducted.
|
5 |
Comparative Study On Ground Vibrations Prediction By Statistical And Neural Networks Approaches At Tuncbilek Coal Mine, Panel ByhAkeil, Salah 01 June 2004 (has links) (PDF)
In this thesis, ground vibrations induced by bench blasting from the Tunç / bilek Coal Mine, Panel BYH, were measured to find out the site-specific attenuation and to assess the structural damage risk. A statistical approach is applied to the collected data, and from the data analysis an attenuation relationship is established to be used in predicting the peak particle velocity as well as to calculate the maximum allowable charge per delay. The values of frequencies are also analyzed to investigate the damage potential to the structures of Tunç / bilek Township. A new approach to predict the peak particle velocity is also proposed in this research study. A neural network technique from the branch of the artificial intelligence is put forward as an alternative approach to the statistical technique.
Findings of this study indicate, according to USBM (1980) criteria, that there is no damage risk to the structures in Tunç / bilek Township induced by bench blasting performed at Tunç / bilek coal mine, Panel BYH. Therefore, it is concluded that the damage claims put forward by the inhabitants of Tunç / bilek township had no scientific bases. It is also concluded that the empirical statistical technique is not the only acceptable approach that can be taken into account in predicting the peak particle velocity. An alternative and interesting neural network approach can also give a satisfactory accuracy in predicting peak particle velocity when compared to a set of additional recorded data of PPV.
|
6 |
Probabilistic Analysis of Brake Noise : A Hierarchical Multi-fidelity Statistical ApproachVenkatesan, Sreedhar, Banglore Hanumantha Raju, Hariprasad January 2018 (has links)
Computer Aided Engineering driven analysis is gaining grounds in automotive industry. Prediction of brake noise using CAE techniques has become populardue to its overall low cost as compared to physical testing. However, the presence of several uncertain parameters which affect brake noise and also the lack of basic understanding about brake noise, makes it difficult to make reliable decisions based on CAE analysis. Therefore, the confidence level in CAE techniques has to be increased to ensure reliability and robustness in the CAE solutions which support design work. One such way to achieve reliability in the CAE analysis isinvestigated in this thesis by incorporating the effects of different sources of uncertainty and variability in the analysis and estimating the probability of designfailure (probability of brake noise above a certain threshold). While incorporating the uncertainties in the CAE analysis ensures robustness, it is computationally intensive. This thesis work aims to gain an understanding about a brakenoise - creep groan, and to bring robustness into the CAE analysis along with reduction in computational time. A probabilistic analysis technique called hierarchical multi-fidelity statistical approachis explored in this thesis work, to estimate the probability of design failure or design robustness at a faster rate. It incorporates the stochasticity in the input parameters while running simulations. The method involves application of a hierarchy of approximations to the system response computed with variations in mesh resolution or variations in number of modes or changing solver time step,etc. And finally it uses the probability theory, to relate the information provided by approximate solutions to get the target failure estimation.Through this method, reliable data regarding the probability of design failure was approximated for every simulation and at a reduced computational time.Additionally, it provided information about critical parameters that influenced brake noise which was meritorious for design management. Estimation of probability of design failure by this method has been proved to be reliable in the case of brake noise according to the simulation results and the method can be considered robust. / Computer Aided Engineering (cae) driven analysis is gaining grounds in automotive industry. Brake noise is one such place where cae simulations are gaining more attention. The presence of several uncertain parameters which affect brake noises and also the lack of basic understanding about brake noise, makes it difficult to make reliable decisions based on cae deterministic analyses alone.Therefore, the confidence level in cae analyses has to be increased to ensure cae analysis robustness. One way to achieve this is by incorporating the effects of different sources of uncertainty and variability in the cae analysis and estimating the probability of design failure. Such a reliability measure (i.e. probability of noise event occurrence or exceedance of noise level than a threshold) can provide car manufacturers with an idea about the costs of warranty claims due to brake noise and can be used as a metric to evaluate different design solutions, before the final design goes to the production stage. On one hand, using the high-fidelity models of brake/chassis system is generally computationally intensive, and thus, often only limited number of simula-tion runs are feasible for uncertainty analysis and design failure risk assessment. On the other hand, analyses on low-fidelity models, typically based on simplified assumptions during the development phase are fast but not always accu-rate. Striking for a good balance between efficiency and accuracy/robustness is an important task, when dealing with uncertainty/risk analysis of such complex dynamical systems To address these issues, a hierarchical multi-fidelity statistical approach has been adopted in this study, in order to estimate the probability of design failure. It employs a hierarchy of approximations to the system response computed with different fidelity by surrogate modelling, coarse spatial/temporal model mesh resolution variation, changing solver time step, etc., using probability theory, to relate information provided by approximate solu-tions to the target failure estimation. Using this approach opens up the possi-bility to use a low-fidelity models to accelerate the uncertainty quantification of complex brake/chassis systems, while granting unbiased estimation of system design failure risk/reliability. It also enables management of design changes, during fast iterations of the design process. This approach is used for studying one of the brake noise issue called creep groan, understand the root cause and providedesign proposals.
|
7 |
Raspodela i profil zagađujućih jedinjenja u abiotskim i biotskim matriksima multivarijacionom analizom / Distribution and profile of pollutants in bioticand abiotic samples by multivariate statisticalapproachĐurišić-Mladenović Nataša 16 November 2012 (has links)
<p>U okviru disertacije analizirano je prisustvo različitih postojanih zagađujućih<br />materija u abiotskim i biotskim uzorcima iz različitih regiona, uključujući i uzorke<br />zemljišta iz Novog Sada i okolnih naselja, i to zagađujuće materije organskog<br />(policiklične aromatične ugljovodonike, polihlorovane bifenile i organohlorne<br />pesticide) i neorganskog (teški elementi) porekla. Dobijeni rezultati uvršteni su u<br />baze zajedno sa relevantnim podacima iz međunarodnih radova i na taj način<br />formirane su baze koje prevazilaze lokalne interese pojedinačnih istraživanja.<br />Primenom multivarijacionih metoda analize ovakvih baza utvrđen je stepen<br />zagađenosti ispitivanih uzoraka u odnosu na rezultate iz literature, a takođe je<br />razmatrana struktura formiranih multidimenzionalnih baza sa ciljem analize<br />raspodele postojanih zagađujućih jedinjenja u posmatranim matriksima i<br />identifikacije zajedničkih izvora zagađenja. Primenom različitih (matematičkih)<br />predtretmana podataka u bazama, a zatim njihovom analizom izabranim<br />multivarijacionim metodama, izvršena je procena uticaja predtretmana na<br />rezultate i mogućnosti njihove interpretacije, kao i ispitivanje zavisnosti između<br />posmatranih veličina i grupisanje uzoraka. Specifični ciljevi istraživanja su<br />omogućili da se:<br />- utvrde sličnosti i razlike pri korišćenju različitih načina izražavanja<br />analitičkih rezultata (apsolutne vrednosti koncentracije nasuprot relativnih<br />procentualnih udela, tzv. kompozicionih podataka) u okviru baza<br />podataka i pri izdvajanju informacija iz multidimenzionalnih baza<br />primenom multivarijacionih metoda,<br />- utvrdi uticaj različitih načina pripreme (obrade) podataka pre primene<br />multivarijacionih metoda radi dobijanja potpunijih informacija u cilju bolje<br />interpretacije podataka i smanjenja dimenzija baza podataka;<br />- ispitaju regionalne i vremenske razlike i/ili sličnosti između prisustva<br />posmatranih jedinjenja u abiotskim i biotskim matriksima radi uočavanja<br />dominantnih izvora zagađenja u određenim oblastima i vremenskim<br />periodima uz istovremenu karakterizaciju eksperimentalno ispitanih<br />uzoraka u odnosu na uzorke iz drugih regiona.<br />Postignuti rezultati predstavljaju jedinstvene rezultate primene multivarijacionih<br />metoda na bazama sastavljenim od podataka dobijenim u različitim<br />istraživanjima iz sveta o prisustvu postojanih zagađujućih materija u izabranim<br />abiotskim i biotskim uzorcima, doprinoseći tako analizi njihove opšte raspodele.</p> / <p>Presence of different pollutant classes of both organic (polycyclic aromatic<br />hydrocarbons, polychlorinated biphenyls and organochlorine pesticides) and<br />inorganic origin (heavy elements) were analysed in abiotic and biotic matrices<br />from various regions, including Novi Sad and its surrounding settlements.<br />Obtained results with available data published in the international articles were<br />included in the sets, forming the input matrices to be analysed by chemometric<br />techniques. Analysis of the created sets of data by multivariate approach was<br />performed due to determining pollution level of the investigated samples, as well<br />to elucidate the persistent pollutants distribution and profiles in the selected<br />matrices and to identify the common pollution sources.<br />Using different treatments of a set of input data, influence of these procedures to<br />results was assessed.<br />Specific aims of investigation were:<br />Determination of similarity and differences by using different ways of data<br />expression (apsolute values of concentrations as apposed to relative percent<br />fraction) in interpreration of multidimension data sets on the basis of multivariate<br />statistical approach<br />Determination of different processing of data before multivariate statistical<br />methods due to obtaining adequate information for interpretation of data and<br />reducing a set of original variables<br />Examination of regional and temporal differences and/or similarity among<br />presence of observed compounds in abiotic and biotic matrices due to<br />identification of dominant pollutant sources as well as comparative<br />characterisation of experimantally obtained data in relation to samples from<br />another regions worldwide.<br />Achieved results are unique examples of multivariate methods application on<br />large data sets with results on the occurance of pesistent organic compounds in<br />abiotic and biotic matrices obtained in different studies all over the world.</p>
|
8 |
AN INQUIRY INTO THE APPLICABILITY OF KANTOROVICH'S APPROACH TO THE THERMODYNAMIC OPTIMIZATIONDai, Cong 10 1900 (has links)
<p>The purpose of this research has been to reassess the Ag-Mg system using the CALPHAD technique. Compared with previous assessments, we carry out the optimization by fitting calculations to the original data instead of second-hand information. Moreover, we use a two sub-lattice model and a four sub-lattice model based on compound energy formalism to simulate both first-order and second-order transformations between the FCC phase and the L1<sub>2</sub> phase. Undoubtedly, the CALPHAD technique has achieved a degree of maturity, but its deficiencies are regularly ignored.</p> <p>In this thesis, we develop an interval method based on Kantorovich’s idea to overcome the shortcomings of the CALPHAD technique. Both advantages and disadvantages of the interval method are discussed. We also present an example of the interval approach on thermodynamic optimization of the Ag-Mg melt. The results suggest that this method would be helpful as a pre-optimization tool.</p> / Master of Applied Science (MASc)
|
9 |
Critério estatístico para obtenção de valores de NSPT para previsão da capacidade de carga de estacas por métodos semi empíricos. / Statistical criteria to evaluate NSPT data to predict the load bearing capacity of pile by semi-empiric methods.Fernando de Paula Vieira 11 February 2015 (has links)
Uma das tarefas mais desafiadoras do engenheiro na área da Geotecnia é a escolha dos valores de parâmetros geotécnicos obtidos de ensaios de campo ou laboratório e que serão utilizados nos modelos analíticos ou numéricos na fase de projeto de fundações. Diante das incertezas inerentes aos ensaios de SPT e da heterogeneidade de abordagens para a utilização dos valores de NSPT, é proposta neste estudo, a aplicação de um critério estatístico para obtenção de valores de NSPT, a partir da construção de intervalos de confiança de 95% de probabilidade em torno da reta ajustada de regressão linear simples entre a variável aleatória NSPT e a profundidade. Os valores obtidos de NSPT pelo critério aplicado foram utilizados na previsão da capacidade de carga de 19 estacas isoladas a partir da utilização de três métodos semi-empíricos: Aoki-Velloso (1975) com coeficientes alterados por Monteiro (1997), Décourt & Quaresma (1978) alterado pelo método de Décourt (1996) e Método de Alonso (1996). As cargas de ruptura dessas 19 estacas ensaiadas através de Provas de Carga Estática foram obtidas pelos métodos de extrapolação de Van Der Veen (1953) e Décourt (1996) e serviram para comparação e consequente validação do critério estatístico. Adicionalmente, com fulcro no item 6.2.1.2.1 da ABNT NBR 6122:2010 Resistência calculada por método semi-empírico, foram avaliados os fatores de segurança em relação às cargas de projeto, inclusive, também se utilizando da premissa de reconhecimento de regiões representativas, levando em conta o número de ensaios de SPT executados, fato que promove uma diminuição da incerteza dos parâmetros, apontando a um menor fator de segurança. A dissertação enfatiza as vantagens de um adequado tratamento estatístico dos parâmetros geotécnicos, a exemplo da recomendação já existente nas normas internacionais como Eurocódigo e outras. O critério construído permite e encoraja análises e decisões racionais no universo das partes interessadas consumidores, projetistas, fiscais de obras, contratantes e comunidade científica promovendo as discussões de forma mais objetiva e harmoniosa sobre o tema. / One of the most challenging aspects of geotechnical engineering is the selection of soil parameters from field and / or laboratory tests to be used in analytical or numerical models for foundation design. Due to known uncertainties in SPT tests and wide availability of criteria for NSPT interpretation, a proposed procedure is presented based on 95% confidence limits around a trend line defined by simple linear regression analysis expressing the variation of NSPT with depth. The NSPT values obtained by the proposed approach have been used to estimate the pile ultimate capacity of 19 isolated continuous flight auger piles using different semi-empirical methods, such as Aoki and Velloso (1975) with modified coefficients as proposed by Monteiro (1997), Décourt and Quaresma (1978) modified by Décourt (1996) and Alonso (1996). Static load tests of the same 19 piles have been extrapolated by Van Der Veen (1953) and Décourt (1996) methods, as an aid for comparison and validation of the statistical criterion. Additionally, were made with the fulcrum in item 6.2.1.2.1 of ABNT NBR 6122: 2010 - Resistance calculated by semi-empirical method, evaluations of safety factors in relation to load project, also including the premise of recognizing representative regions and taking into account the number of SPT tests, a fact that provides the decreased uncertainty of the parameters, indicating a lower FS. The dissertation emphasizes the advantages of an adequate statistical treatment of the geotechnical data, similar to what is recommended by the Eurocode. Such approach allows and encourages a more rational decision including all interested parties - consumers, designers, inspectors, contractors and scientific community providing more objective and harmonious discussions on this subject.
|
10 |
Approche hybride pour la reconnaissance automatique de la parole en langue arabe / Hybrid approach for automatic speech recognition for the Arabic languageMasmoudi Dammak, Abir 21 September 2016 (has links)
Le développement d'un système de reconnaissance de la parole exige la disponibilité d'une grande quantité de ressources à savoir, grands corpus de texte et de parole, un dictionnaire de prononciation. Néanmoins, ces ressources ne sont pas disponibles directement pour des dialectes arabes. De ce fait, le développement d'un SRAP pour les dialectes arabes se heurte à de multiples difficultés à savoir, l’'abence de grandes quantités de ressources et l'absence d’'une orthographe standard vu que ces dialectes sont parlés et non écrit. Dans cette perspective, les travaux de cette thèse s’intègrent dans le cadre du développement d’un SRAP pour le dialecte tunisien. Une première partie des contributions consiste à développer une variante de CODA (Conventional Orthography for Arabic Dialectal) pour le dialecte tunisien. En fait, cette convention est conçue dans le but de fournir une description détaillée des directives appliquées au dialecte tunisien. Compte tenu des lignes directives de CODA, nous avons constitué notre corpus nommé TARIC : Corpus de l’interaction des chemins de fer de l’arabe tunisien dans le domaine de la SNCFT. Outre ces ressources, le dictionnaire de prononciation s’impose d’une manière indispensable pour le développement d’un SRAP. À ce propos, dans la deuxième partie des contributions, nous visons la création d’un système nommé conversion (Graphème-Phonème) G2P qui permet de générer automatiquement ce dictionnaire phonétique. Toutes ces ressources décrites avant sont utilisées pour adapter un SRAP pour le MSA du laboratoire LIUM au dialecte tunisien dans le domaine de la SNCFT. L’évaluation de notre système donné lieu WER de 22,6% sur l’ensemble de test. / The development of a speech recognition system requires the availability of a large amount of resources namely, large corpora of text and speech, a dictionary of pronunciation. Nevertheless, these resources are not available directly for Arabic dialects. As a result, the development of a SRAP for Arabic dialects is fraught with many difficulties, namely the lack of large amounts of resources and the absence of a standard spelling as these dialects are spoken and not written. In this perspective, the work of this thesis is part of the development of a SRAP for the Tunisian dialect. A first part of the contributions consists in developing a variant of CODA (Conventional Orthography for Arabic Dialectal) for the Tunisian dialect. In fact, this convention is designed to provide a detailed description of the guidelines applied to the Tunisian dialect. Given the guidelines of CODA, we have created our corpus TARIC: Corpus of the interaction of the railways of the Tunisian Arab in the field of SNCFT. In addition to these resources, the pronunciation dictionary is indispensable for the development of a peech recognition system. In this regard, in the second part of the contributions, we aim at the creation of a system called conversion(Grapheme-Phonème) G2P which allows to automatically generate this phonetic dictionary. All these resources described before are used to adapt a SRAP for the MSA of the LIUM laboratory to the Tunisian dialect in the field of SNCFT. The evaluation of our system gave rise to WER of 22.6% on the test set.
|
Page generated in 0.0991 seconds