Global ETD Search

1	Automatic structure and keyphrase analysis of scientific publications Constantin, Alexandru January 2014 (has links) Purpose. This work addresses an escalating problem within the realm of scientific publishing, that stems from accelerated publication rates of article formats difficult to process automatically. The amount of manual labour required to organise a comprehensive corpus of relevant literature has long been impractical. This has, in effect, reduced research efficiency and delayed scientific advancement. Two complementary approaches meant to alleviate this problem are detailed and improved upon beyond the current state-of-the-art, namely logical structure recovery of articles and keyphrase extraction. Methodology. The first approach targets the issue of flat-format publishing. It performs a structural analysis of the camera-ready PDF article and recognises its fine-grained organisation over logical units. The second approach is the application of a keyphrase extraction algorithm that relies on rhetorical information from the recovered structure to better contour an article’s true points of focus. A recount of the scientific article’s function, content and structure is provided, along with insights into how different logical components such as section headings or the bibliography can be automatically identified and utilised for higher-quality keyphrase extraction. Findings. Structure recovery can be carried out independently of an article’s formatting specifics, by exploiting conventional dependencies between logical components. In addition, access to an article’s logical structure is beneficial across term extraction approaches, reducing input noise and facilitating the emphasis of regions of interest. Value. The first part of this work details a novel method for recovering the rhetorical structure of scientific articles that is competitive with state-of-the-art machine learning techniques, yet requires no layout-specific tuning or prior training. The second part showcases a keyphrase extraction algorithm that outperforms other solutions in an established benchmark, yet does not rely on collection statistics or external knowledge sources in order to be proficient. 005
2	Document Analysis and Recognition WATANABE, Toyohide 20 March 1999 (has links) No description available. layout recognition document types logical structure layout structure bottom-up top-down document model
3	An XML document representation method based on structure and content : application in technical document classification / An XML document representation method based on structure and content : application in technical document classification Chagheri, Samaneh 27 September 2012 (has links) L’amélioration rapide du nombre de documents stockés électroniquement représente un défi pour la classification automatique de documents. Les systèmes de classification traditionnels traitent les documents en tant que texte plat, mais les documents sont de plus en plus structurés. Par exemple, XML est la norme plus connue et plus utilisée pour la représentation de documents structurés. Ce type des documents comprend des informations complémentaires sur l'organisation du contenu représentées par différents éléments comme les titres, les sections, les légendes etc. Pour tenir compte des informations stockées dans la structure logique, nous proposons une approche de représentation des documents structurés basée à la fois sur la structure logique du document et son contenu textuel. Notre approche étend le modèle traditionnel de représentation du document appelé modèle vectoriel. Nous avons essayé d'utiliser d'information structurelle dans toutes les phases de la représentation du document: -procédure d'extraction de caractéristiques, -La sélection des caractéristiques, -Pondération des caractéristiques. Notre deuxième contribution concerne d’appliquer notre approche générique à un domaine réel : classification des documents techniques. Nous désirons mettre en œuvre notre proposition sur une collection de documents techniques sauvegardés électroniquement dans la société CONTINEW spécialisée dans l'audit de documents techniques. Ces documents sont en format représentations où la structure logique est non accessible. Nous proposons une solution d’interprétation de documents pour détecter la structure logique des documents à partir de leur présentation physique. Ainsi une collection hétérogène en différents formats de stockage est transformée en une collection homogène de documents XML contenant le même schéma logique. Cette contribution est basée sur un apprentissage supervisé. En conclusion, notre proposition prend en charge l'ensemble de flux de traitements des documents partant du format original jusqu’à la détermination de la ses classe Dans notre système l’algorithme de classification utilisé est SVM. / Rapid improvement in the number of documents stored electronically presents a challenge for automatic classification of documents. Traditional classification systems consider documents as a plain text; however documents are becoming more and more structured. For example, XML is the most known and used standard for structured document representation. These documents include supplementary information on content organization represented by different elements such as title, section, caption etc. We propose an approach on structured document classification based on both document logical structure and its content in order to take into account the information present in logical structure. Our approach extends the traditional document representation model called Vector Space Model (VSM). We have tried to integrate structural information in all phases of document representation construction: -Feature extraction procedure, -Feature selection, -Feature weighting. Our second contribution concerns to apply our generic approach to a real domain of technical documentation. We desire to use our proposition for classifying technical documents electronically saved in CONTINEW; society specialized in technical document audit. These documents are in legacy format in which logical structure is inaccessible. Then we propose an approach for document understanding in order to extract documents logical structure from their presentation layout. Thus a collection of heterogeneous documents in different physical presentations and formats is transformed to a homogenous XML collection sharing the same logical structure. Our contribution is based on learning approach where each logical element is described by its physical characteristics. Therefore, our proposal supports whole document transformation workflow from document’s original format to being classified. In our system SVM has been used as classification algorithm. Informatique Document structuré Document XML Classification supervisée Structure logique Reconstruction structure Computer science Structured document XML Document Supervised classification Logical structure Restructuring 006.740 72
4	L'énoncé averbal en allemand et en kabyle (berbère) / Non-verbal utterance in German and Kabyle (Berber) Bouzidi, Said 10 July 2015 (has links) Cette étude compare le fonctionnement de l’énoncé averbal (EAV) en kabyle et en allemand, en prenant comme cadre théorique la triade sémantico-logique établie par Zemb (1978), i. e. le thème (ce dont on parle), le rhème (ce qu’on en dit) et le phème (lieu d’articulation de la modalisation et de la négation) appliquée par Behr et Quintin (1996) et Behr (2013) à la catégorisation des EAV de l’allemand. Nous postulons que chaque langue dispose de moyens morphosyntaxiques, contextuelles et situationnelles contribuant à la réalisation d’EAV et que ces moyens sont plus étendus en kabyle. Nous supposons qu’il existe des structures sémantico-logiques uniques qui pourraient s’exprimer à travers des structures morphosyntaxiques variées. Nous supposons enfin que les EAV réalisent toutes les modalités, disposent de moyens morphologiques et/ ou contextuels permettant de les localiser dans le cadre temporel. Parmi les résultats, nous avons constaté que les EAV sont plus fréquents en kabyle grâce aux structures prédicatives grammaticalisées, sauf l’EAV représentant une continuité syntaxique avec le segment de gauche dont la fréquence en allemand est due au scrambling. Au niveau syntaxique, la pré-/postposition du thème par rapport au rhème obéit à des contraintes liées à la langue, i. e. l’état du nom en kabyle et la définitude du GN en allemand ; des contraintes propres à l’EAV se manifestent dans la prédilection pour l’ordre rhème-thème en allemand. Les EAV expriment toutes les modalités, ils sont situés dans le temps par les circonstants, certains démonstratifs ou le contexte, et les nominalisations en tant que rhème existentiel expriment l’aspectualité télique et atélique. / The study compares the functioning of non-verbal utterances in German and Kabyle (Berber) using the Zemb’s (1978) semantico-logical triad as a theoretical framework, i.e. the theme (what is being talked about), the rheme (what is said about the theme) and the phème (place of articulation of modalisation and negation), applied by Behr and Quintin (1996) and Behr (2013) to categorisation of German non-verbal utterances. We posit that each language has morphosyntactic, contextual and situational means allowing the construction of non-verbal utterances and that these means are more extensive in Kabyle.We also hypothesise that there are unique semantico-logical structures which could be expressed through varied morphosyntactical structures. Finally, we presume that non-verbal utterances express all the modalities; they have morphological and/ or contextual possibilities which locate them within the temporal framework. We have observed, among other results, that the frequency of non-verbal utterances is higher in Kabyle due to grammaticalized predicative structures, except for those depending syntactically on a main sentence, which could be explained by the scrambling-process. At the syntactic level, the pre-/postposition of the rheme in relation to the theme is subject to language specific constraints, i.e. changes in the noun state in Kabyle, the determination and definiteness in German; constraints concerning non-verbal utterances appear in the preference of the rheme-theme order in German. Non-verbal utterances express all modalities; they are located in time by circumstances, by some demonstratives or by the context, and nominalisations as existential rheme express telic and atelic aspectuality. Allemand (langue) Berbere (langue) Kabyle (langue) Linguistique contrastive Structure sémantico-Logique Structure morphosyntaxique Structure intonative Prédication averbale Enoncé averbal German Berber Kabyle Contrastive linguistics Semantico-Logical structure Morphosyntactical structure Intonative structure Non-Verbal predication Non-Verbal utterance 435 493.3
5	Průzkum tržní orientace našich firem / Research on market orientation of our firms Vlasová, Tereza January 2008 (has links) The thesis deals with the market orientation of high-tech companies based in the city of Brno. It measures the effects of market orientation on business performance, analyzes the relations between market orientation and business performance for chosen companies and evaluates logical structure of the questionnaire used for marketing research. Based on the results of the research the thesis contains solution proposals addressed to the given companies, points out their weaknesses and follows with the proposals for the structure of the questionnaire and its future application for researching market orientation of the companies.

1

Page generated in 0.089 seconds