Global ETD Search

271	Leyline : a provenance-based desktop search system using graphical sketchpad user interface Ghorashi, Seyed Soroush 07 December 2011 (has links) While there are powerful keyword search systems that index all kinds of resources including emails and web pages, people have trouble recalling semantic facts such as the name, location, edit dates and keywords that uniquely identifies resources in their personal repositories. Reusing information exasperates this problem. A rarely used approach is to leverage episodic memory of file provenance. Provenance is traditionally defined as "the history of ownership of a valued object". In terms of documents, we consider not only the ownership, but also the operations performed on the document, especially those that related it to other people, events, or resources. This thesis investigates the potential advantages of using provenance data in desktop search, and consists of two manuscripts. First, a numerical analysis using field data from a longitudinal study shows that provenance information can effectively be used to identify files and resources in realistic repositories. We introduce the Leyline, the first provenance-based search system that supports dynamic relations between files and resources such as copy/paste, save as, file rename. The Leyline allows users to search by drawing search queries as graphs in a sketchpad. The Leyline overlays provenance information that may help users identify targets or explore information flow. A limited controlled experiment showed that this approach is feasible in terms of time and effort. Second, we explore the design of the Leyline, compare it to previous provenance-based desktop search systems, including their underlying assumptions and focus, search coverage and flexibility, and features and limitations. / Graduation date: 2012 Provenance Documents User Interface Keyword Search Information Search and Retrieval Query formulation Search process File Organization Desktop Search User interaction styles User-centered design
272	Ontologias de domínio na interpretação de consultas a bancos de dados relacionais / Domain ontologies in query interpretation to relational databases Marins, Walquíria Fernandes 08 October 2015 (has links) Submitted by Luciana Ferreira (lucgeral@gmail.com) on 2016-02-29T10:48:44Z No. of bitstreams: 2 Dissertação - Walquiria Fernandes Marins - 2015.pdf: 2611919 bytes, checksum: 3d20806ae28d0e0e4c891d8dd65cbd88 (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) / Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2016-02-29T10:54:01Z (GMT) No. of bitstreams: 2 Dissertação - Walquiria Fernandes Marins - 2015.pdf: 2611919 bytes, checksum: 3d20806ae28d0e0e4c891d8dd65cbd88 (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) / Made available in DSpace on 2016-02-29T10:54:01Z (GMT). No. of bitstreams: 2 Dissertação - Walquiria Fernandes Marins - 2015.pdf: 2611919 bytes, checksum: 3d20806ae28d0e0e4c891d8dd65cbd88 (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) Previous issue date: 2015-10-08 / There is an huge amount of data and information stored digitally. Part of this amount of data is available for consultation by Web, however, a significant portion is hidden, due to its storage form, and can’t be recovered by traditional search engines. This makes users face a common and growing challenge to search and find specific information. This challenge is enhanced by the unpreparedness of users to formulate searches and the limitations inherent in research technologies, that because of syntactic differences, can’t find probably relevant data. Aiming to expand the possibilities of semantic interpretation of a query, this work proposes an approach to interpretation of queries in natural language to relational databases through the use of domain ontologies as a tool for their interpretation and semantic enrichment. With the implementation of the approach and its appreciation against a backdrop of queries without the semantic enrichment it is observed that the approach contributes satisfactorily to identify the user’s intention. / Há uma quantidade enorme de dados e informações armazenadas digitalmente. Uma parte desse volume de dados está disponível para consultas através da Web, entretanto, uma parcela significativa está oculta, devido à sua forma de armazenamento, e não pode ser recuperada pelos mecanismos tradicionais de busca. Isso faz com que os usuários enfrentem um desafio comum e crescente para buscar e encontrar informações específicas. Este desafio é potencializado pelo despreparo dos usuários em formular buscas e pelas limitações inerentes às tecnologias de pesquisa que, em virtude de diferenças sintáticas, não conseguem encontrar dados provavelmente relevantes. Visando ampliar as possibilidades de interpretação semântica de uma consulta, este trabalho propõe uma abordagem de interpretação de consultas em linguagem natural a bancos de dados relacionais através do uso de ontologias de domínio como instrumento para sua interpretação e enriquecimento semântico. Com a implementação da abordagem e sua apreciação em relação a um cená- rio de consultas sem o enriquecimento semântico observa-se que a abordagem contribui satisfatoriamente para identificar a intenção do usuário. Bancos de dados Web Bancos de dados relacionais Consulta por palavras-chave Ontologia de domínio Interpretação semântica da consulta Web databases Relational databases Keyword-based query Domain ontology Semantic query interpretation
273	以型態組合為主的關鍵詞擷取技術在學術寫作字彙上的研究 / A pattern approach to keyword extraction for academic writing vocabulary 邵智捷, Shao, Chih Chieh Unknown Date (has links) 隨著時間的推移演進，人們瞭解到將知識經驗著作成文獻典籍保存下來供後人研究開發的重要性。時至今日，以英語為主的學術寫作論文成為全世界最主要的研究交流媒介。而對於英語為非母語的研究專家而言，在進行英語學術寫作上常常會遇到用了不適當的字彙或搭配詞導致無法確切的傳達自己的研究成果，或是在表達上過於貧乏的問題，因此英語學術寫作字彙與搭配詞的學習與使用就顯得相當重要。在本研究中，我們藉由收集大量不同國家以及不同研究領域的學術論文為基礎，建構現實中實際使用的語料庫，並且建立數種詞性標籤型態，使用關鍵詞擷取關鍵詞擷取(Keyword Extraction)技術從中擷取出學術著作中常用的學術寫作字彙候選詞，當作是學術常用寫作字彙之初步結果，隨即將候選詞導入關鍵詞分析的指標形態模型，將候選詞依照指標特徵選出具有代表指標意義的進一步候選詞。在實驗方面，透過對不同範圍的樣本資料進行篩選，並導入統計上的方法對字彙進行不同領域共通性的分析檢證，再加上輔助篩選的機制後，最後求得名詞和動詞分別在學術寫作中常用的字彙，也以此字彙為基礎，發掘出語料庫中常用的搭配詞組合，提出以英語為外國語的研究學者以及學生在學術寫作上的常用字彙與搭配詞組合作為參考，在學術寫作上能夠提供更多樣性且正確的研究論述的協助。 / With the evolution over time, people start to know the importance of taking their knowledge and experience into literature texts and preserving them for future research. Until now, academic writing research papers mainly in English become the world’s leading communication media all over the world. For those non-native English researchers, they often encounter with the inappropriate vocabularies or collocations which causes them not to pass on their idea accurately or to express their research poorly. As a result, it’s very important to know how to learn or to use the correct academic writing in English vocabularies and collocations. In this study, we constructed the real academic thesis corpus which includes different countries and fields of academic research. The keyword extraction technique based on the several Part-of-Speech tag patterns is used for capturing the common academic writing vocabulary candidates in the academic works to be the initial result of the common vocabulary of academic writing. The candidate words would be introduced to the index analysis model of keyword and be picked out to the further meaningful candidate words according to the index characteristics. For the experiments, the sample data with different fields would be filtered and the vocabularies on different fields of commonality would be analyzed and verified through statistical methods. Moreover, the auxiliary filter mechanism would also be applied to get the common vocabularies in academic writing with nouns and verbs. Based on these vocabularies, we could discover the common combination with the words in the academic thesis corpus and provide them to the non-native English researchers and students as a reference with the common vocabularies and collocations in academic writing. Hopefully the study could help them to write more rich and correct research papers in the future. 關鍵字擷取英語學習學術字彙學術字彙列表詞性標籤型態 keyword extraction English learning academic vocabulary academic word list AWL PoS tag patterns
274	Ανάπτυξη μεθόδου με σκοπό την αναγνώριση και εξαγωγή θεματικών λέξεων κλειδιών από διευθύνσεις ιστοσελίδων του ελληνικού Διαδικτύου / Keyword identification within Greek URLs Βονιτσάνου, Μαρία-Αλεξάνδρα 16 January 2012 (has links) Η αύξηση της διαθέσιμης Πληροφορίας στον Παγκόσμιο Ιστό είναι ραγδαία. Η παρατήρηση αυτή παρότρυνε πολλούς ερευνητές να επικεντρώσουν το έργο τους στην εξαγωγή χρήσιμων γνωρισμάτων από διαδικτυακά έγγραφα, όπως ιστοσελίδες, εικόνες, βίντεο, με σκοπό τη ενίσχυση της διαδικασίας κατηγοριοποίησης ιστοσελίδων. Ένας πόρος που περιέχει πληροφορία και δεν έχει διερευνηθεί διεξοδικά για γλώσσες εκτός της αγγλικής, είναι η διεύθυνση ιστοσελίδας (URL- Uniform Recourse Locator). Το κίνητρο της διπλωματικής αυτής εργασίας είναι το γεγονός ότι ένα σημαντικό υποσύνολο των χρηστών του διαδικτύου δείχνει ενδιαφέρον για δικτυακούς πόρους, των οποίων οι διευθύνσεις URL περιλαμβάνουν όρους προερχόμενους από τη μητρική τους γλώσσα (η οποία δεν είναι η αγγλική), γραμμένους με λατινικούς χαρακτήρες. Προτείνεται μέθοδος η οποία θα αναγνωρίζει και θα εξάγει τις λέξεις-κλειδιά από διευθύνσεις ιστοσελίδων (URLs), εστιάζοντας στο ελληνικό Διαδίκτυο και συγκεκριμένα σε URLs που περιέχουν ελληνικούς όρους. Το κύριο ζήτημα της προτεινόμενης μεθόδου είναι ότι οι ελληνικές λέξεις μπορούν να μεταγλωττίζονται με λατινικούς χαρακτήρες σύμφωνα με πολλούς διαφορετικούς τρόπους, καθώς και το γεγονός ότι τα URLs μπορούν να περιέχουν περισσότερες της μιας λέξεις χωρίς κάποιο διαχωριστικό. Παρόλη την ύπαρξη προηγούμενων προσεγγίσεων για την επεξεργασία ελληνικού διαδικτυακού περιεχομένου, όπως αναζητήσεις στο ελληνικό διαδίκτυο και αναγνώριση οντότητας σε ελληνικές ιστοσελίδες, καμία από τις παραπάνω δεν βασίζεται σε διευθύνσεις URL. Επιπλέον, έχουν αναπτυχθεί πολλές τεχνικές για την κατηγοριοποίηση ιστοσελίδων με βάση κυρίως τις διευθύνσεις URL, αλλά καμία δεν διερευνά την περίπτωση του ελληνικού διαδικτύου. Η προτεινόμενη μέθοδος περιέχει δύο βασικά στοιχεία: το μεταγλωττιστή και τον κατακερματιστή. Ο μεταγλωττιστής, βασισμένος σε ένα ελληνικό λεξικό και ένα σύνολο κανόνων, μετατρέπει τις λέξεις που είναι γραμμένες με λατινικούς χαρακτήρες σε ελληνικούς όρους ενώ παράλληλα ο κατακερματιστής τμηματοποιεί τη διεύθυνση URL σε λέξεις με νόημα, εξάγοντας, έτσι τελικά ελληνικούς όρους που αποτελούν λέξεις κλειδιά. Η πειραματική αξιολόγηση της προτεινόμενης μεθόδου σε δείγμα ελληνικών URLs αποδεικνύει ότι μπορεί να αξιοποιηθεί εποικοδομητικά στην αυτόματη αναγνώριση λέξεων-κλειδιών σε ελληνικά URLs. / The available information on the WWW is increasing rapidly. This observation has triggered many researchers to focus their work on extracting useful features from web documents that would enhance the task of web classification. A quite informative resource that has not been thoroughly explored for languages other than English, is the uniform recourse locator (URL). Motivated by the fact that a significant part of the Web users is interested in web resources, whose URLs contain terms from their non English native languages,written using Latin characters, we propose a method that identifies and extracts successfully keywords within URLs focusing on the Greek Web and especially ons URLs, containing Greek terms. The main issue of this approach is that Greek words can be transliterated to Latin characters in many different ways based on how the words are pronounced rather than on how they are written. Although there are previous attempts on similar issues, like Greek web searches and entity recognition in Greek Web Pages, none of them is based on URLs. In addition, there are many techniques on web page categorization based mainly on URLs but noone explores the case of Greek terms. The proposed method uses a three-step approach; firstly, a normalized URL is divided into its basic components, according to URI protocol (scheme :// host / path-elements / document . extension). The domain part is splitted on the apperance of punctuation marks or numbers. Secondly, domain-tokens are segmented into meaningful tokens using a set of transliteration rules and a Greek dictionary. Finally, in order to identify useful keywords, a score is assigned to each extracted keyword based on its length and whether the word is nested in another word. The algorithm is evaluated on a random sample of 1,000 URLs collected manually. We perform a human-based evaluation comparing the keywords extracted automatically with the keywords extracted manually when no other additional information than the URL is available. The results look promising. Τμηματοποίηση λέξεων 025.042 2 Greeklish to Greek transliteration Keyword extraction Uniform resource locator Word segmentation
275	Σχεδιασμός και υλοποίηση δημοσιογραφικού RDF portal με μηχανή αναζήτησης άρθρων Χάιδος, Γεώργιος 11 June 2013 (has links) Το Resource Description Framework (RDF) αποτελεί ένα πλαίσιο περιγραφής πόρων ως μεταδεδομένα για το σημασιολογικό ιστό. Ο σκοπός του σημασιολογικού ιστού είναι η εξέλιξη και επέκταση του υπάρχοντος παγκόσμιου ιστού, έτσι ώστε οι χρήστες του να μπορούν ευκολότερα να αντλούν συνδυασμένη την παρεχόμενη πληροφορία. Ο σημερινός ιστός είναι προσανατολισμένος στον άνθρωπο. Για τη διευκόλυνση σύνθετων αναζητήσεων και σύνθεσης επιμέρους πληροφοριών, ο ιστός αλλάζει προσανατολισμό, έτσι ώστε να μπορεί να ερμηνεύεται από μηχανές και να απαλλάσσει το χρήστη από τον επιπλέον φόρτο. Η πιο φιλόδοξη μορφή ενσωμάτωσης κατάλληλων μεταδεδομένων στον παγκόσμιο ιστό είναι με την περιγραφή των δεδομένων με RDF triples αποθηκευμένων ως XML. Το πλαίσιο RDF περιγράφει πόρους, ορισμένους με Uniform Resource Identifiers (URI’s) ή literals με τη μορφή υποκείμενου-κατηγορήματος-αντικειμένου. Για την ορθή περιγραφή των πόρων ενθαρρύνεται από το W3C η χρήση υπαρχόντων λεξιλογίων και σχημάτων , που περιγράφουν κλάσεις και ιδιότητες. Στην παρούσα εργασία γίνεται υλοποίηση ενός δημοσιογραφικού RDF portal. Για τη δημιουργία RDF/XML, έχουν χρησιμοποιηθεί τα λεξιλόγια και σχήματα που συνιστούνται από το W3C καθώς και των DCMI και PRISM. Επίσης χρησιμοποιείται για την περιγραφή typed literals to XML σχήμα του W3C και ένα σχήμα του portal. Η δημιουργία των μεταδεδομένων γίνεται αυτόματα από το portal με τη χρήση των στοιχείων που συμπληρώνονται στις φόρμες δημοσίευσης άρθρων και δημιουργίας λογαριασμών. Για τον περιορισμό του χώρου αποθήκευσης τα μεταδεδομένα δεν αποθηκεύονται αλλά δημιουργούνται όταν ζητηθούν. Στην υλοποίηση έχει δοθεί έμφαση στην ασφάλεια κατά τη δημιουργία λογαριασμών χρήστη με captcha και κωδικό ενεργοποίησης με hashing. Για τη διευκόλυνση του έργου του αρθρογράφου, έχει εισαχθεί και επεκταθεί ο TinyMCE Rich Text Editor, o οποίος επιτρέπει τη μορφοποίηση του κειμένου αλλά και την εισαγωγή εικόνων και media. Ο editor παράγει αυτόματα HTML κώδικα από το εμπλουτισμένο κείμενο. Οι δυνατότητες του editor επεκτάθηκαν κυρίως με τη δυνατότητα για upload εικόνων και media και με την αλλαγή κωδικοποίησης για συμβατότητα με τα πρότυπα της HTML5. Για επιπλέον συμβατότητα με την HTML5 εισάγονται από το portal στα άρθρα ετικέτες σημασιολογικής δομής. Εκτός από τα άρθρα που δημιουργούνται με τη χρήση του Editor, δημοσιοποιούνται και άρθρα από εξωτερικές πηγές. Στη διαδικασία που είναι αυτόματη και επαναλαμβανόμενη, γίνεται επεξεργασία και αποθήκευση μέρους των δεδομένων των εξωτερικών άρθρων. Στον αναγνώστη του portal παρουσιάζεται ένα πρωτοσέλιδο και σελίδες ανά κατηγορία με τα πρόσφατα άρθρα. Στο portal υπάρχει ενσωματωμένη μηχανή αναζήτησης των άρθρων, με πεδία για φιλτράρισμα χρονικά, κατηγορίας, αρθρογράφου-πηγής αλλά και λέξεων κλειδιών. Οι λέξεις κλειδιά προκύπτουν από την περιγραφή του άρθρου στη φόρμα δημιουργίας ή αυτόματα. Όταν τα άρθρα προέρχονται από εξωτερικές πηγές, η διαδικασία είναι υποχρεωτικά αυτόματη. Για την αυτόματη ανεύρεση των λέξεων κλειδιών από ένα άρθρο χρησιμοποιείται η συχνότητα της λέξης στο άρθρο, με τη βαρύτητα που δίνεται από την HTML για τη λέξη (τίτλος, έντονη γραφή), κανονικοποιημένη για το μέγεθος του άρθρου και η συχνότητα του λήμματος της λέξης σε ένα σύνολο άρθρων που ανανεώνεται. Για την ανάκτηση των άρθρων χρησιμοποιείται η τεχνική των inverted files για όλες τις λέξεις κλειδιά. Για τη μείωση του όγκου των δεδομένων και την επιτάχυνση απάντησης ερωτημάτων, αφαιρούνται από την περιγραφή λέξεις που παρουσιάζουν μεγάλη συχνότητα και μικρή αξία ανάκτησης πληροφορίας “stop words”. Η επιλογή μιας αντιπροσωπευτικής λίστας με stop words πραγματοποιήθηκε με τη χρήση ενός σώματος κειμένων από άρθρα εφημερίδων, τη μέτρηση της συχνότητας των λέξεων και τη σύγκριση τους με τη λίστα stop words της Google. Επίσης για τον περιορισμό του όγκου των δεδομένων αλλά και την ορθότερη απάντηση των ερωτημάτων, το portal κάνει stemming στις λέξεις κλειδιά, παράγοντας όρους που μοιάζουν με τα λήμματα των λέξεων. Για to stemming έγινε χρήση της διατριβής του Γεώργιου Νταή του Πανεπιστημίου της Στοκχόλμης που βασίζεται στη Γραμματική της Νεοελληνικής Γραμματικής του Μανώλη Τριανταφυλλίδη. Η επιστροφή των άρθρων στα ερωτήματα που περιλαμβάνουν λέξεις κλειδιά γίνεται με κατάταξη εγγύτητας των λέξεων κλειδιών του άρθρου με εκείνο του ερωτήματος. Γίνεται χρήση της συχνότητας των λέξεων κλειδιών και της συχνότητας που έχουν οι ίδιες λέξεις σε ένα σύνολο άρθρων που ανανεώνεται. Για την αναζήτηση γίνεται χρήση θησαυρού συνώνυμων λέξεων. / The Resource Description Framework (RDF) is an appropriate framework for describing resources as metadata in the Semantic Web. The aim of semantic web is the development and expansion of the existing web, so users can acquire more integrated the supplied information. Today's Web is human oriented. In order to facilitate complex queries and the combination of the acquired data, web is changing orientation. To relieve the user from the extra burden the semantic web shall be interpreted by machines. The most ambitious form incorporating appropriate metadata on the web is by the description of data with RDF triples stored as XML. The RDF framework describes resources, with the use of Uniform Resource Identifiers (URI's) or literals as subject-predicate-object. The use of existing RDF vocabularies to describe classes and properties is encouraged by the W3C. In this work an information-news RDF portal has been developed. The RDF / XML, is created using vocabularies and schemas recommended by W3C and the well known DCMI and PRISM. The metadata is created automatically with the use of data supplied when a new articles is published. To facilitate the journalist job, a Rich Text Editor, which enables formatting text and inserting images and media has been used and expanded. The editor automatically generates HTML code from text in a graphic environment. The capabilities of the editor were extended in order to support images and media uploading and media encoding changes for better compatibility with the standards of HTML5. Apart from uploading articles with the use of the editor the portal integrates articles published by external sources. The process is totally automatic and repetitive. The user of the portal is presented a front page and articles categorized by theme. The portal includes a search engine, with fields for filtering time, category, journalist-source and keywords. The keywords can be supplied by the publisher or selected automatically. When the articles are integrated from external sources, the process is necessarily automatic. For the automatic selection of the keywords the frequency of each word in the article is used. Extra weight is given by the HTML for the words stressed (e.g. title, bold, underlined), normalized for the size of the article and stem frequency of the word in a set of articles that were already uploaded. For the retrieval of articles by the search engine the portal is using an index as inverted files for all keywords. To reduce the data volume and accelerate the query processing words that have high frequency and low value information retrieval "stop words" are removed. The choice of a representative list of stop words is performed by using a corpus of newspaper articles, measuring the frequency of words and comparing them with the list of stop words of Google. To further reduce the volume of data and increase the recall to questions, the portal stems the keywords. For the stemming the rule based algorithm presented in the thesis of George Ntais in the University of Stockholm -based Grammar was used. The returned articles to the keywords queried by the search engine are ranked by the proximity of the keywords the article is indexed. To enhance the search engine synonymous words are also included by the portal. Σημασιολογικός ιστός Ανεστραμένα αρχεία Μηχανή αναζήτησης Λημματοποίηση Ανάκτηση πληροφορίας 025.042 7 Semantic web Inverted files Search engine Stemming Keyword indexing Information retrieval Resource Description Framework (RDF) Stopwords
276	Automatisk extraktion av nyckelord ur ett kundforum / Automatic keyword extraction from a customer forum Ekman, Sara January 2018 (has links) Konversationerna i ett kundforum rör sig över olika ämnen och språket är inkonsekvent. Texterna uppfyller inte de krav som brukar ställas på material inför automatisk nyckelordsextraktion. Uppsatsens undersöker hur nyckelord automatiskt kan extraheras ur ett kundforum trots dessa svårigheter. Fokus i undersökningen ligger på tre aspekter av nyckelordsextraktion. Den första faktorn rör hur den etablerade nyckelordsextraktionsmetoden TFIDF presterar jämfört med fyra metoder som skapas med hänsyn till materialets ovanliga struktur. Nästa faktor som testas är om olika sätt att räkna ordfrekvens påverkar resultatet. Den tredje faktorn är hur metoderna presterar om de endast använder inläggen, rubrikerna eller båda texttyperna i sina extraktioner. Icke-parametriska test användes för utvärdering av extraktionerna. Ett antal Friedmans test visar att metoderna i några fall skiljer sig åt gällande förmåga att identifiera relevanta nyckelord. I post-hoc-test mellan de högst presterande metoderna ses en av de nya metoderna i ett fall prestera signifikant bättre än de andra nya metoderna men inte bättre än TFIDF. Ingen skillnad hittades mellan användning av olika texttyper eller sätt att räkna ordfrekvens. För framtida forskning rekommenderas reliabilitetstest av manuellt annoterade nyckelord. Ett större stickprov bör användas än det i aktuell studie och olika förslag ges för att förbättra rättning av extraherade nyckelord. / Conversations in a customer forum span across different topics and the language is inconsistent. The text type do not meet the demands for automatic keyword extraction. This essay examines how keywords can be automatically extracted despite these difficulties. Focus in the study are three areas of keyword extraction. The first factor regards how the established keyword extraction method TFIDF performs compared to four methods created with the unusual material in mind. The next factor deals with different ways to calculate word frequency. The third factor regards if the methods use only posts, only titles, or both in their extractions. Non-parametric tests were conducted to evaluate the extractions. A number of Friedman's tests shows the methods in some cases differ in their ability to identify relevant keywords. In post-hoc tests performed between the highest performing methods, one of the new methods perform significantly better than the other new methods but not better than TFIDF. No difference was found between the use of different text types or ways to calculate word frequency. For future research reliability test of manually annotated keywords is recommended. A larger sample size should be used than in the current study and further suggestions are given to improve the results of keyword extractions. Automatic keyword extraction Information extraction Noisy text TFIDF User generated text Användargenererad text Automatisk nyckelordsextraktion Brusig text Informationsextraktion TFIDF General Language Studies and Linguistics
277	Automatiserade regressionstester avseende arbetsflöden och behörigheter i ProjectWise. : En fallstudie om ProjectWise på Trafikverket / Automated regression tests regarding workflows and permissions in ProjectWise. Ograhn, Fredrik, Wande, August January 2016 (has links) Test av mjukvara görs i syfte att se ifall systemet uppfyller specificerade krav samt för att hitta fel. Det är en viktig del i systemutveckling och involverar bland annat regressionstestning. Regressionstester utförs för att säkerställa att en ändring i systemet inte medför att andra delar i systemet påverkas negativt. Dokumenthanteringssystem hanterar ofta känslig data hos organisationer vilket ställer höga krav på säkerheten. Behörigheter i system måste därför testas noggrant för att säkerställa att data inte hamnar i fel händer. Dokumenthanteringssystem gör det möjligt för flera organisationer att samla sina resurser och kunskaper för att nå gemensamma mål. Gemensamma arbetsprocesser stöds med hjälp av arbetsflöden som innehåller ett antal olika tillstånd. Vid dessa olika tillstånd gäller olika behörigheter. När en behörighet ändras krävs regressionstester för att försäkra att ändringen inte har gjort inverkan på andra behörigheter. Denna studie har utförts som en kvalitativ fallstudie vars syfte var att beskriva utmaningar med regressionstestning av roller och behörigheter i arbetsflöden för dokument i dokumenthanteringssystem. Genom intervjuer och en observation så framkom det att stora utmaningar med dessa tester är att arbetsflödens tillstånd följer en förutbestämd sekvens. För att fullfölja denna sekvens så involveras en enorm mängd behörigheter som måste testas. Det ger ett mycket omfattande testarbete avseende bland annat tid och kostnad. Studien har riktat sig mot dokumenthanteringssystemet ProjectWise som förvaltas av Trafikverket. Beslutsunderlag togs fram för en teknisk lösning för automatiserad regressionstestning av roller och behörigheter i arbetsflöden åt ProjectWise. Utifrån en kravinsamling tillhandahölls beslutsunderlag som involverade Team Foundation Server (TFS), Coded UI och en nyckelordsdriven testmetod som en teknisk lösning. Slutligen jämfördes vilka skillnader den tekniska lösningen kan utgöra mot manuell testning. Utifrån litteratur, dokumentstudie och förstahandserfarenheter visade sig testautomatisering kunna utgöra skillnader inom ett antal identifierade problemområden, bland annat tid och kostnad. / Software testing is done in order to see whether the system meets specified requirements and to find bugs. It is an important part of system development and involves, among other things, regression testing. Regression tests are performed to ensure that a change in the system does not affect other parts of the system adversely. Document management systems often deals with sensitive data for organizations, which place high demands on safety. Permissions in the system has to be tested thoroughly to ensure that data does not fall into the wrong hands. Document management systems make it possible for organizations to pool their resources and knowledge together to achieve common goals. Common work processes are supported through workflows that contains a variety of states. These different permissions apply to different states. When a permission changes regression tests are required to ensure that the changes has not made an impact on other permissions. This study was conducted as a qualitative case study whose purpose was to describe the challenges of regression testing of roles and permissions in document workflows in a document management system. Through interviews and an observation it emerged that the major challenges of these tests is that workflow states follow a predetermined sequence. To complete this sequence, a huge amount of permissions must be tested. This provides a very extensive test work that is time consuming and costly. The study was directed toward the document management system ProjectWise, managed by Trafikverket. Supporting documentation for decision making was produced for a technical solution for automated regression testing of roles and permissions in workflows for ProjectWise. Based on a requirement gathering decision-making was provided that involved the Team Foundation Server (TFS), Coded UI and a keyword-driven test method for a technical solution. Finally, a comparison was made of differences in the technical solution versus today's manual testing. Based on literature, document studies and first hand experiences, test automation provides differences in a number of problem areas, including time and cost. Workflows permissions roles regression test test automation document management system keyword driven testing Coded UI TFS. Arbetsflöden behörigheter roller regressionstest testautomatisering dokumenthanteringssystem nyckelordsdriven testmetod Coded UI TFS. Information Systems
278	Détection automatique de chutes de personnes basée sur des descripteurs spatio-temporels : définition de la méthode, évaluation des performances et implantation temps-réel / Automatic human fall detection based on spatio-temporal descriptors : definition of the method, evaluation of the performance and real-time implementation Charfi, Imen 21 October 2013 (has links) Nous proposons une méthode supervisée de détection de chutes de personnes en temps réel, robusteaux changements de point de vue et d’environnement. La première partie consiste à rendredisponible en ligne une base de vidéos DSFD enregistrées dans quatre lieux différents et qui comporteun grand nombre d’annotations manuelles propices aux comparaisons de méthodes. Nousavons aussi défini une métrique d’évaluation qui permet d’évaluer la méthode en s’adaptant à la naturedu flux vidéo et la durée d’une chute, et en tenant compte des contraintes temps réel. Dans unsecond temps, nous avons procédé à la construction et l’évaluation des descripteurs spatio-temporelsSTHF, calculés à partir des attributs géométriques de la forme en mouvement dans la scène ainsique leurs transformations, pour définir le descripteur optimisé de chute après une méthode de sélectiond’attributs. La robustesse aux changements d’environnement a été évaluée en utilisant les SVMet le Boosting. On parvient à améliorer les performances par la mise à jour de l’apprentissage parl’intégration des vidéos sans chutes enregistrées dans l’environnement définitif. Enfin, nous avonsréalisé, une implantation de ce détecteur sur un système embarqué assimilable à une caméra intelligentebasée sur un composant SoC de type Zynq. Une démarche de type Adéquation AlgorithmeArchitecture a permis d’obtenir un bon compromis performance de classification/temps de traitement / We propose a supervised approach to detect falls in home environment adapted to location andpoint of view changes. First, we maid publicly available a realistic dataset, acquired in four differentlocations, containing a large number of manual annotation suitable for methods comparison. We alsodefined a new metric, adapted to real-time tasks, allowing to evaluate fall detection performance ina continuous video stream. Then, we build the initial spatio-temporal descriptor named STHF usingseveral combinations of transformations of geometrical features and an automatically optimised setof spatio-temporal descriptors thanks to an automatic feature selection step. We propose a realisticand pragmatic protocol which enables performance to be improved by updating the training in thecurrent location with normal activities records. Finally, we implemented the fall detection in Zynqbasedhardware platform similar to smart camera. An Algorithm-Architecture Adequacy step allowsa good trade-off between performance of classification and processing time Détection de chute temps réel Descripteurs spatio-temporels Sélection d’attributs SVM Boosting Base de vidéos de chute System on Chip (SoC) No english keyword 006.3 006.6 621.39
279	Le métabolisme des acides gras monoinsaturés et la prolifération des cellules cancéreuses coliques : rôle de la Stéaroyl-CoA Désaturase-1 et effets des isomères conjugués de l'acide linoléique / Monounsaturated fatly acid metabolism and colon cancer cells proliferation : role of Stearoyl-CoA desaturase-1 and effect of conjugated linoleic acid isomers Pierre, Anne-Sophie 21 December 2012 (has links) Le métabolisme de la cellule cancéreuse s’adapte aux besoins en macromolécules de cette cellule en prolifération et en réponse aux signaux du microenvironnement tumoral. Ainsi, la biosynthèse des acides gras monoinsaturés (AGMI), est augmentée dans les cellules cancéreuses coliques et associée à une augmentation de l’activité de la stéaroyl-CoA désaturase (SCD), enzyme limitante de cette synthèse. Les acides gras polyinsaturés (AGPI) comme les isomères conjugués de l’acide linoléique (CLA), c9,t11 CLA et t10,c12 CLA possèdent à la fois un effet inhibiteur sur l’activité SCD et un effet anti-tumoral dont les mécanismes moléculaires restent à préciser. Dans ce contexte, les objectifs de ce travail furent d’évaluer le rôle de SCD-1 dans la survie de la cellule cancéreuse colique (CCC) et les mécanismes de régulation sous-jacents mais également d’apporter des éléments nouveaux sur les régulations à l’origine de l’effet anti-prolifératif des CLA. Nous avons tout d’abord montré que l’extinction de l’expression de SCD-1 conduit à l’apoptose des CCC dépendante de CHOP. En revanche l’extinction de SCD 1 n’affecte en rien les cellules non cancéreuses. Par ailleurs, nous avons étudié les effets sur la viabilité des CCC de deux isomères de l’acide linoléique le c9,t11-CLA et le t10,c12-CLA in vitro. Nos résultats montrent que seul le t10,c12-CLA induit une mort cellulaire par apoptose des CCC sans affecter la survie de cellules coliques non transformées. Il apparait aussi être le seul isomère à réduire la biosynthèse des AGMI dans les CCC. La mort induite par le t10,c12 CLA dans les CCC est dépendante de l’activation d’un stress du RE via la production d’espèces réactives de l’oxygène.Nos travaux apportent des éléments nouveaux dans la compréhension du rôle de SCD 1 dans la survie des cellules cancéreuses coliques et les mécanismes d’action du t10,c12 CLA. Notre étude soutient l’hypothèse de faire de la biosynthèse des AGMI une cible thérapeutique possible dans le traitement des cancers colorectaux / Cancer cells adapt their metabolism in response to signals from the microenvironment and proliferation. Thus, MonoUnsaturated Fatty Acid (MUFA) synthesis is increased in colon cancer cells, and associated with increased Stearoyl-CoA Desaturase (SCD) activity, the rate limiting enzyme of MUFA biosynthesis. Polyunsaturated Fatty Acid (PUFA), as isomers of conjugated linoleic acid (CLA), exert inhibitor activity on SCD-1 and have anti-cancer properties, but their mechanisms are not yet clear. In this context, the aim of this work was first to evaluate the role of SCD-1 in the proliferation of colon cancer cells (CCC) and to define the underlying mechanisms. In a second time, we provide new information about regulation of the anti-proliferative effect of CLA. In a first time we showed that extinction of SCD-1 induces CCC apoptosis through CHOP expression. In contrast, the extinction of SCD-1 has no effect on viability of non cancerous cells. In addition, we studied effects of two isomers of CLA, c9,t11-CLA and t10,c12 CLA, on CCC viability in vitro. We showed that only t10,c12 CLA induces apoptotic CCC death without affecting survival of untransformed colon cells. t10,c12 CLA seems to be also the only to repress MUFA synthesis. It is also shown that cell death induced by t10,c12 CLA is ER stress dependent through reactive oxygen species generation. This work provides new information about SCD 1 role in colon cancer cells survival and mechanism of t10,c12 CLA. This study supports the hypothesis to consider MUFA biosynthesis as a potential therapeutic target in colorectal cancer treatment. Stéaroyl-CoA désaturase-1 (SCD-1) Stress du réticulum endoplasmique (RE) Cellules cancéreuses coliques No english keyword 541.39 571.6 571.9 572.8 616.9
280	On-line marketingová komunikace / On-line marketing communication Skácelová, Pavlína January 2021 (has links) The subject of this diploma thesis is online marketing communication using search engine marketing tools. It specifies what online marketing and search engine marketing are, describes its tools and explains its use. The analytical part of the work uses tools such as analysis of macro and micro environment, analysis of marketing mix and associated SEO analysis. Last but not least, the development of website traffic and the subsequent possible development of this data in the future will also be examined. Based on the findings from the analytical part, in the last part of the work will be designed online marketing communication using search engine marketing.

Search results