Global ETD Search

1	Μηχανισμοί και τεχνικές διαχείρισης, επεξεργασίας, ανάλυσης, κατηγοριοποίησης, εξαγωγής περίληψης και προσωποποίησης συχνά ανανεώσιμων δεδομένων του παγκόσμιου ιστού για παρουσίαση σε σταθερές και κινητές συσκευές Πουλόπουλος, Βασίλειος 01 November 2010 (has links) Ζούμε μία εποχή τεχνολογικών εξελίξεων και τεχνολογικών αλμάτων με το Διαδίκτυο να γίνεται ένας από τους βασικότερους εκφραστές των νέων τεχνολογικών τάσεων. Ωστόσο, ο τρόπος λειτουργίας του και δόμησής του παρουσιάζει εξαιρετικά μεγάλη ανομοιογένεια με αποτέλεσμα οι χρήστες να βρίσκονται συχνά μπροστά από αδιέξοδο στην προσπάθεια αναζήτησης πληροφορίας. Άλλωστε η ύπαρξη εκατομμυρίων domains οδηγεί σε δυσκολίες κατά την αναζήτηση πληροφορίας. Η έρευνα που πραγματοποιείται επικεντρώνεται στους δικτυακούς τόπους που αποτελούν πηγές ενημέρωσης και πιο συγκεκριμένα στα ειδησεογραφικά πρακτορεία ειδήσεων, αλλά και στα blogs. Μία απλή αναζήτηση αποκάλυψε περισσότερους από 40 δικτυακούς τόπους από μεγάλα ειδησεογραφικά πρακτορεία στην Αμερική. Αυτό σημαίνει πως στην προσπάθεια αναζήτησης μίας είδησης και δη, όλων των πτυχών της, κάποιος θα πρέπει να επισκεφθεί αν όχι όλους, τους περισσότερους από αυτούς τους δικτυακούς τόπους για να εντοπίσει στοιχεία για το θέμα που τον ενδιαφέρει. Σε αυτό το «πρόβλημα» ή έστω σε αυτή την επίπονη διαδικασία, έχει γίνει προσπάθεια να δοθούν λύσεις μέσα από τη χρήση των καναλιών επικοινωνίας RSS και μέσα από προσωποποιημένους δικτυακούς τόπους που διαθέτουν τα μεγάλα ειδησεογραφικά πρακτορεία ή ακόμα και από τους μηχανισμούς αναζήτησης που αυτοί διαθέτουν. Σε κάθε περίπτωση όμως, υπάρχουν σημαντικά μειονεκτήματα που συχνά οδηγούν και πάλι το χρήστη σε αδιέξοδο. Τα κανάλια επικοινωνίας δε φιλτράρουν πληροφορίες, τροφοδοτώντας τους RSS readers των χρηστών με πληθώρα πληροφοριών που δεν αφορούν τους χρήστες ή ακόμα είναι ενοχλητικές για αυτούς. Για παράδειγμα η προσθήκη δύο (2) μόνον καναλιών από Ελληνικά μεγάλα ειδησεογραφικά portals μας οδήγησε στη λήψη περισσότερων από 1000 ειδήσεων καθημερινά. Από την άλλη, η χρήση των microsites που έχουν οι δικτυακοί τόποι επιβάλει στους χρήστες την επίσκεψη σε όλους τους δικτυακούς τόπους που τους ενδιαφέρουν. Όσον αφορά στη χρήση των μηχανών αναζήτησης, ακόμα και οι πιο μεγάλες από αυτές συχνά επιστρέφουν εκατομμύρια αποτελέσματα στα ερωτήματα των χρηστών ή πληροφορίες που δεν είναι επικαιροποιημένες. Τέλος, επειδή οι δικτυακοί τόποι των ειδησεογραφικών πρακτορείων δεν έχουν κατασκευαστεί για να προσφέρουν εκτενείς υπηρεσίες αναζήτησης ειδήσεων, είναι συχνό το φαινόμενο είτε να μην προσφέρουν καθόλου υπηρεσία αναζήτησης, είτε η υπηρεσία που προσφέρουν να μη μπορεί να απαντήσει με δομημένα αποτελέσματα και αντί να βοηθά τους χρήστες να εντοπίσουν την πληροφορία που αναζητούν, να τους αποπροσανατολίζει. / We live an era of technology advances and huge technological steps where the Internet becomes a basic place of demonstration of the technology trends. Nevertheless, the way of operation and construction of the WWW is extremely uneven and this results in dead-ends when the users are trying to locate information. Besides the existence of billions of domains leads to difficulties in difficulties in recording all this information. The research that we are doing, is focused on websites that are sources of information and specifically news portals and informational blogs. A simple search on the Internet led to more than 40 large scale press agencies in America. This means that when trying to search for information and more specifically a news article in all its existences somebody has to visit all the websites. This problem, or at least this tedious task is of major concern of the research community. Many solutions were proposed in order to overcome the aforementioned issues with usage of RSS feeds or personalized microsites, or even analytical search applications. In any occasion there are many disadvantages that lead the user to a dead-end again. The RSS feeds do not filter information and they feed the user’s RSS readers with large amounts of information that most of it is not of the user’s concern. For example, a simple addition of 2 rss feeds from large Greek portals led to receipt of more that 1000 news articles within a day! On the other side, the usage of microsites that many websites support is a solution if and only if the user visits every single website and of course have and maintain an account to each one of them. The search engines are an alternative but lately, due to the expansion of the WWW, the results to simple queries are often million or the first results retrieved are outdated. Finally, the websites of the major news agencies are not directly constructed to offer extensive searching facilities and thus they usually offer search results through support of a large well-known search engine (eg. Google). According to the aforementioned the research that we are conducting is furthermore focused on the study of techniques and mechanisms that try to give a solution to the everyday issue of being informed about news and having a spherical opinion about an issue. The idea is simple and lies on the problem of the Internet: instead of letting the user do all the search of the news and information that meet their needs we collect all the informationand present them directly to the user, presenting only the information that meet their profile. This sounds pretty simple and logical, but the implementation we have to think of a number of prerequisites. The constraints are: the users of the Internet speak different languages and they want to see the news in their mother language and the users want access to the information from everywhere. This implies that we need a mechanism that would collect news articles from many – if not all – news agencies worldwide so that everybody can be informed. The news articles that we collect should be furthermore analyzed before presented to the users. In parallel we need to apply text pre-processing techniques, categorization and automatic summarization so that the news articles can be presented back to the user in a personalized manner. Finally, the mechanism is able to construct and maintain a user profile and present only articles that meet the profile of the user and not all the articles collected by the system. As it is obvious this is not a simple procedure. Substantially it a multilevel modular mechanism that implements and uses advanced algorithm on every level in order to achieve the required result. We are referring to eight different mechanisms that lead to the desired result. The systems are: 1. Retrieve news and articles from the Internet –advaRSS system 2. HTML page analysis and useful text extraction – CUTER system. 3. Preprocess and Natural Language Processing in order to extract keywords. 4. Categorization subsystem in order to construct ontologies that assigns texts to categories 5. Article Grouping mechanism (web application level) 6. Automatic Text Summarization 7. Web based User Personalization Mechanism 8. Application based User Personalization Mechanism The subsystems and system architecture is presented in figure 1: The procedure of fetching articles and news from the WWW is a procedure that includes algorithms that fetch data of the large database that is called internet. In this research we have included algorithms for instant retrieval of articles and the mechanism has furthermore mechanism for fetching HTML pages that include news articles. As a next step and provided that we own HTML pages with articles we have procedures for efficient useful text extraction. The HTML pages include the body of the article and information that are disrelated to the article like advertisements. Our mechanism introduces algorithms and systems for extraction of the original body of the text out of the aforementioned pages and omitting any irrelevant information. As a furthermore procedure of the same mechanism we try and extract multimedia related to the article. The aforementioned mechanism are communicating directly with the Internet. Κατηγοριοποίηση Προσωποποίηση Προφίλ χρηστών Διαδίκτυο Αποδελτίωση τύπου 025.042 2 Categorization Personalization Users' profile Internet Press clipping
2	Υλοποίηση προσωποποιημένης πολυμεσικής εφαρμογής ηλεκτρονικού εμπορίου με λειτουργίες χωρικής αναζήτησης / Implementation of personalized multimedia e-commerce application with spatial search features Μηναδάκης, Νίκος 25 January 2012 (has links) Σκοπός την εργασίας είναι η δημιουργία ενός ολοκληρωμένου ηλεκτρονικού καταστήματος το οποίο θα παρέχει στους χρήστες μεταξύ άλλων, δυνατότητα χωρικής αναζήτησης προϊόντων, και προσωποποίησης. Η εφαρμογή υποστηρίζει όλες τις λειτουργίες ενός σύγχρονου ηλεκτρονικού καταστήματος προσθέτοντας σε αυτές ένα πλήθος καινοτόμων λειτουργιών. Συγκεκριμένα υποστηρίζει λειτουργία καλαθιού αγορών και παραγγελιών μέσω πιστωτικής κάρτας χρησιμοποιώντας ένα εικονικό σύστημα τραπεζικών συναλλαγών, πλήθος λειτουργιών αναζήτησης προϊόντων, διαφορετικά είδη προσωποποίησης, πολλαπλά επίπεδα ασφάλειας με χρήση κρυπτογράφησης, δημιουργία λογαριασμών χρηστών, forum κα. Στις καινοτομίες της εφαρμογής συγκαταλέγονται η χωρική αναζήτηση προϊόντων με χρήση Τ.Κ., η αναζήτηση αντίστοιχων τιμών σε άλλα καταστήματα και η social προσωποποίηση με χρήση βαρών στους παράγοντες προσωποποίησης. Ιδιαίτερη βαρύτητα έχει δοθεί επίσης στην συντηρησιμότητα του συστήματος και στη φιλικότητα προς τον χρήστη. Οι τεχνολογίες που χρησιμοποιήθηκαν είναι κατά κύριο λόγο HTML, CSS, PHP, Postgresql, smarty και λειτουργικό σύστημα Linux. / The purpose of this thesis is the development of a comprehensive online store that supports all functions of a modern e-shop plus a host of innovative features. Specifically, the application supports shopping cart, orders pipeline using a virtual banking system, many search products features, different kinds of personalization, multiple levels of security using encryption, user accounts, forum etc.. Innovations in the application include spatial search for products using postal codes, search for corresponding values in other e-shops and social personalization using weights to the personalization factors. Special attention was also given to the maintainability of the system and user friendliness. For the implementation of the application were used HTML, CSS, PHP, Postgresql, smarty and Linux operating system. Χωρική αναζήτηση Προσωποποίηση Ηλεκτρονικό εμπόριο 025.042 2 Spatial search Personalization E-commerce E-shops
3	Ανάπτυξη δυναμικής web based εφαρμογής με σκοπό την απεικόνιση της βιοϊατρικής πληροφορίας στον παγκόσμιο χάρτη, μέσω φορητών συσκευών / Developing dynamic web based application for visualization of biomedical information on the world map through mobile devices Μαντιδάκης, Γεώργιος 15 May 2012 (has links) Ο σκοπός της διπλωματικής εργασίας είναι η απεικόνιση της βιοϊατρικής πληροφορίας με πρωτότυπο τρόπο, δίνοντας έμφαση τόσο στον σχεδιασμό και στην ανάπτυξη μίας φιλικής διεπαφής για τον χρήστη όσο και στην απόκτηση τεχνικής γνώσης από μέρους μου. Έτσι, μελετήθηκαν υπάρχουσες εφαρμογές-τεχνολογίες και οι αντίστοιχες δυνατότητες που προσέφεραν, σε συνδυασμό πάντα με τις απαιτήσεις και τις ανάγκες των ιατρών. / The subject of the present diploma is the imaging of biomedical information in an original way, emphasizing both the design and development of a user friendly interface and the acquisition of technical knowledge for me. Thus, studied existing applications and technologies and their respective capabilities offered, I tried to make a system always in combination with the demands and the needs of the doctors. Απεικόνιση δεδομένων Βιοϊατρικά δεδομένα Βάσεις δεδομένων 025.042 2 Biodatabases Mapping data JQuery JVector Javascript PHP HTML MySQL server
4	Μελέτη και συγκριτική αξιολόγηση μεθόδων δόμησης περιεχομένου ιστοτόπων : εφαρμογή σε ειδησεογραφικούς ιστοτόπους Στογιάννος, Νικόλαος-Αλέξανδρος 20 April 2011 (has links) Η κατάλληλη οργάνωση του περιεχομένου ενός ιστοτόπου, έτσι ώστε να αυξάνεται η ευρεσιμότητα των πληροφοριών και να διευκολύνεται η επιτυχής ολοκλήρωση των τυπικών εργασιών των χρηστών, αποτελεί έναν από τους πρωταρχικούς στόχους των σχεδιαστών ιστοτόπων. Οι υπάρχουσες τεχνικές του πεδίου Αλληλεπίδρασης-Ανθρώπου Υπολογιστή που συνεισφέρουν στην επίτευξη αυτού του στόχου συχνά αγνοούνται εξαιτίας των απαιτήσεών τους σε χρονικούς και οικονομικούς πόρους. Ειδικότερα για ειδησεογραφικούς ιστοτόπους, τόσο το μέγεθος τους όσο και η καθημερινή προσθήκη και τροποποίηση των παρεχόμενων πληροφοριών, καθιστούν αναγκαία τη χρήση αποδοτικότερων τεχνικών για την οργάνωση του περιεχομένου τους. Στην εργασία αυτή διερευνούμε την αποτελεσματικότητα μίας μεθόδου, επονομαζόμενης AutoCardSorter, που έχει προταθεί στη βιβλιογραφία για την ημιαυτόματη κατηγοριοποίηση ιστοσελίδων, βάσει των σημασιολογικών συσχετίσεων του περιεχομένου τους, στο πλαίσιο οργάνωσης των πληροφοριών ειδησεογραφικών ιστοτόπων. Για το σκοπό αυτό διενεργήθηκαν πέντε συνολικά μελέτες, στις οποίες πραγματοποιήθηκε τόσο ποσοτική όσο και ποιοτική σύγκριση των κατηγοριοποιήσεων που προέκυψαν από συμμετέχοντες σε αντίστοιχες μελέτες ταξινόμησης καρτών ανοικτού και κλειστού τύπου, με τα αποτελέσματα της τεχνικής AutoCardSorter. Από την ανάλυση των αποτελεσμάτων προέκυψε ότι η AutoCardSorter παρήγαγε ομαδοποιήσεις άρθρων που βρίσκονται σε μεγάλη συμφωνία με αυτές των συμμετεχόντων στις μελέτες, αλλά με σημαντικά αποδοτικότερο τρόπο, επιβεβαιώνοντας προηγούμενες παρόμοιες μελέτες σε ιστοτόπους άλλων θεματικών κατηγοριών. Επιπρόσθετα, οι μελέτες έδειξαν ότι μία ελαφρώς τροποποιημένη εκδοχή της AutoCardSorter τοποθετεί νέα άρθρα σε προϋπάρχουσες κατηγορίες με αρκετά μικρότερο ποσοστό συμφωνίας συγκριτικά με τον τρόπο που επέλεξαν οι συμμετέχοντες. Η εργασία ολοκληρώνεται με την παρουσίαση κατευθύνσεων για την βελτίωση της αποτελεσματικότητας της AutoCardSorter, τόσο στο πλαίσιο οργάνωσης του περιεχομένου ειδησεογραφικών ιστοτόπων όσο και γενικότερα. / The proper structure of a website's content, so as to increase the findability of the information provided and to ease the typical user task-making, is one of the primary goals of website designers. The existing methods from the field of HCI that assist designers in this, are often neglected due to their high cost and human resources demanded. Even more so on News Sites, their size and the daily content updating call for improved and more efficient techniques. In this thesis we investigate the efficiency of a novel method, called AutoCardSorter, that has been suggested in bibliography for the semi-automatic content categorisation based on the semantic similarity of each webpage-content. To accomplish this we conducted five comparative studies in which the method was compared, to the primary alternatives of the classic Card Sorting method (open, closed). The analysis of the results showed that AutoCardSorter suggested article categories with high relavance to the ones suggested from a group of human subjects participating in the CardSort studies, although in a much more efficient way. This confirms the results of similar previous studies on websites of other themes (eg. travel, education). Moreover, the studies showed that a modified version of the method places articles under pre-existing categories with significant less relavance to the categorisation suggested by the participants. The thesis is concluded with the proposal of different ways to improve the proposed method's efficiency, both in the content of News Sites and in general. Ταξινόμηση καρτών 025.042 2 Human-computer interaction Semantic similarity measures Latent semantic analysis (LSA) Information architecture Card sorting
5	Ανάπτυξη μεθόδου με σκοπό την αναγνώριση και εξαγωγή θεματικών λέξεων κλειδιών από διευθύνσεις ιστοσελίδων του ελληνικού Διαδικτύου / Keyword identification within Greek URLs Βονιτσάνου, Μαρία-Αλεξάνδρα 16 January 2012 (has links) Η αύξηση της διαθέσιμης Πληροφορίας στον Παγκόσμιο Ιστό είναι ραγδαία. Η παρατήρηση αυτή παρότρυνε πολλούς ερευνητές να επικεντρώσουν το έργο τους στην εξαγωγή χρήσιμων γνωρισμάτων από διαδικτυακά έγγραφα, όπως ιστοσελίδες, εικόνες, βίντεο, με σκοπό τη ενίσχυση της διαδικασίας κατηγοριοποίησης ιστοσελίδων. Ένας πόρος που περιέχει πληροφορία και δεν έχει διερευνηθεί διεξοδικά για γλώσσες εκτός της αγγλικής, είναι η διεύθυνση ιστοσελίδας (URL- Uniform Recourse Locator). Το κίνητρο της διπλωματικής αυτής εργασίας είναι το γεγονός ότι ένα σημαντικό υποσύνολο των χρηστών του διαδικτύου δείχνει ενδιαφέρον για δικτυακούς πόρους, των οποίων οι διευθύνσεις URL περιλαμβάνουν όρους προερχόμενους από τη μητρική τους γλώσσα (η οποία δεν είναι η αγγλική), γραμμένους με λατινικούς χαρακτήρες. Προτείνεται μέθοδος η οποία θα αναγνωρίζει και θα εξάγει τις λέξεις-κλειδιά από διευθύνσεις ιστοσελίδων (URLs), εστιάζοντας στο ελληνικό Διαδίκτυο και συγκεκριμένα σε URLs που περιέχουν ελληνικούς όρους. Το κύριο ζήτημα της προτεινόμενης μεθόδου είναι ότι οι ελληνικές λέξεις μπορούν να μεταγλωττίζονται με λατινικούς χαρακτήρες σύμφωνα με πολλούς διαφορετικούς τρόπους, καθώς και το γεγονός ότι τα URLs μπορούν να περιέχουν περισσότερες της μιας λέξεις χωρίς κάποιο διαχωριστικό. Παρόλη την ύπαρξη προηγούμενων προσεγγίσεων για την επεξεργασία ελληνικού διαδικτυακού περιεχομένου, όπως αναζητήσεις στο ελληνικό διαδίκτυο και αναγνώριση οντότητας σε ελληνικές ιστοσελίδες, καμία από τις παραπάνω δεν βασίζεται σε διευθύνσεις URL. Επιπλέον, έχουν αναπτυχθεί πολλές τεχνικές για την κατηγοριοποίηση ιστοσελίδων με βάση κυρίως τις διευθύνσεις URL, αλλά καμία δεν διερευνά την περίπτωση του ελληνικού διαδικτύου. Η προτεινόμενη μέθοδος περιέχει δύο βασικά στοιχεία: το μεταγλωττιστή και τον κατακερματιστή. Ο μεταγλωττιστής, βασισμένος σε ένα ελληνικό λεξικό και ένα σύνολο κανόνων, μετατρέπει τις λέξεις που είναι γραμμένες με λατινικούς χαρακτήρες σε ελληνικούς όρους ενώ παράλληλα ο κατακερματιστής τμηματοποιεί τη διεύθυνση URL σε λέξεις με νόημα, εξάγοντας, έτσι τελικά ελληνικούς όρους που αποτελούν λέξεις κλειδιά. Η πειραματική αξιολόγηση της προτεινόμενης μεθόδου σε δείγμα ελληνικών URLs αποδεικνύει ότι μπορεί να αξιοποιηθεί εποικοδομητικά στην αυτόματη αναγνώριση λέξεων-κλειδιών σε ελληνικά URLs. / The available information on the WWW is increasing rapidly. This observation has triggered many researchers to focus their work on extracting useful features from web documents that would enhance the task of web classification. A quite informative resource that has not been thoroughly explored for languages other than English, is the uniform recourse locator (URL). Motivated by the fact that a significant part of the Web users is interested in web resources, whose URLs contain terms from their non English native languages,written using Latin characters, we propose a method that identifies and extracts successfully keywords within URLs focusing on the Greek Web and especially ons URLs, containing Greek terms. The main issue of this approach is that Greek words can be transliterated to Latin characters in many different ways based on how the words are pronounced rather than on how they are written. Although there are previous attempts on similar issues, like Greek web searches and entity recognition in Greek Web Pages, none of them is based on URLs. In addition, there are many techniques on web page categorization based mainly on URLs but noone explores the case of Greek terms. The proposed method uses a three-step approach; firstly, a normalized URL is divided into its basic components, according to URI protocol (scheme :// host / path-elements / document . extension). The domain part is splitted on the apperance of punctuation marks or numbers. Secondly, domain-tokens are segmented into meaningful tokens using a set of transliteration rules and a Greek dictionary. Finally, in order to identify useful keywords, a score is assigned to each extracted keyword based on its length and whether the word is nested in another word. The algorithm is evaluated on a random sample of 1,000 URLs collected manually. We perform a human-based evaluation comparing the keywords extracted automatically with the keywords extracted manually when no other additional information than the URL is available. The results look promising. Τμηματοποίηση λέξεων 025.042 2 Greeklish to Greek transliteration Keyword extraction Uniform resource locator Word segmentation

1

Page generated in 0.0221 seconds