Spelling suggestions: "subject:"[een] DOCUMENT"" "subject:"[enn] DOCUMENT""
691 |
Paan : a tool for back-propagating changes to projected documentsKim, Jongwook 08 July 2011 (has links)
Research in Software Product Line Engineering (SPLE) traditionally focuses on product derivation. Prior work has explored the automated derivation of products by module composition. However, it has so far neglected propagating changes (edits) in a product back to the product line definition. A domain-specific product should be possible to update its features locally, and later these changes should be propagated back to the product line definition automatically. Otherwise, the entire product line has to be revised manually in order to make the changes permanent. Although this is the current state, it is a very error-prone process. To address these issues, we present a tool called Paan to create product lines of MS Word documents with back-propagation support. It is a diff-based tool that ignores unchanged fragments and reveals fragments that are changed, added or deleted. Paan takes a document with variation points (VPs) as input, and shreds it into building blocks called tiles. Only those tiles that are new or have changed must be updated in the tile repository. In this way, changes in composed documents can be back-propagated to their original feature module definitions. A document is synthesized by retrieving the appropriate tiles and composing them. / text
|
692 |
Medical document management system using XMLChan, Wai-man, 陳偉文 January 2001 (has links)
published_or_final_version / Computer Science and Information Systems / Master / Master of Philosophy
|
693 |
Virtualios organizacijos dokumentų valdymo sistema / Virtual Organisation Document Management SystemSturis, Ričardas 22 September 2004 (has links)
The main goal of Masters final work is to analize basics of virtual organization and its needs for document management system. Virtual organisation is a one of organisations methods to colaborate. Virtual organisation require the system, who will guarantee document managent and control.
|
694 |
Dokumentų valdymo sistemos metaduomenų apdorojimo modelio sudarymas ir tyrimas / Analysis and Development of Metadata Processing Model for Document ManagementŽukaitis, Rimantas 25 May 2004 (has links)
The increasing usage of personal computers and Internet in organizations made it possible to create, edit and share various documents between different employees of the organization. However, document management becomes very troublesome, especially if several employees can contribute changes to a singe document: it is very hard to locate latest document version, or determine which changes to the document were made by which employee. Document management systems are aimed to solve these problems. However these systems often are highly specialized and very costly to implement, or they are general-purpose and hard to customize and apply to organization business domain. Inabilities to customize often arise from strict and inflexible metadata model, used in document management system. The aim of this work is propose abstract document metadata definition and processing model, based on XML data definition language and concept of XML data processing pipeline. The proposed model is general-purpose and highly flexible at the same time, enabling to apply model to any business domain and customize it to reflect any features specific to this domain.
|
695 |
Tekstinių dokumentų išsaugojimo ir išrinkimo metodų dokumentų valdymo sistemoje tyrimas / Storage and retrieval methods of text documents in document management systemsKažukauskas, Audrys 27 May 2004 (has links)
Document management systems allow organizations to have greater control over the lifecycle of documents from creation through review, storage, retrieval and dissemination all the way to their destruction. Document management provides greater efficiencies in the ability to classify and reuse information. This document deals with issues of storage and retrieval processes of text documents in document management systems. Main focus is made on choosing an effective document format, means and methods for storing and retrieving documents. The paper suggests using XML as the base document format and relational database management system as backend storage of the document management system. A new modification of the standard Edge method for storing and retrieving XML documents from relational database management systems is introduced and the results of its performance experiment are presented. The experiment proves the performance superiority of the modified Edge method over its standard analogue.
|
696 |
Savivaldybių darbuotojų kompetenciją Europos Sąjungos struktūrinių fondų lėšomis finansuojamų aplinkos projektų valdyme / Abilities of municipality employees in the field of environmental projects supported from EU Structural FundsLožytė, Aurelija 29 January 2008 (has links)
Lietuvai tapus pilnateise Europos Sąjungos nare, atsivėrė galimybės pasinaudoti ES struktūrinių fondų skiriama parama. Pagal Lietuvos 2004-2006 m. bendrąjį programavimo dokumentą (toliau – BPD), patvirtintą LR Vyriausybės ir Europos Komisijos, Lietuvai skirta 3,09 milijardų litų ES struktūrinių fondų paramos patvirtintiems investiciniams prioritetams įgyvendinti. Beveik 85 milijonai iš jų skirta aplinkosaugai. Europos regioninės plėtros fondo parama aplinkos projektams skiriama pagal BPD 1.3 priemonę „Aplinkos kokybės gerinimas ir žalos aplinkai prevencija“. ES struktūrinių fondų lėšomis yra remiami ir savivaldybių administracijų inicijuoti aplinkos projektai.
Šiame darbe siekiama įvertinti savivaldybių administracijų darbuotojų kompetenciją Europos Sąjungos struktūrinių fondų lėšomis finansuojamų aplinkos projektų valdyme ir pateikti šių projektų vykdymo gerinimo galimybes. Informacija reikalinga darbe iškeltam tikslui pasiekti gauta anketinės apklausos būdu ir analizuojant literatūrą tyrimo tema. Atlikus tyrimą matyti, jog savivaldybių darbuotojai turi pakankamai patirties, žinių ir įgūdžių reikalingų projektų valdymui, todėl galima teigti, jog tyrime dalyvavę darbuotojai yra kompetentingi įgyvendinti savivaldybių inicijuotus projektus. Tačiau reikia pažymėti, kad didelis tiesioginio darbo krūvis, nepakankamos žinios ES struktūrinės paramos valdymo klausimais bei žymus mokymų trūkumas, kol kas trukdo sėkmingai inicijuoti ir įgyvendinti ES struktūrinių fondų lėšomis... [toliau žr. visą tekstą] / Joining the European Union in May 2004, Lithuania qualified for EU financial assistance from ES Structural Funds. The support from EU Structural Funds for Lithuania in 2004-2006 is provided under the Single Programming Document for 2004-2006 (hereinafter – the SPD) approved by the Government of the Republic of Lithuania and by the European Commission. The SPD established five investment priorities and for the implementation of these priorities, about 895 million Euro of EU funds have been allocated. About 24 million Euro of this support have been allocated for the environment protection. Financial assistance from the European Regional Development Fund for the environment protection is provided under the measure 1.3 „Improvement of Environment Quality and Prevention of Environmental Damage“. Municipality administrations also qualified the support from EU Structural Funds to initiate environmental projects.
The aim of this paper is to evaluate abilities of municipality administrations employees in the field of environmental projects and to suggest how better to hold these projects. Information necessary to achieve the aim of this paper was got using the questionnaire and making the analysis of literature. According to the results of this research, employees of municipality administrations have enough skills, experience and knowledge to hold projects initiated by municipality administrations. But it is necessary to emphasize, that having a heavy caseload, lack of knowledge how... [to full text]
|
697 |
Investigating the Efficacy of XML and Stylesheets to Render Electronic Courseware for Multiple Learning Stylesdu Toit, Masha 01 June 2007 (has links)
The objective of this project was to test the efficacy of using Extensible Markup Language (XML) - in particular the DocBook 5.0b5 schema - and Extensible Stylesheet Language Transformation (XSLT) to render electronic courseware that can be dynamically re-formatted according to a student’s individual learning style.
The text of a typical lesson was marked up in XML according to the DocBook schema, and several XSLT stylesheets were created to transform the XML document into different versions, each according to particular learning needs. These learning needs were drawn from the Felder-Silverman learning style model. The notes had links to trigger JavaScript functions that allowed the student to reformat the notes to produce different views of the lesson.
The dynamic notes were tested on twelve users who filled out a feedback questionnaire. Feedback was largely positive. It suggested that users were able to navigate according to their learning style. There were some usability issues caused by lack of compatibility of the program with some browsers. However, the user test is not the most critical part of the evaluation. It served to confirm that the notes were usable, but the analysis of the use of XSLT and DocBook is the key aspect of this project. It was found that XML, and in particular the DocBook schema, was a useful tool in these circumstances, being easy to learn, well supported and having the appropriate structure for a project of this type.
The use of XSLT on the other hand was not so straightforward. Learning a declarative language was a challenge, as was using XSLT to transform the notes as necessary for this project. A particular problem was the need to move content from one area of the document to another - to hide it in some cases and reveal it in others. The solution was not straightforward to achieve using XSLT, and does not take proper advantage of the strengths of this technology. The fact that the XSLT processor uses the DOM API, which necessitates the loading of the entire XML document into memory, is particularly problematic in this instance where the document is constantly transformed and re-transformed. The manner in which stylesheets are assigned, as well as the need to use DOM objects to edit the source tree, necessitated the use of JavaScript to create the necessary usability. These mechanisms introduced a limitation in terms of compatibility with browsers and caused the program to freeze on older machines. The problems with browser compatibility and the synchronous loading of data are not insurmountable, and can be overcome with the appropriate use of JavaScript and the use of asynchronous data retrieval as is made possible by the use of AJAX.
|
698 |
Improving searchability of automatically transcribed lectures through dynamic language modellingMarquard, Stephen 01 December 2012 (has links)
Recording university lectures through lecture capture systems is increasingly common. However, a single continuous audio recording is often unhelpful for users, who may wish to navigate quickly to a particular part of a lecture, or locate a specific lecture within a set of recordings.
A transcript of the recording can enable faster navigation and searching. Automatic speech recognition (ASR) technologies may be used to create automated transcripts, to avoid the significant time and cost involved in manual transcription.
Low accuracy of ASR-generated transcripts may however limit their usefulness. In particular, ASR systems optimized for general speech recognition may not recognize the many technical or discipline-specific words occurring in university lectures. To improve the usefulness of ASR transcripts for the purposes of information retrieval (search) and navigating within recordings, the lexicon and language model used by the ASR engine may be dynamically adapted for the topic of each lecture.
A prototype is presented which uses the English Wikipedia as a semantically dense, large language corpus to generate a custom lexicon and language model for each lecture from a small set of keywords. Two strategies for extracting a topic-specific subset of Wikipedia articles are investigated: a naïve crawler which follows all article links from a set of seed articles produced by a Wikipedia search from the initial keywords, and a refinement which follows only links to articles sufficiently similar to the parent article. Pair-wise article similarity is computed from a pre-computed vector space model of Wikipedia article term scores generated using latent semantic indexing.
The CMU Sphinx4 ASR engine is used to generate transcripts from thirteen recorded lectures from Open Yale Courses, using the English HUB4 language model as a reference and the two topic-specific language models generated for each lecture from Wikipedia.
Three standard metrics – Perplexity, Word Error Rate and Word Correct Rate – are used to evaluate the extent to which the adapted language models improve the searchability of the resulting transcripts, and in particular improve the recognition of specialist words. Ranked Word Correct Rate is proposed as a new metric better aligned with the goals of improving transcript searchability and specialist word recognition.
Analysis of recognition performance shows that the language models derived using the similarity-based Wikipedia crawler outperform models created using the naïve crawler, and that transcripts using similarity-based language models have better perplexity and Ranked Word Correct Rate scores than those created using the HUB4 language model, but worse Word Error Rates.
It is concluded that English Wikipedia may successfully be used as a language resource for unsupervised topic adaptation of language models to improve recognition performance for better searchability of lecture recording transcripts, although possibly at the expense of other attributes such as readability.
|
699 |
一個對單篇中文文章擷取關鍵字之演算法 / A Keyword Extraction Algorithm for Single Chinese Document吳泰勳, Wu, Tai Hsun Unknown Date (has links)
數位典藏與數位學習國家型科技計畫14年來透過數位化方式典藏國家文物,例如:生物、考古、地質等15項主題,為了能讓數位典藏資料與時事互動故使用關鍵字作為數位典藏資料與時事的橋樑,由於時事資料會出現新字詞,因此,本研究將提出一個演算法在不使用詞庫或字典的情況下對單一篇中文文章擷取主題關鍵字,此演算法是以Bigram的方式斷詞因此字詞最小單位為二個字,例如:「中文」,隨後挑選出頻率詞並採用分群的方式將頻率詞進行分群最後計算每個字詞的卡方值並產生主題關鍵字,在文章中字詞共現的分佈是很重要的,假設一字詞與所有頻率詞的機率分佈中,此字詞與幾個頻率詞的機率分佈偏差較大,則此字詞極有可能為一關鍵字。在字詞的呈現方面,中文句子裡不像英文句子裡有明顯的分隔符號隔開每一個字詞,造成中文在斷詞處理上產生了極大的問題,與英文比較起來中文斷詞明顯比英文來的複雜許多,在本研究將會比較以Bigram、CKIP和史丹佛中文斷詞器為斷詞的工具,分別進行過濾或不過濾字詞與對頻率詞分群或不分群之步驟,再搭配計算卡方值或詞頻後所得到的主題關鍵字之差異,實驗之資料將採用中央研究院數位典藏資源網的文章,文章的標準答案則來自於中央研究院資訊科學研究所電腦系統與通訊實驗室所開發的撈智網。從實驗結果得知使用Bigram斷詞所得到的主題關鍵字部分和使用CKIP或史丹佛中文斷詞器所得到的主題關鍵字相同,且部分關鍵字與文章主題的關聯性更強,而使用Bigram斷詞的主要優點在於不用詞庫。最後,本研究所提出之演算法是基於能將數位典藏資料推廣出去的前提下所發展,希望未來透過此演算法能從當下熱門話題的文章擷取出主題關鍵字,並透過主題關鍵字連結到相關的數位典藏資料,進而帶動新一波「數典潮」。 / In the past 14 years, Taiwan e-Learning and Digital Archives Program has developed digital archives of organism, archaeology, geology, etc. There are 15 topics in the digital archives. The goal of the work presented in this thesis is to automatically extract keyword s in documents in digital archives, and the techniques developed along with the work can be used to build a connection between digital archives and news articles. Because there are always new words or new uses of words in news articles, in this thesis we propose an algorithm that can automatically extract keywords from a single Chinese document without using a corpus or dictionary. Given a document in Chinese, initially the algorithm uses a bigram-based approach to divide it into bigrams of Chinese characters. Next, the algorithm calculates term frequencies of bigrams and filters out those with low term frequencies. Finally, the algorithm calculates chi-square values to produce keywords that are most related to the topic of the given document. The co-occurrence of words can be used as an indicator for the degree of importance of words. If a term and some frequent terms have similar distributions of co-occurrence, it would probably be a keyword. Unlike English word segmentation which can be done by using word delimiters, Chinese word segmentation has been a challenging task because there are no spaces between characters in Chinese. The proposed algorithm performs Chinese word segmentation by using a bigram-based approach, and we compare the segmented words with those given by CKIP and Stanford Chinese Segmenter. In this thesis, we present comparisons for different settings: One considers whether or not infrequent terms are filtered out, and the other considers whether or not frequent terms are clustered by a clustering algorithm. The dataset used in experiments is downloaded from the Academia Sinica Digital Resources and the ground truth is provided by Gainwisdom, which is developed by Computer Systems and Communication Lab in Academia Sinica. According to the experimental results, some of the segmented words given by the bigram-based approach adopted in the proposed algorithm are the same as those given by CKIP or Stanford Chinese Segmenter, while some of the segmented words given by the bigram-based approach have stronger connections to topics of documents. The main advantage of the bigram-based approach is that it does not require a corpus or dictionary.
|
700 |
CLUSTER-BASED TERM WEIGHTING AND DOCUMENT RANKING MODELSMurugesan, Keerthiram 01 January 2011 (has links)
A term weighting scheme measures the importance of a term in a collection. A document ranking model uses these term weights to find the rank or score of a document in a collection. We present a series of cluster-based term weighting and document ranking models based on the TF-IDF and Okapi BM25 models. These term weighting and document ranking models update the inter-cluster and intra-cluster frequency components based on the generated clusters. These inter-cluster and intra-cluster frequency components are used for weighting the importance of a term in addition to the term and document frequency components. In this thesis, we will show how these models outperform the TF-IDF and Okapi BM25 models in document clustering and ranking.
|
Page generated in 0.0718 seconds