Global ETD Search

1	Research Data Services Maturity in Academic Libraries Kollen, Christine, Kouper, Inna, Ishida, Mayu, Williams, Sarah, Fear, Kathleen 01 1900 (has links) An ACRL white paper from 2012 reported that, at that time, only a small number of academic libraries in the United States and Canada offered research data services (RDS), but many were planning to do so within the next two years (Tenopir, Birch, and Allard, 2012). By 2013, 74% of the Association of Research Libraries (ARL) survey respondents offered RDS and an additional 23% were planning to do so (Fearon, Gunia, Pralle, Lake, and Sallans, 2013). The academic libraries recognize that the landscape of services changes quickly and that they need to support the changing needs of research and instruction. In their efforts to implement RDS, libraries often respond to pressures originating outside the library, such as national or funder mandates for data management planning and data sharing. To provide effective support for researchers and instructors, though, libraries must be proactive and develop new services that look forward and yet accommodate the existing human, technological, and intellectual capital accumulated over the decades. Setting the stage for data curation in libraries means to create visionary approaches that supersede institutional differences while still accommodating diversity in implementation. How do academic libraries work towards that? This chapter will combine an historical overview of RDS thinking and implementations based on the existing literature with an empirical analysis of ARL libraries’ current RDS goals and activities. The latter is based on the study we conducted in 2015 that included a content analysis of North American research library web pages and interviews of library leaders and administrators of ARL libraries. Using historical and our own data, we will synthesize the current state of RDS implementation across ARL libraries. Further, we will examine the models of research data management maturity (see, for example, Qin, Crowston and Flynn, 2014) and discuss how such models compare to our own three-level classification of services and activities offered at libraries - basic, intermediate, and advanced. Our analysis will conclude with a set of recommendations for next steps, i.e., actions and resources that a library might consider to expand their RDS to the next maturity level. References Fearon, D. Jr., Gunia, B., Pralle, B.E., Lake, S., Sallans, A.L. (2013). Research data management services. (ARL Spec Kit 334). Washington, D.C.: ARL. Retrieved from: http://publications.arl.org/Research-Data-Management-Services-SPEC-Kit-334/ Tenopir, C., Birch, B., & Allard, S. (2012). Academic libraries and research data services: Current practices and plans for the future. ACRL. Retrieved from http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/Tenopir_Birch_Allard.pdf Qin, J., Crowston, K., & Flynn, C. (2014). 1.1 Commitment to Perform. A Capability Maturity Model for Research Data Management. wiki. Retrieved http://rdm.ischool.syr.edu/xwiki/bin/view/CMM+for+RDM/WebHome research data management research data curation
2	Data Management and Curation: Services and Resources Kollen, Christine, Bell, Mary 18 October 2016 (has links) Poster from University of Arizona 2016 IT Summit / Are you or the researchers you work with writing a grant proposal that requires a data management plan? Are you working on a research project and have questions about how to effectively and efficiently manage your research data? Are you interested in sharing your data with other researchers? We can help! For the past several years, the University of Arizona (UA) Libraries, in collaboration with the Office of Research and Discovery and the University Information Technology Services, has been providing data management services and resources to the campus. We are interested in tailoring our services and resources to what you need. We conducted a research data management survey in 2014 and are currently working on the Data Management and Data Curation and Publication (DMDC) pilot. This poster will describe what data management and curation services we are currently providing, and ask for your feedback on potential new data management services and resources. Data Management data curation Libraries Data management plans
3	Curating Digital Research Data Smith, MacKenzie 23 April 2012 (has links) 'Data Management and Curation' Breakout session from the Living the Future 8 Conference, April 23-24, 2012, University of Arizona Libraries, Tucson, AZ. changes for libraries academic libraries innovation in libraries research data curation
4	Developing Data Management Services: What Support do Researchers Need? Kollen, Christine 18 October 2016 (has links) Presented at the University of Arizona 2016 IT Summit / The past several years has seen an increasing emphasis on providing access to the results of research, both publications and data. The majority of federal grant funding agencies require that researchers include a data management plan as part of their grant proposal. In response, the University of Arizona Libraries, in collaboration with the Office of Research and Discovery and the University Information Technology Services, has been providing data management services and resources to the campus for the past several years. In 2014, we conducted a research data management survey to find out how UA researchers manage their research data, determine the demand for existing services and identify new services that UA researchers need. In the fall of 2015, the Data Management and Data Publication and Curation (DMDC) Pilot was started to determine what specific services and tools, including training and support and the needed technology infrastructure, researchers need to effectively and efficiently manage and curate their research data. This presentation will present what data management services we currently are offering, discuss findings from the 2014 survey, and present initial results from the DMDC pilot. research data curation data management data Research Discovery and Innovation
5	Supporting Metadata Management for Data Curation: Problem and Promise Westbrooks, Elaine L. 02 May 2008 (has links) Breakout session from the Living the Future 7 Conference, April 30-May 3, 2008, University of Arizona Libraries, Tucson, AZ. / Research communities and libraries are on the verge of reaching a saturation point with regard to the number of published reports documenting, planning, and defining e-science, e-research, cyberscholarship, and data curation. Despite the volumes of literature, little research is devoted to metadata maintenance and infrastructure. Libraries are poised to contribute metadata expertise to campus-wide data curation efforts; however, traditional and costly library methods of metadata creation and management must be replaced with cost-effective models that focus on the researcher’s data collection/analysis process. In such a model, library experts collaborate with researchers in building tools for metadata creation and maintenance which in turn contribute to the long-term sustainability, organization, and preservation of data. This presentation will introduce one of Cornell University Library’s collaborative efforts curating 2003 Northeast Blackout Data. The goal of the project is to make Blackout data accessible so that it can serve as a catalyst for innovative cross-disciplinary research that will produce better scientific understanding of the technology and communications that failed during the Blackout. Library staff collaborated with three groups: engineering faculty at Cornell, Government power experts, and power experts in the private sector. Finally the core components with regard to the metadata management methodology will be outlined and defined. Rights management emerged as the biggest challenge for the Blackout project. changes for libraries academic libraries innovation in libraries data curation metadata management
6	Leveraging big data for competitive advantage in a media organisation Nartey, Cecil Kabu January 2015 (has links) Thesis submitted in fulfilment of the requirements for the degree Master of Technology: Information Technology In the Faculty of Informatics and Design at the Cape Peninsula University of Technology / Data sources often emerge with the potential to transform, drive and allow deriving never-envisaged business value. These data sources change the way business enacts and models value generation. As a result, sellers are compelled to capture value by collecting data about business elements that drive change. Some of these elements, such as the customer and products, generate data as part of transactions which necessitates placement of the business element at the centre of the organisation’s data curation journey. This is in order to reveal changes and how these elements affect the business model. Data in business represents information translated into a format convenient for transfer. Data holds the relevant markers needed to measure business elements and provide the relevant metrics to monitor, steer and forecast business to attain enterprise goals. Data forms the building blocks of information within an organisation, allowing for knowledge and facts to be obtained. At its lowest level of abstraction, it provides a platform from which insights and knowledge can be derived as a direct extract for business decision-making as these decisions steer business into profitable situations. Because of this, organisations have had to adapt or change their business models to derive business value for sustainability, profitability and transformation. An organisation’s business model reflects a conceptual representation on how the organisation obtains and delivers value to prospective customers (the service beneficiary). In the process of delivering value to the service beneficiaries, data is generated. Generated data leads to business knowledge which can be leveraged to re-engineer the business model. The business model dictates which information and technology assets are needed for a balanced, profitable and optimised operation. The information assets represent value holding documented facts. Information assets go hand in hand with technology assets. The technology assets within an organisation are the technologies (computers, communications and databases) that support the automation of well-defined tasks as the organisation seeks to remain relevant to its clientele. What has become apparent is the fact that companies find it difficult to leverage the opportunities that data, and for that matter Big Data (BD), offers them. A data curation journey enables a seller to strategise and collect insightful data to influence how business may be conducted in a sustainable and profitable way while positioning the curating firm in a state of ‘information advantage’. While much of the discussion surrounding the concept of BD has focused on programming models (such as Hadoop) and technology innovations usually referred to as disruptive technologies (such as The Internet of Things and Automation of Knowledge Work), the real driver of technology and business is BD economics, which is the combination of open source data management and advanced analytics software coupled with commodity-based, scale-out architectures which are comparatively cheaper than prevalent sustainable technologies known to industry. Hadoop, though hugely misconstrued, is not an integration platform; it is a model the helps determine data value while it brings on-board an optimised way of curating data cheaply as part of the integration architecture. The objectives of the study were to explore how BD can be used to utilise the opportunities it offers the organisation, such as leveraging insights to enable business for transformation. This is accomplished by assessing the level of BD integration with the business model using the BD Business Model Maturation Index. Guidelines with subsequent recommendations are proposed for curation procedures aimed at improving the curation process. A qualitative research methodology was adopted. The research design outlines the research as a single case study; it outlines the philosophy as interpretivist, the approach as data collection through interviews, and the strategy as a review of the method of analysis deployed in the study. Themes that emerged from categorised data indicate the diverging of business elements into primary business elements and secondary supporting business elements. Furthermore, results show that data curation still hinges firmly on traditional data curation processes which diminish the benefits associated with BD curation. Results suggest a guided data curation process optimised by persistence hybridisation as an enabler to gain information advantage. The research also evaluated the level of integration of BD into the case business model to extrapolate results leading to guidelines and recommendations for BD curation. Big Data Data curation Business model Competitive advantage data monetisation Polyglot persistence
7	Open Archival Information System (OAIS) as a data curation standard in the World Data Centre Laughton, Paul Arthur 06 June 2012 (has links) D. Litt. et Phil. / The use of data in science has evolved to a new level in e-science. Collaboration in e- science is important as scientists, engineers and technologists work together to solve scientific problems, through the collection and analysis of large data sets. These experiments can generate enormous amounts of data, creating a need for more efficient storage, management and processing of data. Data needs to be managed effectively to ensure possible future use for secondary analysis and further experimentation. The practice of data curation deals with the management of data, with the objective of sustaining data as a resource for future use. A number of frameworks and models have been developed to address the curation of data, but only the Open Archival Information System (OAIS) has been accepted internationally. The World Data Centre (WDC) is an organisation that was established to ensure access to scientific data for a number of different scientific disciplines. This organisation consists of 52 individual data centres (iWDCs) that are members of the WDC, and are responsible for the curation of scientific data. Because the data curation practices and needs of each iWDC differ, the purpose of this study is to determine to what extent it is possible to develop a framework for the curation of data in the WDC. This study used a mixed method research design through the collection of data from an online survey (quantitative data) and a multiple-case case study (qualitative data). All the iWDCs were invited to participate in the online survey, which was created to quantify OAIS functional model compatibility, sampling for the case study was conducted based on the OAIS functional model compatibility scores. v Based on the findings from this study, suggestions towards a suitable framework for the curation of data in the WDC are made. The key outcomes from this research included a quantitative OAIS functional model compatibility test and suggestions towards a suitable framework for the curation of data. The suggestions towards a suitable framework for the curation of data in the WDC should in future be tested in the newly formed World Data System (WDS) and adjustments made to create a viable framework for curating data in the WDS. Open Archival Information System OAIS Open access publishing Data curation Digital preservation standards World Data System
8	Italianising English words with G2P techniques in TTS voices. An evaluation of different models Grassini, Francesco January 2024 (has links) Text-to-speech voices have come a long way in terms of their naturalness, and they are getting closer to human-sounding than ever. However, among the problems that still persist, the pronunciation of foreign words is still one of them. The experiments conducted in this thesis focus on using grapheme-to-phoneme (G2P) models to tackle the just-mentioned issue and, more specifically, to adjust the erroneous pronunciation of English words to an Italian English accent in Italian-speaking voices. We curated a dataset of words collected during recording sessions with an Italian voice actor reading general conversational sentences. We then manually transcribed their pronunciation in Italian English. In the second stage, we augmented the dataset by collecting the most common surnames in Great Britain and the United States, phonetically transcribed them with a rule-based phoneme mapping algorithm previously deployed by the company, and then manually adjusted the pronunciations to Italian English. Thirdly, by using the massively multilingual ByT5 model, a Transformer G2P model pre-trained on 100 languages, as well as its tokenizer-dependent versions T5_base and T5_small, and an LSTM with attention based on OpenNMT, we performed 10-fold cross-validation with the curated dataset. The results show that augmenting the data benefitted every model. In terms of PER, WER and accuracy, the transformer-based ByT5_small strongly outperformed its T5_small and T5_base counterparts even with a third or two-thirds of the training data. The second best performing model, the LSTM with attention one built with the OpenNMT framework, outperformed as well the T5 models, showed the second-best accuracy of our experiments and was the 'lightest' in terms of trainable parameters (2M) in comparison to ByT5 (299M) and the T5 ones (60 and 200M). G2P Data Curation NVIDIA OpenNMT Phonetics Speech Technology
9	Ein längeres Leben für Deine Daten! / Let your data live longer! Schäfer, Felix 20 April 2016 (has links) (PDF) Data life cycle and research data managemet plans are just two of many key-terms used in the present discussion about digital research data. But what do they mean - on the one hand for an individual scholar and on the other hand for a digital infrastructure like IANUS? The presentation will try to explain some of the terms and will show how IANUS is dealing with them in order to enhance the reusability of unique data. The presentation starts with an overview of the different disciplines, research methods and types of data, which together characterise modern research on ancient cultures. Nearly in all scientific processes digital data is produced and has gained a dominant role as the stakeholder-analysis and the evaluation of test data collections done by IANUS in 2013 clearly demonstrate. Nevertheless, inspite of their high relevance digital files and folders are in danger with regard to their accessability and reusability in the near and far future. Not only the storage devices, software applications and file formates become slowly but steadily obsolete, but also the relevant information (i.e. the metadata) to understand all the produced bits and bytes intellectually will get lost over the years. Therefore, urging questions concern the challenges how we can prevent – or at least reduce – a forseeable loss of digital information and what we will do with all the results, which do not find their way into publications? Being a disipline’s specific national center for research data of archaeology and ancient studies, IANUS tries to answer these questions and to establish different services in this context. The slides give an overview of the centre structure, its state of development and its planned targets. The primary service (scheduled for autumn 2016) will be the long-term preservation, curation and publication of digital research data to ensure its reusability and will be open for any person and institution. One already existing offer are the “IT-Empfehlungen für den nachhaltigen Umgang mit digitalen Daten in den Altertumswissenschaften“ which provide information and advice about data management, file formats and project documentation. Furthermore, it offers instructions on how to deposit data collections for archiving and disseminating. Here, external experts are cordially invited to contribute and write missing recommendations as new authors. Datenlebenszyklus Forschungsdatenmanagement Datenkuratierung IT-Empfehlungen IANUS data life cycle research data management data curation IT-guidelines IANUS ddc:930
10	A multi-layered approach to information extraction from tables in biomedical documents Milosevic, Nikola January 2018 (has links) The quantity of literature in the biomedical domain is growing exponentially. It is becoming impossible for researchers to cope with this ever-increasing amount of information. Text mining provides methods that can improve access to information of interest through information retrieval, information extraction and question answering. However, most of these systems focus on information presented in main body of text while ignoring other parts of the document such as tables and figures. Tables present a potentially important component of research presentation, as authors often include more detailed information in tables than in textual sections of a document. Tables allow presentation of large amounts of information in relatively limited space, due to their structural flexibility and ability to present multi-dimensional information. Table processing encapsulates specific challenges that table mining systems need to take into account. Challenges include a variety of visual and semantic structures in tables, variety of information presentation formats, and dense content in table cells. The work presented in this thesis examines a multi-layered approach to information extraction from tables in biomedical documents. In this thesis we propose a representation model of tables and a method for table structure disentangling and information extraction. The model describes table structures and how they are read. We propose a method for information extraction that consists of: (1) table detection, (2) functional analysis, (3) structural analysis, (4) semantic tagging, (5) pragmatic analysis, (6) cell selection and (7) syntactic processing and extraction. In order to validate our approach, show its potential and identify remaining challenges, we applied our methodology to two case studies. The aim of the first case study was to extract baseline characteristics of clinical trials (number of patients, age, gender distribution, etc.) from tables. The second case study explored how the methodology can be applied to relationship extraction, examining extraction of drug-drug interactions. Our method performed functional analysis with a precision score of 0.9425, recall score of 0.9428 and F1-score of 0.9426. Relationships between cells were recognized with a precision of 0.9238, recall of 0.9744 and F1-score of 0.9484. The information extraction methodology performance is the state-of-the-art in table information extraction recording an F1-score range of 0.82-0.93 for demographic data, adverse event and drug-drug interaction extraction, depending on the complexity of the task and available semantic resources. Presented methodology demonstrated that information can be efficiently extracted from tables in biomedical literature. Information extraction from tables can be important for enhancing data curation, information retrieval, question answering and decision support systems with additional information from tables that cannot be found in the other parts of the document. 004

Search results