261 |
ヘイト・スピーチ規制に関する憲法学的考察 : 表現の自由を巡る現代的課題 / ヘイト・スピーチ キセイ ニカンスル ケンポウガクテキ コウサツ : ヒョウゲン ノ ジユウ オ メグル ゲンダイテキ カダイ / ヘイトスピーチ規制に関する憲法学的考察 : 表現の自由を巡る現代的課題桧垣 伸次, Shinji Higaki 17 September 2015 (has links)
博士(法学) / Doctor of Laws / 同志社大学 / Doshisha University
|
262 |
Data Fusion and Text Mining for Supporting Journalistic WorkZsombor, Vermes January 2022 (has links)
During the past several decades, journalists have been struggling with the ever growing amount of data on the internet. Investigating the validity of the sources or finding similar articles for a story can consume a lot of time and effort. These issues are even amplified by the declining size of the staff of news agencies. The solution is to empower the remaining professional journalists with digital tools created by computer scientists. This thesis project is inspired by an idea to provide software support for journalistic work with interactive visual interfaces and artificial intelligence. More specifically, within the scope of this thesis project, we created a backend module that supports several text mining methods such as keyword extraction, named entity recognition, sentiment analysis, fake news classification and also data collection from various data sources to help professionals in the field of journalism. To implement our system, first we gathered the requirements from several researchers and practitioners in journalism, media studies, and computer science, then acquired knowledge by reviewing literature on current approaches. Results are evaluated both with quantitative methods such as individual component benchmarks and also with qualitative methods by analyzing the outcomes of the semi-structured interviews with collaborating and external domain experts. Our results show that there is similarity between the domain experts' perceived value and the performance of the components on the individual evaluations. This shows us that there is potential in this research area and future work would be welcomed by the journalistic community.
|
263 |
Text Simplification and Keyphrase Extraction for SwedishLindqvist, Ellinor January 2019 (has links)
Attempts have been made in Sweden to increase readability for texts addressed to the public, and ongoing projects are still being conducted by disability associations, private companies and Swedish authorities. In this thesis project, we explore automatic approaches to increase readability trough text simplification and keyphrase extraction, with the goal of facilitating text comprehension and readability for people with reading difficulties. A combination of handwritten rules and monolingual machine translation was used to simplify the syntactic and lexical content of Swedish texts, and noun phrases were extracted to provide the reader with a short summary of the textual content. A user evaluation was conducted to compare the original and the simplified version of the same text. Several texts and their simplified versions were also evaluated using established readability metrics. Although a manual evaluation of the result showed that the implemented rules generally worked as intended on the sentences that were targeted, the results from the user evaluation and readability metrics did not show improvements. We believe that further additions to the rule set, targeting a wider range of linguistic structures, have the potential to improve the results.
|
264 |
OPEN—Enabling Non-expert Users to Extract, Integrate, and Analyze Open DataBraunschweig, Katrin, Eberius, Julian, Thiele, Maik, Lehner, Wolfgang 27 January 2023 (has links)
Government initiatives for more transparency and participation have lead to an increasing amount of structured data on the web in recent years. Many of these datasets have great potential. For example, a situational analysis and meaningful visualization of the data can assist in pointing out social or economic issues and raising people’s awareness. Unfortunately, the ad-hoc analysis of this so-called Open Data can prove very complex and time-consuming, partly due to a lack of efficient system support.On the one hand, search functionality is required to identify relevant datasets. Common document retrieval techniques used in web search, however, are not optimized for Open Data and do not address the semantic ambiguity inherent in it. On the other hand, semantic integration is necessary to perform analysis tasks across multiple datasets. To do so in an ad-hoc fashion, however, requires more flexibility and easier integration than most data integration systems provide. It is apparent that an optimal management system for Open Data must combine aspects from both classic approaches. In this article, we propose OPEN, a novel concept for the management and situational analysis of Open Data within a single system. In our approach, we extend a classic database management system, adding support for the identification and dynamic integration of public datasets. As most web users lack the experience and training required to formulate structured queries in a DBMS, we add support for non-expert users to our system, for example though keyword queries. Furthermore, we address the challenge of indexing Open Data.
|
265 |
Лексема «надежда» в русской и китайской языковой картине мира (по данным словарей и результатам психолингвистического эксперимента) : магистерская диссертация / Lexeme "hope" in the russian and chinese language world vision (according to the dictionaries and the results of the psycholinguistic experiment)Чэнь, Я., Chen, Y. January 2019 (has links)
Национальный менталитет выражается в языке, в частности, через ключевые слова, одно из которых надежда. Данные толковых словарей русского и китайского языков позволяют представить лексикографическое описание лексемы надежда, сравнить их значения в разных языках. Проведенный психолингвистический эксперимент среди русских и китайских респондентов позволил выявить лексические значения, существующие в реальном употреблении. Если для русских респондентов надежда – это ожидание, мечта и женское имя, то для китайских респондентов надежда – это успех, мотивация. / National mentality is expressed in language, in particular, through keywords, one of which is hope. The data of the explanatory dictionaries of the Russian and Chinese languages allow us to present a lexicographical description of the lexeme hope, to compare their meanings in different languages. A psycholinguistic experiment conducted among Russian and Chinese respondents revealed lexical meanings that exist in actual use. If for Russian respondents hope is an expectation, a dream and a female name, then for Chinese respondents hope is a success, motivation.
|
266 |
A Platform for Aligning Academic Assessments to Industry and Federal Job PostingsParks, Tyler J. 07 1900 (has links)
The proposed tool will provide users with a platform to access a side-by-side comparison of classroom assessment and job posting requirements. Using techniques and methodologies from NLP, machine learning, data analysis, and data mining: the employed algorithm analyzes job postings and classroom assessments, extracts and classifies skill units within, then compares sets of skills from different input volumes. This effectively provides a predicted alignment between academic and career sources, both federal and industrial. The compilation of tool results indicates an overall accuracy score of 82%, and an alignment score of only 75.5% between the input assessments and overall job postings. These results describe that the 50 UNT assessments and 5,000 industry and federal job postings examined, demonstrate a compatibility (alignment) of 75.5%; and, that this measure was calculated using a tool operating at an 82% precision rate.
|
267 |
AI Enabled Cloud RAN Test Automation : Automatic Test Case Prediction Using Natural Language Processing and Machine Learning Techniques / AI Cloud RAN test automatisering : Automatisk generering av testfall med hjälp av naturlig språkbehandling och maskininlärningsteknikerSantosh Nimbhorkar, Jeet January 2023 (has links)
The Cloud Radio Access Network (RAN) is a technology used in the telecommunications industry. It provides a flexible, scalable, and costeffective solution for managing and delivering seamless wireless network services. However, the testing of Cloud RAN applications poses formidable challenges due to its complex nature, resulting in potential delays in product delivery and amplified costs. Using the power of test automation is an approach to tackling these challenges. By automating the testing process, we can reduce manual efforts, enhance the accuracy and efficiency of testing procedures, and ultimately expedite the delivery of high-quality products. In this era of cutting-edge advancements, artificial intelligence (AI) and machine learning (ML) can be used to aid Cloud RAN testing. These technologies empower us to swiftly identify and address complex issues. The goal of this thesis is to have a data-driven approach toward Cloud RAN test automation. Machine learning along with natural language processing techniques are used to automatically predict test cases from test instructions. The test instructions are analyzed and keywords are extracted from them using natural language processing techniques. The performance of two keyword extraction techniques is compared. SpaCy was the best-performing keyword extractor. Test script prediction from these keywords is done using two approaches; using test script names and using test script contents. Random Forest was the best performing model for both these approaches when the data were oversampled and when it was undersampled as well. / Cloud Radio Access Network (RAN) är en revolutionerande teknik som används inom telekommunikationsindustrin. Det ger en flexibel, skalbar och kostnadseffektiv lösning för att hantera och leverera sömlösa trådlösa nätverkstjänster. Testningen av Cloud RAN-applikationer innebär dock enorma utmaningar på grund av dess komplexa natur, vilket resulterar i potentiella förseningar i produktleverans och förstärkta kostnader. Att använda kraften i testautomatisering är en avgörande metod för att tackla dessa utmaningar. Genom att automatisera testprocessen kan vi dramatiskt minska manuella ansträngningar, avsevärt förbättra noggrannheten och effektiviteten i testprocedurerna och i slutändan påskynda leveransen av högkvalitativa produkter. I denna era av banbrytande framsteg kan artificiell intelligens (AI) och maskininlärning (ML) användas för att revolutionera Cloud RAN-testning. Dessa banbrytande teknologier ger oss möjlighet att snabbt identifiera och ta itu med komplexa problem. Målet med detta examensarbete är att ha ett datadrivet förhållningssätt till Cloud RAN-testautomatisering. Maskininlärning tillsammans med naturliga språkbehandlingstekniker används för att automatiskt generera testfall från testinstruktioner. Testinstruktionerna analyseras och nyckelord extraheras från dem med hjälp av naturliga språkbehandlingstekniker. Resultatet av två sökordsextraktionstekniker jämförs. SpaCy var den bäst presterande sökordsextraktorn. Förutsägelse av testskript från dessa nyckelord görs med två metoder; använda testskriptnamn och använda testskriptinnehåll. Random forests var den bäst presterande modellen för båda dessa tillvägagångssätt när data överstämplades och även undersamplades.
|
268 |
Exploratory Ad-Hoc Analytics for Big DataEberius, Julian, Thiele, Maik, Lehner, Wolfgang 19 July 2023 (has links)
In a traditional relational database management system, queries can only be defined over attributes defined in the schema, but are guaranteed to give single, definitive answer structured exactly as specified in the query. In contrast, an information retrieval system allows the user to pose queries without knowledge of a schema, but the result will be a top-k list of possible answers, with no guarantees about the structure or content of the retrieved documents. In this chapter, we present Drill Beyond, a novel IR/RDBMS hybrid system, in which the user seamlessly queries a relational database together with a large corpus of tables extracted from a web crawl. The system allows full SQL queries over a relational database, but additionally enables the user to use arbitrary additional attributes in the query that need not to be defined in the schema. The system then processes this semi-specified query by computing a top-k list of possible query evaluations, each based on different candidate web data sources, thus mixing properties of two worlds RDBMS and IR systems.
|
269 |
[pt] BUSCA POR PALAVRAS-CHAVE SOBRE GRAFOS RDF FEDERADOS EXPLORANDO SEUS ESQUEMAS / [en] KEYWORD SEARCH OVER FEDERATED RDF GRAPHS BY EXPLORING THEIR SCHEMASYENIER TORRES IZQUIERDO 28 July 2017 (has links)
[pt] O Resource Description Framework (RDF) foi adotado como uma recomendação do W3C em 1999 e hoje é um padrão para troca de dados na Web. De fato, uma grande quantidade de dados foi convertida em RDF, muitas vezes em vários conjuntos de dados fisicamente distribuídos ao longo de diferentes localizações. A linguagem de consulta SPARQL (sigla do inglês de SPARQL Protocol and RDF Query Language) foi oficialmente introduzido em 2008 para recuperar dados RDF e fornecer endpoints para consultar fontes distribuídas. Uma maneira alternativa de acessar conjuntos de dados RDF é usar consultas baseadas em palavras-chave, uma área que tem sido extensivamente pesquisada, com foco recente no conteúdo da Web. Esta dissertação descreve uma estratégia para compilar consultas baseadas em palavras-chave em consultas SPARQL federadas sobre conjuntos de dados RDF distribuídos, assumindo que cada conjunto de dados RDF tem um esquema e que a federação tem um esquema mediado. O processo de compilação da consulta SPARQL federada é explicado em detalhe, incluindo como computar o conjunto de joins externos entre as subconsultas locais geradas, como combinar, com a ajuda de cláusulas UNION, os resultados de consultas locais que não têm joins entre elas, e como construir a cláusula TARGET, de acordo com a composição da cláusula WHERE. Finalmente, a dissertação cobre experimentos com dados do mundo real para validar a implementação. / [en] The Resource Description Framework (RDF) was adopted as a W3C recommendation in 1999 and today is a standard for exchanging data in the Web. Indeed, a large amount of data has been converted to RDF, often as multiple datasets physically distributed over different locations. The SPARQL Protocol and RDF Query Language (SPARQL) was officially introduced in 2008 to retrieve RDF datasets and provide endpoints to query distributed sources. An alternative way to access RDF datasets is to use keyword-based queries, an area that has been extensively researched, with a recent focus on Web content. This dissertation describes a strategy to compile keyword-based queries into federated SPARQL queries over distributed RDF datasets, under the assumption that each RDF dataset has a schema and that the federation has a mediated schema. The compilation process of the federated SPARQL query is explained in detail, including how to compute a set of external joins between the local subqueries, how to combine, with the help of the UNION clauses, the results of local queries which have no external joins between them, and how to construct the TARGET clause, according to the structure of the WHERE clause. Finally, the dissertation covers experiments with real-world data to validate the implementation.
|
270 |
#Historia : Metadata som resurs i historieforskningBoström, Hanna January 2020 (has links)
Under 2000-talet har det producerats och spridits mängder av betydande forskning som publicerats via databaser. En betydelsefull länk i kunskapsspridningen utgörs av akademin som i dag står för den största andelen av vetenskapliga publikationer. I denna historiografiskt inriktade undersökning kartläggs och undersöks en del av svensk historieforskning och historieskrivning som ägt rum under 2000-talet. Den vetenskapliga disciplin som undersöks inom det humanistiska fältet är historievetenskapen, avgränsat till de resultat av forskning som studenter gjort runtom på de svenska universiteten och högskolorna, i ämnet historia. Källmaterialet består av studentuppsatser som publicerats i databasen Digitala vetenskapliga arkivet, DiVA vilket i dag ses som det nationellt mest använda systemet för publikationsdata, med över 400 tusen publicerade fulltexter varav antal nedladdade uppgår över 53 miljoner gånger. Genom empiriska och teoretiska studier och bruket av både kvantitativa och kvalitativa metoder analyseras metadata, för att ge svar och resultat över frågan om vad studenter i det svenska utbildningssystemet, på universitet och högskolor skriver historia om under 2000-talet. För att få fram svar fungerade bibliometri som kunskapsområde och frågan om vilka nyckelord som dominerar och var de mest frekvent använda i taggningen (definitionerna) av forskningsresultaten ställdes. Delfrågan om hur bruket av nyckelord ser ut över tid användes för att få fram och se trend över resultat. Teoretiskt ramverk i undersökningen och läsning av de kvantitativa resultaten utgick från Kuhns teori om paradigm. Resultat visar att Genus, Historiebruk, Arkeologi, Historiemedvetande, Historiedidaktik, Identitet, Osteologi, Andra världskriget, Diskursanalys, Kalla kriget, Samer, Laborativ arkeologi och Utbildningshistoria utgör några ledande sakområden som studenterna skrivit historia om under 2000-talet. Resultat visar också att det nationella paradigmet är ledande för studenternas historieforskning, även om USA, Sovjetunionen, Jugoslavien, Japan, Finland, Sápmi och Israel förekommer frekvent. Avslutningsvis visade föreliggande undersökning att metadata kan användas som resurs i historieforskning samtidigt som det historiska perspektivet vidgas. / During the 2000s, numerous significant researches have been produced and disseminated through databases. An important link in the dissemination of knowledge consists of the academy, which today accounts for the largest proportion of scientific publications. In this historiographically oriented study, a part of Swedish history research and history writing that took place during the 2000s is mapped and examined. The scientific discipline that is investigated in the humanities field is the science of history, limited to the results of research that students have done around the Swedish universities and colleges, in the subject of history. The source material consists of student essays published in the database Digital Scientific Archive, DiVA, which today is seen as the nationally most used system for publication data, with over 400 thousand published full texts, of which the number downloaded is over 53 million times. Through empirical and theoretical studies and the use of both quantitative and qualitative methods, metadata is analyzed, to provide answers and results on the question of what students in the Swedish education system, at universities and colleges write history about during the 2000s. To obtain answers, bibliometrics functioned as an area of knowledge and the question of which keywords dominated and were the most frequently used in the tagging (definitions) of the research results was asked. The sub-question about how the use of keywords looks over time was used to bring out and see the trend over results. Theoretical framework in the study and reading of the quantitative results was based on Kuhn's theory of paradigm. Results indicate that Gender, History Use, Archeology, History Consciousness, History Didactics, Identity, Osteology, World War II, Discourse Analysis, the Cold War, Sami, Laboratory Archeology and Educational History are some leading subject areas that students wrote history about during the 2000s. Results also point out that the national paradigm is leading for students' history research, although the United States, the Soviet Union, Yugoslavia, Japan, Finland, Sápmi and Israel occur frequently. In conclusion, the present study showed that metadata can be used as a resource in history research while broadening the historical perspective.
|
Page generated in 0.0519 seconds