Global ETD Search

41	Matching ESCF Prescribed Cyber Security Skills with the Swedish Job Market : Evaluating the Effectiveness of a Language Model Ahmad, Al Ghaith, Abd ULRAHMAN, Ibrahim January 2023 (has links) Background: As the demand for cybersecurity professionals continues to rise, it is crucial to identify the key skills necessary to thrive in this field. This research project sheds light on the cybersecurity skills landscape by analyzing the recommendations provided by the European Cybersecurity Skills Framework (ECSF), examining the most required skills in the Swedish job market, and investigating the common skills identified through the findings. The project utilizes the large language model, ChatGPT, to classify common cybersecurity skills and evaluate its accuracy compared to human classification. Objective: The primary objective of this research is to examine the alignment between the European Cybersecurity Skills Framework (ECSF) and the specific skill demands of the Swedish cybersecurity job market. This study aims to identify common skills and evaluate the effectiveness of a Language Model (ChatGPT) in categorizing jobs based on ECSF profiles. Additionally, it seeks to provide valuable insights for educational institutions and policymakers aiming to enhance workforce development in the cybersecurity sector. Methods: The research begins with a review of the European Cybersecurity Skills Framework (ECSF) to understand its recommendations and methodology for defining cybersecurity skills as well as delineating the cybersecurity profiles along with their corresponding key cybersecurity skills as outlined by ECSF. Subsequently, a Python-based web crawler, implemented to gather data on cybersecurity job announcements from the Swedish Employment Agency's website. This data is analyzed to identify the most frequently required cybersecurity skills sought by employers in Sweden. The Language Model (ChatGPT) is utilized to classify these positions according to ECSF profiles. Concurrently, two human agents manually categorize jobs to serve as a benchmark for evaluating the accuracy of the Language Model. This allows for a comprehensive assessment of its performance. Results: The study thoroughly reviews and cites the recommended skills outlined by the ECSF, offering a comprehensive European perspective on key cybersecurity skills (Tables 4 and 5). Additionally, it identifies the most in-demand skills in the Swedish job market, as illustrated in Figure 6. The research reveals the matching between ECSF-prescribed skills in different profiles and those sought after in the Swedish cybersecurity market. The skills of the profiles 'Cybersecurity Implementer' and 'Cybersecurity Architect' emerge as particularly critical, representing over 58% of the market demand. This research further highlights shared skills across various profiles (Table 7). Conclusion: This study highlights the matching between the European Cybersecurity Skills Framework (ECSF) recommendations and the evolving demands of the Swedish cybersecurity job market. Through a review of ECSF-prescribed skills and a thorough examination of the Swedish job landscape, this research identifies crucial areas of alignment. Significantly, the skills associated with 'Cybersecurity Implementer' and 'Cybersecurity Architect' profiles emerge as central, collectively constituting over 58% of market demand. This emphasizes the urgent need for educational programs to adapt and harmonize with industry requisites. Moreover, the study advances our understanding of the Language Model's effectiveness in job categorization. The findings hold significant implications for workforce development strategies and educational policies within the cybersecurity domain, underscoring the pivotal role of informed skills development in meeting the evolving needs of the cybersecurity workforce. ESCF ChatGPT Scraping Crawler Prompt Engineering Computer Sciences Datavetenskap (datalogi) Computer Engineering Datorteknik Software Engineering Programvaruteknik Information Systems
42	A Platform for Aligning Academic Assessments to Industry and Federal Job Postings Parks, Tyler J. 07 1900 (has links) The proposed tool will provide users with a platform to access a side-by-side comparison of classroom assessment and job posting requirements. Using techniques and methodologies from NLP, machine learning, data analysis, and data mining: the employed algorithm analyzes job postings and classroom assessments, extracts and classifies skill units within, then compares sets of skills from different input volumes. This effectively provides a predicted alignment between academic and career sources, both federal and industrial. The compilation of tool results indicates an overall accuracy score of 82%, and an alignment score of only 75.5% between the input assessments and overall job postings. These results describe that the 50 UNT assessments and 5,000 industry and federal job postings examined, demonstrate a compatibility (alignment) of 75.5%; and, that this measure was calculated using a tool operating at an 82% precision rate. academia assessment industry federal job posting natural language processing machine learning data analysis data mining web scraping keyword extraction text classification learning outcome text comparison Computer Science
43	Evaluating information content of earnings calls to predict bankruptcy using machine learnings techniques Ghaffar, Arooba January 2022 (has links) This study investigates the prediction of firms’ health in terms of bankruptcy and non-bankruptcy based on the sentiments extracted from the earnings calls. Bankruptcy prediction has long been a critical topic in the world of accounting and finance. A firm's economic health is the current financial condition of the firm and is crucial to its stakeholders such as creditors, investors, shareholders, partners, and even customers and suppliers. Various methodologies and strategies have been proposed in research domain for predicting company bankruptcy more promptly and accurately. Conventionally, financial risk prediction has solely been based on historic financial data. However, an increasing number of finance papers also analyze textual data during the last few years. Company’s earnings calls are the key source of information to investigate the current financial condition and how the businesses are doing and what the expectations are for the next quarters. During the call, management offers an overview of recent performance and provide a guidance for the next quarter expectations. The earnings calls summary is provided by the management and can extract the CEO’s sentiments using sentiment analysis. In the last decade, Machine Learnings based techniques have been proposed to achieve accurate predictions of firms’ economic health. Even though most of these techniques work well in a limited context, on a broader perspective these techniques are unable to retrieve the true semantic from the earnings calls, which result in the lower accuracy in predicting the actual condition of firms’ economic health. Thus, state-of-the-art Machine Learnings and Deep Learnings techniques have been used in this thesis to improve accuracy in predicting the firms’ health from the earnings calls. Various machine learnings and deep learnings method have been applied on web-scraped earnings calls data-set, and the results show that LONG SHORT-TERM MEMORY (LSTM) is the best machine learnings technique as compared to the comparison set of models. Computer and Information Sciences Data- och informationsvetenskap
44	Data mining historical insights for a software keyword from GitHub and Libraries.io; GraphQL / Datautvinning av historiska insikter för ett mjukvara nyckelord från GitHub och Libraries.io; GraphQL Bodemar, Gustaf January 2022 (has links) This paper explores an approach to extracting historical insights into a software keyword by data mining GitHub and Libraries.io. We test our method using the keyword GraphQL to see what insights we can gain. We managed to plot several timelines of how repositories and software libraries related to our keyword were created over time. We could also do a rudimentary analysis of how active said items were. We also extracted programing language data associated with each repository and library from GitHub and Libraries.io. With this data, we could, at worst, correlate which programming languages were associated with each item or, in the best case, predict what implementations of GraphQL they used. We found through our attempt many problems and caveats that needed to be dealt with but still concluded that extracting historical insights by data mining GitHub and Libraries.io is worthwhile. Data mining Web scraping Historical data analysis GitHub Libraries.io GraphQL Datautvinning Webbskrapning Historisk dataanalys GitHub Libraries.io GraphQL Other Computer and Information Science Annan data- och informationsvetenskap
45	[en] ALUMNI TOOL: INFORMATION RECOVERY OF PERSONAL DATA ON THE WEB IN AUTHENTICATED SOCIAL NETWORKS / [pt] ALUMNI TOOL: RECUPERAÇÃO DE DADOS PESSOAIS NA WEB EM REDES SOCIAIS AUTENTICADAS LUIS GUSTAVO ALMEIDA 02 August 2018 (has links) [pt] O uso de robôs de busca para coletar informações para um determinado contexto sempre foi um problema desafiante e tem crescido substancialmente nos últimos anos. Por exemplo, robôs de busca podem ser utilizados para capturar dados de redes sociais profissionais. Em particular, tais redes permitem estudar as trajetórias profissionais dos egressos de uma universidade, e responder diversas perguntas, como por exemplo: Quanto tempo um ex-aluno da PUC-Rio leva para chegar a um cargo de relevância? No entanto, um problema de natureza comum a este cenário é a impossibilidade de coletar informações devido a sistemas de autenticação, impedindo um robô de busca de acessar determinadas páginas e conteúdos. Esta dissertação aborda uma solução para capturar dados, que contorna o problema de autenticação e automatiza o processo de coleta de dados. A solução proposta coleta dados de perfis de usuários de uma rede social profissional para armazenamento em banco de dados e posterior análise. A dissertação contempla ainda a possibilidade de adicionar diversas outras fontes de dados dando ênfase a uma estrutura de armazém de dados. / [en] The use of search bots to collect information for a given context has grown substantially in recent years. For example, search bots may be used to capture data from professional social networks. In particular, such social networks facilitate studying the professional trajectory of the alumni of a given university, and answer several questions such as: How long does a former student of PUC-Rio take to arrive at a management position? However, a common problem in this scenario is the inability to collect information due to authentication systems, preventing a search robot from accessing certain pages and content. This dissertation addresses a solution to capture data, which circumvents the authentication problem and automates the data collection process. The proposed solution collects data from user profiles for later database storage and analysis. The dissertation also contemplates the possibility of adding several other sources of data giving emphasis to a data warehouse structure. [pt] RECUPERACAO DE INFORMACAO [en] INFORMATION RETRIEVAL [pt] WEB CRAWLING [en] WEB CRAWLING [pt] COLETA DE DADOS [en] DATA RETRIEVAL [pt] BIG DATA [en] BIG DATA [pt] BOTS [en] BOTS [pt] REDES SOCIAIS [en] SOCIAL MEDIA [pt] SELENIUM [en] SELENIUM [pt] SCRAPING [en] SCRAPING [pt] ROBOS DE BUSCA [en] SEARCH ENGINE [pt] WEB SPIDER [en] WEB SPIDER
46	Získávání znalostí z veřejných semistrukturovaných dat na webu / Knowledge Discovery in Public Semistructured Data on the Web Kefurt, Pavel January 2016 (has links) The first part of the thesis deals with the methods and tools that can be used to retrieve data from websites and the tools used for data mining. The second part is devoted to practical demonstration of the entire process. Web Czech Dance Sport Federation, which is available on www.csts.cz , is used as the source web site.
47	Comparação dos procedimentos de "imprint " e escarificação no diagnóstico da Leishmaniose Tegumentar Americana Mello, Cintia Xavier de January 2011 (has links) Submitted by Anderson Silva (avargas@icict.fiocruz.br) on 2012-09-30T22:59:34Z No. of bitstreams: 1 cintia_x_mello_ipec_pesquisaclinicadi_0002_2011.pdf: 6418751 bytes, checksum: 25c0176616e7ab31ae6bd91f77c3fd69 (MD5) / Made available in DSpace on 2012-09-30T22:59:34Z (GMT). No. of bitstreams: 1 cintia_x_mello_ipec_pesquisaclinicadi_0002_2011.pdf: 6418751 bytes, checksum: 25c0176616e7ab31ae6bd91f77c3fd69 (MD5) Previous issue date: 2011 / Fundação Oswaldo Cruz. Instituto de Pesquisa Clínica Evandro Chagas. Rio de Janeiro, RJ, Brasil / A leishmaniose tegumentar americana (LTA) é uma doença infecciosa, causada por parasitas do gênero Leishmania, que apresenta características complexas em diferentes aspectos. O diagnóstico, sempre que possível, deve ser feito com base em evidências epidemiológicas, aspecto clínico e exames laboratoriais. Para pesquisa direta do parasito, são utilizados os procedimentos de escarificação e “imprint”, sendo a escarificação o método mais rápido, de menor custo e de fácil execução. Baseado nas características dos exames diretos para a confirmação dos casos de LTA, o Ministério da Saúde tem incentivado a implantação do procedimento de escarificação em todos os laboratórios centrais de saúde pública, sendo importantes tanto o conhecimento dos parâmetros de acurácia deste método quanto a padronização relacionada a forma de coleta e leitura, para que possa ser aplicado de forma uniformizada em todo o Brasil. Neste estudo, objetivamos avaliar a sensibilidade dos métodos diretos (“imprint”e escarificação), comparados com o teste padrão de referência (cultura). Além disso, buscamos estabelecer critérios de coleta e leitura com fins de uniformização do método para propor sua aplicação em diferentes regiões brasileiras. A população do estudo foi constituída de 110 pacientes com suspeita clínica de LTA que foram atendidos no Laboratório de Vigilância em Leishmanioses (VigiLeish/IPEC/Fiocruz) para avaliação clínica e coleta das amostras. Dentre os 110 pacientes analisados 40 foram confirmados com LTA. O “imprint” foi positivo em 28 pacientes conferindo sensibilidade de 70%, a escarificação realizada em borda externa foi positiva em 17 pacientes e em borda interna em 25 alcançando sensibilidade de 42,5% e 62,5% respectivamente. Além de mais sensível o material obtido da borda interna da lesão apresentou uma maior quantidade de células brancas e menos hemácias facilitando a leitura da lâmina. Os parâmetros de acurácia encontrados para os métodos diretos foram satisfatórios demonstrando que essas metodologias podem ser implantadas em todo o Brasil para o diagnóstico da leishmaniose tegumentar americana. / American Tegumentary Leishmaniasis (ATL) is an infectious disease caused by parasites of the genus Leishmania, which has complex characteristics in different aspects. The diagnosis, whenever possible, should be based on epidemiological evidence, clinical aspect and laboratory tests. Imprint and scraping procedures are used for direct detection of the parasite. Scraping is the quickest, low-cost and easy to conduct. Based on the characteristics of direct examination for the confirmation of ATL cases, the Brazilian Ministry of Health has encouraged the implementation of the scraping procedure in all Central public health laboratories. Thus the knowledge of the accuracy parameters of this procedure and the standardization of collection and reading methods are important for its application in a uniform manner throughout Brazil. In this study, we aimed to evaluate the sensitivity of direct methods (imprint and scraping), compared with the reference standard test (culture). Besides, we sought to establish collection and reading criteria with the purpose of standardizing the method to propose its application in different Brazilian regions. The study population comprised 110 patients with clinical suspicion of ATL who were treated at the Laboratory of Leishmaniasis Surveillance (VigiLeish/IPEC/Fiocruz) for clinical evaluation and sample collection. Among the 110 patients studied, 40 were confirmed with ATL. The imprint was positive in 28 patients granting sensitivity of 70%, scraping conducted in the outer edge was positive in 17 patients and in the inner edge in 25, reaching sensitivity of 42.5% and 62.5% respectively. The material obtained from the inner edge of the lesion was more sensitive and presented a larger amount of white cells and lesser red cells, favoring slide reading. Accuracy parameters found for the direct methods were satisfactory showing that they may be implemented in all Brazilian regions for the diagnosis of American tegumentary leishmaniasis. Leishmaniose Tegumentar Americana Exame Direto Escarificação Imprint American Tegumentary Leishmaniasis Direct Examination Scraping Imprint Leishmaniose Cutânea /epidemiologia Leishmaniose Cutânea /diagnóstico Vigilância Epidemiológica Prática Clínica Baseada em Evidências Brasil /epidemiologia
48	Služba pro ověření spolehlivosti a pečlivosti českých advokátů / A Service for Verification of Czech Attorneys Jílek, Radim January 2017 (has links) This thesis deals with the design and implementation of the Internet service, which allows to objectively assess and verify the reliability and diligence of Czech lawyers based on publicly available data of several courts. The aim of the thesis is to create and put into operation this service. The result of the work are the programs that provide partial actions in the realization of this intention.
49	Evaluation of web scraping methods : Different automation approaches regarding web scraping using desktop tools / Utvärdering av webbskrapningsmetoder : Olika automatiserings metoder kring webbskrapning med hjälp av skrivbordsverktyg Oucif, Kadday January 2016 (has links) A lot of information can be found and extracted from the semantic web in different forms through web scraping, with many techniques emerging throughout time. This thesis is written with the objective to evaluate different web scraping methods in order to develop an automated, performance reliable, easy implemented and solid extraction process. A number of parameters are set to better evaluate and compare consisting techniques. A matrix of desktop tools are examined and two were chosen for evaluation. The evaluation also includes the learning of setting up the scraping process with so called agents. A number of links gets scraped by using the presented techniques with and without executing JavaScript from the web sources. Prototypes with the chosen techniques are presented with Content Grabber as a final solution. The result is a better understanding around the subject along with a cost-effective extraction process consisting of different techniques and methods, where a good understanding around the web sources structure facilitates the data collection. To sum it all up, the result is discussed and presented with regard to chosen parameters. / En hel del information kan bli funnen och extraherad i olika format från den semantiska webben med hjälp av webbskrapning, med många tekniker som uppkommit med tiden. Den här rapporten är skriven med målet att utvärdera olika webbskrapnings metoder för att i sin tur utveckla en automatiserad, prestandasäker, enkelt implementerad och solid extraheringsprocess. Ett antal parametrar är definierade för att utvärdera och jämföra befintliga webbskrapningstekniker. En matris av skrivbords verktyg är utforskade och två är valda för utvärdering. Utvärderingen inkluderar också tillvägagångssättet till att lära sig sätta upp olika webbskrapnings processer med så kallade agenter. Ett nummer av länkar blir skrapade efter data med och utan exekvering av JavaScript från webbsidorna. Prototyper med de utvalda teknikerna testas och presenteras med webbskrapningsverktyget Content Grabber som slutlig lösning. Resultatet utav det hela är en bättre förståelse kring ämnet samt en prisvärd extraheringsprocess bestående utav blandade tekniker och metoder, där en god vetskap kring webbsidornas uppbyggnad underlättar datainsamlingen. Sammanfattningsvis presenteras och diskuteras resultatet med hänsyn till valda parametrar. web scraping data extraction automation semantic web business intelligence DOM parsing HTML parsing XPath webbskrapning datautvinning automatisering semantiska webben business intelligence DOM parsing HTML parsing XPath Engineering and Technology Teknik och teknologier
50	Less Detectable Web Scraping Techniques / Mindre Detekterbara Webbskrapningstekniker Färholt, Fredric January 2021 (has links) Web scraping is an efficient way of gathering data, and it has also become much eas- ier to perform and offers a high success rate. People no longer need to be tech-savvy when scraping data since several easy-to-use platform services exist. This study conducts experiments to see if people can scrape in an undetectable fashion using a popular and intelligent JavaScript library (Puppeteer). Three web scraper algorithms, where two of them use movement patterns from real-world web users, demonstrate how to retrieve information automatically from the web. They operate on a website built for this research that utilizes known semi-security mechanisms, honeypot, and activity logging, making it possible to collect and evaluate data from the algorithms and the website. The result shows that it may be possible to construct a web scraper algorithm with less detectability using Puppeteer. One of the algorithms reveals that it is possible to control computer performance using built-in methods in Puppeteer. / Webbskrapning är ett effektivt sätt att hämta data på, det har även blivit en aktivitet som är enkel att genomföra och chansen att en lyckas är hög. Användare behöver inte längre vara fantaster inom teknik när de skrapar data, det finns idag mängder olika och lättanvändliga plattformstjänster. Den här studien utför experi- ment för att se hur personer kan skrapa på ett oupptäckbart sätt med ett populärt och intelligent JavaScript bibliotek (Puppeteer). Tre webbskrapningsalgoritmer, där två av dem använder rörelsemönster från riktiga webbanvändare, demonstrerar hur en kan samla information. Webbskrapningsalgoritmerna har körts på en hemsida som ingått i experimentet med kännbar säkerhet, honeypot, och aktivitetsloggning, nå- got som gjort det möjligt att samla och utvärdera data från både algoritmerna och hemsidan. Resultatet visar att det kan vara möljligt att skrapa på ett oupptäckbart sätt genom att använda Puppeteer. En av algoritmerna avslöjar även möjligheten att kontrollera prestanda genom att använda inbyggda metoder i Puppeteer. web scraping data mining javascript puppeteer algorithms data collection security mechanisms honeypot security tools undetectability webbskrapning data mining javascript puppeteer algoritmer datakollektion säkerhetsmekanismer honeypot säkerhetsverktyg oupptäckbar Engineering and Technology Teknik och teknologier

Search results