Global ETD Search

51	Exploring State-of-the-Art Natural Language Processing Models with Regards to Matching Job Adverts and Resumes Rückert, Lise, Sjögren, Henry January 2022 (has links) The ability to automate the process of comparing and matching resumes with job adverts is a growing research field. This can be done through the use of the machine learning area Natural Language Processing (NLP), which enables a model to learn human language. This thesis explores and evaluates the application of the state-of-the-art NLP model, SBERT, on the task of comparing and calculating a measure of similarity between extracted text from resumes and adverts. This thesis also investigates what type of data that generates the best performing model on said task. The results show that SBERT quickly can be trained on unlabeled data from the HR domain with the usage of a Triplet network, and achieves high performance and good results when tested on various tasks. The models are shown to be bilingual, can tackle unseen vocabulary and understand the concept and descriptive context of entire sentences instead of solely single words. Thus, the conclusion is that the models have a neat understanding of semantic similarity and relatedness. However, in some cases the models are also shown to become binary in their calculations of similarity between inputs. Moreover, it is hard to tune a model that is exhaustively comprehensive of such diverse domain such as HR. A model fine-tuned on clean and generic data extracted from adverts shows the overall best performance in terms of loss and consistency. Deep Learning Natural Language Processing SBERT Cosine similarity Recruitment Triplet network Semantic similarity Computer Sciences Datavetenskap (datalogi)
52	Increasing CNN representational power using absolute cosine value regularization Singleton, William S. 05 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / The Convolutional Neural Network (CNN) is a mathematical model designed to distill input information into a more useful representation. This distillation process removes information over time through a series of dimensionality reductions, which ultimately, grant the model the ability to resist noise, and generalize effectively. However, CNNs often contain elements that are ineffective at contributing towards useful representations. This Thesis aims at providing a remedy for this problem by introducing Absolute Cosine Value Regularization (ACVR). This is a regularization technique hypothesized to increase the representational power of CNNs by using a Gradient Descent Orthogonalization algorithm to force the vectors that constitute their filters at any given convolutional layer to occupy unique positions in in their respective spaces. This method should in theory, lead to a more effective balance between information loss and representational power, ultimately, increasing network performance. The following Thesis proposes and examines the mathematics and intuition behind ACVR, and goes on to propose Dynamic-ACVR (D-ACVR). This Thesis also proposes and examines the effects of ACVR on the filters of a low-dimensional CNN, as well as the effects of ACVR and D-ACVR on traditional Convolutional filters in VGG-19. Finally, this Thesis proposes and examines regularization of the Pointwise filters in MobileNetv1. Absolute Cosine Value Regularization CIFAR-10 Convolutional Neural Network D-ACVR Gradient Descent Orthogonalization MobileNetv1 VGG-19
53	Design and analysis of Discrete Cosine Transform-based watermarking algorithms for digital images. Development and evaluation of blind Discrete Cosine Transform-based watermarking algorithms for copyright protection of digital images using handwritten signatures and mobile phone numbers. Al-Gindy, Ahmed M.N. January 2011 (has links) This thesis deals with the development and evaluation of blind discrete cosine transform-based watermarking algorithms for copyright protection of digital still images using handwritten signatures and mobile phone numbers. The new algorithms take into account the perceptual capacity of each low frequency coefficients inside the Discrete Cosine Transform (DCT) blocks before embedding the watermark information. They are suitable for grey-scale and colour images. Handwritten signatures are used instead of pseudo random numbers. The watermark is inserted in the green channel of the RGB colour images and the luminance channel of the YCrCb images. Mobile phone numbers are used as watermarks for images captured by mobile phone cameras. The information is embedded multiple-times and a shuffling scheme is applied to ensure that no spatial correlation exists between the original host image and the multiple watermark copies. Multiple embedding will increase the robustness of the watermark against attacks since each watermark will be individually reconstructed and verified before applying an averaging process. The averaging process has managed to reduce the amount of errors of the extracted information. The developed watermarking methods are shown to be robust against JPEG compression, removal attack, additive noise, cropping, scaling, small degrees of rotation, affine, contrast enhancements, low-pass, median filtering and Stirmark attacks. The algorithms have been examined using a library of approximately 40 colour images of size 512 512 with 24 bits per pixel and their grey-scale versions. Several evaluation techniques were used in the experiment with different watermarking strengths and different signature sizes. These include the peak signal to noise ratio, normalized correlation and structural similarity index measurements. The performance of the proposed algorithms has been compared to other algorithms and better invisibility qualities with stronger robustness have been achieved. Image processing Watermarking Discrete Cosine Transform (DCT) Still colour images Still grey-scale images Watermarking algorithms Digital images
54	Matrix Approximation And Image Compression Padavana, Isabella R 01 June 2024 (has links) (PDF) This thesis concerns the mathematics and application of various methods for approximating matrices, with a particular eye towards the role that such methods play in image compression. An image is stored as a matrix of values with each entry containing a value recording the intensity of a corresponding pixel, so image compression is essentially equivalent to matrix approximation. First, we look at the singular value decomposition, one of the central tools for analyzing a matrix. We show that, in a sense, the singular value decomposition is the best low-rank approximation of any matrix. However, the singular value decomposition has some serious shortcomings as an approximation method in the context of digital images. The second method we consider is the discrete Fourier transform, which does not require the storage of basis vectors (unlike the SVD). We describe the fast Fourier transform, which is a remarkably efficient method for computing the discrete cosine transform, and how we can use this method to reduce the information in a matrix. Finally, we look at the discrete cosine transform, which reduces the complexity of the calculation further by restricting to a real basis. We also look at how we can apply a filter to adjust the relative importance of the data encoded by the discrete cosine transform prior to compression. In addition, we developed code implementing the ideas explored in the thesis and demonstrating examples. Singular Value Decomposition Fourier Analysism Fourier Transform Discrete Fourier Transform Discrete Cosine Transform Analysis Applied Mathematics Mathematics
55	Video extraction for fast content access to MPEG compressed videos Jiang, Jianmin, Weng, Y. 09 June 2009 (has links) No / As existing video processing technology is primarily developed in the pixel domain yet digital video is stored in compressed format, any application of those techniques to compressed videos would require decompression. For discrete cosine transform (DCT)-based MPEG compressed videos, the computing cost of standard row-by-row and column-by-column inverse DCT (IDCT) transforms for a block of 8 8 elements requires 4096 multiplications and 4032 additions, although practical implementation only requires 1024 multiplications and 896 additions. In this paper, we propose a new algorithm to extract videos directly from MPEG compressed domain (DCT domain) without full IDCT, which is described in three extraction schemes: 1) video extraction in 2 2 blocks with four coefficients; 2) video extraction in 4 4 blocks with four DCT coefficients; and 3) video extraction in 4 4 blocks with nine DCT coefficients. The computing cost incurred only requires 8 additions and no multiplication for the first scheme, 2 multiplication and 28 additions for the second scheme, and 47 additions (no multiplication) for the third scheme. Extensive experiments were carried out, and the results reveal that: 1) the extracted video maintains competitive quality in terms of visual perception and inspection and 2) the extracted videos preserve the content well in comparison with those fully decompressed ones in terms of histogram measurement. As a result, the proposed algorithm will provide useful tools in bridging the gap between pixel domain and compressed domain to facilitate content analysis with low latency and high efficiency such as those applications in surveillance videos, interactive multimedia, and image processing. Data compression Video coding Discrete cosine transforms MPEG compressed videos Extraction schemes Visual perception Fast content access Computing cost Discrete cosine transform Histogram measurement Image processing Digital video Video processing technology Visual perception Visual inspection Video extraction Interactive multimedia
56	BANDWIDTH LIMITED 320 MBPS TRANSMITTER Anderson, Christopher 10 1900 (has links) International Telemetering Conference Proceedings / October 28-31, 1996 / Town and Country Hotel and Convention Center, San Diego, California / With every new spacecraft that is designed comes a greater density of information that will be stored once it is in operation. This, coupled with the desire to reduce the number of ground stations needed to download this information from the spacecraft, places new requirements on telemetry transmitters. These new transmitters must be capable of data rates of 320 Mbps and beyond. Although the necessary bandwidth is available for some non-bandwidth-limited transmissions in Ka-Band and above, many systems will continue to rely on more narrow allocations down to X-Band. These systems will require filtering of the modulation to meet spectral limits. The usual requirements of this filtering also include that it not introduce high levels of inter-symbol interference (ISI) to the transmission. These constraints have been addressed at CE by implementing a DSP technique that pre-filters a QPSK symbol set to achieve bandwidth-limited 320 Mbps operation. This implementation operates within the speed range of the radiation-hardened digital technologies that are currently available and consumes less power than the traditional high-speed FIR techniques. High Data Rate Space Communications High Data Rate Telemetry Transmitter Cincinnati Electronics (CE) Quaternary Phase-Shift Keying (QPSK) Raised-Cosine (RC) Filtering Wide Band Data Transmitter (WBDT)
57	CFD investigation of the atmospheric boundary layer under different thermal stability conditions Pieterse, Jacobus Erasmus 03 1900 (has links) Thesis (MScEng)--Stellenbosch University, 2013. / ENGLISH ABSTRACT: An accurate description of the atmospheric boundary layer (ABL) is a prerequisite for computational fluid dynamic (CFD) wind studies. This includes taking into account the thermal stability of the atmosphere, which can be stable, neutral or unstable, depending on the nature of the surface fluxes of momentum and heat. The diurnal variation between stable and unstable conditions in the Namib Desert interdune was measured and quantified using the wind velocity and temperature profiles that describe the thermally stratified atmosphere, as derived by Monin- Obukhov similarity theory. The implementation of this thermally stratified atmosphere into CFD has been examined in this study by using Reynoldsaveraged Navier-Stokes (RANS) turbulence models. The maintenance of the temperature, velocity and turbulence profiles along an extensive computational domain length was required, while simultaneously allowing for full variation in pressure and density through the ideal gas law. This included the implementation of zero heat transfer from the surface, through the boundary layer, under neutral conditions so that the adiabatic lapse rate could be sustained. Buoyancy effects were included by adding weight to the fluid, leading to the emergence of the hydrostatic pressure field and the resultant density changes expected in the real atmosphere. The CFD model was validated against measured data, from literature, for the flow over a cosine hill in a wind tunnel. The standard k-ε and SST k-ω turbulence models, modified for gravity effects, represented the data most accurately. The flow over an idealised transverse dune immersed in the thermally stratified ABL was also investigated. It was found that the flow recovery was enhanced and re-attachment occurred earlier in unstable conditions, while flow recovery and re-attachment took longer in stable conditions. It was also found that flow acceleration over the crest of the dune was greater under unstable conditions. The effect of the dune on the flow higher up in the atmosphere was also felt at much higher distances for unstable conditions, through enhanced vertical velocities. Under stable conditions, vertical velocities were reduced, and the influence on the flow higher up in the atmosphere was much less than for unstable or neutral conditions. This showed that the assumption of neutral conditions could lead to an incomplete picture of the flow conditions that influence any particular case of interest. / AFRIKAANSE OPSOMMING: 'n Akkurate beskrywing van die atmosferiese grenslaag (ABL) is 'n voorvereiste vir wind studies met berekenings-vloeimeganika (CFD). Dit sluit in die inagneming van die termiese stabiliteit van die atmosfeer, wat stabiel, neutraal of onstabiel kan wees, afhangende van die aard van die oppervlak vloed van momentum en warmte. Die daaglikse variasie tussen stabiele en onstabiele toestande in die Namib Woestyn interduin is gemeet en gekwantifiseer deur gebruik te maak van die wind snelheid en temperatuur profiele wat die termies gestratifiseerde atmosfeer, soos afgelei deur Monin-Obukhov teorie, beskryf. Die implementering van hierdie termies gestratifiseerde atmosfeer in CFD is in hierdie studie aangespreek deur gebruik te maak van RANS turbulensie modelle. Die handhawing van die temperatuur, snelheid en turbulensie profiele in die lengte van 'n uitgebreide berekenings domein is nodig, en terselfdertyd moet toegelaat word vir volledige variasie in die druk en digtheid, deur die ideale gaswet. Dit sluit in die implementering van zero hitte-oordrag vanaf die grond onder neutrale toestande sodat die adiabatiese vervaltempo volgehou kan word. Drykrag effekte is ingesluit deur die toevoeging van gewig na die vloeistof, wat lei tot die ontwikkeling van die hidrostatiese druk veld, en die gevolglike digtheid veranderinge, wat in die werklike atmosfeer verwag word. Die CFD-model is gevalideer teen gemete data, vanaf die literatuur, vir die vloei oor 'n kosinus heuwel in 'n windtonnel. Die standaard k-ε en SST k-ω turbulensie modelle, met veranderinge vir swaartekrag effekte, het die data mees akkuraat voorgestel. Die vloei oor 'n geïdealiseerde transversale duin gedompel in die termies gestratifiseerde ABL is ook ondersoek. Daar is bevind dat die vloei herstel is versterk en terug-aanhegging het vroeër plaasgevind in onstabiele toestande, terwyl vloei herstel en terug-aanhegging langer gevat het in stabiele toestande. Daar is ook bevind dat vloei versnelling oor die kruin van die duin groter was onder onstabiele toestande. Die effek van die duin op die vloei hoër op in die atmosfeer is ook op hoër afstande onder onstabiele toestande gevoel, deur middel van verhoogte vertikale snelhede. Onder stabiele toestande, is vertikale snelhede verminder, en die invloed op die vloei hoër op in die atmosfeer was veel minder as vir onstabiel of neutrale toestande. Dit het getoon dat die aanname van neutrale toestande kan lei tot 'n onvolledige beeld van die vloei toestande wat 'n invloed op 'n bepaalde geval kan hê. Monin-abukhov theory Thermal stratifications Adiabatic lapse-rate Cosine hill Transverse dune Dissertations -- Mechanical engineering Theses -- Mechanical engineering Computational fluid dynamic (CFD) Atmospheric boundary layer (ABL) Wind studies
58	Sumarizace českých textů z více zdrojů / Multi-source Text Summarization for Czech Brus, Tomáš January 2012 (has links) This work focuses on the summarization task for a set of articles on the same topic. It discusses several possible ways of summarizations and ways to assess their final quality. The implementation of the described algorithms and their application to selected texts constitutes a part of this work. The input texts come from several Czech news servers and they are represented as deep syntactic trees (the so called tectogrammatical layer).
59	Word2vec2syn : Synonymidentifiering med Word2vec / Word2vec2syn : Synonym Identification using Word2vec Pettersson, Tove January 2019 (has links) Inom NLP (eng. natural language processing) är synonymidentifiering en av de språkvetenskapliga utmaningarna som många antar. Fodina Language Technology AB är ett företag som skapat ett verktyg, Termograph, ämnad att samla termer inom företag och hålla den interna språkanvändningen konsekvent. En metodkombination bestående av språkteknologiska strategier utgör synonymidentifieringen och Fodina önskar ett större täckningsområde samt mer dynamik i framtagningsprocessen. Därav syftade detta arbete till att ta fram en ny metod, utöver metodkombinationen, för just synonymidentifiering. En färdigtränad Word2vec-modell användes och den inbyggda funktionen för cosinuslikheten användes för att få fram synonymer och skapa kluster. Modellen validerades, testades och utvärderades i förhållande till metodkombinationen. Valideringen visade att modellen skattade inom ett rimligt mänskligt spann i genomsnitt 60,30 % av gångerna och Spearmans korrelation visade på en signifikant stark korrelation. Testningen visade att 32 % av de bearbetade klustren innehöll matchande synonymförslag. Utvärderingen visade att i de fall som förslagen inte matchade så var modellens synonymförslag korrekta i 5,73 % av fallen jämfört med 3,07 % för metodkombinationen. Den interna reliabiliteten för utvärderarna visade på en befintlig men svag enighet, Fleiss Kappa = 0,19, CI(0,06, 0,33). Trots viss osäkerhet i resultaten påvisas ändå möjligheter för vidare användning av word2vec-modeller inom Fodinas synonymidentifiering. / One of the main challenges in the field of natural language processing (NLP) is synonym identification. Fodina Language Technology AB is the company behind the tool, Termograph, that aims to collect terms and provide a consistent language within companies. A combination of multiple methods from the field of language technology constitutes the synonym identification and Fodina would like to improve the area of coverage and increase the dynamics of the working process. The focus of this thesis was therefore to evaluate a new method for synonym identification beyond the already used combination. Initially a trained Word2vec model was used and for the synonym identification the built-in-function for cosine similarity was applied in order to create clusters. The model was validated, tested and evaluated relative to the combination. The validation implicated that the model made estimations within a fair human-based range in an average of 60.30% and Spearmans correlation indicated a strong significant correlation. The testing showed that 32% of the processed synonym clusters contained matching synonym suggestions. The evaluation showed that the synonym suggestions from the model was correct in 5.73% of all cases compared to 3.07% for the combination in the cases where the clusters did not match. The interrater reliability indicated a slight agreement, Fleiss’ Kappa = 0.19, CI(0.06, 0.33). Despite uncertainty in the results, opportunities for further use of Word2vec-models within Fodina’s synonym identification are nevertheless demonstrated. Word2vec synonym identification vector space model word vectors cosine similarity Word2vec synonymidentifiering vektorrymdsmodell ordvektorer cosinuslikhet
60	As funções seno e cosseno: diagnóstico de dificuldades de aprendizagem através de sequências didáticas com diferentes mídias Souza, Edílson Paiva de 09 December 2010 (has links) Made available in DSpace on 2016-04-27T16:57:04Z (GMT). No. of bitstreams: 1 Edilson Paiva de Souza.pdf: 3867466 bytes, checksum: 6a09b6b64515cc8958d8385427c33097 (MD5) Previous issue date: 2010-12-09 / Secretaria da Educação do Estado de São Paulo / This research aims to diagnose the learning difficulties of high school students about the concepts of the trigonometric functions sine and cosine. The research is based on the principles of Didactic Engineering and on the theory of Semiotic Representation Registers created by Raymond Duval. The didactic sequence presented is oriented by analysis of high school textbooks and considers the research works made using the graphical software in teaching and learning process to improve knowledge. The tools used in the application of the sequence were pencil and paper and software Graphmatic. The sequence was applied with second year students of public high school in the capital of São Paulo. The protocols produced by eight teams of students who participated in four sessions allowed the analysis, and led to the conclusion that the use of technology through a process of education provided by a dynamic graphics software provided an increase in knowledge about the concepts of sine and cosine functions / Esta pesquisa tem como objetivo diagnosticar as dificuldades de aprendizagem de alunos do Ensino Médio em relação aos conceitos das funções trigonométricas seno e cosseno. A investigação está fundamentada nos princípios da Engenharia Didática e embasada na Teoria dos Registros de Representação Semiótica de Raymond Duval. A sequência didática apresentada orienta-se nas análises de livros didáticos do Ensino Médio e pesquisas que utilizaram o software gráfico no processo de ensino aprendizagem para melhoria do conhecimento. As ferramentas utilizadas na aplicação da sequência foram o lápis e o papel e o software Graphmatic. A sequência foi aplicada com alunos do segundo ano do Ensino Médio, de uma escola pública da capital de São Paulo. Foram analisados os protocolos de oito duplas que participaram de quatro sessões. Os dados coletados foram analisados e levaram a concluir que a utilização da tecnologia, através de um processo de ensino dinâmico proporcionado pelo software gráfico Graphmatic, propiciou um aumento no conhecimento sobre os conceitos das funções seno e cosseno Funções seno e cosseno Registros de representação semiótica Tecnologias na educação matemática Sine and cosine functions Semiotic representation registers Technologies in mathematics education

Search results