Global ETD Search

1	Cross-categorical Intensification: The Case of Cantonese -gwai2 Ye, Jinwei January 2021 (has links) No description available. Linguistics -gwai2 Cantonese cross-categorical intensification grammatical distribution semantic meaning unified account
2	Text ranking based on semantic meaning of sentences / Textrankning baserad på semantisk betydelse hos meningar Stigeborn, Olivia January 2021 (has links) Finding a suitable candidate to client match is an important part of consultant companies work. It takes a lot of time and effort for the recruiters at the company to read possibly hundreds of resumes to find a suitable candidate. Natural language processing is capable of performing a ranking task where the goal is to rank the resumes with the most suitable candidates ranked the highest. This ensures that the recruiters are only required to look at the top ranked resumes and can quickly get candidates out in the field. Former research has used methods that count specific keywords in resumes and can make decisions on whether a candidate has an experience or not. The main goal of this thesis is to use the semantic meaning of the text in the resumes to get a deeper understanding of a candidate’s level of experience. It also evaluates if the model is possible to run on-device and if the database can contain a mix of English and Swedish resumes. An algorithm was created that uses the word embedding model DistilRoBERTa that is capable of capturing the semantic meaning of text. The algorithm was evaluated by generating job descriptions from the resumes by creating a summary of each resume. The run time, memory usage and the ranking the wanted candidate achieved was documented and used to analyze the results. When the candidate who was used to generate the job description is ranked in the top 10 the classification was considered to be correct. The accuracy was calculated using this method and an accuracy of 68.3% was achieved. The results show that the algorithm is capable of ranking resumes. The algorithm is able to rank both Swedish and English resumes with an accuracy of 67.7% for Swedish resumes and 74.7% for English. The run time was fast enough at an average of 578 ms but the memory usage was too large to make it possible to use the algorithm on-device. In conclusion the semantic meaning of resumes can be used to rank resumes and possible future work would be to combine this method with a method that counts keywords to research if the accuracy would increase. / Att hitta en lämplig kandidat till kundmatchning är en viktig del av ett konsultföretags arbete. Det tar mycket tid och ansträngning för rekryterare på företaget att läsa eventuellt hundratals CV:n för att hitta en lämplig kandidat. Det finns språkteknologiska metoder för att rangordna CV:n med de mest lämpliga kandidaterna rankade högst. Detta säkerställer att rekryterare endast behöver titta på de topprankade CV:erna och snabbt kan få kandidater ut i fältet. Tidigare forskning har använt metoder som räknar specifika nyckelord i ett CV och är kapabla att avgöra om en kandidat har specifika erfarenheter. Huvudmålet med denna avhandling är att använda den semantiska innebörden av texten iCV:n för att få en djupare förståelse för en kandidats erfarenhetsnivå. Den utvärderar också om modellen kan köras på mobila enheter och om algoritmen kan rangordna CV:n oberoende av om CV:erna är på svenska eller engelska. En algoritm skapades som använder ordinbäddningsmodellen DistilRoBERTa som är kapabel att fånga textens semantiska betydelse. Algoritmen utvärderades genom att generera jobbeskrivningar från CV:n genom att skapa en sammanfattning av varje CV. Körtiden, minnesanvändningen och rankningen som den önskade kandidaten fick dokumenterades och användes för att analysera resultatet. När den kandidat som användes för att generera jobbeskrivningen rankades i topp 10 ansågs klassificeringen vara korrekt. Noggrannheten beräknades med denna metod och en noggrannhet på 68,3 % uppnåddes. Resultaten visar att algoritmen kan rangordna CV:n. Algoritmen kan rangordna både svenska och engelska CV:n med en noggrannhet på 67,7 % för svenska och 74,7 % för engelska. Körtiden var i genomsnitt 578 ms vilket skulle möjliggöra att algoritmen kan köras på mobila enheter men minnesanvändningen var för stor. Sammanfattningsvis kan den semantiska betydelsen av CV:n användas för att rangordna CV:n och ett eventuellt framtida arbete är att kombinera denna metod med en metod som räknar nyckelord för att undersöka hur noggrannheten skulle påverkas. Natural language processing Word Embedding Resume Ranking Semantic meaning Språkteknologi Ordinbäddning CV rankning Semantisk betydelse Computer Sciences Datavetenskap (datalogi)
3	Явление межъязыковой псевдоэквивалентности в русских и английских медиатекстах : магистерская диссертация / The phenomenon of interlanguage pseudo-equivalence in Russian and English media texts Григорьева, Н. А., Grigoryeva, N. A. January 2024 (has links) The study is devoted to the pseudo-equivalents of the Russian and English languages. By the method of continuous sampling, 1,360 lexical units of English were identified, which amounted to 1,512 pseudo-equivalent pairs of words with 982 lexical units of the Russian language. The lexemes of the Russian and English languages included in the pseudo-equivalent pair were taken as a unit of research. The quantitative difference is explained by the developed homonymy in both languages. In the first part of the study, theoretical issues related to the phenomenon of interlanguage pseudo-equivalence and media text are considered. The basic concepts of translation theory, the interpretation of the concept of "pseudo-equivalence", and the classification of pseudo-equivalents are analyzed. The features of the media text are considered and the classification of media texts by functional genre type is given. In the second part of the study, the classification of selected words classified as pseudoequivalents is carried out according to the volume of semantic meaning into absolute, partial and contextual, and partial belonging to the pseudoequivalents of the Russian and English languages, belonging to the same part of speech (noun, adjective, verb and adverb), and pseudoequivalents of Russian and English languages belonging to different parts of speech (noun, adjective, verb, adverb, preposition, pronoun, interjection). The third part of the study examines the functioning of selected words classified as pseudo-equivalent in English media texts (news, information analytics and journalism) and their translations into Russian according to data from a parallel corpus. The cases of the formation of pseudo-equivalent pairs are considered, as well as translation solutions that were used if the pair was not formed. The results of the study can be used in lexicography, lexicology, in the practice of teaching translation theory, as well as for compiling textbooks. / Работа посвящена псевдоэквивалентам русского и английского языков. Методом сплошной выборки было выделено 1360 лексических единиц английского, которые составили 1512 псевдоэквивалентных пар слов с 982 лексическими единицами русского языка. За единицу исследования принимались лексемы русского и английского языков, входящие в псевдоэквивалентную пару. Квантитативная разница объясняется развитой омонимией в обоих языках. В первой части работы рассматриваются теоретические вопросы, связанные с явлением межъязыковой псевдоэквивалентности и медиатекстом. Анализируются основные понятия теории перевода, трактовки понятия «псевдоэквивалентность», классификации псевдоэквивалентов. Рассматриваются особенности медиатекста и приводится классификация медиатекстов по функционально-жанровому типу. Во второй части работы проводится классификация отобранных слов, отнесенных к категории псевдоэквивалентов, по объему семантического значения на абсолютные, частичные и контекстуальные и по частеречной принадлежности на псевдоэквиваленты русского и английского языков, принадлежащие к одной и той же части речи (имя существительное, имя прилагательное, глагол и наречие), и псевдоэквиваленты русского и английского языков, принадлежащие к разным частям речи (имя существительное, имя прилагательное, глагол, наречие, предлог, местоимение, междометие). В третьей части работы изучается функционирование отобранных слов, отнесенных к категории псевдоэквивалентов, в английских медиатекстах (новости, информационная аналитика и публицистика) и их переводах на русский язык по данным параллельного корпуса. Рассмотрены случаи образования псевдоэквивалентных пар, а также переводческие решения, которые использовались в случае, если пара не образовывалась. Результаты работы могут быть использованы в лексикографии, лексикологии, в практике преподавания теории перевода, а также для составления учебных пособий. MASTER'S THESIS PSEUDO-EQUIVALENTS VOLUME OF SEMANTIC MEANING PARTIAL AFFILIATION MEDIA TEXT TRANSLATION SOLUTION ПСЕВДОЭКВИВАЛЕНТЫ МЕДИАТЕКСТ
4	Cluster Analysis with Meaning : Detecting Texts that Convey the Same Message / Klusteranalys med mening : Detektering av texter som uttrycker samma sak Öhrström, Fredrik January 2018 (has links) Textual duplicates can be hard to detect as they differ in words but have similar semantic meaning. At Etteplan, a technical documentation company, they have many writers that accidentally re-write existing instructions explaining procedures. These "duplicates" clutter the database. This is not desired because it is duplicate work. The condition of the database will only deteriorate as the company expands. This thesis attempts to map where the problem is worst, and also how to calculate how many duplicates there are. The corpus is small, but written in a controlled natural language called Simplified Technical English. The method uses document embeddings from doc2vec and clustering by use of HDBSCAN* and validation using Density-Based Clustering Validation index (DBCV), to chart the problems. A survey was sent out to try to determine a threshold value of when documents stop being duplicates, and then using this value, a theoretical duplicate count was calculated. nlp text mining clustering semantic meaning text clustering semantic duplicates simplified technical english duplicate detection dbcv doc2vec etteplan Computer Sciences Datavetenskap (datalogi)
5	由動詞及UP或DOWN組成之動詞片語與介系詞片語連用之分析 / Analysis of the co-occurrence of the VP-UP/DOWN construction and the P-NP construction 李旻倩 Unknown Date (has links) 過去許多研究著眼於探討英文介系詞的語意，其中許多學者專注在單一介系詞的探討（e.g., Boers, 1996; Lindstromberg, 2010），其它學者則分析由動詞與介系詞組成之動詞片語、由介系詞與名詞組成之介系詞片語的語意（e.g. Larsen-Freeman & Celce-Murcia, 1999; Lindner, 1983; Quirk, Greenbaum, Leech, & Svartik, 1985）。過去這些研究大多在單一介系詞的框架下進行，鮮少有包含雙介系詞的句構的研究。本論文所研究之句構為：一個由動詞及UP/DOWN組成之動詞片語加上一個由IN與名詞組成之介系詞片語，在本句構中包含兩個連用之介系詞，本研究的分析包含雙介系詞的語意、動詞片語以及介系詞片語的語意，另外還包含此句構中所有語意的語意關連。本研究採納並調整前人對介系詞、動詞片語以及介系詞片語的語意類別，以調整過的語意類別分析句構。研究結果發現在此句構中，雙介系詞大多含有隱喻概念，而大多的動詞片語則用來表達完成的動作語意，介系詞片語則多描繪空間概念或狀態。除此之外，我們發現此句構中的語意間有所關連，另外我們更發現UP和DOWN在本句構中並沒有完全對比的語意。在本研究中，我們不同於以往研究只專注於一個介系詞或一個片語的分析，而是由三個角度切入探討一個含有雙介系詞的句構，未來期望能將本研究的結果運用在對比學習者對此句構的語言表現，並對介系詞的教學有更多貢獻。 BNC英國國家語料庫動詞片語介系詞片語雙介系詞連用語意分析 BNC two co-occurring prepositions semantic meaning analysis verb-preposition sequence prepositional phase

1

Page generated in 0.0866 seconds