Global ETD Search

41	探索性資料分析方法在文本資料中的應用─以「新青年」雜誌為例 / A Study of Exploratory Data Analysis on Text Data ── A Case study based on New Youth Magazine 潘艷艷, Pan, Yan Yan Unknown Date (has links) 隨著經濟繁榮和網絡發展的日新月異，線上線下每時每刻都產生龐大數據，其中約有80%的文字、影像等非結構化數據，如何量化和採取適合的分析方法，成為有效提取有價值信息及對其加以利用的關鍵。針對文字類型的資料，本文提出探索性資料分析方法，並以《新青年》雜誌的語言變化為例，呈現如何選取文本特徵并对其量化及分析的過程。首先，本文以卷為分析單位，多角度量化《新青年》雜誌各卷的文本結構，包括文本用字、用句、文言和白虛字使用以及常用字詞共用等方面，通過多種圖表相結合的呈現方式，窺探《新青年》雜誌語言變化歷程以及轉變特點。這其中既包括了對文言文到白話文轉變機制的探索，也包括白話語言演化的探索。其次，根據各卷初探的結果，尋找可區隔文言文和白話文兩種語言形式的文本特徵變數，再以《新青年》第一卷和第七卷為訓練樣本，結合主成分和羅吉斯迴歸，對文、白兩種語言形式的文章進行分類訓練，再利用第四卷進行測試。結果證實，所提取的文本變數能夠有效實現對文、白兩種語言形式的文章的區分。此外，本文亦根據前述初探結果以及人文學者經驗，探索《新青年》雜誌後期語言形式的變化，即從五四運動時期的白話文至以「紅色中文」為特徵的白話文（二戰之後中國使用的白話文）的變化。以第七卷和第十一卷為樣本進行訓練，結果證實這兩卷語言形式存在明顯區別；並加入台灣《聯合報》和中國大陸的《人民日報》進行分類預測，發現兩類報刊的語言偏向有明顯差異，值得後續深入研究。 / Tremendous data are produced every day, due to the rapid development of computer technology and economics. Unstructured data, such as text, pictures, videos, etc., account for nearly 80 percent of all data created. Choosing appropriate methods for quantifying and analyzing this kind of data would determine whether or not we can extract useful information. For that, we propose a standard operating process of exploratory data analysis (EDA) and use a case study of language changes in New Youth Magazine as a demonstration. First, we quantify the texts of New Youth magazine from different perspectives, including the uses of words, sentences, function words, and share of common vocabulary. We aim to detect the evolution of modern language itself as well as changes from traditional Chinese to modern Chinese. Then, according to the results of exploratory data analysis, we treat the first and seventh volumes of New Youth magazine for training data to develop classification model and apply the model to fourth volume (i.e., testing data). The results show that the traditional Chinese and modern Chinese can be successfully classified. Next, we intend to verify the changes from modern Chinese of the May 4th Movement to those by advocating Socialism. We treat the seventh volume and eleventh volume of New Youth magazine as training data and again develop a classification model. Then we apply this model to the United Daily News from Taiwan and People’s Daily from Mainland China. We found these two newspapers are very different and the style of United Daily News is closer to that of seventh volume, while the style of People’s Daily is more like that of eleventh volume. This indicates that the People’s Daily is likely to be influenced by the Soviet Union. 非結構化數據文本分析探索性資料分析主成分分析羅吉斯迴歸 Unstructured Data Text Analysis Exploratory data Analysis Principal Component Analysis Logistic Regression
42	Remote sensing of rapidly draining supraglacial lakes on the Greenland Ice Sheet Williamson, Andrew Graham January 2018 (has links) Supraglacial lakes in the ablation zone of the Greenland Ice Sheet (GrIS) often drain rapidly (in hours to days) by hydraulically-driven fracture (“hydrofracture”) in the summer. Hydrofracture can deliver large meltwater volumes to the ice-bed interface and open-up surface-to-bed connections, thereby routing surface meltwater to the subglacial system, altering basal water pressures and, consequently, the velocity profile of the GrIS. The study of rapidly draining lakes is thus important for developing coupled hydrology and ice-dynamics models, which can help predict the GrIS’s future mass balance. Remote sensing is commonly used to identify the location, timing and magnitude of rapid lake-drainage events for different regions of the GrIS and, with the increased availability of high-quality satellite data, may be able to offer additional insights into the GrIS’s surface hydrology. This study uses new remote-sensing datasets and develops novel analytical techniques to produce improved knowledge of rapidly draining lake behaviour in west Greenland over recent years. While many studies use 250 m MODerate-resolution Imaging Spectroradiometer (MODIS) imagery to monitor intra- and inter-annual changes to lakes on the GrIS, no existing research with MODIS calculates changes to individual and total lake volume using a physically-based method. The first aim of this research is to overcome this shortfall by developing a fully-automated lake area and volume tracking method (“the FAST algorithm”). For this, various methods for automatically calculating lake areas and volumes with MODIS are tested, and the best techniques are incorporated into the FAST algorithm. The FAST algorithm is applied to the land-terminating Paakitsoq and marine-terminating Store Glacier regions of west Greenland to investigate the incidence of rapid lake drainage in summer 2014. The validation and application of the FAST algorithm show that lake areas and volumes (using a physically-based method) can be calculated accurately using MODIS, that the new algorithm can identify rapidly draining lakes reliably, and that it therefore has the potential to be used widely across the GrIS to generate novel insights into rapidly draining lakes. The controls on rapid lake drainage remain unclear, making it difficult to incorporate lake drainage into models of GrIS hydrology. The second aspect of this study therefore investigates whether various hydrological, morphological, glaciological and surface-mass-balance controls can explain the incidence of rapid lake drainage on the GrIS. These potential controlling factors are examined within an Exploratory Data Analysis statistical technique to elicit statistical similarities and differences between the rapidly and non-rapidly draining lake types. The results show that the lake types are statistically indistinguishable for almost all factors, except lake area. It is impossible, therefore, to elicit an empirically-supported, deterministic method for predicting hydrofracture in models of GrIS hydrology. A frequent problem in remote sensing is the need to trade-off high spatial resolution for low temporal resolution, or vice versa. The final element of this thesis overcomes this problem in the context of monitoring lakes on the GrIS by adapting the FAST algorithm (to become “the FASTER algorithm”) to use with a combined Landsat 8 and Sentinel-2 satellite dataset. The FASTER algorithm is applied to a large, predominantly land-terminating region of west Greenland in summers 2016 and 2017 to track changes to lakes, identify rapidly draining lakes, and ascertain the extra quantity of information that can be generated by using the two satellites simultaneously rather than individually. The FASTER algorithm can monitor changes to lakes at both high spatial (10 to 30 m) and temporal (~3 days) resolution, overcoming the limitation of low spatial or temporal resolution associated with previous remote sensing of lakes on the GrIS. The combined dataset identifies many additional rapid lake-drainage events than would be possible with Landsat 8 or Sentinel-2 alone, due to their low temporal resolutions, or with MODIS, due to its inferior spatial resolution.
43	Analýza úrovně kvality života pomocí shlukové analýzy a porovnání s Human Development Indexem / Analysis of the Quality of life using cluster analysis and comparison with the Human Development Index Pánková, Barbara January 2015 (has links) Nowadays quality of life is often discussed topic. In defining this term, there is considerable ambiguity and disunity, since there is no universally accepted definition, nor theoretically sophisticated model. However, despite this fact, the level of quality of life is currently one of the most discussed topic. Monitoring the quality of life by using a variety of indicators are engaged in several international organizations, one of them is the Development Programme of the United Nations. This organization annually publishes the Human Development Index, which divides the world´s countries into four groups according to their level of development: low, medium, high and very high development. The aim of this thesis is to analyze the quality of life in 125 countries by using cluster analysis, accurately the Ward's method. Quality of life in this thesis is evaluated based on 19 demographic and economic indicators, which include life expectancy, literacy rate, access to drinking water and infant mortality rate. The cluster analysis divided the country into individual clusters by their similarities. Six clusters were created by this analysis, which had been compared with the results of Human Development Index. The clusters very well reflect the division, which is commonly used in the characterization of developing and developed countries. Each of the six clusters can be very well described and characterized in terms of quality of life. It is also possible qualify those clusters as poorest developing, low developed, moderately developed, medium development, high and very high development countries. Based on the results it can be stated that this analysis is consistent with other indicators of quality of life and the resulting clusters are identical with the division of countries which is commonly used.
44	Ordenação evolutiva de anúncios em publicidade computacional / Evolutionary ad ranking for computational advertising Marcos Eduardo Bolelli Broinizi 15 June 2015 (has links) Otimizar simultaneamente os interesses dos usuários, anunciantes e publicadores é um grande desafio na área de publicidade computacional. Mais precisamente, a ordenação de anúncios, ou ad ranking, desempenha um papel central nesse desafio. Por outro lado, nem mesmo as melhores fórmulas ou algoritmos de ordenação são capazes de manter seu status por um longo tempo em um ambiente que está em constante mudança. Neste trabalho, apresentamos uma análise orientada a dados que mostra a importância de combinar diferentes dimensões de publicidade computacional por meio de uma abordagem evolutiva para ordenação de anúncios afim de responder a mudanças de forma mais eficaz. Nós avaliamos as dimensões de valor comercial, desempenho histórico de cliques, interesses dos usuários e a similaridade textual entre o anúncio e a página. Nessa avaliação, nós averiguamos o desempenho e a correlação das diferentes dimensões. Como consequência, nós desenvolvemos uma abordagem evolucionária para combinar essas dimensões. Essa abordagem é composta por três partes: um repositório de configurações para facilitar a implantação e avaliação de experimentos de ordenação; um componente evolucionário de avaliação orientado a dados; e um motor de programação genética para evoluir fórmulas de ordenação de anúncios. Nossa abordagem foi implementada com sucesso em um sistema real de publicidade computacional responsável por processar mais de quatorze bilhões de requisições de anúncio por mês. De acordo com nossos resultados, essas dimensões se complementam e nenhuma delas deve ser neglicenciada. Além disso, nós mostramos que a combinação evolucionária dessas dimensões não só é capaz de superar cada uma individualmente, como também conseguiu alcançar melhores resultados do que métodos estáticos de ordenação de anúncios. / Simultaneous optimization of users, advertisers and publishers\' interests has been a formidable challenge in online advertising. More concretely, ranking of advertising, or more simply ad ranking, has a central role in this challenge. However, even the best ranking formula or algorithm cannot withstand the ever-changing environment of online advertising for a long time. In this work, we present a data-driven analysis that shows the importance of combining different aspects of online advertising through an evolutionary approach for ad ranking in order to effectively respond to changes. We evaluated aspects ranging from bid values and previous click performance to user behavior and interests, including the textual similarity between ad and page. In this evaluation, we assessed commercial performance along with the correlation between different aspects. Therefore, we proposed an evolutionary approach for combining these aspects. This approach was composed of three parts: a configuration repository to facilitate deployment and evaluation of ranking experiments; an evolutionary data-based evaluation component; and a genetic programming engine to evolve ad ranking formulae. Our approach was successfully implemented in a real online advertising system that processes more than fourteen billion ad requests per month. According to our results, these aspects complement each other and none of them should be neglected. Moreover, we showed that the evolutionary combination of these aspects not only outperformed each of them individually, but was also able to achieve better overall results than static ad ranking methods. Análise de componentes principais Análise exploratória de dados Programação genética Publicidade computacional Publicidade contextualizada Publicidade digital Publicidade online Computational advertising Contextual advertising Exploratory data analysis Genetic programming Learning to advertising Online advertising Principal component analysis
45	Regression Models to Predict Coastdown Road Load for Various Vehicle Types Singh, Yuvraj January 2020 (has links) No description available. Automotive Engineering Mechanical Engineering Statistics coastdown testing fuel economy certification chassis dynamometer testing road load statistical modeling regression exploratory data analysis kernel density estimation wind tunnel testing aerodynamic drag
46	Zwangsmobilität und Verkehrsmittelorientierung junger Erwachsener / Forced mobility and orientation towards transport modes of young adults: Creation of a typology Wittwer, Rico 23 January 2015 (has links) (PDF) In der Mobilitätsforschung entstand in den vergangenen Jahrzehnten eine breite Wissensbasis für das Verständnis von Verkehrsursachen und Zusammenhängen, die das Verkehrsverhalten determinieren. Mit der Entwicklung von Verkehrsmodellen lag das Forschungsinteresse zunächst primär bei Ökonomen und Ökonometrikern sowie Verkehrsingenieuren. Bald kamen andere Wissenschaftsbereiche wie die Psychologie oder die Geowissenschaften hinzu, welche sich in der Folge zunehmend mit dem Thema Mobilität befassten und die zur Erklärung des menschlichen Verhaltens ganz unterschiedliche Methoden und Maßstäbe nutzten. Heute versuchen zumeist handlungsorientierte Ansätze, auf Individualebene, Faktoren zu bestimmen, die Aufschluss über die Verhaltensvariabilität in der Bevölkerung geben und damit einen möglichst großen Beitrag zur Varianzaufklärung leisten. Werden Einflussfaktoren in geeigneter Weise identifiziert und quantifiziert, können Defizite und Chancen erkannt und das Verhalten steuernde Maßnahmen entworfen werden. Mit deren Hilfe wird ungewollten Entwicklungen entgegengesteuert. Junge Erwachsene stellen aufgrund ihrer sehr unterschiedlichen Phasen im Lebenszyklus, z. B. gerade anstehender oder abgeschlossener Ausbildung, Umzug in eine eigene Wohnung, Familiengründung, Neuorientierung in Arbeitsroutinen oder das Einleben in ein anderes Lebensumfeld einer fremden Stadt, intuitiv eine sehr heterogene Gruppe dar. Die Modellierung des Verhaltens ist für diese Altersgruppe besonders schwierig. Aus der Komplexität dieser Problemstellung heraus ist ersichtlich, dass fundierte Analysen zur Mobilität junger Erwachsener notwendig sind, um verkehrsplanerische Defizite aufzudecken und Chancen zu erkennen. Der methodische Schwerpunkt des Beitrages liegt auf der Bildung einer Typologie des Verkehrsverhaltens junger Erwachsener. Die verwendete Datengrundlage ist das „Deutsche Mobilitätspanel – MOP“. Dabei wird der Versuch unternommen, zunächst Variablen aller relevanten Dimensionen des handlungsorientierten, aktivitätsbasierten Verkehrsverhaltens zusammenzustellen und für eine entsprechende Analyse aufzubereiten. Im Anschluss werden geeignete und in den Sozialwissenschaften erprobte Verfahren zur Ähnlichkeitsmessung eingesetzt, um möglichst verhaltensähnliche Personen zu typologisieren. Im Weiteren finden konfirmatorische Analysetechniken Anwendung, mit deren Hilfe Verhaltenshintergründe erklärt und inferenzstatistisch geprüft werden. Als Ergebnis wird eine clusteranalytische Typologisierung vorgestellt, die im Anschluss anhand soziodemografischer Indikatoren und raumstruktureller Kriterien der Lagegunst beschrieben wird. Aufgrund der gewonnenen Erkenntnisse können objektive und im Idealfall quantifizierbare, d. h. prognosefähige Merkmale zur Bildung verkehrssoziologischer und weitgehend verhaltensähnlicher Personengruppen genutzt werden. / Over the last few decades of mobility research, a wide base of knowledge for understanding travel determinants and causal relationships in mobility behavior has been established. The development of travel models was at first of interest primarily to economists and econometricians as well as transportation engineers. They were soon joined by other scientific areas such as psychology or the geosciences, which as a result increasingly addressed the theme of mobility and used quite different methodologies and criteria for explaining human behavior. Today, activity-oriented approaches generally attempt to determine individual-level factors that provide information on behavioral variability within the population, thereby contributing greatly to explaining variances. If explanatory factors can be properly identified and quantified, then deficiencies and opportunities can be recognized and measures for influencing behavior can be conceptualized. With their help, undesirable developments can be avoided. Because of their highly differing stages in life, e.g. upcoming or recently completed education, moving into their own apartment, starting a family, becoming oriented in a work routine or adapting to a new environment in a different city, young adults are intuitively a very heterogeneous group. Modeling the behavior of this age group is particularly difficult. This problem makes it clear that founded analysis of the mobility of young adults is necessary in order to recognize deficiencies and opportunities in transportation planning. The methodological focus of this work is on creating a typology of young adults’ travel behavior. The base data is from the “Deutsches Mobilitätspanel – MOP” (German Mobility Panel). An attempt is made to gather and prepare all relevant dimensions of decision-oriented, activity-based travel behavior for a corresponding analysis. Afterward, appropriate and proven methods from the social sciences are used to test for similarity in order to identify groups of persons which are as behaviorally homogeneous as possible. In addition, confirmatory data analysis is utilized which helps explain and test, through inferential statistics, determinants of behavior. The resulting typology from the cluster analysis is presented and followed by a description using sociodemographic indicators and spatial criteria of accessibility. The findings make it possible to use objective and, ideally, quantifiable and therefore forecastable characteristics for identifying sociological population groups within which similar travel behavior is displayed. Junge Erwachsene Verkehrssoziologie Mobilität Verkehrsverhalten Zwangsmobilität Verkehrsmittelwahl Typologie Klassifizierung exploratorische Datenanalyse konfirmatorische Datenanalyse Dimensionsreduktion Faktorenanalyse Clusteranalyse Diskriminanzanalyse logistische Regression verhaltenshomogene Personengruppen Young adults sociology of transport mobility travel behavior forced mobility choice of transport modes typology classification exploratory data analysis confirmatory data analysis reduction of dimension factor analysis cluster analysis discriminant analysis logistic regression homogeneous groups of persons ddc:620 rvk:ZO 3300 rvk:QR 800
47	Zwangsmobilität und Verkehrsmittelorientierung junger Erwachsener: Eine Typologisierung Wittwer, Rico 12 December 2014 (has links) In der Mobilitätsforschung entstand in den vergangenen Jahrzehnten eine breite Wissensbasis für das Verständnis von Verkehrsursachen und Zusammenhängen, die das Verkehrsverhalten determinieren. Mit der Entwicklung von Verkehrsmodellen lag das Forschungsinteresse zunächst primär bei Ökonomen und Ökonometrikern sowie Verkehrsingenieuren. Bald kamen andere Wissenschaftsbereiche wie die Psychologie oder die Geowissenschaften hinzu, welche sich in der Folge zunehmend mit dem Thema Mobilität befassten und die zur Erklärung des menschlichen Verhaltens ganz unterschiedliche Methoden und Maßstäbe nutzten. Heute versuchen zumeist handlungsorientierte Ansätze, auf Individualebene, Faktoren zu bestimmen, die Aufschluss über die Verhaltensvariabilität in der Bevölkerung geben und damit einen möglichst großen Beitrag zur Varianzaufklärung leisten. Werden Einflussfaktoren in geeigneter Weise identifiziert und quantifiziert, können Defizite und Chancen erkannt und das Verhalten steuernde Maßnahmen entworfen werden. Mit deren Hilfe wird ungewollten Entwicklungen entgegengesteuert. Junge Erwachsene stellen aufgrund ihrer sehr unterschiedlichen Phasen im Lebenszyklus, z. B. gerade anstehender oder abgeschlossener Ausbildung, Umzug in eine eigene Wohnung, Familiengründung, Neuorientierung in Arbeitsroutinen oder das Einleben in ein anderes Lebensumfeld einer fremden Stadt, intuitiv eine sehr heterogene Gruppe dar. Die Modellierung des Verhaltens ist für diese Altersgruppe besonders schwierig. Aus der Komplexität dieser Problemstellung heraus ist ersichtlich, dass fundierte Analysen zur Mobilität junger Erwachsener notwendig sind, um verkehrsplanerische Defizite aufzudecken und Chancen zu erkennen. Der methodische Schwerpunkt des Beitrages liegt auf der Bildung einer Typologie des Verkehrsverhaltens junger Erwachsener. Die verwendete Datengrundlage ist das „Deutsche Mobilitätspanel – MOP“. Dabei wird der Versuch unternommen, zunächst Variablen aller relevanten Dimensionen des handlungsorientierten, aktivitätsbasierten Verkehrsverhaltens zusammenzustellen und für eine entsprechende Analyse aufzubereiten. Im Anschluss werden geeignete und in den Sozialwissenschaften erprobte Verfahren zur Ähnlichkeitsmessung eingesetzt, um möglichst verhaltensähnliche Personen zu typologisieren. Im Weiteren finden konfirmatorische Analysetechniken Anwendung, mit deren Hilfe Verhaltenshintergründe erklärt und inferenzstatistisch geprüft werden. Als Ergebnis wird eine clusteranalytische Typologisierung vorgestellt, die im Anschluss anhand soziodemografischer Indikatoren und raumstruktureller Kriterien der Lagegunst beschrieben wird. Aufgrund der gewonnenen Erkenntnisse können objektive und im Idealfall quantifizierbare, d. h. prognosefähige Merkmale zur Bildung verkehrssoziologischer und weitgehend verhaltensähnlicher Personengruppen genutzt werden. / Over the last few decades of mobility research, a wide base of knowledge for understanding travel determinants and causal relationships in mobility behavior has been established. The development of travel models was at first of interest primarily to economists and econometricians as well as transportation engineers. They were soon joined by other scientific areas such as psychology or the geosciences, which as a result increasingly addressed the theme of mobility and used quite different methodologies and criteria for explaining human behavior. Today, activity-oriented approaches generally attempt to determine individual-level factors that provide information on behavioral variability within the population, thereby contributing greatly to explaining variances. If explanatory factors can be properly identified and quantified, then deficiencies and opportunities can be recognized and measures for influencing behavior can be conceptualized. With their help, undesirable developments can be avoided. Because of their highly differing stages in life, e.g. upcoming or recently completed education, moving into their own apartment, starting a family, becoming oriented in a work routine or adapting to a new environment in a different city, young adults are intuitively a very heterogeneous group. Modeling the behavior of this age group is particularly difficult. This problem makes it clear that founded analysis of the mobility of young adults is necessary in order to recognize deficiencies and opportunities in transportation planning. The methodological focus of this work is on creating a typology of young adults’ travel behavior. The base data is from the “Deutsches Mobilitätspanel – MOP” (German Mobility Panel). An attempt is made to gather and prepare all relevant dimensions of decision-oriented, activity-based travel behavior for a corresponding analysis. Afterward, appropriate and proven methods from the social sciences are used to test for similarity in order to identify groups of persons which are as behaviorally homogeneous as possible. In addition, confirmatory data analysis is utilized which helps explain and test, through inferential statistics, determinants of behavior. The resulting typology from the cluster analysis is presented and followed by a description using sociodemographic indicators and spatial criteria of accessibility. The findings make it possible to use objective and, ideally, quantifiable and therefore forecastable characteristics for identifying sociological population groups within which similar travel behavior is displayed. info:eu-repo/classification/ddc/620 ddc:620
48	Money Laundering Detection using Tree Boosting and Graph Learning Algorithms / Detektion av Penningtvätt med hjälp av Trädalgoritmer och Grafinlärningsalgoritmer Frumerie, Rickard January 2021 (has links) In this masters thesis we focused on using machine learning methods for detecting money laundering in financial transaction networks, in order to demonstrate that it can be used as a complement or instead of the more commonly used rule based systems. The graph learning method graph convolutional networks (GCN) has been a hot topic in the field since they were shown to scale well with data size back in 2018. However the typical GCN models cannot use edge features, which is why this thesis combines the GCN model with a node and edge neural network (NENN) in order to solve this problem. This new method will be compared towards an already established machine learning method for financial transactions, namely the tree boosting method (XGBoost). Because of confidentiality concerns for financial transactions data, the machine learning algorithms will be tested on two carefully constructed synthetically generated data sets, which from agent based simulations resembles real financial data. The results showed the viability and superiority of the new implementation of the GCN model with it being a preferable method for connectivly structured data, meaning that a transaction or account is analyzed in the context of its financial environment. On the other hand the XGBoost method showed better results when examining transactions independently. Hence it was more accurately able to find fraudulent and non fraudulent patterns from the transactional features themselves. / I detta examensarbete fokuserar vi på användandet av maskininlärningsmetoder för att detektera penningtvätt i finansiella transaktionsnätverk, med målet att demonstrera att dess kan användas som ett komplement till eller i stället för de mer vanligt använda regelbaserade systemen. Grafinlärningsmetoden \textit{graph convolutional networks} (GCN) som har varit ett hett ämne inom området sedan metoden under 2018 visades fungera bra för stora datamängder. Däremot kan inte en vanlig GCN-modell använda kantinformation, vilket är varför denna avhandling kombinerar GCN-modellen med \textit{node and edge neural networks} (NENN) för att mer effektivt detektera penningtvätt. Denna nya metod kommer att jämföras med en redan etablerad maskininlärningsmetod för finansiella transaktioner, nämligen \textit{tree boosting} (XGBoost). På grund av sekretessanledningar för finansiella transaktionsdata var maskininlärningsalgoritmerna testade på två noggrant konstruerade syntetiskt genererade datamängder som från agentbaserade simuleringar liknar riktiga finansiella data. Resultaten visade på applikationsmöjligheter och överlägsenhet för den nya implementationen av GCN-modellen vilken är att föredra för relationsstrukturerade data, det vill säga när transaktioner och konton analyseras i kontexten av deras finansiella omgivning. Å andra sidan visar XGBoost bättre resultat på att examinera transaktioner individuellt eftersom denna metod mer precist kan identifiera bedrägliga och icke-bedrägliga mönster från de transnationella funktionerna. Tree boosting XGBoost graph convolutional networks (GCN) node and edge neural networks (NENN) exploratory data analysis (EDA) anti money laundering (AML) financial graph networks. Trädalgoritmer XGBoost convolutions grafnätverk (GCN) nod och kant neurala nätverk (NENN) utforskande dataanalys penningtvättsbekämpning (AML) finansiella grafnätverk. Probability Theory and Statistics Sannolikhetsteori och statistik

Search results