Spelling suggestions: "subject:"[een] CLUSTER ANALYSIS"" "subject:"[enn] CLUSTER ANALYSIS""
591 |
Decision-support tool for identifying locations of shared mobility hubs : A case study in AmsterdamPodestà, Pietro January 2022 (has links)
Shared mobility is considered a more sustainable alternative to private modes. Nonetheless, its sudden and sometimes “out of control” emergence poses issues that need to be addressed. Lack of regulations and public space mismanagement cause sidewalks and city roads to be overcrowded with shared vehicles (especially in the case of micromobility). This causes nuisance and safety concerns and hinders the societal benefits shared mobility may provide. Shared mobility hubs have the potential to address these issues. The research was carried out within the context of the SmartHubs project, an EIT Urban Mobility project initiated in 2021 by a diverse consortium of 7 cities, companies, and universities to develop and validate effective and economically viable mobility hub solutions. This degree project aims to improve the Decision-Support-Tool (DST) developed by SmartHubs to identify locations of shared-mobility hubs having high potential in driving sustainable travel usage. To achieve that, the thesis proposes a methodology for determining smart hub locations and their corresponding utilities based on the combination of GIS cluster analysis of free-floating shared mobility parking patterns and a stated-preference study. The potential hub locations were determined from the cluster analysis of free-floating trip characteristics. Using the stated preference survey data, the thesis develops a model to estimate the probability of parking at the hub as a function of explanatory variables, including walking distance, reward policies and the parking situation. The model testing results showed that the proposed methodology can well predict the hub (usage) demand and improve the current DST originally developed in the SmartHubs project.
|
592 |
Cluster-assisted Grading : Comparison of different methods for pre-processing, text representation and cluster analysis in cluster-assisted short-text grading / Kluster-assisterad rättning : Jämförelse av olika metoder för bearbetning, textrepresentation och klusteranalys i kluster-assisterad rättningBåth, Jacob January 2022 (has links)
School teachers spend approximately 30 percent of their time grading exams and other assessments. With an increasingly digitized education, a research field have been initiated that aims to reduce the time spent on grading by automating it. This is an easy task for multiple-choice questions but much harder for open-ended questions requiring free-text answers, where the latter have shown to be superior for knowledge assessment and learning consolidation. While results in previous work have presented promising results of up to 90 percent grading accuracy, it is still problematic using a system that gives the wrong grade in 10 percent of the cases. This has given rise to a research field focusing on assisting teachers in the grading process, instead of fully replacing them. Cluster analysis has been the most popular tool for this, grouping similar answers together and letting teachers process groups of answers at once, instead of evaluating each question one-at-a-time. This approach has shown evidence to decrease the time spent on grading substantially, however, the methods for performing the clustering vary widely between studies, leaving no apparent methodology choice for real-use implementation. Using several techniques for pre-processing, text representation and choice of clustering algorithm, this work compared various methods for clustering free-text answers by evaluating them on a dataset containing almost 400 000 student answers. The results showed that using all of the tested pre-processing techniques led to the best performance, although the difference to using minimum pre-processing were small. Sentence embeddings were the text representation approach that performed the best, however, it remains to be answered how it should be used when spelling and grammar is part of the assessment, as it lacks the ability to identify such errors. A suitable choice of clustering algorithm is one where the number of clusters can be specified, as determining this automatically proved to be difficult. Teachers can then easily adjust the number of clusters based on their judgement. / Skollärare spenderar ungefär 30 procent av sin tid på rättning av prov och andra bedömningar. I takt med att mer utbildning digitaliseras, försöker forskare hitta sätt att automatisera rättning för att minska den administrativa bördan för lärare. Flervalsfrågor har fördelen att de enkelt kan rättas automatiskt, medan öppet ställda frågor som kräver ett fritt formulerat svar har visat sig vara ett bättre verktyg för att mäta elevers förståelse. Dessa typer av frågor är däremot betydligt svårare att rätta automatiskt, vilket lett till forskning inom automatisk rättning av dessa. Även om tidigare forskning har lyckats uppnå resultat med upp till 90 procents träffsäkerhet, är det fortfarande problematiskt att det blir fel i de resterande 10 procenten av fallen. Detta har lett till forskning som fokuserar på underlätta för lärare i rättningen, istället för att ersätta dem. Klusteranalys har varit det mest populära tillvägagångssättet för att åstadkomma detta, där liknande svar grupperas tillsammans, vilket möjliggör rättning av flera svar samtidigt. Denna metod har visat sig minska rättningstiden signifikant, däremot har metoderna för att göra klusteranalysen varierat brett, vilket gör det svårt att veta hur en implementering i ett verkligt scenario bör se ut. Genom att använda olika tekniker för textbearbetning, textrepresentation och val av klusteralgoritm, jämför detta arbete olika metoder för att klustra fritext-svar, genom att utvärdera dessa på nästan 400 000 riktiga elevsvar. Resultatet visar att mer textbearbetning generellt är bättre, även om skillnaderna är små. Användning av så kallade sentence embeddings ledde till bäst resultat när olika tekniker för textrepresentation jämfördes. Däremot har denna teknik svårare att identifiera grammatik- och stavningsfel, hur detta ska hanteras är en fråga för framtida forskning. Ett lämpligt val av klustringsalgoritm är en där antalet kluster kan bestämmas av användaren, då det visat sig svårt att bestämma det automatiskt. Lärare kan då justera antalet kluster ifall det skulle vara för få eller för många.
|
593 |
Deep Learning Approaches for Clustering Source Code by Functionality / Djupinlärningsmetoder för gruppering av källkod efter funktionalitetHägglund, Marcus January 2021 (has links)
With the rise of artificial intelligence, applications for machine learning can be found in nearly everyaspect of modern life, from healthcare and transportation to software services like recommendationsystems. Consequently, there are now more developers engaged in the field than ever - with the numberof implementations rapidly increasing by the day. In order to meet the new demands, it would be usefulto provide services that allow for an easy orchestration of a large number of repositories. Enabling usersto easily share, access and search for source code would be beneficial for both research and industryalike. A first step towards this is to find methods for clustering source code by functionality. The problem of clustering source code has previously been studied in the literature. However, theproposed methods have so far not leveraged the capabilities of deep neural networks (DNN). In thiswork, we investigate the possibility of using DNNs to learn embeddings of source code for the purpose ofclustering by functionality. In particular, we evaluate embeddings from Code2Vec and cuBERT modelsfor this specific purpose. From the results of our work we conclude that both Code2Vec and cuBERT are capable of learningsuch embeddings. Among the different frameworks that we used to fine-tune cuBERT, we found thebest performance for this task when fine-tuning the model under the triplet loss criterion. With thisframework, the model was capable of learning embeddings that yielded the most compact and well-separated clusters. We found that a majority of the cluster assignments were semantically coherent withrespect to the functionalities implemented by the methods. With these results, we have found evidenceindicating that it is possible to learn embeddings of source code that encode the functional similaritiesamong the methods. Future research could therefore aim to further investigate the possible applicationsof the embeddings learned by the different frameworks. / Med den avsevärda ökningen av användandet av artificiell intelligens går det att finna tillämpningar förmaskininlärningsalgoritmer i nästan alla aspekter av det moderna livet, från sjukvård och transport tillmjukvarutjänster som rekommendationssystem. Till följd av detta så är det fler utvecklare än någonsinengagerade inom området, där antalet nya implementationer ökar för var dag. För att möta de nyakraven skulle det vara användbart att kunna tillhandahålla tjänster som möjliggör en enkel hantering avett stort antal kodförråd. Att göra det möjligt för användare att enkelt dela, komma åt och söka efterkällkod skulle vara till nytta inom både forskning och industri. Ett första steg mot detta är att hittametoder som gör det möjligt att klustra källkod med avseende på funktionalitet. Problemet med klustring av källkod är något som har tidigare studerats. De föreslagna metoderna hardock hittils inte utnyttjat kapaciteten hos djupa neurala nätverk (DNN). I detta arbete undersöker vimöjligheten att använda DNN för inlärning av inbäddningar av källkod i syfte att klustra med avseendepå funktionalitet. I synnerhet så utvärderar vi inbäddningar från Code2Vec- och cuBERT-modeller fördetta specifika ändamål. Från resultatet av vårt arbete drar vi slutsatsen att både Code2Vec och cuBERT har kapacitet för attlära sig sådana inbäddningar. Bland de olika ramverken som vi undersökte för att finjustera cuBERT,fann vi att modellen som finjusterades under triplet-förlustkriteriet var bäst lämpad för denna uppgift.Med detta ramverk kunde modellen lära sig inbäddningar som resulterade i de mest kompakta och välseparerade klusterna, där en majoritet av klustertilldelningarna var semantiskt sammanhängande medavseende på funktionaliteten som metoderna implementerade. Med dessa resultat har vi funnit beläggsom tyder på att det är möjligt att lära sig inbäddning av källkod som bevarar och åtger funktionellalikheter mellan metoder. Framtida forskning kan därför syfta till att ytterligare undersöka de olikamöjliga användningsområdena för de inbäddningar som lärts in inom de olika ramverken.
|
594 |
Elements of musically conveyed emotion: Insights from musical and perceptual analyses of historic preludesAnderson, Cameron J. January 2021 (has links)
This thesis comprises two manuscripts prepared for scholarly journals. Chapter 2 comprises an article entitled “Exploring Historic Changes in Musical Communication: Deconstructing Emotional Cues in Preludes by Bach and Chopin.”, which examines emotion perception in historic prelude sets by J.S. Bach and F. Chopin. This work connects psychological research on perceived musical emotion to musicological research describing changes in music structure. Using a technique called commonality analysis to deconstruct cues’ individual and joint roles in predicting participants’ perceived emotions, the chapter clarifies how music’s conveyed emotion can differ in compositions from different eras. Chapter 3 comprises an article entitled “Parsing Musical Patterns in Prelude Sets: Bridging Qualitative and Quantitative Epistemologies in Historical Music Research”. This chapter bridges gaps between qualitative and quantitative research on music history through an analytical approach engaging with both fields. Specifically, cluster analyses of Bach and Chopin’s preludes reveal notable differences in the composers’ expressive toolkits, consistent with work from historical and empirical music research. Through a novel analytical framework, the chapter illustrates a method for detecting groups of pieces demarcated by salient musical differences, assessing cues’ importance within these groups, and determining the most influential cue values for each group. Together, these articles provide new insight into the subtle sonic relationships influencing musical meaning and emotion perception. / Thesis / Master of Science (MSc) / Music’s capacity to express emotion has received considerable attention in psychological and musicological research. Whereas efforts from psychology clarify the musical cues for emotion through perceptual experiments, efforts from musicology track changes in compositional practice over time—finding changing relationships between music’s cues for emotion in historically diverse compositions. To date, the implications of these changing musical relationships for emotion perception remain unclear. This thesis analyzes musical scores and listeners’ emotion ratings to gain insight into music’s structural changes throughout history and their implications for perceived emotion. By applying statistical techniques to (i) detect musical patterns in prelude sets by J.S. Bach and F. Chopin and (ii) clarify how cue relationships influence emotion perception, this thesis sheds light on the relationship between music’s historic context and its emotional meaning.
|
595 |
DO THE CAUSES OF POVERTY VARY BY NEIGHBORHOOD TYPE?Kandula, Uday Bhaskar January 2012 (has links)
No description available.
|
596 |
How Do Socio-Demographics and The Built Environment Affect Individual Accessibility Based on Activity Space as A Transport Exclusion Indicator?Chen, Na 08 November 2016 (has links)
No description available.
|
597 |
INVESTIGATION OF THE CONSONANT ENDINGS OF THE CHAOSHAN DIALECT: A RESULT OF LANGUAGE CONTACT AND HORIZONTAL TRANSMISSIONChen, Jin 08 May 2020 (has links) (PDF)
This thesis studies the inter-group variation of the consonant endings among five principal subgroups of the Chaoshan dialect, a branch of the South Min dialect in Eastern Guangdong Province, from the perspective of language contact and horizontal transmission. I conduct a quantitative study to present the synchronic variance of the consonant endings among five Chaoshan subgroups and the diachronic variance from Middle Chinese to modern Chaoshan dialect on a numerical scale.
The current literature tends to take the change of the consonant endings as a process of weakening governed by regular rules. My research findings challenge this conventional view. First, the change of the consonant endings from Middle Chinese to five subgroups of modern Chaoshan dialect is irregular, which is an exception to the linguistic laws proposed in the existing literature. Secondly, I find that some characters without consonant endings or with a weakened ending in Chaozhou in the 19th century reverse to have a consonant ending in modern Chaoshan dialect. This reversal contradicts to the weakening hypothesis that regards the change of the consonant endings as a process of simplifying. Finally, my quantitative research shows that Chaozhou dialect in the 19th century in much closer to modern Xiamen dialect than to five subgroups of modern Chaoshan dialect in terms of the relativeness in consonant endings, which is the opposite to the prediction that languages become more and more different and have no consequent contact with other daughter languages after separating from the proto-language.
We propose that the actual situation of the consonant endings in different subgroups of the Chaoshan dialect can be better explained from the perspective of language contact and horizontal transmission. The interaction between Han Chinese and non-Han Chinese is the primary reason for the change of the consonant endings of the Chaoshan dialect. Also, the language contact between Chaoshan aborigines and migrants from the Fujian Province leads to the divergence of the consonant endings in different Chaoshan subgroups.Population structure and other social factors determine what phonetic features survive after several times of horizontal transmission.
|
598 |
Methods of Determining the Number of Clusters in a Data Set and a New Clustering CriterionYan, Mingjin 29 December 2005 (has links)
In cluster analysis, a fundamental problem is to determine the best estimate of the number of clusters, which has a deterministic effect on the clustering results. However, a limitation in current applications is that no convincingly acceptable solution to the best-number-of-clusters problem is available due to high complexity of real data sets. In this dissertation, we tackle this problem of estimating the number of clusters, which is particularly oriented at processing very complicated data which may contain multiple types of cluster structure. Two new methods of choosing the number of clusters are proposed which have been shown empirically to be highly effective given clear and distinct cluster structure in a data set. In addition, we propose a sequential type of clustering approach, called multi-layer clustering, by combining these two methods. Multi-layer clustering not only functions as an efficient method of estimating the number of clusters, but also, by superimposing a sequential idea, improves the flexibility and effectiveness of any arbitrary existing one-layer clustering method. Empirical studies have shown that multi-layer clustering has higher efficiency than one layer clustering approaches, especially in detecting clusters in complicated data sets. The multi-layer clustering approach has been successfully implemented in clustering the WTCHP microarray data and the results can be interpreted very well based on known biological knowledge.
Choosing an appropriate clustering method is another critical step in clustering. K-means clustering is one of the most popular clustering techniques used in practice. However, the k-means method tends to generate clusters containing a nearly equal number of objects, which is referred to as the ``equal-size'' problem. We propose a clustering method which competes with the k-means method. Our newly defined method is aimed at overcoming the so-called ``equal-size'' problem associated with the k-means method, while maintaining its advantage of computational simplicity. Advantages of the proposed method over k-means clustering have been demonstrated empirically using simulated data with low dimensionality. / Ph. D.
|
599 |
Die Rolle von Unternehmen beim Verkehrsverhalten im PersonenwirtschaftsverkehrHebes, Paul 23 November 2011 (has links)
Eine steigende Anzahl Beschäftigter ist im Berufsalltag mobil. Zur Erbringung von Dienstleistungen und zum Zwecke von Geschäftsreisen führen Mitarbeiter regelmäßig Fahrten mit dem Motorisierten Individualverkehr durch. Der so entstehende Personenwirtschaftsverkehr belastet vor allem in den hochverdichteten Innenstadtbereichen die Infrastruktur, die Umwelt und die Gesellschaft. In der deutschen wie in der internationalen Forschung ist trotz seiner Relevanz wenig darüber bekannt, wie sich der Personenwirtschaftsverkehr im Straßenraum manifestiert und welche Faktoren das Verkehrsverhalten bestimmen. Die vorliegende Dissertationsschrift nutzt zwei empirische Datensätze um die Kenntnislücken zum Personenwirtschaftsverkehr zu schließen, die Studie ‚Kraftfahrzeugverkehr in Deutschland, KiD 2002‘ und die ‚Dienstleistungsverkehrsstudie, DLVS‘. Die neuen Erkenntnisse ermöglichen eine verbesserte Modellierung des (Personen-)Wirtschaftsverkehrs und erleichtern die Planung und Lenkung kommunaler (städtischer) Verkehre. Die Ergebnisse dieser Arbeit zeigen, dass zwischen vier charakteristischen Verkehrsverhalten unterschieden werden kann. Im Rahmen des Personenwirtschaftsverkehrs gibt es sowohl Tourenmuster, die sich durch wenige Stopps und eine geringe Verkehrsleistung auszeichnen als auch Fahrzeuge, die zahlreiche Ziele am Tag ansteuern und eine hohe Verkehrsbeteiligung aufweisen. Die statistischen Analysen belegen außerdem, dass sich die Tourenmuster von Fahrzeugen unterscheiden, die entweder ausschließlich dienstlich oder aber auch privat eingesetzt werden dürfen. Die Berechnung von multivariaten Regressionsmodellen beweist, dass sowohl interne Strukturfaktoren und interne Prozessfaktoren als auch externe Strukturfaktoren und externe Prozessfaktoren eine Rolle beim Verkehrsverhalten spielen. Das bedeutet, die unternehmensbezogenen Faktoren, vor allem aber die Unternehmensstrukturen, sind mit ausschlaggebend dafür, welches der vier Verkehrsverhalten Firmenfahrzeuge aufweisen. / More and more employees are mobile during working hours. To provide services and for business trips, employees use motor vehicles regularly. The emerging service-related traffic burdens the infrastructure, the environment and the society, particularly in high density urban areas. Despite its relevance there is little German and international research on travel behavior of service-related traffic. Even less is known about what factors might influence tour characteristics of service-related traffic. To close this gap of knowledge this dissertation utilizes two data sets for empirical research, ‘Kraftfahrzeugverkehr in Deutschland, KiD 2002’ (‘Motor Vehicle Traffic in Germany’) and ‘Service-Related Traffic’. The findings allow enhanced commercial transport- and service-related traffic modeling and facilitate urban transport planning and direction. The empirical results show that four typical travel patterns can be differentiated. Against the background of service-related traffic there are on the one hand vehicles which are characterized by only a few stops and little road performance per day. On the other hand many cars visit numerous customers and participate a lot in traffic. Statistical analyses also prove that travel patterns differ, depending on an exclusive business or a permitted private use of corporate vehicles. The calculation of multivariate regression models shows that four corporate factor groups, namely internal structures and internal processes as well as external structures and external processes, play a role in travel behavior. This means that company-related factors, especially corporate structure, are decisive for corporate vehicles’ travel patterns.
|
600 |
應用資料採礦於零售通路業之商品力矩陣分析-以某連鎖藥妝銷售資料為例 / The Application of Data Mining on Commodity Competitiveness Matrix Analysis of Retailing Industry-Case Study of Chained Drugstore Sales Data賴柏龍, Lai, Po Lung Unknown Date (has links)
由於台灣國人所得提高,生活水準跟著日漸提高,近年來更是意識到健康對個人及家庭的重要性,因此國內健康食品與藥品市場在這幾年蓬勃地發展,特別是連鎖藥妝的普及,結合藥品、健康食品與開架式保養品、化妝品銷售,提供專業藥師諮詢服務,成為複合式的經營模式。但近年來連鎖藥妝零售業者也面臨來自外商連鎖藥妝、本土連鎖藥妝、地區性連鎖藥局等不同體系的競爭,因此藥品及化粧品零售業者普遍認同,目前經營上所面臨之困難主要為「同業競爭激烈」。
商品力為一連鎖藥妝零售業者成功的重要因素,具體展現在商品多樣性、商品獲利性、商品價格競爭力、商品獨特性…等不同的面向。目前藥品及化粧品零售業中,確實大部分的業者都有商品企劃或設計的需求,但有商品企劃或設計部門者僅為少數。利用資料採礦技術,將能在不大量增加人事費用的情況下,有效率地協助進行商品企劃或設計,進而提升連鎖藥妝零售業者的商品力。
本研究將針對資料採礦在連鎖藥妝上的應用進行探討,包含以下研究目的:
1. 利用資料採礦中之集群分析建置商品力矩陣,代表他們的屬性與價值。透過商品力矩陣釐清各商品的定位,幫助決策者優化商品組合,針對各商品執行妥善策略安排。
2. 依循集群分析後的結果,更進一步進行商品分類的關聯規則分析。幫助決策者將集群分析之成果化為實務決策之參考,優化商品組合,針對各商品執行妥善策略安排,也為關聯規則的整理帶來新的應用方式。
3. 根據上述兩模型建置之結果,對H連鎖藥妝提出具體可行之行銷策略建議。
本研究利用資料採礦中的Two-step Cluster模型建置出H連鎖藥妝中各項商品的商品力矩陣,此矩陣的兩軸分別為「個別商品的平均毛利」及「個別商品的年交易筆數」,將各種商品概略分為明星、樂透、忠狗、問號四大類商品,分別代表他們不同的屬性與價值。同時配合關聯規則分析,提出具體可行之候選規則篩選模式:
1. 樂透型商品,應用方式有兩種,將樂透型商品放在Apriori模型中的後項,找出導購向樂透型商品的潛在模式;將樂透型商品放在Apriori模型中的前項,並將後項商品作為加價購搭售促銷標的,提升購買樂透型商品的意願。
2. 忠狗型商品,應用方式也有兩種,將忠狗型商品放在Apriori模型中的前項,找出可能導購的商品標的,推出合適的加價購搭售促銷活動;另外也可以藉由觀察忠狗型商品的消費行為,進而提供適當的促銷、推薦,提高其他品項交叉銷售的可能性。 / Taiwanese living standard raised due to the income growing, which lead to recognizing the importance of health toward personal and family. As a result, the market of dietary supplements and drugs flourishing these years, especially the spread of chained drugstores, which turned into combinative store by providing professional pharmacist consultant and selling of drugs, dietary supplements, skincare products and cosmetics. The drug and cosmetic retailers generally agreed that the main difficulty is “Industry Competition” due to the competition from different systems, including foreign chained drugstores, local chained drugstores and regional chained drugstores.
Commodity competitiveness is one of the key successful factors of chained drugstores, which expressed as commodity diversity, commodity profitability, commodity price competitiveness, commodity uniqueness, etc. Seldom drugstores own product planning or designing department although most drugstores have demand of product planning or designing. It could raise the commodity competitiveness of chained drugstores by applying data mining to help product planning or designing more efficiently without increasing too much labor cost.
This study focus on the application of data mining on chained drugstores, including goals below:
1. Building commodity competitiveness matrix by cluster analysis, representing their features and values. Through positioning products on commodity competitiveness matrix, helping decision maker optimize product mix and execute appropriate strategy toward products.
2. Based on the results from cluster analysis, proceed association rules analysis toward product categories. Help turning the results from cluster analysis into references of actual decision, optimize product mix and execute appropriate strategy toward products. Bringing new application pattern of association rules analysis.
3. Providing actual marketing strategy suggestions to H chained drugstore based on the two models built above.
This study built commodity competitiveness matrix of H chained drugstore by Two-step Cluster model, which take “average margin of individual product” and “annual transaction amounts of individual product” as two axes. Divided products into Star, Lottery, Greyfriars and Question Mark. Each of them represent different features and values. Providing practical filtering rules of candidate rules in association rules analysis:
1. Lottery Products:
Placing lottery products as consequents in Apriori model, searching for the potential pattern led to buying lottery products.
Placing lottery products as antecedents, which we can provide the consequents with additional purchase discount in order to raise the willing to buy lottery products.
2. Greyfriars Products:
Placing Greyfriars products as antecedents, searching for potential recommendation with additional purchase discount.
Providing appropriate sales and recommendation to raise the possibility of cross-selling by observing consuming behaviors of Greyfriars products.
|
Page generated in 0.0666 seconds