• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 483
  • 186
  • 126
  • 35
  • 28
  • 24
  • 24
  • 22
  • 20
  • 14
  • 9
  • 9
  • 8
  • 4
  • 4
  • Tagged with
  • 1080
  • 1080
  • 148
  • 147
  • 136
  • 130
  • 92
  • 70
  • 70
  • 68
  • 67
  • 58
  • 56
  • 56
  • 53
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
891

Text mining of online book reviews for non-trivial clustering of books and users

Lin, Eric 14 August 2013 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / The classification of consumable media by mining relevant text for their identifying features is a subjective process. Previous attempts to perform this type of feature mining have generally been limited in scope due having limited access to user data. Many of these studies used human domain knowledge to evaluate the accuracy of features extracted using these methods. In this thesis, we mine book review text to identify nontrivial features of a set of similar books. We make comparisons between books by looking for books that share characteristics, ultimately performing clustering on the books in our data set. We use the same mining process to identify a corresponding set of characteristics in users. Finally, we evaluate the quality of our methods by examining the correlation between our similarity metric, and user ratings.
892

Viejo Period Architecture in the Casas Grandes Region of Northern Mexico

Jensen, Samuel J. 24 April 2023 (has links) (PDF)
The Casas Grandes region of northern Mexico is an understudied, though important, part of the culture area that has come to be known as the Northwest/Southwest (NW/SW). What studies have been conducted in the Casas Grandes region have focused on the Medio Period (approximately 1200-1450 AD) and the large site of Paquimé. Only a small amount of research has been conducted on the preceding Viejo Period (approximately 700-1200 AD). In this thesis, I create a clearing house of published Viejo Period architectural features excavated in the Casas Grandes region. I also analyze those features to develop our understanding of the materials and technological choices used to construct these features, and to evaluate the validity of sub-regional zones which have begun to develop within the archaeological literature from this area. These analyses include a qualitative analysis of the excavated architectural features as well as statistical clustering methods, a Principal Components Analysis, and a Correspondence Analysis of available architectural data. I ultimately propose revisions to the existing architectural typology for the Viejo Period and the abandonment of the concept of sub-regional zones within the Casas Grandes region. I also observe some emerging patterns within the architectural data and suggest that further research is needed to fully understand the distribution of architectural features throughout the region.
893

A comparative study on a practical use case for image clustering based on common shareability and metadata / En jämförande studie i ett praktiskt användningsfall för bildklustring baserat på gemensamt delade bilder och dess metadata

Dackander, Erik January 2018 (has links)
As the amount of data increases every year, the need for effective structuring of data is a growing problem. This thesis aims to investigate and compare how four different clustering algorithms perform on a practical use case for images. The four algorithms used are Affinity Propagation, BIRCH, Rectifying Self-Organizing Maps, Deep Embedded Clustering. The algorithms get the image metadata and also its content, extracted using a pre-trained deep convolutional neural network. The results demonstrate that while there are variations in the data, Affinity Propagation and BIRCH shows the most potential among the four algorithms. Furthermore, when metadata is available it improves the results of the algorithms that can process the extreme values cause. For Affinity Propagation the mean share score is improved by 5.6 percentage points and the silhouette score is improved by 0.044. BIRCH mean share score improves by 1.9 percentage points and silhouette score by 0.051. RSOM and DEC could not process the metadata. / Allt eftersom datamängderna ökar för varje år som går så ökar även behovet av att strukturera datan på en bra sätt. Detta arbete syftar till att undersöka och jämföra hur väl fyra olika klustringsalgoritmer fungerar för ett praktiskt användningsfall med bilder. De fyra algorithmerna som används är Affinity Propagation, BIRCH, Rectifying Self-Organizing Maps och Deep Embedded Clustering. Algoritmerna hade bildernas metadata samt deras innehåll, framtaget med hjälp av ett deep convolutional neural network, att använda för klustringen. Resultaten visar att även om det finns stora variationer i utfallen, visar Affinity Propagation och BIRCH den största potentialen av de fyra algoritmerna. Vidare verkar metadatan, när den finns tillgänglig, förbättra resultaten för de klustringsalgoritmer som kunde hantera de extremvärden som metadatan kunde ge upphov till. För Affinity propagation föbättrades den genomsnittliga delnings poängen med 5,6 procentenheter och dess silhouette index ökade med 0.044. BIRCHs genomsnittliga delnings poäng ökade med 1,9 procentenheter samt dess silhouette index förbättades med 0.051. RSOM och DEC kunde inte processa metadatan.
894

Deinterleaving of radar pulses with batch processing to utilize parallelism / Gruppering av radar pulser med batch-bearbetning för att utnyttja parallelism

Lind, Emma, Stahre, Mattias January 2020 (has links)
The threat level (specifically in this thesis, for aircraft) in an environment can be determined by analyzing radar signals. This task is critical and has to be solved fast and with high accuracy. The received electromagnetic pulses have to be identified in order to classify a radar emitter. Usually, there are several emitters transmitting radar pulses at the same time in an environment. These pulses need to be sorted into groups, where each group contains pulses from the same emitter. This thesis aims to find a fast and accurate solution to sort the pulses in parallel. The selected approach analyzes batches of pulses in parallel to exploit the advantages of a multi-threaded Central Processing Unit (CPU) or a Graphics Processing Unit (GPU). Firstly, a suitable clustering algorithm had to be selected. Secondly, an optimal batch size had to be determined to achieve high clustering performance and to rapidly process the batches of pulses in parallel. A quantitative method based on experiments was used to measure clustering performance, execution time, system response, and parallelism as a function of batch sizes when using the selected clustering algorithm. The algorithm selected for clustering the data was Density-based Spatial Clustering of Applications with Noise (DBSCAN) because of its advantages, such as not having to specify the number of clusters in advance, its ability to find arbitrary shapes of a cluster in a data set, and its low time complexity. The evaluation showed that implementing parallel batch processing is possible while still achieving high clustering performance, compared to a sequential implementation that used the maximum likelihood method.An optimal batch size in terms of data points and cutoff time is hard to determine since the batch size is very dependent on the input data. Therefore, one batch size might not be optimal in terms of clustering performance and system response for all streams of data. A solution could be to determine optimal batch sizes in advance for different streams of data, then adapt a batch size depending on the stream of data. However, with a high level of parallelism, an additional delay is introduced that depends on the difference between the time it takes to collect data points into a batch and the time it takes to process the batch, thus the system will be slower to output its result for a given batch compared to a sequential system. For a time-critical system, a high level of parallelism might be unsuitable since it leads to slower response times. / Genom analysering av radarsignaler i en miljö kan hotnivån bestämmas. Detta är en kritisk uppgift som måste lösas snabbt och med bra noggrannhet. För att kunna klassificera en specifik radar måste de elektromagnetiska pulserna identifieras. Vanligtvis sänder flera emittrar ut radarpulser samtidigt i en miljö. Dessa pulser måste sorteras i grupper, där varje grupp innehåller pulser från en och samma emitter. Målet med denna avhandling är att ta fram ett sätt att snabbt och korrekt sortera dessa pulser parallellt. Den valda metoden använder grupper av data som analyserades parallellt för att nyttja fördelar med en multitrådad Central Processing Unit (CPU) eller en Central Processing Unit (CPU) or a Graphics Processing Unit (GPU). Först behövde en klustringsalgoritm väljas och därefter en optimal gruppstorlek för den valda algoritmen. Gruppstorleken baserades på att grupperna kunde behandlas parallellt och snabbt, samt uppnå tillförlitlig klustring. En kvantitativ metod användes som baserades på experiment genom att mäta klustringens tillförlitlighet, exekveringstid, systemets svarstid och parallellitet som en funktion av gruppstorlek med avseende på den valda klustringsalgoritmen. Density-based Spatial Clustering of Applications with Noise (DBSCAN) valdes som algoritm på grund av dess förmåga att hitta kluster av olika former och storlekar utan att på förhand ange antalet kluster för en mängd datapunkter, samt dess låga tidskomplexitet. Resultaten från utvärderingen visade att det är möjligt att implementera ett system med grupper av pulser och uppnå bra och tillförlitlig klustring i jämförelse med en sekventiell implementation av maximum likelihood-metoden. En optimal gruppstorlek i antal datapunkter och cutoff tid är svårt att definiera då storleken är väldigt beroende på indata. Det vill säga, en gruppstorlek måste inte nödvändigtvis vara optimal för alla typer av indataströmmar i form av tillförlitlig klustring och svarstid för systemet. En lösning skulle vara att definiera optimala gruppstorlekar i förväg för olika indataströmmar, för att sedan kunna anpassa gruppstorleken efter indataströmmen. Det uppstår en fördröjning i systemet som är beroende av differensen mellan tiden det tar att skapa en grupp och exekveringstiden för att bearbeta en grupp. Denna fördröjning innebär att en parallell grupp-implementation aldrig kommer kunna vara lika snabb på att producera sin utdata som en sekventiell implementation. Detta betyder att det i ett tidskritiskt system förmodligen inte är optimalt att parallellisera mycket eftersom det leder till långsammare svarstid för systemet.
895

Concentric Layout, A New Scientific Data Layout For Matrix Data Set In Hadoop File System

Cheng, Lu 01 January 2010 (has links)
The data generated by scientific simulation, sensor, monitor or optical telescope has increased with dramatic speed. In order to analyze the raw data speed and space efficiently, data preprocess operation is needed to achieve better performance in data analysis phase. Current research shows an increasing tread of adopting MapReduce framework for large scale data processing. However, the data access patterns which generally applied to scientific data set are not supported by current MapReduce framework directly. The gap between the requirement from analytics application and the property of MapReduce framework motivates us to provide support for these data access patterns in MapReduce framework. In our work, we studied the data access patterns in matrix files and proposed a new concentric data layout solution to facilitate matrix data access and analysis in MapReduce framework. Concentric data layout is a data layout which maintains the dimensional property in chunk level. Contrary to the continuous data layout which adopted in current Hadoop framework by default, concentric data layout stores the data from the same sub-matrix into one chunk. This matches well with the matrix operations like computation. The concentric data layout preprocesses the data beforehand, and optimizes the afterward run of MapReduce application. The experiments indicate that the concentric data layout improves the overall performance, reduces the execution time by 38% when the file size is 16 GB, also it relieves the data overhead phenomenon and increases the effective data retrieval rate by 32% on average.
896

Feature Pruning For Action Recognition In Complex Environment

Nagaraja, Adarsh 01 January 2011 (has links)
A significant number of action recognition research efforts use spatio-temporal interest point detectors for feature extraction. Although the extracted features provide useful information for recognizing actions, a significant number of them contain irrelevant motion and background clutter. In many cases, the extracted features are included as is in the classification pipeline, and sophisticated noise removal techniques are subsequently used to alleviate their effect on classification. We introduce a new action database, created from the Weizmann database, that reveals a significant weakness in systems based on popular cuboid descriptors. Experiments show that introducing complex backgrounds, stationary or dynamic, into the video causes a significant degradation in recognition performance. Moreover, this degradation cannot be fixed by fine-tuning the system or selecting better interest points. Instead, we show that the problem lies at the descriptor level and must be addressed by modifying descriptors.
897

Определение эффективных подгрупп в социальной группе на основе применения методологии анализа социальных сетей (SNA-методологии) : магистерская диссертация / Detection of effective subgroups in a social group on the basis of SNA-methodology implementation

Муравьев, А. А., Muravyov, A. A. January 2020 (has links)
В магистерской диссертации производится сравнительный анализ четырех программных инструментов, которые поддерживают методологию анализа социальных сетей (SNA - методологию), и могут быть использованы для решения задачи формирования эффективных команд. В терминах SNA-методологии это есть поиск подгрупп в социальной группе. Приводится описание наиболее известных алгоритмов кластеризации, а также уровень поддержки этих алгоритмов существующими программными инструментами. В результате определяется наиболее эффективный алгоритм и наиболее удобный программный инструмент для решения данной задачи. / In the master's dissertation, a comparative analysis of four software tools is carried out. These tools support the methodology of analysis of social networks (SNA-methodology) and which could be used for effective teams building. In terms of the SNA-methodology, this is a kind of subgroup search in a social group. Description of the most popular clustering algorithms is delivered, as well as the level of support of these algorithms with software tools is under discussion. As a result, the most effective clustering algorithm and the most usable software tool for solving this problem are determined.
898

New Clustering and Feature Selection Procedures with Applications to Gene Microarray Data

Xu, Yaomin January 2008 (has links)
No description available.
899

Development of a Landslide Hazard Rating System for Selected Counties in Northeastern Ohio

Dalqamouni, Ahmad Yousef 07 March 2011 (has links)
No description available.
900

Explication of Political User-Generated Content and Theorizing about Its Effects on Democracy with a Mix-of-Attributes Approach and Documenting Attribute Presence with a Quantitative Content Analysis

Dylko, Ivan B. 25 July 2011 (has links)
No description available.

Page generated in 0.1511 seconds