• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 606
  • 285
  • 85
  • 61
  • 40
  • 18
  • 17
  • 16
  • 16
  • 16
  • 15
  • 12
  • 6
  • 5
  • 5
  • Tagged with
  • 1351
  • 236
  • 168
  • 164
  • 140
  • 125
  • 110
  • 109
  • 103
  • 94
  • 91
  • 90
  • 89
  • 82
  • 81
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
841

Evolution of SoftwareDocumentation Over Time : An analysis of the quality of softwaredocumentation

Tévar Hernández, Helena January 2020 (has links)
Software developers, maintainers, and testers rely on documentation to understandthe code they are working with. However, software documentation is perceivedas a waste of effort because it is usually outdated. How documentation evolvesthrough a set of releases may show whether there is any relationship between timeand quality. The results could help future developers and managers to improvethe quality of their documentation and decrease the time developers use to analyzecode. Previous studies showed that documentation used to be scarce and low inquality, thus, this research has investigated different variables to check if the qualityof the documentation changes over time. Therefore, we have created a tool thatwould extract and calculate the quality of the comments in code blocks, classes,and methods. The results have agreed with the previous studies. The quality of thedocumentation is affected to some extent through the releases, with a tendency todecrease.
842

Mining user similarity in online social networks : analysis,modeling and applications / Fouille de similarité de l'utilisateur dans les réseaux sociaux : analyse, modélisation et applications

Han, Xiao 21 May 2015 (has links)
Réseaux sociaux (RS) (par exemple, Facebook, Twitter et LinkedIn) ont gagné en popularité écrasante et accumulé des données numériques massives sur la société humaine. Ces données massives, représentant de l’information personnelle et sociale des individus, nous offrent des possibilités sans précédent pour étudier, analyser et modéliser la structure de réseau complexe, les relations humaines, les gens similitude, etc. Pendant ce temps, les RS ont déclenché un grand nombre d’applications et de services qui rentables chercher à maintenir des liens de vibrer et l’expérience des utilisateurs d’avance. Dans ce contexte, comment concevoir ces applications et les services, en particulier comment extraire et d’exploiter des fonctionnalités sociales efficaces à partir des données massives disponibles pour améliorer les applications et les services, a reçu beaucoup d’attention. Cette thèse, visant à améliorer les applications et les services sociaux, étudie trois questions essentielles et pratiques RS: (1) Comment pouvons-nous explorer les amis potentiels pour un utilisateur d’établir et d’élargir ses liens sociaux? (2) comment pouvons-nous découvrir un contenu intéressant pour un utilisateur pour satisfaire ses goûts personnels? (3) comment pouvons-nous informer un utilisateur du risque d’exposition de son information privée pour préserver sa vie privée? S’appuyant sur les idées sur la similarité de personnes dans les sciences sociales, cette thèse étudie les effets et les applications de l’utilisateur similitude dans les RS pour résoudre les problèmes mentionnés ci-dessus. Plus précisément, les sociologues suggèrent que la similitude engendre connexion et induit principe homophilie que les gens similaires (par exemple, même âge, l’éducation ou la profession) sont plus susceptibles de communiquer, de confiance et de partager l’information avec l’autre que ceux dissemblables. Inspiré par ces résultats, cette thèse étudie le principe de similitude répandue dans RS en termes de savoir si les utilisateurs similaires seraient proches dans leurs relations sociales, similaire dans leurs intérêts, ou approximative dans leur géo distance, en se appuyant sur 500K profils d’utilisateurs recueillies auprès de Facebook; il explore en outre des solutions pour exploiter efficacement le principe de similitude observée pour concevoir les quatre applications et des services sociaux suivantes: • Effets de Similarité de L’utilisateur sur Lien Prévision pour les Nouveaux Utilisateurs : nous analysons la prédiction de liaison pour les nouveaux utilisateurs qui n’ont pas créé de lien. Basé sur l’information limitée obtenu lors de votre inscription la procédure de nouveaux utilisateurs, ainsi que les attributs et les liens des utilisateurs existants dans un RS, nous étudions la façon dont beaucoup de similitude entre deux utilisateurs affecterait la probabilité qu’ils se lient d’amitié. En conséquence, nous proposons un modèle de prédiction de liaison efficace pour les nouveaux utilisateurs. • Similarité Minière de L’utilisateur pour la Découverte de Contenu en Réseaux P2P Sociale : nous examinons comment similarité et connaissances des participants dans RS pourraient bénéficier leur découverte de contenu dans les réseaux P2P. Nous construisons un modèle de réseau P2P sociale où chaque pair attribue plus de poids à ses amis dans RS qui ont similarité supérieur et plus de connaissances. Utilisation de marche aléatoire avec la méthode de redémarrage, nous présentons un nouveau contenu algorithme de découverte le dessus du modèle de réseau P2P sociale proposé. • Inspection intérêt similarité - Prédiction et Application : nous présentons des études empiriques détaillées sur les intérêts similitude et de révéler que les gens sont susceptibles de présenter des goûts similaires s’ils ont des informations démographiques similaires (par exemple, âge, lieu), ou s’elles sont amis. Par conséquent, étant donné un nouvel utilisateur dont les intérêts (...) / Online Social Networks (OSNs) (e.g., Facebook, Twitter and LinkedIn) have gained overwhelming popularity and accumulated massive digital data about human society. These massive data, representing individuals' personal and social information, provide us with unprecedented opportunities to study, analyze and model the complex network structure, human connections, people similarity, etc. Meanwhile, OSNs have triggered a large number of profitable applications and services which seek to maintain vibrate connections and advance users' experience. In this context, how to devise such applications and services, especially how to extract and exploit effective social features from the massive available data to enhance the applications and services, has received much attention. This dissertation, aiming to enhance the social applications and services, investigates three critical and practical issues in OSNs: (1) How can we explore potential friends for a user to establish and enlarge her social connections? (2) How can we discover interesting content for a user to satisfy her personal tastes? (3) How can we inform a user the exposure risk of her private information to preserve her privacy? Drawing on the insights about people's similarity in social science, this dissertation studies the widespread similarity principle in OSN in terms of whether similar users would be close in their social relationships, similar in their interests, or approximate in their geo-distance, relying on 500K user profiles collected from Facebook; it further explores solutions to effectively leverage the observed similarity principle to address the aforementioned practical issues
843

Tests d’hypothèses statistiquement et algorithmiquement efficaces de similarité et de dépendance / Statistically and computationally efficient hypothesis tests for similarity and dependency

Bounliphone, Wacha 30 January 2017 (has links)
Cette thèse présente de nouveaux tests d’hypothèses statistiques efficaces pour la relative similarité et dépendance, et l’estimation de la matrice de précision. La principale méthodologie adoptée dans cette thèse est la classe des estimateurs U-statistiques.Le premier test statistique porte sur les tests de relative similarité appliqués au problème de la sélection de modèles. Les modèles génératifs probabilistes fournissent un cadre puissant pour représenter les données. La sélection de modèles dans ce contexte génératif peut être difficile. Pour résoudre ce problème, nous proposons un nouveau test d’hypothèse non paramétrique de relative similarité et testons si un premier modèle candidat génère un échantillon de données significativement plus proche d’un ensemble de validation de référence.La deuxième test d’hypothèse statistique non paramétrique est pour la relative dépendance. En présence de dépendances multiples, les méthodes existantes ne répondent qu’indirectement à la question de la relative dépendance. Or, savoir si une dépendance est plus forte qu’une autre est important pour la prise de décision. Nous présentons un test statistique qui détermine si une variable dépend beaucoup plus d’une première variable cible ou d’une seconde variable.Enfin, une nouvelle méthode de découverte de structure dans un modèle graphique est proposée. En partant du fait que les zéros d’une matrice de précision représentent les indépendances conditionnelles, nous développons un nouveau test statistique qui estime une borne pour une entrée de la matrice de précision. Les méthodes existantes de découverte de structure font généralement des hypothèses restrictives de distributions gaussiennes ou parcimonieuses qui ne correspondent pas forcément à l’étude de données réelles. Nous introduisons ici un nouveau test utilisant les propriétés des U-statistics appliqués à la matrice de covariance, et en déduisons une borne sur la matrice de précision. / The dissertation presents novel statistically and computationally efficient hypothesis tests for relative similarity and dependency, and precision matrix estimation. The key methodology adopted in this thesis is the class of U-statistic estimators. The class of U-statistics results in a minimum-variance unbiased estimation of a parameter.The first part of the thesis focuses on relative similarity tests applied to the problem of model selection. Probabilistic generative models provide a powerful framework for representing data. Model selection in this generative setting can be challenging. To address this issue, we provide a novel non-parametric hypothesis test of relative similarity and test whether a first candidate model generates a data sample significantly closer to a reference validation set.Subsequently, the second part of the thesis focuses on developing a novel non-parametric statistical hypothesis test for relative dependency. Tests of dependence are important tools in statistical analysis, and several canonical tests for the existence of dependence have been developed in the literature. However, the question of whether there exist dependencies is secondary. The determination of whether one dependence is stronger than another is frequently necessary for decision making. We present a statistical test which determine whether one variables is significantly more dependent on a first target variable or a second.Finally, a novel method for structure discovery in a graphical model is proposed. Making use of a result that zeros of a precision matrix can encode conditional independencies, we develop a test that estimates and bounds an entry of the precision matrix. Methods for structure discovery in the literature typically make restrictive distributional (e.g. Gaussian) or sparsity assumptions that may not apply to a data sample of interest. Consequently, we derive a new test that makes use of results for U-statistics and applies them to the covariance matrix, which then implies a bound on the precision matrix.
844

Destabilization of protein-based emulsions caused by bacteriostatic emulsifiers / タンパク質で乳化したエマルションの静菌性乳化剤による不安定化

Matsumiya, Kentaro 24 March 2014 (has links)
京都大学 / 0048 / 新制・論文博士 / 博士(農学) / 乙第12820号 / 論農博第2793号 / 新制||農||1025(附属図書館) / 学位論文||H26||N4815(農学部図書室) / 31307 / 京都大学農学研究科農学専攻 / (主査)教授 松村 康生, 教授 裏出 令子, 教授 安達 修二 / 学位規則第4条第2項該当 / Doctor of Agricultural Science / Kyoto University / DFAM
845

Bilingual Lexicon Induction Framwork for Closely Related Languages / 近縁言語のための帰納的な対訳辞書生成フレームワーク

Arbi, Haza Nasution 25 September 2018 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第21395号 / 情博第681号 / 新制||情||117(附属図書館) / 京都大学大学院情報学研究科社会情報学専攻 / (主査)教授 石田 亨, 教授 吉川 正俊, 教授 河原 達也 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
846

Dynamic Information Density for Image Classification in an Active Learning Framework

Morgan, Joshua Edward 01 May 2020 (has links)
No description available.
847

Automated Extraction of Insurance Policy Information : Natural Language Processing techniques to automate the process of extracting information about the insurance coverage from unstructured insurance policy documents.

Hedberg, Jacob, Furberg, Erik January 2023 (has links)
This thesis investigates Natural Language Processing (NLP) techniques to extract relevant information from long and unstructured insurance policy documents. The goal is to reduce the amount of time required by readers to understand the coverage within the documents. The study uses predefined insurance policy coverage parameters, created by industry experts to represent what is covered in the policy documents. Three NLP approaches are used to classify the text sequences as insurance parameter classes. The thesis shows that using SBERT to create vector representations of text to allow cosine similarity calculations is an effective approach. The top scoring sequences for each parameter are assigned that parameter class. This approach shows a significant reduction in the number of sequences required to read by a user but misclassifies some positive examples. To improve the model, the parameter definitions and training data were combined into a support set. Similarity scores were calculated between all sequences and the support sets for each parameter using different pooling strategies. This few-shot classification approach performed well for the use case, improving the model’s performance significantly. In conclusion, this thesis demonstrates that NLP techniques can be applied to help understand unstructured insurance policy documents. The model developed in this study can be used to extract important information and reduce the time needed to understand the contents of aninsurance policy document. A human expert would however still be required to interpret the extracted text. The balance between the amount of relevant information and the amount of text shown would depend on how many of the top-scoring sequences are classified for each parameter. This study also identifies some limitations of the approach depending on available data. Overall, this research provides insight into the potential implications of NLP techniques for information extraction and the insurance industry.
848

Contributions for Handling Big Data Heterogeneity. Using Intuitionistic Fuzzy Set Theory and Similarity Measures for Classifying Heterogeneous Data

Ali, Najat January 2019 (has links)
A huge amount of data is generated daily by digital technologies such as social media, web logs, traffic sensors, on-line transactions, tracking data, videos, and so on. This has led to the archiving and storage of larger and larger datasets, many of which are multi-modal, or contain different types of data which contribute to the problem that is now known as “Big Data”. In the area of Big Data, volume, variety and velocity problems remain difficult to solve. The work presented in this thesis focuses on the variety aspect of Big Data. For example, data can come in various and mixed formats for the same feature(attribute) or different features and can be identified mainly by one of the following data types: real-valued, crisp and linguistic values. The increasing variety and ambiguity of such data are particularly challenging to process and to build accurate machine learning models. Therefore, data heterogeneity requires new methods of analysis and modelling techniques to enable useful information extraction and the modelling of achievable tasks. In this thesis, new approaches are proposed for handling heterogeneous Big Data. these include two techniques for filtering heterogeneous data objects are proposed. The two techniques called Two-Dimensional Similarity Space(2DSS) for data described by numeric and categorical features, and Three-Dimensional Similarity Space(3DSS) for real-valued, crisp and linguistic data are proposed for filtering such data. Both filtering techniques are used in this research to reduce the noise from the initial dataset and make the dataset more homogeneous. Furthermore, a new similarity measure based on intuitionistic fuzzy set theory is proposed. The proposed measure is used to handle the heterogeneity and ambiguity within crisp and linguistic data. In addition, new combine similarity models are proposed which allow for a comparison between the heterogeneous data objects represented by a combination of crisp and linguistic values. Diverse examples are used to illustrate and discuss the efficiency of the proposed similarity models. The thesis also presents modification of the k-Nearest Neighbour classifier, called k-Nearest Neighbour Weighted Average (k-NNWA), to classify the heterogeneous dataset described by real-valued, crisp and linguistic data. Finally, the thesis also introduces a novel classification model, called FCCM (Filter Combined Classification Model), for heterogeneous data classification. The proposed model combines the advantages of the 3DSS and k-NNWA classifier and outperforms the latter algorithm. All the proposed models and techniques have been applied to weather datasets and evaluated using accuracy, Fscore and ROC area measures. The experiments revealed that the proposed filtering techniques are an efficient approach for removing noise from heterogeneous data and improving the performance of classification models. Moreover, the experiments showed that the proposed similarity measure for intuitionistic fuzzy data is capable of handling the fuzziness of heterogeneous data and the intuitionistic fuzzy set theory offers some promise in solving some Big Data problems by handling the uncertainties, and the heterogeneity of the data.
849

AUTOMATED BRIDGE INSPECTION IMAGE LOCALIZATION AND RETRIEVAL BASED ON GPS-REFINED SIMILARITY LEARNING

Benjamin Eric Wogen (15315859) 24 April 2023 (has links)
<p>  </p> <p>The inspection of highway bridge structures in the United States is a task critical to the national transportation system. Inspection images contain abundant visual information that can be exploited to streamline bridge assessment and management tasks. However, historical inspection images often go unused in subsequent assessments as they are disorganized and unlabeled. Further, due to the lack of GPS metadata and visual ambiguity, it is often difficult for other inspectors to identify the location on the bridge where past images were taken. While many approaches are being considered toward fully- or semi-automated methods for bridge inspection, there are research opportunities to develop practical tools for inspectors to make use of those images already in a database. In this study, a deep learning-based image similarity technique is combined with image geolocation data to localize and retrieve historical inspection images based on a current query image. A Siamese convolutional neural network (SCNN) is trained and validated on a gathered dataset of over 1,000 real world bridge deck images collected by the Indiana Department of Transportation. A composite similarity (CS) metric is created for effective image ranking and the overall method is validated on a subset of eight bridge’s images. The results show promise for implementation into existing databases and for other similar structural inspections, showing up to an 11-fold improvement in successful image retrieval when compared to random image selection.</p>
850

Feature-based Comparison and Generation of Time Series

Kegel, Lars, Hahmann, Martin, Lehner, Wolfgang 17 August 2022 (has links)
For more than three decades, researchers have been developping generation methods for the weather, energy, and economic domain. These methods provide generated datasets for reasons like system evaluation and data availability. However, despite the variety of approaches, there is no comparative and cross-domain assessment of generation methods and their expressiveness. We present a similarity measure that analyzes generation methods regarding general time series features. By this means, users can compare generation methods and validate whether a generated dataset is considered similar to a given dataset. Moreover, we propose a feature-based generation method that evolves cross-domain time series datasets. This method outperforms other generation methods regarding the feature-based similarity.

Page generated in 0.0686 seconds