1 |
SVM-based algorithms for aligning ontologies using literaturexu, wei January 2008 (has links)
<p>Ontologies is one of the key techniques used in Semantic Web establishment. Nowadays,many ontologies have been developed and it is critical to understand the relationships between the terms of the ontologies, i.e. we need to align the ontologies.</p><p>This thesis deals with an approach for finding relationships between ontologies using literature by classifying documents related to terms in the ontologies.</p><p> </p><p>In this project the general method from [1] is used, but in the classifier generation part, a brand new classifier based on SVMs algorithm is implemented by LPU and SVM<em><sup>light</sup></em>. We evaluate our approach and compare it to previous approaches.</p>
|
2 |
Feature Ranking for Text ClassifiersMakrehchi, Masoud January 2007 (has links)
Feature selection based on feature ranking has received much
attention by researchers in the field of text classification. The
major reasons are their scalability, ease of use, and fast computation. %,
However, compared to the search-based feature selection methods such
as wrappers and filters, they suffer from poor performance. This is
linked to their major deficiencies, including: (i) feature ranking
is problem-dependent; (ii) they ignore term dependencies, including
redundancies and correlation; and (iii) they usually fail in
unbalanced data.
While using feature ranking methods for dimensionality reduction, we
should be aware of these drawbacks, which arise from the function of
feature ranking methods. In this thesis, a set of solutions is
proposed to handle the drawbacks of feature ranking and boost their
performance. First, an evaluation framework called feature
meta-ranking is proposed to evaluate ranking measures. The framework
is based on a newly proposed Differential Filter Level Performance
(DFLP) measure. It was proved that, in ideal cases, the performance
of text classifier is a monotonic, non-decreasing function of the
number of features. Then we theoretically and empirically validate
the effectiveness of DFLP as a meta-ranking measure to evaluate and
compare feature ranking methods. The meta-ranking framework is also
examined by a stopword extraction problem. We use the framework to
select appropriate feature ranking measure for building
domain-specific stoplists. The proposed framework is evaluated by
SVM and Rocchio text classifiers on six benchmark data. The
meta-ranking method suggests that in searching for a proper feature
ranking measure, the backward feature ranking is as important as the
forward one.
Second, we show that the destructive effect of term redundancy gets
worse as we decrease the feature ranking threshold. It implies that
for aggressive feature selection, an effective redundancy reduction
should be performed as well as feature ranking. An algorithm based
on extracting term dependency links using an information theoretic
inclusion index is proposed to detect and handle term dependencies.
The dependency links are visualized by a tree structure called a
term dependency tree. By grouping the nodes of the tree into two
categories, including hub and link nodes, a heuristic algorithm is
proposed to handle the term dependencies by merging or removing the
link nodes. The proposed method of redundancy reduction is evaluated
by SVM and Rocchio classifiers for four benchmark data sets.
According to the results, redundancy reduction is more effective on
weak classifiers since they are more sensitive to term redundancies.
It also suggests that in those feature ranking methods which compact
the information in a small number of features, aggressive feature
selection is not recommended.
Finally, to deal with class imbalance in feature level using ranking
methods, a local feature ranking scheme called reverse
discrimination approach is proposed. The proposed method is applied
to a highly unbalanced social network discovery problem. In this
case study, the problem of learning a social network is translated
into a text classification problem using newly proposed actor and
relationship modeling. Since social networks are usually sparse
structures, the corresponding text classifiers become highly
unbalanced. Experimental assessment of the reverse discrimination
approach validates the effectiveness of the local feature ranking
method to improve the classifier performance when dealing with
unbalanced data. The application itself suggests a new approach to
learn social structures from textual data.
|
3 |
Feature Ranking for Text ClassifiersMakrehchi, Masoud January 2007 (has links)
Feature selection based on feature ranking has received much
attention by researchers in the field of text classification. The
major reasons are their scalability, ease of use, and fast computation. %,
However, compared to the search-based feature selection methods such
as wrappers and filters, they suffer from poor performance. This is
linked to their major deficiencies, including: (i) feature ranking
is problem-dependent; (ii) they ignore term dependencies, including
redundancies and correlation; and (iii) they usually fail in
unbalanced data.
While using feature ranking methods for dimensionality reduction, we
should be aware of these drawbacks, which arise from the function of
feature ranking methods. In this thesis, a set of solutions is
proposed to handle the drawbacks of feature ranking and boost their
performance. First, an evaluation framework called feature
meta-ranking is proposed to evaluate ranking measures. The framework
is based on a newly proposed Differential Filter Level Performance
(DFLP) measure. It was proved that, in ideal cases, the performance
of text classifier is a monotonic, non-decreasing function of the
number of features. Then we theoretically and empirically validate
the effectiveness of DFLP as a meta-ranking measure to evaluate and
compare feature ranking methods. The meta-ranking framework is also
examined by a stopword extraction problem. We use the framework to
select appropriate feature ranking measure for building
domain-specific stoplists. The proposed framework is evaluated by
SVM and Rocchio text classifiers on six benchmark data. The
meta-ranking method suggests that in searching for a proper feature
ranking measure, the backward feature ranking is as important as the
forward one.
Second, we show that the destructive effect of term redundancy gets
worse as we decrease the feature ranking threshold. It implies that
for aggressive feature selection, an effective redundancy reduction
should be performed as well as feature ranking. An algorithm based
on extracting term dependency links using an information theoretic
inclusion index is proposed to detect and handle term dependencies.
The dependency links are visualized by a tree structure called a
term dependency tree. By grouping the nodes of the tree into two
categories, including hub and link nodes, a heuristic algorithm is
proposed to handle the term dependencies by merging or removing the
link nodes. The proposed method of redundancy reduction is evaluated
by SVM and Rocchio classifiers for four benchmark data sets.
According to the results, redundancy reduction is more effective on
weak classifiers since they are more sensitive to term redundancies.
It also suggests that in those feature ranking methods which compact
the information in a small number of features, aggressive feature
selection is not recommended.
Finally, to deal with class imbalance in feature level using ranking
methods, a local feature ranking scheme called reverse
discrimination approach is proposed. The proposed method is applied
to a highly unbalanced social network discovery problem. In this
case study, the problem of learning a social network is translated
into a text classification problem using newly proposed actor and
relationship modeling. Since social networks are usually sparse
structures, the corresponding text classifiers become highly
unbalanced. Experimental assessment of the reverse discrimination
approach validates the effectiveness of the local feature ranking
method to improve the classifier performance when dealing with
unbalanced data. The application itself suggests a new approach to
learn social structures from textual data.
|
4 |
SVM-based algorithms for aligning ontologies using literatureXu, Wei January 2008 (has links)
Ontologies is one of the key techniques used in Semantic Web establishment. Nowadays,many ontologies have been developed and it is critical to understand the relationships between the terms of the ontologies, i.e. we need to align the ontologies. This thesis deals with an approach for finding relationships between ontologies using literature by classifying documents related to terms in the ontologies. In this project the general method from [1] is used, but in the classifier generation part, a brand new classifier based on SVMs algorithm is implemented by LPU and SVMlight. We evaluate our approach and compare it to previous approaches.
|
5 |
Detektering av fusk vid användning av AI : En studie av detektionsmetoder / Detection of cheating when using AI : A study of detection methodsEnnajib, Karim, Liang, Tommy January 2023 (has links)
Denna rapport analyserar och testar olika metoder som syftar till att särskiljamänskligt genererade lösningar på uppgifter och texter från de som genereras avartificiell intelligens. På senare tid har användningen av artificiell intelligens setten betydande ökning, särskilt bland studenter. Syftet med denna studie är attavgöra om det för närvarande är möjligt att upptäcka fusk från högskolestudenterinom elektroteknik som använder sig av AI. I rapporten testas lösningar påuppgifter och texter genererade av programmet ChatGPT med hjälp av en generellmetod och externa AI-verktyg. Undersökningen omfattar områdena matematik,programmering och skriven text. Resultatet av undersökningen tyder på att detinte är möjligt att upptäcka fusk med hjälp av AI i ämnena matematik ochprogrammering. Dock när det gäller text kan i viss utsträckning fusk vidanvändning av en AI upptäckas. / This report analyzes and tests various methods aimed at distinguishinghuman-generated solutions to tasks and texts from those generated by artificialintelligence. Recently the use of artificial intelligence has seen a significantincrease, especially among students. The purpose of this study is to determinewhether it is currently possible to detect if a college student in electricalengineering is using AI to cheat. In this report, solutions to tasks and textsgenerated by the program ChatGPT are tested using a general methodology andexternal AI-based tools. The research covers the areas of mathematics,programming and written text. The results of the investigation suggest that it is notpossible to detect cheating with the help of an AI in the subjects of mathematicsand programming. In the case of text, cheating by using an AI can be detected tosome extent.
|
Page generated in 0.0593 seconds