Global ETD Search

41	Social training : aprendizado semi supervisionado utilizando funções de escolha social / Social-Training: Semi-Supervised Learning Using Social Choice Functions Alves, Matheus January 2017 (has links) Dada a grande quantidade de dados gerados atualmente, apenas uma pequena porção dos mesmos pode ser rotulada manualmente por especialistas humanos. Isso é um desafio comum para aplicações de aprendizagem de máquina. Aprendizado semi-supervisionado aborda este problema através da manipulação dos dados não rotulados juntamente aos dados rotulados. Entretanto, se apenas uma quantidade limitada de exemplos rotulados está disponível, o desempenho da tarefa de aprendizagem de máquina (e.g., classificação) pode ser não satisfatória. Diversas soluções abordam este problema através do uso de uma ensemble de classificadores, visto que essa abordagem aumenta a diversidade dos classificadores. Algoritmos como o co-training e o tri-training utilizam múltiplas partições de dados ou múltiplos algoritmos de aprendizado para melhorar a qualidade da classificação de instâncias não rotuladas através de concordância por maioria simples. Além disso, existem abordagens que estendem esta ideia e adotam processos de votação menos triviais para definir os rótulos, como eleição por maioria ponderada, por exemplo. Contudo, estas soluções requerem que os rótulos possuam um certo nível de confiança para serem utilizados no treinamento. Consequentemente, nem toda a informação disponível é utilizada. Por exemplo: informações associadas a níveis de confiança baixos são totalmente ignoradas. Este trabalho propõe uma abordagem chamada social-training, que utiliza toda a informação disponível na tarefa de aprendizado semi-supervisionado. Para isto, múltiplos classificadores heterogêneos são treinados com os dados rotulados e geram diversas classificações para as mesmas instâncias não rotuladas. O social-training, então, agrega estes resultados em um único rótulo por meio de funções de escolha social que trabalham com agregação de rankings sobre as instâncias. Especificamente, a solução trabalha com casos de classificação binária. Os resultados mostram que trabalhar com o ranking completo, ou seja, rotular todas as instâncias não rotuladas, é capaz de reduzir o erro de classificação para alguns conjuntos de dados da base da UCI utilizados. / Given the huge quantity of data currently being generated, just a small portion of it can be manually labeled by human experts. This is a challenge for machine learning applications. Semi-supervised learning addresses this problem by handling unlabeled data alongside labeled ones. However, if only a limited quantity of labeled examples is available, the performance of the machine learning task (e.g., classification) can be very unsatisfactory. Many solutions address this issue by using a classifier ensemble because this increases diversity. Algorithms such as co-training and tri-training use multiple views or multiple learning algorithms in order to improve the classification of unlabeled instances through simple majority agreement. Also, there are approaches that extend this idea and adopt less trivial voting processes to define the labels, like weighted majority voting. Nevertheless, these solutions require some confidence level on the label in order to use it for training. Hence, not all information is used, i.e., information associated with low confidence level is disregarded completely. An approach called social-training is proposed, which uses all information available in the semi-supervised learning task. For this, multiple heterogeneous classifiers are trained with the labeled data and generate diverse classifications for the same unlabeled instances. Social-training then aggregates these results into a single label by means of social choice functions that work with rank aggregation over the instances. The solution addresses binary classification cases. The results show that working with the full ranking, i.e., labeling all unlabeled instances, is able to reduce the classification error for some UCI data sets used. Aprendizado : máquina Gestão do conhecimento Semi-supervised learning Social choice functions Classifier ensembles
42	Instrumental Justifications of Popular Rule Ingham, Sean January 2012 (has links) Ordinary citizens are rarely charged with making consequential decisions in representative democracies. Almost all consequential decisions are delegated to elected representatives or political appointees. On what basis should we judge whether decisions should be placed in the hands of ordinary citizens or delegated to political elites? I argue that decision-making authority should be allocated in whatever way an assembly of randomly selected citizens would choose, given reasonable beliefs about the consequences of their possible choices. The standard I defend is a variation of the principal-agent model of political representation, in which the people are viewed as a principal and officeholders as their agents. As it is usually formulated, the objectives of the people are deﬁned by the preferences of the majority. I draw on this formulation in chapter 4 to explain why the majority might rationally prefer to delegate authority to a citizens’ assembly instead of an elected legislature and why they might rationally view citizens’ assemblies with distrust, when they are organized and administered by elites. But the standard formulation of the principal-agent model does not provide a coherent standard when the will of the majority is not well-deﬁned. Several chapters on social choice theory explain this problem and why political theorists’ previous responses to it have been unconvincing. In light of this problem, I argue for a revisionary understanding of the principal-agent model, according to which the people and its will are identified not with the preferences of the majority but rather with the decisions of a citizens’ assembly. To motivate this approach I offer a critique of the recent literature on “epistemic democracy,” which describes an alternative form of justification for empowering ordinary citizens. Appeals to expertise and knowledge have historically figured prominently in justifications of political exclusion and hierarchy, but epistemic democrats put them to use in defending participatory forms of democratic politics. Epistemic democrats claim that decision processes in which inexpert, ordinary citizens participate can exhibit greater “collective wisdom” than elite- or expert-dominated decision-making. Chapters 2 and 3 explain why these arguments sit uncomfortably with the nature of disagreements in politics. / Government political science philosophy citizens' assemblies democratic theory epistemic democracy populism social choice theory
43	Υπολογιστικά ζητήματα στην κοινωνική επιλογή : μελέτη των ψηφοφοριών Dodgson Καρανικόλας, Νικόλαος 27 April 2009 (has links) Η ψηφοφορία είναι ένας δημοφιλής τρόπος για κατανεμημένη λήψη αποφάσεων και παραδοσιακά είναι το αντικείμενο της θεωρίας κοινωνικής επιλογής έχοντας ως κεντρικό πρόβλημα το πως θα φτάσουμε ομόφωνα σε μια κοινωνικά καλή απόφαση έχοντας ως δεδομένο τις προτιμήσεις των ψηφοφόρων πάνω σε ένα σύνολο από υποψηφίους. Πολλά συστήματα ψηφοφορίας έχουν εμφανιστεί στη σχετική βιβλιογραφία από τότε που οι Borda και Marquis de Condorcet πρότειναν στα τέλη του 18ου αιώνα τα πρώτα συστήματα. Ενώ οι περισσότερες από τις σχετικές έρευνες εστιάζουν στις ιδιότητες των συστημάτων ψηφοφορίας για κυβερνητικές εκλογές ή λήψη αποφάσεων σε επιτροπές, η εμφάνιση εφαρμογών μεγάλης κλίμακας για εξόρυξη πληροφορίας, κατάταξη, και ανάκτηση έχει βάλει την ψηφοφορία στην ημερήσια διάταξη της έρευνας της επιστήμης των υπολογιστών. Όντως, προβλήματα σαν την κατάταξη συνόλων μπορούν να θεωρηθούν ως προβλήματα εκλογών. Στα προβλήματα κατάταξης συνόλων, δίδεται ένα σύνολο από διαφορετικές κατατάξεις (π.χ. τα αποτελέσματα από διαφορετικές μηχανές αναζήτησης ιστοσελίδων σε ένα συγκεκριμένο ερώτημα) για το ίδιο σύνολο δεδομένων (π.χ. ιστοσελίδες σχετικές με το ερώτημα), και ο σκοπός είναι να επιλεγεί μια μοναδική κατάταξη που είναι κοντά σε όλες τις κατατάξεις σύμφωνα με ένα καλώς ορισμένο κριτήριο. Σε αυτό το παράδειγμα, οι διαφορετικές μηχανές αναζήτησης είναι οι ψηφοφόροι και κάθε σελίδα αντιστοιχεί σε ένα υποψήφιο, και ο σκοπός σύμφωνα με το οποίον υπολογίζεται η μοναδική κατάταξη είναι ο κανόνας ψηφοφορίας. Είναι φανερό ότι σε τέτοιες εφαρμογές η απόφαση για το ποιος είναι ο νικητής των εκλογών δεν είναι το μόνο πρόβλημα, συνήθως απαιτείται η πλήρης κατάταξη των υποψηφίων. Στην εργασία αυτή γίνεται αρχικά μια προσπάθεια καταγραφής των κυριότερων συστημάτων κοινωνικής επιλογής. Κατά κύριο λόγο εστιάζουμε στη μέθοδο που πρότεινε ο Dodgson και ακολούθως στην μέθοδο του Young. Αυτοί οι κανόνες ψηφοφορίας έχουν σχεδιαστεί για να βρίσκουν τον υποψήφιο που είναι πιο κοντά στο νικητή κατά Condorcet. Το σκορ ενός δεδομένου υποψηφίου είναι γνωστό ότι είναι δύσκολο να υπολογιστεί και για τους δυο κανόνες. Σε αυτήν την εργασία, προτείνουμε για την μέθοδο του Dodgson δυο προσεγγιστικούς αλγόριθμους. Πιο συγκεκριμένα παρουσιάστηκαν και αναλύθηκαν δυο προσεγγιστικοί αλγόριθμοι υπολογισμού του Dodgson σκορ ενός υποψηφίου σε μία εκλογή Dodgson με N υποψηφίους, ένας άπληστος ντετερμινιστικός και ένας πιθανοτικός. Και οι δυο αλγόριθμοι έχουν λόγο προσέγγισης Ο (log N). Επίσης αποδεικνύουμε ότι ο άπληστος αλγόριθμος είναι βέλτιστος μέχρι ένα παράγοντα της τάξης του 2 εκτός αν όλα τα προβλήματα που ανήκουν στο ΝΡ έχουν υπο-εκθετικού (quasi-polynomial) χρόνου αλγορίθμους. Παρόλο που ο άπληστος αλγόριθμος είναι υπολογιστικά ισχυρότερος, ο πιθανοτικός μας αλγόριθμος έχει πλεονέκτημα υπό την οπτική της θεωρίας κοινωνικής επιλογής. Ακόμη, δείχνουμε ότι ο υπολογισμός οποιασδήποτε ικανοποιητικής προσέγγισης που παράγεται από τον κανόνα του Dodgson είναι υπολογιστικά δύσκολη. Αυτό παρέχει μια θεωρητική εξήγηση από σκοπιά υπολογιστικής πολυπλοκότητας για τις μεγάλες διαφορές που έχουν παρατηρηθεί στην θεωρία κοινωνικής επιλογής όταν συγκρίνονται οι εκλογές Dodgson με απλούστερους κανόνες ψηφοφορίας. Τέλος δείχνουμε ότι το πρόβλημα υπολογισμού του Young σκορ είναι ΝΡ-δύσκολο να προσεγγιστεί υπό οποιονδήποτε παράγοντα. Τα κυριότερα αποτελέσματα που εκπονήθηκαν σε αυτήν την εργασία παρουσιάστηκαν στο συνέδριο ACM-SIAM Symposium on Discrete Algorithms (SODA09). / Voting is a popular way for distributed decision making and has traditionally been the subject of Social Choice Theory with the central issue being how to reach consensus on a socially good decision given the preferences of voters on a set of alternatives (or candidates). Several voting systems have appeared in the related literature since the first voting systems were proposed by Borda and Marquis de Condorcet at the end of the 18th century. While most of the related studies have focused on properties of voting systems for government elections or decision making in committees, the emergence of large-scale applications for data mining, classification, and retrieval has put voting in the research agenda of Computer Science. Indeed, problems like rank aggregation can be thought of as elections. In rank aggregation, we are given a set of different rankings (e.g., the results from different web search engines on a particular query) over the same set of data (e.g., web pages related to the query), and the objective is to select a single ranking which is close to all rankings according to a well-defined criterion. In this example, the different web search engines are the voters, each web page corresponds to a candidate, and the objective according to which the single ranking is computed is the voting rule. Clearly, in such applications, deciding the winner of the election is not the only issue; usually, the ranking of the candidates is required as a complete answer. In this thesis firstly we familiarize the reader with the main different methods of social choice theory. We focus on two methods, the Dodgson method and the Young one. These two voting rules have been designed in order to find the candidate which is closer to the Condorcet winner, under two different significances of approach. The score of a given candidate is known that is NP-hard to compute for the two voting rules. So we suggest two approximation algorithms for the Dodgson's method. These two approximation algorithms compute the Dodgson score of a given candidate in an election of N candidates. The first one is a greedy deterministic algorithm while the second one is randomized. Both algorithms have approximation ratio of O(logN). While the greedy algorithm is computationally superior in every way, we show that the randomized has the advantage of being monotonic, which is a desirable property from a social choice point of view. We further observe that it follows from the work of McCabe-Dansted that the Dodgson score cannot be approximated within sublogarithmic factors by polynomial-time deterministic algorithms unless P = NP, and by polynomial-time randomized algorithms unless RP = NP. We prove a more explicit inapproximability result of (1-ε) lnm, under the assumption that problems in NP do not have algorithms running in quasi-polynomial time; this implies that the approximation ratio achieved by our greedy algorithm is optimal up to a factor of 2. Some of the results mentioned above establish that there are sharp discrepancies between the Dodgson ranking and the rankings produced by other rank aggregation rules. Some of these rules (e.g., Borda and Copeland) are polynomial-time computable, so the corresponing results can be viewed as negative results regarding the approximability of the Dodgson ranking by polynomial-time algorithms. We show that the problem of distinguishing between whether a given alternative is the unique Dodgson winner or in the last O(√m) positions in any Dodgson ranking is NP-hard. Finally, we found the following result : it is NP-hard to approximate the Young score within any factor. Speciφιcally, we show that it is NP-hard to distinguish between the case where the Young score of a given alternative is 0, and the case where the score is greater than 0. As a corollary we obtain an inapproximability result for the Young ranking. The results of this thesis were presented in ACM-SIAM Symposium on Discrete Algorithms (SODA09). Εκλογές Κοινωνική επιλογή Υποψήφιοι Ψηφοφόροι 324.650 285 Elections Social choice Alternatives Voters
44	Stochastic Mechanisms for Truthfulness and Budget Balance in Computational Social Choice Dufton, Lachlan Thomas January 2013 (has links) In this thesis, we examine stochastic techniques for overcoming game theoretic and computational issues in the collective decision making process of self-interested individuals. In particular, we examine truthful, stochastic mechanisms, for settings with a strong budget balance constraint (i.e. there is no net flow of money into or away from the agents). Building on past results in AI and computational social choice, we characterise affine-maximising social choice functions that are implementable in truthful mechanisms for the setting of heterogeneous item allocation with unit demand agents. We further provide a characterisation of affine maximisers with the strong budget balance constraint. These mechanisms reveal impossibility results and poor worst-case performance that motivates us to examine stochastic solutions. To adequately compare stochastic mechanisms, we introduce and discuss measures that capture the behaviour of stochastic mechanisms, based on techniques used in stochastic algorithm design. When applied to deterministic mechanisms, these measures correspond directly to existing deterministic measures. While these approaches have more general applicability, in this work we assess mechanisms based on overall agent utility (efficiency and social surplus ratio) as well as fairness (envy and envy-freeness). We observe that mechanisms can (and typically must) achieve truthfulness and strong budget balance using one of two techniques: labelling a subset of agents as ``auctioneers'' who cannot affect the outcome, but collect any surplus; and partitioning agents into disjoint groups, such that each partition solves a subproblem of the overall decision making process. Worst-case analysis of random-auctioneer and random-partition stochastic mechanisms show large improvements over deterministic mechanisms for heterogeneous item allocation. In addition to this allocation problem, we apply our techniques to envy-freeness in the room assignment-rent division problem, for which no truthful deterministic mechanism is possible. We show how stochastic mechanisms give an improved probability of envy-freeness and low expected level of envy for a truthful mechanism. The random-auctioneer technique also improves the worst-case performance of the public good (or public project) problem. Communication and computational complexity are two other important concerns of computational social choice. Both the random-auctioneer and random-partition approaches offer a flexible trade-off between low complexity of the mechanism, and high overall outcome quality measured, for example, by total agent utility. They enable truthful and feasible solutions to be incrementally improved on as the mechanism receives more information and is allowed more processing time. The majority of our results are based on optimising worst-case performance, since this provides guarantees on how a mechanism will perform, regardless of the agents that use it. To complement these results, we perform empirical, average-case analyses on our mechanisms. Finally, while strong budget balance is a fixed constraint in our particular social choice problems, we show empirically that this can improve the overall utility of agents compared to a utility-maximising assignment that requires a budget imbalanced mechanism. Artificial Intelligence Game Theory Mechanism Design Computational Social Choice Auction Design Resource Allocation Computer Science
45	Stochastic Mechanisms for Truthfulness and Budget Balance in Computational Social Choice Dufton, Lachlan Thomas January 2013 (has links) In this thesis, we examine stochastic techniques for overcoming game theoretic and computational issues in the collective decision making process of self-interested individuals. In particular, we examine truthful, stochastic mechanisms, for settings with a strong budget balance constraint (i.e. there is no net flow of money into or away from the agents). Building on past results in AI and computational social choice, we characterise affine-maximising social choice functions that are implementable in truthful mechanisms for the setting of heterogeneous item allocation with unit demand agents. We further provide a characterisation of affine maximisers with the strong budget balance constraint. These mechanisms reveal impossibility results and poor worst-case performance that motivates us to examine stochastic solutions. To adequately compare stochastic mechanisms, we introduce and discuss measures that capture the behaviour of stochastic mechanisms, based on techniques used in stochastic algorithm design. When applied to deterministic mechanisms, these measures correspond directly to existing deterministic measures. While these approaches have more general applicability, in this work we assess mechanisms based on overall agent utility (efficiency and social surplus ratio) as well as fairness (envy and envy-freeness). We observe that mechanisms can (and typically must) achieve truthfulness and strong budget balance using one of two techniques: labelling a subset of agents as ``auctioneers'' who cannot affect the outcome, but collect any surplus; and partitioning agents into disjoint groups, such that each partition solves a subproblem of the overall decision making process. Worst-case analysis of random-auctioneer and random-partition stochastic mechanisms show large improvements over deterministic mechanisms for heterogeneous item allocation. In addition to this allocation problem, we apply our techniques to envy-freeness in the room assignment-rent division problem, for which no truthful deterministic mechanism is possible. We show how stochastic mechanisms give an improved probability of envy-freeness and low expected level of envy for a truthful mechanism. The random-auctioneer technique also improves the worst-case performance of the public good (or public project) problem. Communication and computational complexity are two other important concerns of computational social choice. Both the random-auctioneer and random-partition approaches offer a flexible trade-off between low complexity of the mechanism, and high overall outcome quality measured, for example, by total agent utility. They enable truthful and feasible solutions to be incrementally improved on as the mechanism receives more information and is allowed more processing time. The majority of our results are based on optimising worst-case performance, since this provides guarantees on how a mechanism will perform, regardless of the agents that use it. To complement these results, we perform empirical, average-case analyses on our mechanisms. Finally, while strong budget balance is a fixed constraint in our particular social choice problems, we show empirically that this can improve the overall utility of agents compared to a utility-maximising assignment that requires a budget imbalanced mechanism. Artificial Intelligence Game Theory Mechanism Design Computational Social Choice Auction Design Resource Allocation Computer Science
46	Equity and efficiency considerations of public higher education Barbaro, Salvatore. January 1900 (has links) Thesis (Ph. D.) - University of Göttingen, 2004.
47	Distributive preferences in social dilemmas / Kazemi, Ali, January 2006 (has links) Diss. Göteborg : Göteborgs universitet, 2007. / Härtill 4 uppsatser.
48	Equity and efficiency considerations of public higher education Barbaro, Salvatore. January 1900 (has links) Thesis (Ph. D.)--University of Göttingen, 2004. / Description based on print version record.
49	Social training : aprendizado semi supervisionado utilizando funções de escolha social / Social-Training: Semi-Supervised Learning Using Social Choice Functions Alves, Matheus January 2017 (has links) Dada a grande quantidade de dados gerados atualmente, apenas uma pequena porção dos mesmos pode ser rotulada manualmente por especialistas humanos. Isso é um desafio comum para aplicações de aprendizagem de máquina. Aprendizado semi-supervisionado aborda este problema através da manipulação dos dados não rotulados juntamente aos dados rotulados. Entretanto, se apenas uma quantidade limitada de exemplos rotulados está disponível, o desempenho da tarefa de aprendizagem de máquina (e.g., classificação) pode ser não satisfatória. Diversas soluções abordam este problema através do uso de uma ensemble de classificadores, visto que essa abordagem aumenta a diversidade dos classificadores. Algoritmos como o co-training e o tri-training utilizam múltiplas partições de dados ou múltiplos algoritmos de aprendizado para melhorar a qualidade da classificação de instâncias não rotuladas através de concordância por maioria simples. Além disso, existem abordagens que estendem esta ideia e adotam processos de votação menos triviais para definir os rótulos, como eleição por maioria ponderada, por exemplo. Contudo, estas soluções requerem que os rótulos possuam um certo nível de confiança para serem utilizados no treinamento. Consequentemente, nem toda a informação disponível é utilizada. Por exemplo: informações associadas a níveis de confiança baixos são totalmente ignoradas. Este trabalho propõe uma abordagem chamada social-training, que utiliza toda a informação disponível na tarefa de aprendizado semi-supervisionado. Para isto, múltiplos classificadores heterogêneos são treinados com os dados rotulados e geram diversas classificações para as mesmas instâncias não rotuladas. O social-training, então, agrega estes resultados em um único rótulo por meio de funções de escolha social que trabalham com agregação de rankings sobre as instâncias. Especificamente, a solução trabalha com casos de classificação binária. Os resultados mostram que trabalhar com o ranking completo, ou seja, rotular todas as instâncias não rotuladas, é capaz de reduzir o erro de classificação para alguns conjuntos de dados da base da UCI utilizados. / Given the huge quantity of data currently being generated, just a small portion of it can be manually labeled by human experts. This is a challenge for machine learning applications. Semi-supervised learning addresses this problem by handling unlabeled data alongside labeled ones. However, if only a limited quantity of labeled examples is available, the performance of the machine learning task (e.g., classification) can be very unsatisfactory. Many solutions address this issue by using a classifier ensemble because this increases diversity. Algorithms such as co-training and tri-training use multiple views or multiple learning algorithms in order to improve the classification of unlabeled instances through simple majority agreement. Also, there are approaches that extend this idea and adopt less trivial voting processes to define the labels, like weighted majority voting. Nevertheless, these solutions require some confidence level on the label in order to use it for training. Hence, not all information is used, i.e., information associated with low confidence level is disregarded completely. An approach called social-training is proposed, which uses all information available in the semi-supervised learning task. For this, multiple heterogeneous classifiers are trained with the labeled data and generate diverse classifications for the same unlabeled instances. Social-training then aggregates these results into a single label by means of social choice functions that work with rank aggregation over the instances. The solution addresses binary classification cases. The results show that working with the full ranking, i.e., labeling all unlabeled instances, is able to reduce the classification error for some UCI data sets used. Aprendizado : máquina Gestão do conhecimento Semi-supervised learning Social choice functions Classifier ensembles
50	Social training : aprendizado semi supervisionado utilizando funções de escolha social / Social-Training: Semi-Supervised Learning Using Social Choice Functions Alves, Matheus January 2017 (has links) Dada a grande quantidade de dados gerados atualmente, apenas uma pequena porção dos mesmos pode ser rotulada manualmente por especialistas humanos. Isso é um desafio comum para aplicações de aprendizagem de máquina. Aprendizado semi-supervisionado aborda este problema através da manipulação dos dados não rotulados juntamente aos dados rotulados. Entretanto, se apenas uma quantidade limitada de exemplos rotulados está disponível, o desempenho da tarefa de aprendizagem de máquina (e.g., classificação) pode ser não satisfatória. Diversas soluções abordam este problema através do uso de uma ensemble de classificadores, visto que essa abordagem aumenta a diversidade dos classificadores. Algoritmos como o co-training e o tri-training utilizam múltiplas partições de dados ou múltiplos algoritmos de aprendizado para melhorar a qualidade da classificação de instâncias não rotuladas através de concordância por maioria simples. Além disso, existem abordagens que estendem esta ideia e adotam processos de votação menos triviais para definir os rótulos, como eleição por maioria ponderada, por exemplo. Contudo, estas soluções requerem que os rótulos possuam um certo nível de confiança para serem utilizados no treinamento. Consequentemente, nem toda a informação disponível é utilizada. Por exemplo: informações associadas a níveis de confiança baixos são totalmente ignoradas. Este trabalho propõe uma abordagem chamada social-training, que utiliza toda a informação disponível na tarefa de aprendizado semi-supervisionado. Para isto, múltiplos classificadores heterogêneos são treinados com os dados rotulados e geram diversas classificações para as mesmas instâncias não rotuladas. O social-training, então, agrega estes resultados em um único rótulo por meio de funções de escolha social que trabalham com agregação de rankings sobre as instâncias. Especificamente, a solução trabalha com casos de classificação binária. Os resultados mostram que trabalhar com o ranking completo, ou seja, rotular todas as instâncias não rotuladas, é capaz de reduzir o erro de classificação para alguns conjuntos de dados da base da UCI utilizados. / Given the huge quantity of data currently being generated, just a small portion of it can be manually labeled by human experts. This is a challenge for machine learning applications. Semi-supervised learning addresses this problem by handling unlabeled data alongside labeled ones. However, if only a limited quantity of labeled examples is available, the performance of the machine learning task (e.g., classification) can be very unsatisfactory. Many solutions address this issue by using a classifier ensemble because this increases diversity. Algorithms such as co-training and tri-training use multiple views or multiple learning algorithms in order to improve the classification of unlabeled instances through simple majority agreement. Also, there are approaches that extend this idea and adopt less trivial voting processes to define the labels, like weighted majority voting. Nevertheless, these solutions require some confidence level on the label in order to use it for training. Hence, not all information is used, i.e., information associated with low confidence level is disregarded completely. An approach called social-training is proposed, which uses all information available in the semi-supervised learning task. For this, multiple heterogeneous classifiers are trained with the labeled data and generate diverse classifications for the same unlabeled instances. Social-training then aggregates these results into a single label by means of social choice functions that work with rank aggregation over the instances. The solution addresses binary classification cases. The results show that working with the full ranking, i.e., labeling all unlabeled instances, is able to reduce the classification error for some UCI data sets used. Aprendizado : máquina Gestão do conhecimento Semi-supervised learning Social choice functions Classifier ensembles

Search results