Global ETD Search

1	Scheduling for Reliability : complexity and Algorithms Dufossé, Fanny 06 September 2011 (has links) (PDF) This thesis deals with the mapping and the scheduling of workflows. In this context, we consider unreliable platforms, with processors subject to failures. In a first part, we consider a particular model of streaming applications : the filtering services. In this context, we aim at minimizing period and latency. We first neglect communication costs. In this model, we study scheduling problems on homogeneous and heterogeneous platforms. Then, the impact of communication costs on scheduling problems of a filtering application is studied. Finally, we consider the scheduling problem of such an application on a chain of processors. The theoretical complexity of any variant of this problem is proved. This filtering property can model the reliability of processors. The results of some computations are successfully computed, and some other ones are lost. We consider the more frequent failure types : transient failures. We aim efficient and reliable schedules. The complexity of many variants of this problem is proved. Two heuristics are proposed and compared using using simulations. Even if transient failures are the most common failures in classical grids, some particular type of platform are more concerned by other type of problems. Desktop grids are especially unstable. In this context, we want to execute iterative applications. All tasks are executed, then a synchronization occurs, and so on. Two variants of this problem are considered : applicationsof independent tasks, and applications where all tasks need to be executed at same speed. In both cases, the problem is first theoretically studied, then heuristics are proposed and compared using simulations. [INFO:INFO_OH] Computer Science/Other Streaming applications Multicriteria optimization Heuristics Linear programs Heterogeneous platforms Complexity results Desktop grids Transient failures Reliability
2	Shop-Scheduling Problems with Transportation Knust, Sigrid 26 September 2000 (has links) In this thesis scheduling problems with transportation aspects are studied. Classical scheduling models for problems with multiple operations are the so-called shop-scheduling models. In these models jobs consisting of different operations have to be planned on certain machines in such a way that a given objective function is minimized. Each machine may process at most one operation at a time and operations belonging to the same job cannot be processed simultaneously. We generalize these classical shop-scheduling problems by assuming that the jobs additionally have to be transported between the machines. This transportation has to be done by robots which can handle at most one job at a time. Besides transportation times which occur for the jobs during their transport, also empty moving times are considered which arise when a robot moves empty from one machine to another. Two types of problems are distinguished: on the one hand, problems without transportation conflicts (i.e. each transportation can be performed without delay), and on the other hand, problems where transportation conflicts may arise due to a limited capacity of transport robots. In the first part of this thesis several new complexity results are derived for flow-shop problems with a single robot. Since very special cases of these problems are already NP-hard, in the second part of this thesis some techniques are developed for dealing with these hard problems in practice. We concentrate on the job-shop problem with a single robot and the makespan objective. At first we study the subproblem which arises for the robot when some scheduling decisions for the machines have already been made. The resulting single-machine problem can be regarded as a generalization of the traveling salesman problem with time windows where additionally minimal time-lags between certain jobs have to be respected and the makespan has to be minimized. For this single-machine problem we adapt immediate selection techniques used for other scheduling problems and calculate lower bounds based on linear programming and the technique of column generation. On the other hand, to determine upper bounds for the single-machine problem we develop an efficient local search algorithm which finds good solutions in reasonable time. This algorithm is integrated into a local search algorithm for the job-shop problem with a single robot. Finally, the proposed algorithms are tested on different test data and computational results are presented. scheduling transportation job-shop problem traveling salesman problem complexity results lower bounds immediate selection local search 90B35 27 - Mathematik ddc:510
3	Scheduling for Reliability : complexity and Algorithms / Ordonnancement pour la Fiabilité : complexité et algorithmes Dufossé, Fanny 06 September 2011 (has links) Les travaux présentés dans cette thèse portent sur le placement et l’ordonnancement d’applications de flots de données. On se place dans le contexte de plates-formes composées de processeurs sujets à des pannes. Dans une première partie, on considère un type particulier d’applications de flots de données: les services filtrants. On étudie l'ordonnancement de telles applications sur des plates-formes homogènes et hétérogènes, d'abord sans tenir compte des coûts de communication, puis en les incluant dans le modèle. On considère enfin l’ordonnancement d’un tel calcul sur une chaîne de processeurs. Le comportement d’un service filtrant est comparable à celui d’un calcul effectué sur un processeur non fiable: certains résultats vont être calculés, et d’autres perdus. On étudie le modèle des pannes transitoires. On veut effectuer un calcul à la fois fiable et efficace. La complexité de différentes variantes de ce problème est démontrée. Deux heuristiques sont décrites, puis comparées expérimentalement. Si les pannes transitoires sont les pannes les plus fréquemment rencontrées sur des grilles de calculs classiques, certains types de plates-formes rencontrent d’autres types de défaillances. Les grilles de volontaires sont particulièrement instables. Sur ce type de plate-forme, on veut exécuter des calculs itératifs. Cette application est constituée soit de tâches indépendantes, soit de tâches couplées, qui doivent être calculées ensemble et au même rythme. Dans chaque cas, le problème est d’abord étudié théoriquement, puis des heuristiques sontproposées, et leur performances sont comparées. / This thesis deals with the mapping and the scheduling of workflows. In this context, we consider unreliable platforms, with processors subject to failures. In a first part, we consider a particular model of streaming applications : the filtering services. In this context, we aim at minimizing period and latency. We first neglect communication costs. In this model, we study scheduling problems on homogeneous and heterogeneous platforms. Then, the impact of communication costs on scheduling problems of a filtering application is studied. Finally, we consider the scheduling problem of such an application on a chain of processors. The theoretical complexity of any variant of this problem is proved. This filtering property can model the reliability of processors. The results of some computations are successfully computed, and some other ones are lost. We consider the more frequent failure types : transient failures. We aim efficient and reliable schedules. The complexity of many variants of this problem is proved. Two heuristics are proposed and compared using using simulations. Even if transient failures are the most common failures in classical grids, some particular type of platform are more concerned by other type of problems. Desktop grids are especially unstable. In this context, we want to execute iterative applications. All tasks are executed, then a synchronization occurs, and so on. Two variants of this problem are considered : applicationsof independent tasks, and applications where all tasks need to be executed at same speed. In both cases, the problem is first theoretically studied, then heuristics are proposed and compared using simulations. Applications de flots de données Optimisation multi-critères Heuristiques Programmes linéaires Plate-formes hétérogènes Complexité Grilles de volontaires Pannes transitoires Streaming applications Multicriteria optimization Heuristics Linear programs Heterogeneous platforms Complexity results Desktop grids Transient failures Reliability
4	Agrégation de classements avec égalités : algorithmes, guides à l'utilisateur et applications aux données biologiques / Rank aggregation with ties : algorithms, user guidance et applications to biologicals data Brancotte, Bryan 25 September 2015 (has links) L'agrégation de classements consiste à établir un consensus entre un ensemble de classements (éléments ordonnés). Bien que ce problème ait de très nombreuses applications (consensus entre les votes d'utilisateurs, consensus entre des résultats ordonnés différemment par divers moteurs de recherche...), calculer un consensus exact est rarement faisable dans les cas d'applications réels (problème NP-difficile). De nombreux algorithmes d'approximation et heuristiques ont donc été conçus. Néanmoins, leurs performances (en temps et en qualité de résultat produit) sont très différentes et dépendent des jeux de données à agréger. Plusieurs études ont cherché à comparer ces algorithmes mais celles-ci n’ont généralement pas considéré le cas (pourtant courant dans les jeux de données réels) des égalités entre éléments dans les classements (éléments classés au même rang). Choisir un algorithme de consensus adéquat vis-à-vis d'un jeu de données est donc un problème particulièrement important à étudier (grand nombre d’applications) et c’est un problème ouvert au sens où aucune des études existantes ne permet d’y répondre. Plus formellement, un consensus de classements est un classement qui minimise le somme des distances entre ce consensus et chacun des classements en entrés. Nous avons considérés (comme une grande partie de l’état-de-art) la distance de Kendall-Tau généralisée, ainsi que des variantes, dans nos études. Plus précisément, cette thèse comporte trois contributions. Premièrement, nous proposons de nouveaux résultats de complexité associés aux cas que l'on rencontre dans les données réelles où les classements peuvent être incomplets et où plusieurs éléments peuvent être classés à égalité. Nous isolons les différents « paramètres » qui peuvent expliquer les variations au niveau des résultats produits par les algorithmes d’agrégation (par exemple, utilisation de la distance de Kendall-Tau généralisée ou de variantes, d’un pré-traitement des jeux de données par unification ou projection). Nous proposons un guide pour caractériser le contexte et le besoin d’un utilisateur afin de le guider dans le choix à la fois d’un pré-traitement de ses données mais aussi de la distance à choisir pour calculer le consensus. Nous proposons finalement une adaptation des algorithmes existants à ce nouveau contexte. Deuxièmement, nous évaluons ces algorithmes sur un ensemble important et varié de jeux de données à la fois réels et synthétiques reproduisant des caractéristiques réelles telles que similarité entre classements, la présence d'égalités, et différents pré-traitements. Cette large évaluation passe par la proposition d’une nouvelle méthode pour générer des données synthétiques avec similarités basée sur une modélisation en chaîne Markovienne. Cette évaluation a permis d'isoler les caractéristiques des jeux de données ayant un impact sur les performances des algorithmes d'agrégation et de concevoir un guide pour caractériser le besoin d'un utilisateur et le conseiller dans le choix de l'algorithme à privilégier. Une plateforme web permettant de reproduire et étendre ces analyses effectuée est disponible (rank-aggregation-with-ties.lri.fr). Enfin, nous démontrons l'intérêt d'utiliser l'approche d'agrégation de classements dans deux cas d'utilisation. Nous proposons un outil reformulant à-la-volé des requêtes textuelles d'utilisateur grâce à des terminologies biomédicales, pour ensuite interroger de bases de données biologiques, et finalement produire un consensus des résultats obtenus pour chaque reformulation (conqur-bio.lri.fr). Nous comparons l'outil à la plateforme de références et montrons une amélioration nette des résultats en qualité. Nous calculons aussi des consensus entre liste de workflows établie par des experts dans le contexte de la similarité entre workflows scientifiques. Nous observons que les consensus calculés sont très en accord avec les utilisateurs dans une large proportion de cas. / The rank aggregation problem is to build consensus among a set of rankings (ordered elements). Although this problem has numerous applications (consensus among user votes, consensus between results ordered differently by different search engines ...), computing an optimal consensus is rarely feasible in cases of real applications (problem NP-Hard). Many approximation algorithms and heuristics were therefore designed. However, their performance (time and quality of product loss) are quite different and depend on the datasets to be aggregated. Several studies have compared these algorithms but they have generally not considered the case (yet common in real datasets) that elements can be tied in rankings (elements at the same rank). Choosing a consensus algorithm for a given dataset is therefore a particularly important issue to be studied (many applications) and it is an open problem in the sense that none of the existing studies address it. More formally, a consensus ranking is a ranking that minimizes the sum of the distances between this consensus and the input rankings. Like much of the state-of-art, we have considered in our studies the generalized Kendall-Tau distance, and variants. Specifically, this thesis has three contributions. First, we propose new complexity results associated with cases encountered in the actual data that rankings may be incomplete and where multiple items can be classified equally (ties). We isolate the different "features" that can explain variations in the results produced by the aggregation algorithms (for example, using the generalized distance of Kendall-Tau or variants, pre-processing the datasets with unification or projection). We propose a guide to characterize the context and the need of a user to guide him into the choice of both a pre-treatment of its datasets but also the distance to choose to calculate the consensus. We finally adapt existing algorithms to this new context. Second, we evaluate these algorithms on a large and varied set of datasets both real and synthetic reproducing actual features such as similarity between rankings, the presence of ties and different pre-treatments. This large evaluation comes with the proposal of a new method to generate synthetic data with similarities based on a Markov chain modeling. This evaluation led to the isolation of datasets features that impact the performance of the aggregation algorithms, and to design a guide to characterize the needs of a user and advise him in the choice of the algorithm to be use. A web platform to replicate and extend these analyzes is available (rank-aggregation-with-ties.lri.fr). Finally, we demonstrate the value of using the rankings aggregation approach in two use cases. We provide a tool to reformulating the text user queries through biomedical terminologies, to then query biological databases, and ultimately produce a consensus of results obtained for each reformulation (conqur-bio.lri.fr). We compare the results to the references platform and show a clear improvement in quality results. We also calculate consensus between list of workflows established by experts in the context of similarity between scientific workflows. We note that the computed consensus agree with the expert in a very large majority of cases. Agrégation de classements Agrégation de préférences Top-k Topk Classement de Kemeny optimal Solution exact Guidance Benchmark Résultat de complexité Interrogation de sources biomédicale NP-difficile Rank aggregation Preference aggregation Top-k Topk Optimal Kemey ranking Exact solution Guidance Benchmark Complexity results Querying biomedical sources NP-Hard

1

Page generated in 0.0402 seconds