Global ETD Search

61	Bayesian optimization for selecting training and validation data for supervised machine learning : using Gaussian processes both to learn the relationship between sets of training data and model performance, and to estimate model performance over the entire problem domain / Bayesiansk optimering för val av träning- och valideringsdata för övervakad maskininlärning Bergström, David January 2019 (has links) Validation and verification in machine learning is an open problem which becomes increasingly important as its applications becomes more critical. Amongst the applications are autonomous vehicles and medical diagnostics. These systems all needs to be validated before being put into use or else the consequences might be fatal. This master’s thesis focuses on improving both learning and validating machine learning models in cases where data can either be generated or collected based on a chosen position. This can for example be taking and labeling photos at the position or running some simulation which generates data from the chosen positions. The approach is twofold. The first part concerns modeling the relationship between any fixed-size set of positions and some real valued performance measure. The second part involves calculating such a performance measure by estimating the performance over a region of positions. The result is two different algorithms, both variations of Bayesian optimization. The first algorithm models the relationship between a set of points and some performance measure while also optimizing the function and thus finding the set of points which yields the highest performance. The second algorithm uses Bayesian optimization to approximate the integral of performance over the region of interest. The resulting algorithms are validated in two different simulated environments. The resulting algorithms are applicable not only to machine learning but can also be used to optimize any function which takes a set of positions and returns a value, but are more suitable when the function is expensive to evaluate. Bayesian optimization AutoML supervised learning Computer Sciences Datavetenskap (datalogi)
62	Upplevelse av gruppträning, self-efficacy samt underlättande och hindrande faktorer för träning hos en grupp kvinnor med kvarstående besvär efter förlossning : - En kvalitativ intervjustudie / Experience of group training, exercise self-efficacy, facilitating factors and barriers for exercise amongst a group of women with persistent postpartum problems : - A qualitative interview study Hanser, Maria, Holm, Sara January 2019 (has links) BakgrundDen fysiska aktivitetsnivån sänks för många i samband med graviditet och kan vara kvarstående en längre tid efter förlossning. Det finns begränsad kunskap om vilka faktorer som stärker self-efficacy (S-E) att utföra träning, underlättande och hindrande faktorer för initiering eller återupptagande av fysisk aktivitet och träning efter förlossning. SyfteSyftet var att undersöka upplevelser efter deltagande i gruppträning på en vårdcentral hos kvinnor med kvarstående besvär efter förlossning, S-E till fortsatt träning på egen hand samt hindrande och underlättande faktorer för träning. Design och metod En kvalitativ deskriptiv design användes och fem semistrukturerade intervjuer genomfördes. Vid databearbetning användes en kvalitativ innehållsanalys. ResultatResultatet beskriver betydelsen av ledarledd gruppträning och dess innehåll, S-E till- och underlättande samt hindrande faktorer för träning på egen hand, individuella strategier för träning, betydelsen av att ha drivkraft och omgivningsfaktorer som påverkar träningen. KonklusionInformanterna beskrev positiva aspekter gällande ledarledd gruppträning med andra mödrar. Fler träningstillfällen och ytterligare vägledning beskrevs kunna stärka S-E för träning på egen hand. Olika underlättande och hindrande faktorer påverkade om kvinnorna tränade på egen hand eller inte. Denna information kan vara till nytta för fysioterapeuter och barnmorskor för att främja fysisk aktivitet och träning efter förlossning. / BackgroundThe physical activity level decreases among many women during pregnancy and this decline may remain a long period of time postpartum. There is limited knowledge about the factors, such as self-efficacy (S-E) and barriers for exercise, influencing physical activity postpartum. ObjectivesThe aim of this study was to investigate how women with postpartum complications experienced group training in primary healthcare. The purpose was also to analyze their S-E for self-managed exercise and facilitating factors and barriers for exercise. Design and methodA qualitative descriptive study design was used. Five semi-structured interviews were conducted and analyzed with qualitative content analysis. ResultsThe informants described the impact of supervised group training and how self-efficacy and different factors, influenced self-managed exercise. They described individual exercise strategies, the importance of motivation and environmental factors affecting the exercise. ConclusionDifferent positive aspects emerged regarding supervised exercise and exercise in a group with other mothers. To enhance S-E in individual exercise additional supervised exercise and further guidance were highlighted.Different facilitating factors and barriers for exercise affected whether or not the informants exercised on their own. This information can be of use for physiotherapists and midwives to promote physical activity and exercise postpartum. Self-efficacy barriers postpartum supervised exercise exercise physiotherapy Physiotherapy Sjukgymnastik
63	Unsupervised categorization : perceptual shift, strategy development, and general principles Colreavy, Erin Patricia January 2008 (has links) Unsupervised categorization is the task of classifying novel stimuli without external feedback or guidance, and is important for every day decisions such as deciding whether emails fall into 'interesting Categorization (Psychology) Cognition Perception Unsupervised categorization Perception Supervised categorization
64	Learning object boundary detection from motion data Ross, Michael G., Kaelbling, Leslie P. 01 1900 (has links) This paper describes the initial results of a project to create a self-supervised algorithm for learning object segmentation from video data. Developmental psychology and computational experience have demonstrated that the motion segmentation of objects is a simpler, more primitive process than the detection of object boundaries by static image cues. Therefore, motion information provides a plausible supervision signal for learning the static boundary detection task and for evaluating performance on a test set. A video camera and previously developed background subtraction algorithms can automatically produce a large database of motion-segmented images for minimal cost. The purpose of this work is to use the information in such a database to learn how to detect the object boundaries in novel images using static information, such as color, texture, and shape. / Singapore-MIT Alliance (SMA) machine learning self-supervised algorithm motion segmentation object boundary detection
65	Stable Mixing of Complete and Incomplete Information Corduneanu, Adrian, Jaakkola, Tommi 08 November 2001 (has links) An increasing number of parameter estimation tasks involve the use of at least two information sources, one complete but limited, the other abundant but incomplete. Standard algorithms such as EM (or em) used in this context are unfortunately not stable in the sense that they can lead to a dramatic loss of accuracy with the inclusion of incomplete observations. We provide a more controlled solution to this problem through differential equations that govern the evolution of locally optimal solutions (fixed points) as a function of the source weighting. This approach permits us to explicitly identify any critical (bifurcation) points leading to choices unsupported by the available complete data. The approach readily applies to any graphical model in O(n^3) time where n is the number of parameters. We use the naive Bayes model to illustrate these ideas and demonstrate the effectiveness of our approach in the context of text classification problems. AI semi-supervised learning incomplete data EM stable estimation
66	Validating Co-Training Models for Web Image Classification Zhang, Dell, Lee, Wee Sun 01 1900 (has links) Co-training is a semi-supervised learning method that is designed to take advantage of the redundancy that is present when the object to be identified has multiple descriptions. Co-training is known to work well when the multiple descriptions are conditional independent given the class of the object. The presence of multiple descriptions of objects in the form of text, images, audio and video in multimedia applications appears to provide redundancy in the form that may be suitable for co-training. In this paper, we investigate the suitability of utilizing text and image data from the Web for co-training. We perform measurements to find indications of conditional independence in the texts and images obtained from the Web. Our measurements suggest that conditional independence is likely to be present in the data. Our experiments, within a relevance feedback framework to test whether a method that exploits the conditional independence outperforms methods that do not, also indicate that better performance can indeed be obtained by designing algorithms that exploit this form of the redundancy when it is present. / Singapore-MIT Alliance (SMA) Co-Training Machine Learning Multimedia Data Mining Semi-Supervised Learning
67	Novel Measures on Directed Graphs and Applications to Large-Scale Within-Network Classification Mantrach, Amin 25 October 2010 (has links) Ces dernières années, les réseaux sont devenus une source importante d’informations dans différents domaines aussi variés que les sciences sociales, la physique ou les mathématiques. De plus, la taille de ces réseaux n’a cessé de grandir de manière conséquente. Ce constat a vu émerger de nouveaux défis, comme le besoin de mesures précises et intuitives pour caractériser et analyser ces réseaux de grandes tailles en un temps raisonnable. La première partie de cette thèse introduit une nouvelle mesure de similarité entre deux noeuds d’un réseau dirigé et pondéré : la covariance “sum-over-paths”. Celle-ci a une interprétation claire et précise : en dénombrant tous les chemins possibles deux noeuds sont considérés comme fortement corrélés s’ils apparaissent souvent sur un même chemin – de préférence court. Cette mesure dépend d’une distribution de probabilités, définie sur l’ensemble infini dénombrable des chemins dans le graphe, obtenue en minimisant l'espérance du coût total entre toutes les paires de noeuds du graphe sachant que l'entropie relative totale injectée dans le réseau est fixée à priori. Le paramètre d’entropie permet de biaiser la distribution de probabilité sur un large spectre : allant de marches aléatoires naturelles où tous les chemins sont équiprobables à des marches biaisées en faveur des plus courts chemins. Cette mesure est alors appliquée à des problèmes de classification semi-supervisée sur des réseaux de taille moyennes et comparée à l’état de l’art. La seconde partie de la thèse introduit trois nouveaux algorithmes de classification de noeuds en sein d’un large réseau dont les noeuds sont partiellement étiquetés. Ces algorithmes ont un temps de calcul linéaire en le nombre de noeuds, de classes et d’itérations, et peuvent dés lors être appliqués sur de larges réseaux. Ceux-ci ont obtenus des résultats compétitifs en comparaison à l’état de l’art sur le large réseaux de citations de brevets américains et sur huit autres jeux de données. De plus, durant la thèse, nous avons collecté un nouveau jeu de données, déjà mentionné : le réseau de citations de brevets américains. Ce jeu de données est maintenant disponible pour la communauté pour la réalisation de tests comparatifs. La partie finale de cette thèse concerne la combinaison d’un graphe de citations avec les informations présentes sur ses noeuds. De manière empirique, nous avons montré que des données basées sur des citations fournissent de meilleurs résultats de classification que des données basées sur des contenus textuels. Toujours de manière empirique, nous avons également montré que combiner les différentes sources d’informations (contenu et citations) doit être considéré lors d’une tâche de classification de textes. Par exemple, lorsqu’il s’agit de catégoriser des articles de revues, s’aider d’un graphe de citations extrait au préalable peut améliorer considérablement les performances. Par contre, dans un autre contexte, quand il s’agit de directement classer les noeuds du réseau de citations, s’aider des informations présentes sur les noeuds n’améliora pas nécessairement les performances. La théorie, les algorithmes et les applications présentés dans cette thèse fournissent des perspectives intéressantes dans différents domaines. In recent years, networks have become a major data source in various fields ranging from social sciences to mathematical and physical sciences. Moreover, the size of available networks has grow substantially as well. This has brought with it a number of new challenges, like the need for precise and intuitive measures to characterize and analyze large scale networks in a reasonable time. The first part of this thesis introduces a novel measure between two nodes of a weighted directed graph: The sum-over-paths covariance. It has a clear and intuitive interpretation: two nodes are considered as highly correlated if they often co-occur on the same -- preferably short -- paths. This measure depends on a probability distribution over the (usually infinite) countable set of paths through the graph which is obtained by minimizing the total expected cost between all pairs of nodes while fixing the total relative entropy spread in the graph. The entropy parameter allows to bias the probability distribution over a wide spectrum: going from natural random walks (where all paths are equiprobable) to walks biased towards shortest-paths. This measure is then applied to semi-supervised classification problems on medium-size networks and compared to state-of-the-art techniques. The second part introduces three novel algorithms for within-network classification in large-scale networks, i.e., classification of nodes in partially labeled graphs. The algorithms have a linear computing time in the number of edges, classes and steps and hence can be applied to large scale networks. They obtained competitive results in comparison to state-of-the-art technics on the large scale U.S.~patents citation network and on eight other data sets. Furthermore, during the thesis, we collected a novel benchmark data set: the U.S.~patents citation network. This data set is now available to the community for benchmarks purposes. The final part of the thesis concerns the combination of a citation graph with information on its nodes. We show that citation-based data provide better results for classification than content-based data. We also show empirically that combining both sources of information (content-based and citation-based) should be considered when facing a text categorization problem. For instance, while classifying journal papers, considering to extract an external citation graph may considerably boost the performance. However, in another context, when we have to directly classify the network citation nodes, then the help of features on nodes will not improve the results. The theory, algorithms and applications presented in this thesis provide interesting perspectives in various fields. semi-supervised classification large scale graphs betweenness centrality graph kernels
68	Comparison of Two Diet and Exercise Approaches on Weight Loss and Health Outcomes in Women Mardock, Michelle 1967- 14 March 2013 (has links) The purpose of this study was to determine the effects of following either the Curves® Fitness and Weight Management Plan or the Weight Watchers® Momentum™ Plan on body composition and markers of health and fitness in previously sedentary obese women. Fifty-one women (age 35±8 yrs; height 163±7 cm; weight 90±1 kg; BMI 34±5 kg/m2; 47±7% body fat) were randomized to participate in the Curves® (C) or Weight Watchers® (W) weight loss programs for 16-wks. Participants in the C group (n=24) followed a 1,200 kcal/d diet for 1-wk; 1,500 kcal/d diet for 3 wks (~30%:45% CHO:PRO); and 2,000 kcals/d for 2-wks (45:30) and repeated this diet while participating in a supervised Curves® with Zumba program 3-d-wk. Remaining subjects (n=27) followed the W point-based diet program, received weekly group counseling, and were encouraged to exercise. Body composition, anthropometrics, resting energy expenditure (REE), lipid biomarkers, and hormone concentrations were assessed at 0, 4, 10, and 16 weeks. Maximal cardiopulmonary exercise capacity and upper and lower body isotonic strength and endurance were assessed at 0 and 16 weeks. Data were analyzed using multivariate analysis of variance for repeated measures. MANOVA analysis of body composition data revealed overall time (Wilks’ Lamda p=0.001) and time by diet effects (Wilks’ Lamda p=0.003). Subjects in both groups lost a similar amount of total mass (C -2.4±2.0, -4.1±3.4, -5.1±3.9; W -2.3±2.3, -4.5±3.0, -5.5±4.6 kg, p=0.78). However, subjects in the C group tended to have a greater reduction in percent body fat (C -3.3±5.2, -3.2±4.6, -4.7±5.4; W 0.6±6.7, -0.6±8.3, -1.4±8.1%, p=0.10) and body fat mass (C -3.9±5.5, -4.6±5.3, -6.4±5.9; W -0.4±5.7, -2.1±6.7, -2.9±7.8 kg, p=0.09), while maintaining FFM (C 1.5±4.3, 0.52±3.7, 1.3±4.0; W -1.8±5.4, -2.4±5.8, -2.5±5.1, p=0.01). While both groups had increases cardiovascular fitness, the C group experienced improvements in upper body muscular endurance (C 1.4±3.9; W -1.2±2.4 repetitions, p=0.006). Both groups experienced improvements in lipid biomarkers; however, only the C group experienced a moderate increase in HDL-c. Results indicate that participants following the C program experienced more favorable changes in body composition and markers of fitness and health than participants in the W program. Comparative Effectiveness Comparison Supervised Exercise Program Exercise Weight Loss Weight
69	Active Learning with Semi-Supervised Support Vector Machines Chinaei, Leila January 2007 (has links) A significant problem in many machine learning tasks is that it is time consuming and costly to gather the necessary labeled data for training the learning algorithm to a reasonable level of performance. In reality, it is often the case that a small amount of labeled data is available and that more unlabeled data could be labeled on demand at a cost. If the labeled data is obtained by a process outside of the control of the learner, then the learner is passive. If the learner picks the data to be labeled, then this becomes active learning. This has the advantage that the learner can pick data to gain specific information that will speed up the learning process. Support Vector Machines (SVMs) have many properties that make them attractive to use as a learning algorithm for many real world applications including classification tasks. Some researchers have proposed algorithms for active learning with SVMs, i.e. algorithms for choosing the next unlabeled instance to get label for. Their approach is supervised in nature since they do not consider all unlabeled instances while looking for the next instance. In this thesis, we propose three new algorithms for applying active learning for SVMs in a semi-supervised setting which takes advantage of the presence of all unlabeled points. The suggested approaches might, by reducing the number of experiments needed, yield considerable savings in costly classification problems in the cases when finding the training data for a classifier is expensive. Active Learning Semi-Supervised Support Vector Machines Computer Science
70	Active Learning with Semi-Supervised Support Vector Machines Chinaei, Leila January 2007 (has links) A significant problem in many machine learning tasks is that it is time consuming and costly to gather the necessary labeled data for training the learning algorithm to a reasonable level of performance. In reality, it is often the case that a small amount of labeled data is available and that more unlabeled data could be labeled on demand at a cost. If the labeled data is obtained by a process outside of the control of the learner, then the learner is passive. If the learner picks the data to be labeled, then this becomes active learning. This has the advantage that the learner can pick data to gain specific information that will speed up the learning process. Support Vector Machines (SVMs) have many properties that make them attractive to use as a learning algorithm for many real world applications including classification tasks. Some researchers have proposed algorithms for active learning with SVMs, i.e. algorithms for choosing the next unlabeled instance to get label for. Their approach is supervised in nature since they do not consider all unlabeled instances while looking for the next instance. In this thesis, we propose three new algorithms for applying active learning for SVMs in a semi-supervised setting which takes advantage of the presence of all unlabeled points. The suggested approaches might, by reducing the number of experiments needed, yield considerable savings in costly classification problems in the cases when finding the training data for a classifier is expensive. Active Learning Semi-Supervised Support Vector Machines Computer Science

Search results