• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 17
  • 9
  • 3
  • 2
  • 2
  • 1
  • Tagged with
  • 37
  • 9
  • 8
  • 8
  • 7
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

On the use of transport and optimal control methods for Monte Carlo simulation

Heng, Jeremy January 2016 (has links)
This thesis explores ideas from transport theory and optimal control to develop novel Monte Carlo methods to perform efficient statistical computation. The first project considers the problem of constructing a transport map between two given probability measures. In the Bayesian formalism, this approach is natural when one introduces a curve of probability measures connecting the prior to posterior by tempering the likelihood function. The main idea is to move samples from the prior using an ordinary differential equation (ODE), constructed by solving the Liouville partial differential equation (PDE) which governs the time evolution of measures along the curve. In this work, we first study the regularity solutions of Liouville equation should satisfy to guarantee validity of this construction. We place an emphasis on understanding these issues as it explains the difficulties associated with solutions that have been previously reported. After ensuring that the flow transport problem is well-defined, we give a constructive solution. However, this result is only formal as the representation is given in terms of integrals which are intractable. For computational tractability, we proposed a novel approximation of the PDE which yields an ODE whose drift depends on the full conditional distributions of the intermediate distributions. Even when the ODE is time-discretized and the full conditional distributions are approximated numerically, the resulting distribution of mapped samples can be evaluated and used as a proposal within Markov chain Monte Carlo and sequential Monte Carlo (SMC) schemes. We then illustrate experimentally that the resulting algorithm can outperform state-of-the-art SMC methods at a fixed computational complexity. The second project aims to exploit ideas from optimal control to design more efficient SMC methods. The key idea is to control the proposal distribution induced by a time-discretized Langevin dynamics so as to minimize the Kullback-Leibler divergence of the extended target distribution from the proposal. The optimal value functions of the resulting optimal control problem can then be approximated using algorithms developed in the approximate dynamic programming (ADP) literature. We introduce a novel iterative scheme to perform ADP, provide a theoretical analysis of the proposed algorithm and demonstrate that the latter can provide significant gains over state-of-the-art methods at a fixed computational complexity.
22

Análise de benchmarking com foco na satisfação dos usuários de transporte coletivo : normalização, análise envoltória de dados e clusterização

Barcelos, Mariana Müller January 2016 (has links)
Atrair usuários para o transporte coletivo e manter os que já utilizam é essencial para fomentar cidades mais sustentáveis. Melhorar a qualidade do transporte urbano por ônibus e considerar a visão do usuário, portanto, torna-se relevante. O benchmarking é uma ferramenta reconhecida de gestão da qualidade que permite comparar sistemas, identificar referências de boas práticas e promover trocas de experiência. Neste contexto, aliar o processo de benchmarking com avaliações de satisfação dos usuários do transporte coletivo tem um grande potencial para promover uma gestão mais efetiva e focada nas necessidades e desejos dos usuários do transporte. A comparação da percepção dos usuários de diferentes sistemas, entretanto, possui diversos desafios devido à falta de padronização na coleta de dados, subjetividade e vieses socioculturais dos respondentes. Este trabalho apresenta três métodos que buscam superar estes desafios e permitir a realização de análises de benchmarking com dados de satisfação dos usuários de diferentes cidades A primeira análise consiste na normalização das notas de satisfação para reduzir vieses sociais e culturais. A segunda aplica a análise envoltória de dados para identificar sistemas de transporte eficientes na visão dos seus usuários. Por fim, a terceira análise consiste na aplicação de análise de clusters para identificar relações entre perfis de usuários e as respectivas avaliações de satisfação em diferentes cidades. Os métodos mostram-se adequados para comparação de sistemas, permitindo identificação de metas, prioridades, benchmarks e entendimento de particularidades dos diferentes públicos. As análises apresentam distintos graus de complexidade de aplicação e de obtenção dos dados. Cada um dos métodos proporciona uma visão distinta a partir dos dados disponíveis, que permite que se definam benchmarks e auxilie na definição de diretrizes de melhorias. / Attracting users to public transport and maintaining the ones that already use it is essential to fostering more sustainable cities. Therefore, improving the quality of bus transit systems and considering the users’ vision becomes relevant. Benchmarking is a recognized quality management tool that allows comparing systems, identifying references to good practices and promoting exchanges of experience. Aligning benchmarking process and users’ satisfaction of public transport have great potential to become the management more focused and effective to the needs and desires of the users. However, comparing the perception of users of different systems results in several challenges due the lack of standardization of data collection, subjectivity and socio cultural biases. This study proposes the application of three methods aiming to overcome these challenges and to allow benchmarking analysis with users’ satisfaction data of different cities. The first analysis consists in normalizing satisfaction scores to reduce social and cultural biases. The second one applies Data Envelopment Analysis (DEA) to identify efficient transport systems in users’ view. The third one consists in using clusters analysis to identify relations between users’ profiles and their respective satisfaction in different cities. The methods are adequate for comparing systems, allowing goals identification, priorities, benchmarks and understanding of different audiences’ particularities. The analyses present different degrees of application complexity and data collection. Each method provides a distinct view from available data, which allows defining benchmarks and assist in improvements guidelines.
23

Análise de benchmarking com foco na satisfação dos usuários de transporte coletivo : normalização, análise envoltória de dados e clusterização

Barcelos, Mariana Müller January 2016 (has links)
Atrair usuários para o transporte coletivo e manter os que já utilizam é essencial para fomentar cidades mais sustentáveis. Melhorar a qualidade do transporte urbano por ônibus e considerar a visão do usuário, portanto, torna-se relevante. O benchmarking é uma ferramenta reconhecida de gestão da qualidade que permite comparar sistemas, identificar referências de boas práticas e promover trocas de experiência. Neste contexto, aliar o processo de benchmarking com avaliações de satisfação dos usuários do transporte coletivo tem um grande potencial para promover uma gestão mais efetiva e focada nas necessidades e desejos dos usuários do transporte. A comparação da percepção dos usuários de diferentes sistemas, entretanto, possui diversos desafios devido à falta de padronização na coleta de dados, subjetividade e vieses socioculturais dos respondentes. Este trabalho apresenta três métodos que buscam superar estes desafios e permitir a realização de análises de benchmarking com dados de satisfação dos usuários de diferentes cidades A primeira análise consiste na normalização das notas de satisfação para reduzir vieses sociais e culturais. A segunda aplica a análise envoltória de dados para identificar sistemas de transporte eficientes na visão dos seus usuários. Por fim, a terceira análise consiste na aplicação de análise de clusters para identificar relações entre perfis de usuários e as respectivas avaliações de satisfação em diferentes cidades. Os métodos mostram-se adequados para comparação de sistemas, permitindo identificação de metas, prioridades, benchmarks e entendimento de particularidades dos diferentes públicos. As análises apresentam distintos graus de complexidade de aplicação e de obtenção dos dados. Cada um dos métodos proporciona uma visão distinta a partir dos dados disponíveis, que permite que se definam benchmarks e auxilie na definição de diretrizes de melhorias. / Attracting users to public transport and maintaining the ones that already use it is essential to fostering more sustainable cities. Therefore, improving the quality of bus transit systems and considering the users’ vision becomes relevant. Benchmarking is a recognized quality management tool that allows comparing systems, identifying references to good practices and promoting exchanges of experience. Aligning benchmarking process and users’ satisfaction of public transport have great potential to become the management more focused and effective to the needs and desires of the users. However, comparing the perception of users of different systems results in several challenges due the lack of standardization of data collection, subjectivity and socio cultural biases. This study proposes the application of three methods aiming to overcome these challenges and to allow benchmarking analysis with users’ satisfaction data of different cities. The first analysis consists in normalizing satisfaction scores to reduce social and cultural biases. The second one applies Data Envelopment Analysis (DEA) to identify efficient transport systems in users’ view. The third one consists in using clusters analysis to identify relations between users’ profiles and their respective satisfaction in different cities. The methods are adequate for comparing systems, allowing goals identification, priorities, benchmarks and understanding of different audiences’ particularities. The analyses present different degrees of application complexity and data collection. Each method provides a distinct view from available data, which allows defining benchmarks and assist in improvements guidelines.
24

Caracterização geoquímica marinha e avaliação do impacto das atividades antrópicas e de exploração de petróleo sobre os sedimentos da plataforma continental do estado de Sergipe e sul do estado do Alagoas / GEOCHEMICAL CHARACTERIZATION AND EVALUATION OF MARINE HUMAN IMPACTS OIL EXPLORATION AND SEDIMENTS ON THE CONTINENTAL SHELF AND THE STATE OF THE STATE OF SOUTH ALAGOAS.

Barbosa, Ariadna Cristina Gomes 15 September 2010 (has links)
The environmental impact caused by oil, sewage, industrial, mining, leaching of soil and urban surfaces in the Brazilian coast has generated effluents mixed, with the metal trace a specific problem because they are deposited in marine sediments and present toxicity at the same time, persistence and bioaccumulation. Sediments act as operating compartment corded an essential role in the redistribution of marine biota, because that guard the historical record of contamination of water body. Oil exploration extends to offshore environment, occupying a prominent position in the energy matrix, responsible for most of the national supply. The oil and gas activities may include a significant source of pollution of the planet. The aim of this study was to observe the sediment profiles of the continental shelf off Sergipe and Alagoas southern state, checking for a variation of the concentration of trace metals according to depth. Three cores (PCM-9 (A), PCM-9 (B), Est.2 (A)) were performed to obtain a vertical profile of sediment samples obtained at various depths. In this study, drew up the metals Co, Cr, Fe, Li, Mn, Ni, Pb, Zn, Al and Cu total and partial, were determined by atomic absorption spectrometric (AAS). In the extraction part, the result of the recovery was above 90%, ranging from 92% to 111.1%. In the extraction result of the total recovery ranged from 77.6 to 99%. These values were obtained in the studied parameters, indicating good accuracy of the methodology. The values of RSD are below 10%, expressing the accuracy of methodology. Showed that the concentration distribution of trace metals in sediment profiles vary with depth. Colors in PCM-9 (A), PCM-9 (B), increased concentrations of metals and the core Est.2 (A) decreased gradually from base to surface. The iron was approved as a geochemical normalizer better. The values of the calculated enrichment factor shown to be low, considering the low enriched, except Pb, Co, Mn and Al that have moderate values. The geochemical background values were estimated in three different procedures, which can generate unrealistic characterizations of the state of pollution. The values of trace metals are below the sediment quality guidelines, which leads to the adverse effect on aquatic biota is unlikely. / O impacto ambiental causado por óleo, esgotamento sanitário, industrial,mineração, lixiviação de solos e superfícies urbanas, na costa brasileira temgerado efluentes variados, sendo os metais traço um problema particular,pois se depositam nos sedimentos marinho e apresentam ao mesmo tempo toxicidade, persistência e bioacumulação. Os sedimentos funcionam comocompartimento operacional que exerce papel essencial na redistribuição àbiota marinha, visto que guardam registro histórico da contaminação decorpo d água. A exploração de petróleo se estende ao ambiente offshore, ocupando uma posição de destaque na matriz energética, responsável pela maior parte do suprimento nacional. As atividades petrolíferas podem consistir uma fonte poluidora significativa do planeta. O objetivo desse estudo foi observar os perfis sedimentares da plataforma continental de Sergipe e sul de estado de Alagoas, verificando se há uma variação da concentração dos metais traço de acordo com a profundidade. Três cores (PCM-9(A), PCM-9(B), Est.2(A)) foram realizadas para obter um perfil vertical do sedimento, obtendo amostras a várias profundidades. Nesse estudo, extraiu-se os metais Co, Cr, Fe, Li, Mn, Ni, Pb, Zn, Al e Cu totais e parciais e foram determinados através da espectrometria de absorção atômica. Na extração parcial, o resultado da recuperação foi acima de 90 %, variam de 92% a 111,1%. Já na extração total o resultado da recuperação variou de 77,6 a 99%. Esses valores foram obtidos nos elementos estudados, indicando boa exatidão da metodologia. Os valores do desvio padrão relativo estão abaixo de 10%, expressando a precisão da metodologia. Mostrou-se que a distribuição da concentração dos metais traço nos perfis sedimentares variam com a profundidade. Nos cores PCM- 9(A), PCM-9 (B), as concentrações dos metais aumentaram e no core Est.2 (A) diminuiu gradativamente da base até a superfície. O ferro foi aprovado como melhor normalizador geoquímico. Os valores do fator de enriquecimento calculado mostraram-se baixos, considerando-se pouco enriquecida, exceto o Pb, Co, Mn e Al que apresentam valores moderados. Os valores de background geoquímico estimado foram diferentes nos três procedimentos utilizados. Os valores dos metais traços estão abaixo dos valores guia de qualidade dos sedimentos, o que leva o efeito adverso a biota aquática ser pouco provável.
25

Believable and Manipulable Facial Behaviour in a Robotic Platform using Normalizing Flows / Trovärda och Manipulerbara Ansiktsuttryck i en Robotplattform med Normaliserande Flöde

Alias, Kildo January 2021 (has links)
Implicit communication is important in interaction because it plays a role in conveying the internal mental states of an individual. For example, emotional expressions that are shown through unintended facial gestures can communicate underlying affective states. People can infer mental states from implicit cues and have strong expectations of what those cues mean. This is true for human-human interactions, as well as human-robot interactions. A Normalizing flow model is used as a generative model that can produce facial gestures and head movements. The invertible nature of the Normalizing flow model makes it possible to manipulate attributes of the generated gestures. The model in this work is capable of generating facial expressions that look real and human-like. Furthermore, the model can manipulate the generated output to change the perceived affective state of the facial expressions. / Implicit kommunikation är viktig i interaktioner eftersom den spelar en roll för att förmedla individens inre mentala tillstånd. Till exempel kan känslomässiga uttryck som visas genom oavsiktliga ansiktsgester kommunicera underliggande affektiva tillstånd. Människor kan härleda mentala tillstånd från implicita ledtrådar och har starka förväntningar på vad dessa ledtrådar betyder. Detta gäller för interaktion mellan människor, liksom interaktion mellan människa och robot. En normaliserande flödesmodell används som en generativ modell som kan producera ansiktsgester och huvudrörelser. Den inverterbara naturen hos normaliseringsflödesmodellen gör det också möjligt att manipulera det genererade ansiktsuttrycken. Utgången manipuleras i två dimensioner som vanligtvis används för att beskriva affektivt tillstånd, valens och upphetsning. Modellen i detta arbete kan generera ansiktsuttryck som ser verkliga och mänskliga ut och kan manipuleras for att ändra det affektiva tillstånd.
26

Analyzing the Negative Log-Likelihood Loss in Generative Modeling / Analys av log-likelihood-optimering inom generativa modeller

Espuña I Fontcuberta, Aleix January 2022 (has links)
Maximum-Likelihood Estimation (MLE) is a classic model-fitting method from probability theory. However, it has been argued repeatedly that MLE is inappropriate for synthesis applications, since its priorities are at odds with important principles of human perception, and that, e.g. Generative Adversarial Networks (GANs) are a more appropriate choice. In this thesis, we put these ideas to the test, and explore the effect of MLE in deep generative modelling, using image generation as our example application. Unlike previous studies, we apply a new methodology that allows us to isolate the effects of the training paradigm from several common confounding factors of variation, such as the model architecture and the properties of the true data distribution. The thesis addresses two main questions. First, we ask if models trained via Non-Saturating Generative Adversarial Networks (NSGANs) are capable of producing more realistic images than the exact same architecture trained by directly minimizing the Negative Log-Likelihood (NLL) loss function instead (which is equivalent to MLE). We compare the two training paradigms using the MNIST dataset and a normalizing-flow architecture known as Real NVP, which can explicitly represent a very broad family of density functions. We use the Fréchet Inception Distance (FID) as an algorithmic estimate of subjective image quality. Second, we also analyze how the NLL loss behaves in the presence of model misspecification, which is when the model architecture is not capable of representing the true data distribution, and compare the resulting training curves and performance to those produced by models without misspecification. In order to control for and study different degrees of model misspecification, we create a realistic-looking – but actually synthetic – toy version of the classic MNIST dataset. By this we mean that we create a machine-learning problem where the examples in the dataset look like MNIST, but in fact it have been generated by a Real NVP architecture with known weights, and therefore the true distribution that generated the image data is known. We are not aware of this type of large-scale, realistic-looking toy problem having been used in prior work. Our results show that, first, models trained via NLL perform unexpectedly well in terms of FID, and that a Real NVP trained via an NSGAN approach is unstable during training – even at the Nash equilibrium, which is the global optimum onto which the NSGAN training updates are supposed to converge. Second, the experiments on synthetic data show that models with different degrees of misspecification reach different NLL losses on the training set, but all of them exhibit qualitatively similar convergence behavior. However, looking at the validation NLL loss reveals an important overfitting effect due to the finite size of the synthetic dataset: The models that in theory are able to perfectly describe the true data distribution achieve worse validation NLL losses in practice than some misspecified models, whose reduced complexity acts as a regularizer that helps them generalize better. At the same time, we observe that overfitting has a much stronger negative effect on the validation NLL loss than on the image quality as measured by the FID score. We also conclude that models with too many parameters and degrees of freedom (overparameterized models) should be avoided, as they not only are slow and frequently unstable to train, even using the NLL loss, but they also overfit heavily and produce poorer images. Throughout the thesis, our results highlight the complex and non-intuitive relationship between the NLL loss and the perceptual image quality as measured by the FID score. / Maximum likelihood-metoden är en klassisk parameteruppskattningsmetod från sannolikhetsteori. Det hävdas dock ofta att maximum likelihood är ett olämpligt val för tillämpningar inom exempelvis ljud- och bildsyntes, eftersom metodens prioriteringar står i strid med viktiga principer inom mänsklig perception, och att t.ex. Generative Adversarial Networks (GANs) är ett mer perceptuellt lämpligt val. I den här avhandlingen testar vi dessa hypoteser och utforskar effekten av maximum likelihood i djupa generativa modeller, med bildsyntes som vår exempeltillämpning. Till skillnad från tidigare studier använder vi en ny metodik som gör att vi kan isolera effekterna av träningsparadigmen från flera vanliga störfaktorer, såsom modellarkitekturen och hur väl denna arkitektur svarar mot datats sanna fördelning. Avhandlingen tar upp två huvudfrågor. Först frågar vi oss huruvida modeller tränade via NSGAN (Non-Saturating Generative Adversarial Networks) producerar mer realistiska bilder än om exakt samma arkitektur istället tränas att direkt minimera målfunktionen Negativ Log-Likelihood (NLL). (Att minimera NLL är ekvivalent med maximum likelihood-metoden.) För att jämföra de två träningsparadigmerna använder vi datamängden MNIST samt en normalizing flow-arkitektur kallad Real NVP, vilken på ett explicit sätt kan representera en mycket bred familj av kontinuerliga fördelingsfunktioner. Vi använder också Fréchet Inception Distance (FID) som ett mått för att algoritmiskt uppskatta kvaliteten på syntetiserade bilder. För det andra analyserar vi också hur målfunktionen NLL beter sig för felspecificerade modeller, vilket är det fall när modellarkitekturen inte kan representera datas sanna sannolikhetsfördelning perfekt, och jämför resulterande träningskurvor och -prestanda med motsvarande resultat när vi tränar modeller utan felspecifikation. För att studera och utöva kontroll över olika grader av felspecificerade modeller skapar vi en realistisk – men i själva verket syntetisk – leksaksversion av MNIST. Med detta menar vi att vi skapar ett maskininlärningsproblem där exemplen i datamängden är visuellt mycket lika de i MNIST, men i själva verket alla är slumpgenererade från en Real NVP-arkitektur med kända modellparametrar (vikter), och således är den sanna fördelningen för detta syntetiska bilddatamaterialet känd. Vi är inte medvetna om att någon tidigare forskning använt ett realistiskt och storskaligt leksaksproblem enligt detta recept. Våra resultat visar, för det första, att modeller som tränats via NLL presterar oväntat bra i termer av FID, och att NSGAN-baserad träning av Real NVP-modeller är instabil – även om vi startar träningen vid Nashjämvikten, vilken är det globala optimum som NSGAN är tänkt att konvergera mot. För det andra visar experimenten på syntetiska data att modeller med olika grader av felspecifikation når olika NLL-värden på träningsmaterialet, men de uppvisar alla kvalitativt liknande konvergensbeteende. Om man tittar på NLL-värdena på valideringsdata syns dock en överanpassningseffekt, som härrör från den ändliga storleken på det syntetiska träningsdatamaterialet; specifikt ser vi att de modeller som i teorin perfekt kan beskriva den sanna datafördelningen i praktiken uppnår sämre NLL-värden på valideringsdata än vissa felspecificerade modeller. Den reducerade komplexiteten hos de senare regulariserar uppenbarligen modellerna och hjälper dem att generalisera bättre. Samtidigt noterar vi att överanpassning har en mycket mer uttalad negativ effekt på validerings-NLL än på bildkvalitetsmåttet FID. Vi drar också slutsatsen att modeller med alltför många parametrar och frihetsgrader (överparametriserade modeller) bör undvikas, eftersom de inte bara är långsamma och ofta instabila att träna, också om vi tränar baserat på NLL, men dessutom uppvisar kraftig överanpassning och sämre bildkvalitet. Som helhet belyser resultaten i detta examensarbete det komplexa och icke-intuitiva förhållandet mellan NLL/maximum likelihood och perceptuell bildkvalitet utvärderad med hjälp av FID.
27

Undervisning av elever i behov utav särskilt stöd : Fyra skolors arbetssätt

Eriksson, Malin January 2007 (has links)
<p>Today’s schools agrees that there are students that are in need of special help in school, but how this help best connects to the students can the schools not agree about.</p><p>That’s why I in this essay have chosen to look closer at four different compulsory schools and they’re teaching of students requiring special help. I choose to look at two community schools and two open schools.</p><p>The aim with this essay is to see if the teaching of students in need of special help is different or the same on the four schools. One of the theories that I have used is Haug´s theory about segregated and included integration.</p><p>In my essay I have used qualitative research interview. I have interviewed one person from each school management.</p><p>The result shows that it is not the way the schools teach the students that is important, instead the schools see the contacts between families and the school and the personals attitude agents the students as the most important factor when they work with this students.</p>
28

Undervisning av elever i behov utav särskilt stöd : Fyra skolors arbetssätt

Eriksson, Malin January 2007 (has links)
Today’s schools agrees that there are students that are in need of special help in school, but how this help best connects to the students can the schools not agree about. That’s why I in this essay have chosen to look closer at four different compulsory schools and they’re teaching of students requiring special help. I choose to look at two community schools and two open schools. The aim with this essay is to see if the teaching of students in need of special help is different or the same on the four schools. One of the theories that I have used is Haug´s theory about segregated and included integration. In my essay I have used qualitative research interview. I have interviewed one person from each school management. The result shows that it is not the way the schools teach the students that is important, instead the schools see the contacts between families and the school and the personals attitude agents the students as the most important factor when they work with this students.
29

Intersex - A Challenge for Human Rights and Citizenship Rights

Brömdal, Annette January 2006 (has links)
The purpose with this dissertation is to study the Intersex phenomenon in South Africa, meaning the interplay between the dual sex and gender norms in society. Hence, the treatment by some medical institutions and the view of some non-medical institutions upon this ‘treatment’, have been studied in relation to the Intersex infant’s human rights and citizenship rights. The thesis has moreover also investigated how young Intersex children are included/excluded and mentioned/not mentioned within South Africa’s legal system and within UN’s Convention on the Rights of the Child. Furthermore, because Intersex children are viewed as ‘different’ on two accounts – their status as infants and born with an atypical congenital physical sexual differentiation, the thesis’ theoretical framework looks at the phenomenon from three perspectives – ‘the politics of difference’, human rights, and citizenship rights directed towards infants. The theoretical frameworks have been used to ask questions in relation to the empirical data, i.e. look at how the Intersex infants are ‘treated’ in relation to their status as ‘different’; and also in relation to the concept of being recognized, respected and allowed to partake in deciding whether to impose surgery or not. Moreover, what ‘treatment’ serves the best interest of the Intersex child? This has been done through semi structured interviews. In conclusion, some of the dissertation’s most important features are that since the South African society, like many other societies, strongly live by the belief that there are only two sexes and genders, this implies that Intersex infants do not fit in and become walking pathologies who must be ‘fixed’ to become ‘normal’. Moreover, since most genital corrective surgeries are imposed without being medically or surgically necessary, and are generally imposed before the age of consent (18), the children concerned, are generally not asked for their opinion regarding the surgery. Lastly because early corrective surgery can have devastating life lasting consequences, this ultimately means that the child’s human rights and citizenship rights are of a concern. These conclusions do however not ignore the consequences one has to endure for the price of being ‘different’.
30

Conditional generative modeling for images, 3D animations, and video

Voleti, Vikram 07 1900 (has links)
Generative modeling for computer vision has shown immense progress in the last few years, revolutionizing the way we perceive, understand, and manipulate visual data. This rapidly evolving field has witnessed advancements in image generation, 3D animation, and video prediction that unlock diverse applications across multiple fields including entertainment, design, healthcare, and education. As the demand for sophisticated computer vision systems continues to grow, this dissertation attempts to drive innovation in the field by exploring novel formulations of conditional generative models, and innovative applications in images, 3D animations, and video. Our research focuses on architectures that offer reversible transformations of noise and visual data, and the application of encoder-decoder architectures for generative tasks and 3D content manipulation. In all instances, we incorporate conditional information to enhance the synthesis of visual data, improving the efficiency of the generation process as well as the generated content. Prior successful generative techniques which are reversible between noise and data include normalizing flows and denoising diffusion models. The continuous variant of normalizing flows is powered by Neural Ordinary Differential Equations (Neural ODEs), and have shown some success in modeling the real image distribution. However, they often involve huge number of parameters, and high training time. Denoising diffusion models have recently gained huge popularity for their generalization capabilities especially in text-to-image applications. In this dissertation, we introduce the use of Neural ODEs to model video dynamics using an encoder-decoder architecture, demonstrating their ability to predict future video frames despite being trained solely to reconstruct current frames. In our next contribution, we propose a conditional variant of continuous normalizing flows that enables higher-resolution image generation based on lower-resolution input. This allows us to achieve comparable image quality to regular normalizing flows, while significantly reducing the number of parameters and training time. Our next contribution focuses on a flexible encoder-decoder architecture for accurate estimation and editing of full 3D human pose. We present a comprehensive pipeline that takes human images as input, automatically aligns a user-specified 3D human/non-human character with the pose of the human, and facilitates pose editing based on partial input information. We then proceed to use denoising diffusion models for image and video generation. Regular diffusion models involve the use of a Gaussian process to add noise to clean images. In our next contribution, we derive the relevant mathematical details for denoising diffusion models that use non-isotropic Gaussian processes, present non-isotropic noise, and show that the quality of generated images is comparable with the original formulation. In our final contribution, devise a novel framework building on denoising diffusion models that is capable of solving all three video tasks of prediction, generation, and interpolation. We perform ablation studies using this framework, and show state-of-the-art results on multiple datasets. Our contributions are published articles at peer-reviewed venues. Overall, our research aims to make a meaningful contribution to the pursuit of more efficient and flexible generative models, with the potential to shape the future of computer vision. / La modélisation générative pour la vision par ordinateur a connu d’immenses progrès ces dernières années, révolutionnant notre façon de percevoir, comprendre et manipuler les données visuelles. Ce domaine en constante évolution a connu des avancées dans la génération d’images, l’animation 3D et la prédiction vidéo, débloquant ainsi diverses applications dans plusieurs domaines tels que le divertissement, le design, la santé et l’éducation. Alors que la demande de systèmes de vision par ordinateur sophistiqués ne cesse de croître, cette thèse s’efforce de stimuler l’innovation dans le domaine en explorant de nouvelles formulations de modèles génératifs conditionnels et des applications innovantes dans les images, les animations 3D et la vidéo. Notre recherche se concentre sur des architectures offrant des transformations réversibles du bruit et des données visuelles, ainsi que sur l’application d’architectures encodeur-décodeur pour les tâches génératives et la manipulation de contenu 3D. Dans tous les cas, nous incorporons des informations conditionnelles pour améliorer la synthèse des données visuelles, améliorant ainsi l’efficacité du processus de génération ainsi que le contenu généré. Les techniques génératives antérieures qui sont réversibles entre le bruit et les données et qui ont connu un certain succès comprennent les flux de normalisation et les modèles de diffusion de débruitage. La variante continue des flux de normalisation est alimentée par les équations différentielles ordinaires neuronales (Neural ODEs) et a montré une certaine réussite dans la modélisation de la distribution d’images réelles. Cependant, elles impliquent souvent un grand nombre de paramètres et un temps d’entraînement élevé. Les modèles de diffusion de débruitage ont récemment gagné énormément en popularité en raison de leurs capacités de généralisation, notamment dans les applications de texte vers image. Dans cette thèse, nous introduisons l’utilisation des Neural ODEs pour modéliser la dynamique vidéo à l’aide d’une architecture encodeur-décodeur, démontrant leur capacité à prédire les images vidéo futures malgré le fait d’être entraînées uniquement à reconstruire les images actuelles. Dans notre prochaine contribution, nous proposons une variante conditionnelle des flux de normalisation continus qui permet une génération d’images à résolution supérieure à partir d’une entrée à résolution inférieure. Cela nous permet d’obtenir une qualité d’image comparable à celle des flux de normalisation réguliers, tout en réduisant considérablement le nombre de paramètres et le temps d’entraînement. Notre prochaine contribution se concentre sur une architecture encodeur-décodeur flexible pour l’estimation et l’édition précises de la pose humaine en 3D. Nous présentons un pipeline complet qui prend des images de personnes en entrée, aligne automatiquement un personnage 3D humain/non humain spécifié par l’utilisateur sur la pose de la personne, et facilite l’édition de la pose en fonction d’informations partielles. Nous utilisons ensuite des modèles de diffusion de débruitage pour la génération d’images et de vidéos. Les modèles de diffusion réguliers impliquent l’utilisation d’un processus gaussien pour ajouter du bruit aux images propres. Dans notre prochaine contribution, nous dérivons les détails mathématiques pertinents pour les modèles de diffusion de débruitage qui utilisent des processus gaussiens non isotropes, présentons du bruit non isotrope, et montrons que la qualité des images générées est comparable à la formulation d’origine. Dans notre dernière contribution, nous concevons un nouveau cadre basé sur les modèles de diffusion de débruitage, capable de résoudre les trois tâches vidéo de prédiction, de génération et d’interpolation. Nous réalisons des études d’ablation en utilisant ce cadre et montrons des résultats de pointe sur plusieurs ensembles de données. Nos contributions sont des articles publiés dans des revues à comité de lecture. Dans l’ensemble, notre recherche vise à apporter une contribution significative à la poursuite de modèles génératifs plus efficaces et flexibles, avec le potentiel de façonner l’avenir de la vision par ordinateur.

Page generated in 0.0716 seconds