Global ETD Search

181	Medical image captioning based on Deep Architectures / Medicinsk bild textning baserad på Djupa arkitekturer Moschovis, Georgios January 2022 (has links) Diagnostic Captioning is described as “the automatic generation of a diagnostic text from a set of medical images of a patient collected during an examination” [59] and it can assist inexperienced doctors and radiologists to reduce clinical errors or help experienced professionals increase their productivity. In this context, tools that would help medical doctors produce higher quality reports in less time could be of high interest for medical imaging departments, as well as significantly impact deep learning research within the biomedical domain, which makes it particularly interesting for people involved in industry and researchers all along. In this work, we attempted to develop Diagnostic Captioning systems, based on novel Deep Learning approaches, to investigate to what extent Neural Networks are capable of performing medical image tagging, as well as automatically generating a diagnostic text from a set of medical images. Towards this objective, the first step is concept detection, which boils down to predicting the relevant tags for X-RAY images, whereas the ultimate goal is caption generation. To this end, we further participated in ImageCLEFmedical 2022 evaluation campaign, addressing both the concept detection and the caption prediction tasks by developing baselines based on Deep Neural Networks; including image encoders, classifiers and text generators; in order to get a quantitative measure of my proposed architectures’ performance [28]. My contribution to the evaluation campaign, as part of this work and on behalf of NeuralDynamicsLab¹ group at KTH Royal Institute of Technology, within the school of Electrical Engineering and Computer Science, ranked 4th in the former and 5th in the latter task [55, 68] among 12 groups included within the top-10 best performing submissions in both tasks. / Diagnostisk textning avser automatisk generering från en diagnostisk text från en uppsättning medicinska bilder av en patient som samlats in under en undersökning och den kan hjälpa oerfarna läkare och radiologer, minska kliniska fel eller hjälpa erfarna yrkesmän att producera diagnostiska rapporter snabbare [59]. Därför kan verktyg som skulle hjälpa läkare och radiologer att producera rapporter av högre kvalitet på kortare tid vara av stort intresse för medicinska bildbehandlingsavdelningar, såväl som leda till inverkan på forskning om djupinlärning, vilket gör den domänen särskilt intressant för personer som är involverade i den biomedicinska industrin och djupinlärningsforskare. I detta arbete var mitt huvudmål att utveckla system för diagnostisk textning, med hjälp av nya tillvägagångssätt som används inom djupinlärning, för att undersöka i vilken utsträckning automatisk generering av en diagnostisk text från en uppsättning medi-cinska bilder är möjlig. Mot detta mål är det första steget konceptdetektering som går ut på att förutsäga relevanta taggar för röntgenbilder, medan slutmålet är bildtextgenerering. Jag deltog i ImageCLEF Medical 2022-utvärderingskampanjen, där jag deltog med att ta itu med både konceptdetektering och bildtextförutsägelse för att få ett kvantitativt mått på prestandan för mina föreslagna arkitekturer [28]. Mitt bidrag, där jag representerade forskargruppen NeuralDynamicsLab² , där jag arbetade som ledande forskningsingenjör, placerade sig på 4:e plats i den förra och 5:e i den senare uppgiften [55, 68] bland 12 grupper som ingår bland de 10 bästa bidragen i båda uppgifterna. Artificial Neural Networks Deep Learning Speech and language technology Natural Language Processing (NLP) Deep networks Generative deep networks Convolutional neural networks (CNN) Text generation Information retrieval Diagnostic captioning Image captioning concept prediction classification image encoders transformers Encoder-Decoder architecture abstractive summarization Neurala nätverk Djup inlärning Tal-och språkteknologi naturlig språkbehandling djup neurala nätverk generativa djupa nätverk konvolutionella neurala nätverk Textgenerering Informationssökning Diagnostisk textning Bildtextning konceptförutsägelse klassificering bildkodare transformatorer kodaravkodararkitektur abstrakt sammanfattning Computer and Information Sciences Data- och informationsvetenskap
182	Analysis of Accuracy for Engine and Gearbox Sensors Dogantimur, Erkan, Johnsson, Daniel January 2019 (has links) This thesis provides a standardized method to measure accuracy for engine and gearbox sensors. Accuracy is defined by ISO 5725, which states that trueness and precision need to be known to provide a metric for accuracy. However, obtaining and processing the data required for this is not straight forward. In this thesis, a method is presented that consists of two main parts: data acquisition and data analysis. The data acquisition part shows how to connect all of the equipment used and how to sample and store all the raw data from the sensors. The data analysis part shows how to process that raw data into statistical data, such as trueness, repeatability and reproducibility for the sensors. Once repeatability and reproducibility are known, the total precision can be determined. Accuracy can then be obtained by using information from trueness and precision. Besides, this thesis shows that measurement error can be separated into error caused by the sensors and error caused by the measurand. This is useful information, because it can be used to assess which type of error is the greatest, whether or not it can be compensated for, and if it is economically viable to compensate for such error. The results are then shown, where it is possible to gain information about the sensors’ performance from various graphs. Between Hall and inductive sensors, there were no superior winner, since they both have their strengths and weaknesses. The thesis ends by making recommendations on how to compensate for some of the errors, and how to improve upon the method to make it more automatic in the future. statistical analysis engine gearbox sensor sensors rotation rotational speed differential pressure performance characteristics trueness precision accuracy Hall inductive induction gauge pressure reference hysteresis bias noise linearity non-linearity DEWESoft DAQ wheel tooth teeth absolute rotary encoder matlab XOR Gray binary LabView sampling measurement measuring statistisk analys motor växellåda sensor sensorer rotation rotationell hastighet fart differentiell tryck prestanda karaktäristik trueness precision accuracy noggrannhet onoggrannhet Hall induktiv induktion tryckskillnad referens hysteres bias avstånd brus linjäritet linearitet olinear olinjär olinearitet olinjäritet DEWESoft DAQ sampling mätning hjul tand tänder absolut rotationsgivare givare matlab XOR Gray binär LabView Elektroteknik och elektronik
183	Settling-Time Improvements in Positioning Machines Subject to Nonlinear Friction Using Adaptive Impulse Control Hakala, Tim 31 January 2006 (has links) (PDF) A new method of adaptive impulse control is developed to precisely and quickly control the position of machine components subject to friction. Friction dominates the forces affecting fine positioning dynamics. Friction can depend on payload, velocity, step size, path, initial position, temperature, and other variables. Control problems such as steady-state error and limit cycles often arise when applying conventional control techniques to the position control problem. Studies in the last few decades have shown that impulsive control can produce repeatable displacements as small as ten nanometers without limit cycles or steady-state error in machines subject to dry sliding friction. These displacements are achieved through the application of short duration, high intensity pulses. The relationship between pulse duration and displacement is seldom a simple function. The most dependable practical methods for control are self-tuning; they learn from online experience by adapting an internal control parameter until precise position control is achieved. To date, the best known adaptive pulse control methods adapt a single control parameter. While effective, the single parameter methods suffer from sub-optimal settling times and poor parameter convergence. To improve performance while maintaining the capacity for ultimate precision, a new control method referred to as Adaptive Impulse Control (AIC) has been developed. To better fit the nonlinear relationship between pulses and displacements, AIC adaptively tunes a set of parameters. Each parameter affects a different range of displacements. Online updates depend on the residual control error following each pulse, an estimate of pulse sensitivity, and a learning gain. After an update is calculated, it is distributed among the parameters that were used to calculate the most recent pulse. As the stored relationship converges to the actual relationship of the machine, pulses become more accurate and fewer pulses are needed to reach each desired destination. When fewer pulses are needed, settling time improves and efficiency increases. AIC is experimentally compared to conventional PID control and other adaptive pulse control methods on a rotary system with a position measurement resolution of 16000 encoder counts per revolution of the load wheel. The friction in the test system is nonlinear and irregular with a position dependent break-away torque that varies by a factor of more than 1.8 to 1. AIC is shown to improve settling times by as much as a factor of two when compared to other adaptive pulse control methods while maintaining precise control tolerances. control position adaptive impulsive settling-time nonlinear friction pulses displacements precise tolerances log-spaced update distributed learning Coulomb Stribeck Tomizuka Yang AIC PID MRAC STR RTAI Linux FreeBSD kernel modules microcontroller convergence practical self-tuning methods techniques limit-cycles steady-state error zero stable stability bound envelope partitioned scheme lookup-table multi-point adaptation repeatable mean servo motor exponential square-law rise-time real-time log-log interpolation pro-forma curve-fit sensitivity compliance variable static dynamic response torque acceleration velocity optical encoder parameters evolution fixed-law enhanced split weighting initialization trajectory layered processes Mechanical Engineering
184	Towards meaningful and data-efficient learning : exploring GAN losses, improving few-shot benchmarks, and multimodal video captioning Huang, Gabriel 09 1900 (has links) Ces dernières années, le domaine de l’apprentissage profond a connu des progrès énormes dans des applications allant de la génération d’images, détection d’objets, modélisation du langage à la réponse aux questions visuelles. Les approches classiques telles que l’apprentissage supervisé nécessitent de grandes quantités de données étiquetées et spécifiques à la tâches. Cependant, celles-ci sont parfois coûteuses, peu pratiques, ou trop longues à collecter. La modélisation efficace en données, qui comprend des techniques comme l’apprentissage few-shot (à partir de peu d’exemples) et l’apprentissage self-supervised (auto-supervisé), tentent de remédier au manque de données spécifiques à la tâche en exploitant de grandes quantités de données plus “générales”. Les progrès de l’apprentissage profond, et en particulier de l’apprentissage few-shot, s’appuient sur les benchmarks (suites d’évaluation), les métriques d’évaluation et les jeux de données, car ceux-ci sont utilisés pour tester et départager différentes méthodes sur des tâches précises, et identifier l’état de l’art. Cependant, du fait qu’il s’agit de versions idéalisées de la tâche à résoudre, les benchmarks sont rarement équivalents à la tâche originelle, et peuvent avoir plusieurs limitations qui entravent leur rôle de sélection des directions de recherche les plus prometteuses. De plus, la définition de métriques d’évaluation pertinentes peut être difficile, en particulier dans le cas de sorties structurées et en haute dimension, telles que des images, de l’audio, de la parole ou encore du texte. Cette thèse discute des limites et des perspectives des benchmarks existants, des fonctions de coût (training losses) et des métriques d’évaluation (evaluation metrics), en mettant l’accent sur la modélisation générative - les Réseaux Antagonistes Génératifs (GANs) en particulier - et la modélisation efficace des données, qui comprend l’apprentissage few-shot et self-supervised. La première contribution est une discussion de la tâche de modélisation générative, suivie d’une exploration des propriétés théoriques et empiriques des fonctions de coût des GANs. La deuxième contribution est une discussion sur la limitation des few-shot classification benchmarks, certains ne nécessitant pas de généralisation à de nouvelles sémantiques de classe pour être résolus, et la proposition d’une méthode de base pour les résoudre sans étiquettes en phase de testing. La troisième contribution est une revue sur les méthodes few-shot et self-supervised de détection d’objets , qui souligne les limites et directions de recherche prometteuses. Enfin, la quatrième contribution est une méthode efficace en données pour la description de vidéo qui exploite des jeux de données texte et vidéo non supervisés. / In recent years, the field of deep learning has seen tremendous progress for applications ranging from image generation, object detection, language modeling, to visual question answering. Classic approaches such as supervised learning require large amounts of task-specific and labeled data, which may be too expensive, time-consuming, or impractical to collect. Data-efficient methods, such as few-shot and self-supervised learning, attempt to deal with the limited availability of task-specific data by leveraging large amounts of general data. Progress in deep learning, and in particular, few-shot learning, is largely driven by the relevant benchmarks, evaluation metrics, and datasets. They are used to test and compare different methods on a given task, and determine the state-of-the-art. However, due to being idealized versions of the task to solve, benchmarks are rarely equivalent to the original task, and can have several limitations which hinder their role of identifying the most promising research directions. Moreover, defining meaningful evaluation metrics can be challenging, especially in the case of high-dimensional and structured outputs, such as images, audio, speech, or text. This thesis discusses the limitations and perspectives of existing benchmarks, training losses, and evaluation metrics, with a focus on generative modeling—Generative Adversarial Networks (GANs) in particular—and data-efficient modeling, which includes few-shot and self-supervised learning. The first contribution is a discussion of the generative modeling task, followed by an exploration of theoretical and empirical properties of the GAN loss. The second contribution is a discussion of a limitation of few-shot classification benchmarks, which is that they may not require class semantic generalization to be solved, and the proposal of a baseline method for solving them without test-time labels. The third contribution is a survey of few-shot and self-supervised object detection, which points out the limitations and promising future research for the field. Finally, the fourth contribution is a data-efficient method for video captioning, which leverages unsupervised text and video datasets, and explores several multimodal pretraining strategies. self-supervised learning few-shot classification few-shot object detection low-data learning object detection instance segmentation representation learning residual network visual transformer Faster R-CNN DETR parametric adversarial divergence generative adversarial network variational auto-encoder maximum-likelihood structured prediction optimal discriminator mutual information implicit generative model multimodal pretraining dense video captioning cross-attention YouCook2 HowTo-100M Youtube-8M Recipe-1M Pascal VOC MSCOCO LVIS mutual information neural estimation apprentissage auto-supervisé classification few-shot détection d'objets few-shot apprentissage efficace en données segmentation en instances apprentissage de représentation réseau résiduel transformer visual divergences antagonistes paramétriques auto-encodeur variationnel maximum de vraisemblance prédiction structurée discriminateur optimal information mutuelle modèle génératif implicite pré-apprentissage multi-modal description dense de vidéo attention croisée ResNet ViT GAN VAE MINE

Page generated in 0.0492 seconds