Spelling suggestions: "subject:"[een] CNN"" "subject:"[enn] CNN""
391 |
Predicting Digital Porous Media Properties Using Machine Learning MethodsElmorsy, Mohamed January 2023 (has links)
Subsurface porous media, like aquifers, petroleum reservoirs, and geothermal systems, are vital for natural resources and environmental management. Extensive research has been conducted to understand flow and transport in these media, addressing challenges in hydrocarbon extraction, carbon storage and waste management. Classifying the type of porous media (e.g., sandstone, carbonate) is often the first step in the rock characterization process, and it provides critical information regarding the physical properties of the porous media. Therefore, we utilize multivariate statistical methods with discriminant analysis to categorize porous media samples which proved to be efficient by achieving excellent classification accuracy on testing datasets and served as a surrogate tool to study key porous media characteristics. While recent advances in three-dimensional (3D) imaging of core samples have enabled digital subsurface characterization, the exorbitant computational cost associated with direct numerical simulation in 3D remains a persistent challenge. In contrast, machine learning (ML) models are much more efficient, though their use in subsurface characterization is still in its infancy. Therefore, we introduce a novel 3D convolution neural network (CNN) for end-to-end prediction of permeability. By increasing dataset size, diversity, and optimizing the network architecture, our model surpasses the accuracy of existing 3D CNN models for permeability prediction. It demonstrates excellent generalizability, accurately predicting permeability in previously unseen samples. However, despite the efficiency of the developed 3D CNN model for accurate and fast permeability prediction, its utility remains limited to small subdomains of the digital rock samples. Therefore, we introduce an upscaling technique using a new analytical solution to calculate effective permeability in a 3D digital rock composed of 2 × 2 × 2 anisotropic cells. By incorporating this solution into physics-informed neural network (PINN) models, we achieve highly accurate results. Even when upscaling previously unseen samples at multiple levels, the PINN with the physics-informed module maintains excellent accuracy. This advancement enhances the capability of ML models, like 3D CNN, for efficient and accurate digital rock analysis at the core scale. After successfully applying ML models in permeability prediction, we now extend their application to another important parameter in subsurface engineering projects: effective thermal conductivity, which is a key parameter in engineering projects like radioactive waste repositories, geothermal energy production, and underground energy storage. To address the need for large training data and processing power in ML models, we propose a novel framework based on transfer learning. This approach allows prior knowledge from previous applications to be transferred, resulting in faster and more efficient implementation of new relevant applications. We introduce CNN models trained on various porous media samples that leverage transfer learning to predict porous media sample thermal conductivity accurately. Our approach reduces training time, processing power, and data requirements, enabling effective prediction and analysis of porous media properties such as permeability and thermal conductivity. It also facilitates the application of ML to other properties, improving efficiency and accuracy. / Thesis / Doctor of Philosophy (PhD)
|
392 |
Exploring the Use of Attention for Generation Z Fashion Style Recognition with User Annotations as Labels / Undersökande av uppmärksamhet för igenkänning av Generation Z:s klädstilar med användarannoteringar som träningsetiketterSamakovlis, Niki January 2023 (has links)
As e-commerce and online shopping have increased worldwide, the interest and research of intelligent fashion systems have expanded. Given the competitive nature of the fashion market business, digital marketplaces depend on determining customer preferences. The fashion preferences of the next generation of consumers, Generation Z, are highly discovered on social media, where new fashion styles have emerged. For digital marketplaces to gain the attraction of Generation Z consumers, an understanding of their fashion style preferences may be crucial. However, fashion style recognition remains challenging due to the subjective nature of fashion styles. Previous research has approached the task by fine-tuning pre-trained convolutional neural networks (CNNs). The disadvantage of this approach is that a CNN leveraged on its own fails to find subtle visual differences between clothing items. Hence, this thesis seeks to approach the clothing style recognition task as a fine-grained image recognition task by incorporating a component that allows the model to focus on specific parts of the input images, referred to as an attention mechanism, into the network. Specifically, a convolutional block attention module (CBAM) is added to a CNN. Based on the results, it is concluded that the fine-tuned CNN without the attention module achieves superior performance. In contrast, qualitative analysis conducted on GradCAM visualizations shows that the attention mechanism aids the CNN in capturing discriminative features, while the network without the attention module tends to make predictions based on dataset bias. For a fair comparison, future work should involve extending this research by refining the dataset or using an additional dataset. / I takt med att e-handel har ökat världen över har intresset och forskningen för intelligenta modesystem ökat. Modemarknadens konkurrenskraft har gjort digitala marknadsplatser beroende av att bestämma deras kunders preferenser. Modepreferenserna för nästa generations konsumenter, Generation Z, upptäcks ofta på sociala medier, där nya klädstilar har skapats. För att digitala marknadsplatser ska kunna locka Generation Z kan en förståelse för deras klädstilpreferenser vara avgörande. Igenkänning av klädstilar är dock fortfarande svårt på grund av klädtilars subjektiva natur. Tidigare forskning har finjusterat faltningsnätverk. Nackdelen med detta tillvägagångssätt är att ett faltningsnätverk som utnyttjas på egen hand inte lyckas hitta dem subtila visuella skillnader mellan klädesplagg. Därför definierar denna avhandling problemet som finkornig bildigenkänning genom att addera en komponent som gör att modellen kan fokusera på specifika delar av bilderna, kallad en uppmärksamhetsmekanism, i nätverket. Specifikt läggs en convolutional block attention module (CBAM) till i arkitekturen av ett faltningsnätverk. Baserat på resultaten dras slutsatsen att det finjusterade faltningsnätverket utan uppmärksamhetsmekanismen uppnår överlägsen prestanda. Däremot visar kvalitativ analys utförd på Grad-CAMvisualiseringar att uppmärksamhetsmekanismen hjälper faltningsnätverket att fokusera på de diskriminerande egenskaperna, medan nätverket utan uppmärksamhetsmekanismen tenderar att klassificera baserat på bias i inputdatan. För en rättvis jämförelse bör framtida arbete innebära ett förfinande av datamängden eller använda en ytterligare datamängd.
|
393 |
Deep Learning-Based Approach for Fusing Satellite Imagery and Historical Data for Advanced Traffic Accident SeveritySandaka, Gowtham Kumar, Madhamsetty, Praveen Kumar January 2023 (has links)
Background. This research centers on tackling the serious global problem of trafficaccidents. With more than a million deaths each year and numerous injuries, it’svital to predict and prevent these accidents. By combining satellite images and dataon accidents, this study uses a mix of advanced learning methods to build a modelthat can foresee accidents. This model aims to improve how accurately we predictaccidents and understand what causes them. Ultimately, this could lead to betterroad safety, smoother maintenance, and even benefits for self-driving cars and insurance. Objective.The objective of this thesis is to create a predictive model that improvesthe accuracy of traffic accident severity forecasts by integrating satellite imagery andhistorical accident data and comparing this model with stand-alone data models.Through this hybrid approach, the aim is to enhance prediction precision and gaindeeper insights into the underlying factors contributing to accidents, thereby potentially aiding in the reduction of accidents and their resulting impact. Method.The proposed method involves doing a literature review to find currentimage recognition models and then experimentation by training a Logistic Regression, Random Forest, SVM classifier, VGG19, and the hybrid model using the CNNand VGG19 and then comparing their performance using metrics mentioned in thethesis work. Results.The performance of the proposed method is evaluated using various metrics, including precision, recall, F1 score, and confusion matrix, on a large datasetof labeled images. The results indicate that a high accuracy of 81.7% is achieved indetecting traffic accident severity through our proposed approach where the modelbuilt on individual structural data and image data got an accuracy of 58.4% and72.5%. The potential utilization of our proposed method can detect safe and dangerous locations for accidents. Conclusion.The predictive modeling of Traffic accidents are performed using thethree different types of datasets which are structural data, satellite images, and acombination of both. The finalized architectures are an SVM classifier, VGG19, anda hybrid input model using CNN and VGG19. These models are compared in orderto find the best-performing approach. The results indicate that our hybrid modelhas the best accuracy with 81.7% indicating a strong performance by the model.
|
394 |
Détection de tableaux dans des documents : une étude de TableBankYockell, Eugénie 04 1900 (has links)
L’extraction d’information dans des documents est une nécessité, particulièrement dans
notre ère actuelle où il est commun d’employer un téléphone portable pour photographier
des documents ou des factures. On trouve aussi une utilisation répandue de documents
PDF qui nécessite de traiter une imposante quantité de documents digitaux. Par leur
nature, les données des documents PDF sont complexes à extraire, nécessitant d’être
analysés comme des images. Dans cette recherche, on se concentre sur une information
particulière à prélever: des tableaux. En effet, les tableaux retrouvés dans les docu-
ments représentent une entité significative, car ils contiennent des informations décisives.
L’utilisation de modèles neuronaux pour performer des extractions automatiques permet
considérablement d’économiser du temps et des efforts.
Dans ce mémoire, on définit les métriques, les modèles et les ensembles de données
utilisés pour la tâche de détection de tableaux. On se concentre notamment sur l’étude
des ensembles de données TableBank et PubLayNet, en soulignant les problèmes d’an-
notations présents dans l’ensemble TableBank. On relève que différentes combinaisons
d’ensembles d’entraînement avec TableBank et PubLayNet semblent améliorer les perfor-
mances du modèle Faster R-CNN, ainsi que des méthodes d’augmentations de données.
On compare aussi le modèle de Faster R-CNN avec le modèle CascadeTabNet pour la
détection de tableaux où ce premier demeure supérieur.
D’autre part, on soulève un enjeu qui est peu discuté dans la tâche de détection
d’objets, soit qu’il existe une trop grande quantité de métriques. Cette problématique
rend la comparaison de modèles ardue. On génère ainsi les résultats de modèles selon
plusieurs métriques afin de démontrer qu’elles conduisent généralement vers différents
modèles gagnants, soit le modèle ayant les meilleures performances. On recommande
aussi les métriques les plus pertinentes à observer pour la détection de tableaux, c’est-à-
dire APmedium/APmedium, Pascal AP85 ou COCO AP85 et la métrique de TableBank. / Extracting information from documents is a necessity, especially in today’s age where
it is common to use a cell phone to photograph documents or invoices. There is also
the widespread use of PDF documents that requires processing a large amount of digital
documents. Due to their nature, the data in PDF documents are complex to retrieve,
needing to be analyzed as images. In this research, we focus on a particular information to
be extracted: tables. Indeed, the tables found in documents represent a significant entity,
as they contain decisive information. The use of neural networks to perform automatic
retrieval saves time and effort.
In this research, the metrics, models and datasets used for the table detection task are
defined. In particular, we focus on the study of the TableBank and PubLayNet datasets,
highlighting the problems of annotations present in the TableBank set. We point out that
different combinations of training sets using TableBank and PubLayNet appear to improve
the performance of the Faster R-CNN model, as well as data augmentation methods. We
also compare the Faster R-CNN model with the CascadeTabNet model for table detection
where the former remains superior.
In addition, we raise an issue that is not often discussed in the object detection task,
namely that there are too many metrics. This problem makes model comparison difficult.
We therefore generate results from models with several metrics in order to demonstrate
the influence of these metrics in defining the best performing model. We also recommend
the most relevant metrics to observe for table detection, APmedium/APmedium, Pascal
AP85 or COCO AP85 and the TableBank metric.
|
395 |
Convolutional Neural Network Optimization Using Genetic AlgorithmsReiling, Anthony J. January 2017 (has links)
No description available.
|
396 |
Robust Speech Activity Detection and Direction of Arrival Using Convolutional Neural NetworksNäslund, Anton, Jeansson, Charlie January 2020 (has links)
Social robots are becoming more and more common in our everyday lives. In the field of conversational robotics, the development goes towards socially engaging robots with humanlike conversation. This project looked into one of the technical aspects when recognizing speech, videlicet speech activity detection (SAD). The presented solution uses a convolutional neural network (CNN) based system to detect speech in a forward azimuth area. The project used a dataset from FestVox, called CMU Artic and was complimented by adding recorded noises. A library called Pyroomacoustics were used to simulate a real world setup to create a robust system. A simplified version was built, this model only detected speech activity and a accuracy of 95%was reached. The finished model resulted in an accuracy of 93%.It was compared with similar project, a voice activity detection(VAD) algorithm WebRTC with beamforming, as no previous published solutions to our project was found. Our solution proved to be higher in accuracy in both cases, compared to the accuracy WebRTC achieved on our dataset. / Sociala robotar blir vanligare och vanligare i våra vardagliga liv. Inom konversationsrobotik går utvecklingen mot socialt engagerande robotar som kan ha mänskliga konversationer. Detta projekt tittar på en av de tekniska aspekterna vid taligenkänning, nämligen talaktivitets detektion. Den presenterade lösningen använder ett convolutional neuralt nätverks(CNN) baserat system för att detektera tal i ett framåtriktat azimut område. Projektet använde sig av ett dataset från FestVox, kallat CMU Artic och kompletterades genom att lägga till ett antal inspelade störningsljud. Ett bibliotek som heter Pyroomacoustics användes för att simulera en verklig miljö för att skapa ett robust system. En förenklad modell konstruerades som endast detekterade talaktivitet och en noggrannhet på 95% uppnåddes. Den färdiga maskinen resulterade i en noggrannhet på 93%. Det jämfördes med liknande projekt, en röstaktivitetsdetekterings (VAD) algoritm WebRTC med strålformning, eftersom inga tidigare publicerade lösningar för vårt projekt hittades. Det visade sig att våra lösningar hade högre noggrannhet än den WebRTC uppnådde på vårt dataset. / Kandidatexjobb i elektroteknik 2020, KTH, Stockholm
|
397 |
Blockchain-based Peer-to-peer Electricity Trading Framework Through Machine Learning-based Anomaly Detection TechniqueJing, Zejia 31 August 2022 (has links)
With the growing installation of home photovoltaics, traditional energy trading is evolving from a unidirectional utility-to-consumer model into a more distributed peer-to-peer paradigm. Besides, with the development of building energy management platforms and demand response-enabled smart devices, energy consumption saved, known as negawatt-hours, has also emerged as another commodity that can be exchanged. Users may tune their heating, ventilation, and air conditioning (HVAC) system setpoints to adjust building hourly energy consumption to generate negawatt-hours. Both photovoltaic (PV) energy and negawatt-hours are two major resources of peer-to-peer electricity trading. Blockchain has been touted as an enabler for trustworthy and reliable peer-to-peer trading to facilitate the deployment of such distributed electricity trading through encrypted processes and records. Unfortunately, blockchain cannot fully detect anomalous participant behaviors or malicious inputs to the network. Consequentially, end-user anomaly detection is imperative in enhancing trust in peer-to-peer electricity trading.
This dissertation introduces machine learning-based anomaly detection techniques in peer-to-peer PV energy and negawatt-hour trading. This can help predict the next hour's PV energy and negawatt-hours available and flag potential anomalies when submitted bids. As the traditional energy trading market is agnostic to tangible real-world resources, developing, evaluating, and integrating machine learning forecasting-based anomaly detection methods can give users knowledge of reasonable bid offer quantity. Suppose a user intentionally or unintentionally submits extremely high/low bids that do not match their solar panel capability or are not backed by substantial negawatt-hours and PV energy resources. Some anomalies occur because the participant's sensor is suffering from integrity errors. At the same time, some other abnormal offers are maliciously submitted intentionally to benefit attackers themselves from market disruption. In both cases, anomalies should be detected by the algorithm and rejected by the market. Artificial Neural Networks (ANN), Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), and Convolutional Neural Network (CNN) are compared and studied in PV energy and negawatt-hour forecasting. The semi-supervised anomaly detection framework is explained, and its performance is demonstrated. The threshold values of anomaly detection are determined based on the model trained on historical data. Besides ambient weather information, HVAC setpoint and building occupancy are input parameters to predict building hourly energy consumption in negawatt-hour trading. The building model is trained and managed by negawatt-hour aggregators. CO2 monitoring devices are integrated into the cloud-based smart building platform BEMOSS™ to demonstrate occupancy levels, further improving building load forecasting accuracy in negawatt-hour trading. The relationship between building occupancy and CO2 measurement is analyzed. Finally, experiments based on the Hyperledger platform demonstrate blockchain-based peer-to-peer energy trading and how the platform detects anomalies. / Doctor of Philosophy / The modern power grid is transforming from unidirectional to transactive power systems. Distributed peer-to-peer (P2P) energy trading is becoming more and more popular. Rooftop PV energy and negawatt-hours as two main sources of electricity assets are playing important roles in peer-to-peer energy trading. It enables the building owner to join the electricity market as both energy consumer and producer, also named prosumer.
While P2P energy trading participants are usually un-informed and do not know how much energy they can generate during the next hour. Thus, a system is needed to guide the participant to submit a reasonable amount of PV energy or negawatt-hours to be supplied. This dissertation develops a machine learning-based anomaly detection model for an energy trading platform to detect the reasonable PV energy and negawatt-hours available for the next hour's electricity trading market. The anomaly detection performance of this framework is analyzed. The building load forecasting model used in negawatt-hour trading also considers the effect of building occupancy level and HVAC setpoint adjustment. Moreover, the implication of CO2 measurement devices to monitor building occupancy levels is demonstrated. Finally, a simple Hyperledger-based electricity trading platform that enables participants to sell photovoltaic solar energy/ negawatt-hours to other participants is simulated to demonstrate the potential benefits of blockchain.
|
398 |
Can technical analysis using computer vision generate alpha in the stock market?Lian, Rasmus, Clarin, Oscar January 2024 (has links)
We investigate the novel idea of using computer vision to predict future stock price movement, which is performed by training a convolutional neural network (CNN) to detect patterns in images of stock graphs. Subsequently, we create a portfolio strategy based on the CNN stock price predictions to see if these predictions can generate alpha for investors. We apply this method in the Swedish stock market and evaluate the performance of CNN portfolios across two different exchanges and various stock indices segmented by market capitalisation. Our findings show that trading based on CNN predictions can outperform our benchmarks and generate positive alpha. Most of our portfolios generate positive alpha before transaction costs, while one also generates positive alpha after deducting transaction costs. Further, our results demonstrate that CNN models are capable of successfully generalising their trained knowledge, being able to detect information in stock graphs it has never seen before. This suggests that CNN models are not limited to the characteristics present in their training data, indicating that models trained under one set of market conditions can also be effective in a different market scenario. Our resultsfurther strengthen the overall findings of other researchers utilising similar methods as ours.
|
399 |
Enhancing Industrial Process Interaction Using Deep Learning, Semantic Layers, and Augmented RealityIzquierdo Doménech, Juan Jesús 24 June 2024 (has links)
Tesis por compendio / [ES] La Realidad Aumentada (Augmented Reality, AR) y su capacidad para integrar contenido sintético sobre una imagen real proporciona un valor incalculable en diversos campos; no obstante, la industria es uno de estos campos que más se puede aprovechar de ello. Como tecnología clave en la evolución hacia la Industria 4.0 y 5.0, la AR no solo complementa sino que también potencia la interacción humana con los procesos industriales. En este contexto, la AR se convierte en una herramienta esencial que no sustituye al factor humano, sino que lo enriquece, ampliando sus capacidades y facilitando una colaboración más efectiva entre humanos y tecnología. Esta integración de la AR en entornos industriales no solo mejora la eficiencia y precisión de las tareas, sino que también abre nuevas posibilidades para la expansión del potencial humano.
Existen numerosas formas en las que el ser humano interactúa con la tecnología, siendo la AR uno de los paradigmas más innovadores respecto a cómo los usuarios acceden a la información; sin embargo, es crucial reconocer que la AR, por sí misma, tiene limitaciones en cuanto a la interpretación del contenido que visualiza. Aunque en la actualidad podemos acceder a diferentes librerías que utilizan algoritmos para realizar una detección de imágenes, objetos, o incluso entornos, surge una pregunta fundamental: ¿hasta qué punto puede la AR comprender el contexto de lo que ve? Esta cuestión se vuelve especialmente relevante en entornos industriales. ¿Puede la AR discernir si una máquina está funcionando correctamente, o su rol se limita a la presentación de indicadores digitales superpuestos? La respuesta a estas cuestiones subrayan tanto el potencial como los límites de la AR, impulsando la búsqueda de innovaciones que permitan una mayor comprensión contextual y adaptabilidad a situaciones específicas dentro de la industria.
En el núcleo de esta tesis yace el objetivo de no solo dotar a la AR de una "inteligencia semántica" capaz de interpretar y adaptarse al contexto, sino también de ampliar y enriquecer las formas en que los usuarios interactúan con esta tecnología. Este enfoque se orienta particularmente a mejorar la accesibilidad y la eficiencia de las aplicaciones de AR en entornos industriales, que son por naturaleza restringidos y complejos. La intención es ir un paso más allá de los límites tradicionales de la AR, proporcionando herramientas más intuitivas y adaptativas para los operadores en dichos entornos.
La investigación se despliega a través de tres artículos de investigación, donde se ha desarrollado y evaluado una arquitectura multimodal progresiva. Esta arquitectura integra diversas modalidades de interacción usuario-tecnología, como el control por voz, la manipulación directa y el feedback visual en AR. Además, se incorporan tecnologías avanzadas basadas en modelos de aprendizaje automática (Machine Learning, ML) y aprendizaje profundo (Deep Learning, DL) para extraer y procesar información semántica del entorno. Cada artículo construye sobre el anterior, demostrando una evolución en la capacidad de la AR para interactuar de manera más inteligente y contextual con su entorno, y resaltando la aplicación práctica y los beneficios de estas innovaciones en la industria. / [CA] La Realitat Augmentada (Augmented Reality, AR) i la seua capacitat per integrar contingut sintètic sobre una imatge real ofereix un valor incalculable en diversos camps; no obstant això, la indústria és un d'aquests camps que més pot aprofitar-se'n. Com a tecnologia clau en l'evolució cap a la Indústria 4.0 i 5.0, l'AR no només complementa sinó que també potencia la interacció humana amb els processos industrials. En aquest context, l'AR es converteix en una eina essencial que no substitueix al factor humà, sinó que l'enriqueix, ampliant les seues capacitats i facilitant una col·laboració més efectiva entre humans i tecnologia. Esta integració de l'AR en entorns industrials no solament millora l'eficiència i precisió de les tasques, sinó que també obri noves possibilitats per a l'expansió del potencial humà.
Existeixen nombroses formes en què l'ésser humà interactua amb la tecnologia, sent l'AR un dels paradigmes més innovadors respecte a com els usuaris accedeixen a la informació; no obstant això, és crucial reconéixer que l'AR, per si mateixa, té limitacions quant a la interpretació del contingut que visualitza. Encara que en l'actualitat podem accedir a diferents llibreries que utilitzen algoritmes per a realitzar una detecció d'imatges, objectes, o fins i tot entorns, sorgeix una pregunta fonamental: fins a quin punt pot l'AR comprendre el context d'allò veu? Esta qüestió esdevé especialment rellevant en entorns industrials. Pot l'AR discernir si una màquina està funcionant correctament, o el seu rol es limita a la presentació d'indicadors digitals superposats? La resposta a estes qüestions subratllen tant el potencial com els límits de l'AR, impulsant la recerca d'innovacions que permeten una major comprensió contextual i adaptabilitat a situacions específiques dins de la indústria.
En el nucli d'esta tesi jau l'objectiu de no solament dotar a l'AR d'una "intel·ligència semàntica" capaç d'interpretar i adaptar-se al context, sinó també d'ampliar i enriquir les formes en què els usuaris interactuen amb esta tecnologia. Aquest enfocament s'orienta particularment a millorar l'accessibilitat i l'eficiència de les aplicacions d'AR en entorns industrials, que són de naturalesa restringida i complexos. La intenció és anar un pas més enllà dels límits tradicionals de l'AR, proporcionant eines més intuïtives i adaptatives per als operaris en els entorns esmentats.
La recerca es desplega a través de tres articles d'investigació, on s'ha desenvolupat i avaluat una arquitectura multimodal progressiva. Esta arquitectura integra diverses modalitats d'interacció usuari-tecnologia, com el control per veu, la manipulació directa i el feedback visual en AR. A més, s'incorporen tecnologies avançades basades en models d'aprenentatge automàtic (ML) i aprenentatge profund (DL) per a extreure i processar informació semàntica de l'entorn. Cada article construeix sobre l'anterior, demostrant una evolució en la capacitat de l'AR per a interactuar de manera més intel·ligent i contextual amb el seu entorn, i ressaltant l'aplicació pràctica i els beneficis d'estes innovacions en la indústria. / [EN] Augmented Reality (AR) and its ability to integrate synthetic content over a real image provides invaluable value in various fields; however, the industry is one of these fields that can benefit most from it. As a key technology in the evolution towards Industry 4.0 and 5.0, AR not only complements but also enhances human interaction with industrial processes. In this context, AR becomes an essential tool that does not replace the human factor but enriches it, expanding its capabilities and facilitating more effective collaboration between humans and technology. This integration of AR in industrial environments not only improves the efficiency and precision of tasks but also opens new possibilities for expanding human potential.
There are numerous ways in which humans interact with technology, with AR being one of the most innovative paradigms in how users access information; however, it is crucial to recognize that AR, by itself, has limitations in terms of interpreting the content it visualizes. Although today we can access different libraries that use algorithms for image, object, or even environment detection, a fundamental question arises: To what extent can AR understand the context of what it sees? This question becomes especially relevant in industrial environments. Can AR discern if a machine functions correctly, or is its role limited to presenting superimposed digital indicators? The answer to these questions underscores both the potential and the limits of AR, driving the search for innovations that allow for greater contextual understanding and adaptability to specific situations within the industry.
At the core of this thesis lies the objective of not only endowing AR with "semantic intelligence" capable of interpreting and adapting to context, but also of expanding and enriching the ways users interact with this technology. This approach mainly aims to improve the accessibility and efficiency of AR applications in industrial environments, which are by nature restricted and complex. The intention is to go beyond the traditional limits of AR, providing more intuitive and adaptive tools for operators in these environments.
The research unfolds through three articles, where a progressive multimodal architecture has been developed and evaluated. This architecture integrates various user-technology interaction modalities, such as voice control, direct manipulation, and visual feedback in AR. In addition, advanced technologies based on Machine Learning (ML) and Deep Learning (DL) models are incorporated to extract and process semantic information from the environment. Each article builds upon the previous one, demonstrating an evolution in AR's ability to interact more intelligently and contextually with its environment, and highlighting the practical application and benefits of these innovations in the industry. / Izquierdo Doménech, JJ. (2024). Enhancing Industrial Process Interaction Using Deep Learning, Semantic Layers, and Augmented Reality [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/205523 / Compendio
|
400 |
Intelligent ECG Acquisition and Processing System for Improved Sudden Cardiac Arrest (SCA) PredictionKota, Venkata Deepa 12 1900 (has links)
The survival rate for a suddent cardiac arrest (SCA) is incredibly low, with less than one in ten surviving; most SCAs occur outside of a hospital setting. There is a need to develop an effective and efficient system that can sense, communicate and remediate potential SCA situations on a near real-time basis. This research presents a novel Zeolite-PDMS-based optically unobtrusive flexible dry electrodes for biosignal acquisition from various subjects while at rest and in motion. Two zeolite crystals (4A and 13X) are used to fabricate the electrodes. Three different sizes and two different filler concentrations are compared to identify the better performing electrode suited for electrocardiogram (ECG) data acquisition. A low-power, low-noise amplifier with chopper modulation is designed and implemented using the standard 180nm CMOS process. A commercial off-the-shelf (COTS) based wireless system is designed for transmitting ECG signals. Further, this dissertation provides a framework for Machine Learning Classification algorithms on large, open-source Arrhythmia and SCA datasets. Supervised models with features as the input data and deep learning models with raw ECG as input are compared using different methods. The machine learning tool classifies the datasets within a few minutes, saving time and effort for the physicians. The experimental results show promising progress towards advancing the development of a wireless ECG recording system combined with efficient machine learning models that can positively impact SCA outcomes.
|
Page generated in 0.0692 seconds