Return to search

ImageSI: Interactive Deep Learning for Image Semantic Interaction

Interactive deep learning frameworks are crucial for effectively exploring and analyzing complex image datasets in visual analytics. However, existing approaches often face challenges related to inference accuracy and adaptability. To address these issues, we propose ImageSI, a framework integrating deep learning models with semantic interaction techniques for interactive image data analysis. Unlike traditional methods, ImageSI directly incorporates user feedback into the image model, updating underlying embeddings through customized loss functions, thereby enhancing the performance of dimension reduction tasks. We introduce three variations of ImageSI, ImageSI$_{text{MDS}^{-1}}$, prioritizing explicit pairwise relationships from user interaction, and ImageSI$_{text{DRTriplet}}$ and ImageSI$_{text{PHTriplet}}$, emphasizing clustering by defining groups of images based on user input. Through usage scenarios and quantitative analyses centered on algorithms, we demonstrate the superior performance of ImageSI$_{text{DRTriplet}}$ and ImageSI$_{text{MDS}^{-1}}$ in terms of inference accuracy and interaction efficiency. Moreover, ImageSI$_{text{PHTriplet}}$ shows competitive results. The baseline model, WMDS$^{-1}$, generally exhibits lower performance metrics. / Master of Science / Interactive deep learning frameworks are crucial for effectively exploring and analyzing complex image datasets in visual analytics. However, existing approaches often face challenges related to inference accuracy and adaptability. To address these issues, we propose ImageSI, a framework integrating deep learning models with semantic interaction techniques for interactive image data analysis. Unlike traditional methods, ImageSI directly incorporates user feedback into the image model, updating underlying embeddings through customized loss functions, thereby enhancing the performance of dimension reduction tasks. We introduce three variations of ImageSI, ImageSI$_{text{MDS}^{-1}}$, prioritizing explicit pairwise relationships from user interaction, and ImageSI$_{text{DRTriplet}}$ and ImageSI$_{text{PHTriplet}}$, emphasizing clustering by defining groups of images based on user input. Through usage scenarios and quantitative analyses centered on algorithms, we demonstrate the superior performance of ImageSI$_{text{DRTriplet}}$ and ImageSI$_{text{MDS}^{-1}}$ in terms of inference accuracy and interaction efficiency. Moreover, ImageSI$_{text{PHTriplet}}$ shows competitive results. The baseline model, WMDS$^{-1}$, generally exhibits lower performance metrics.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/119283
Date04 June 2024
CreatorsLin, Jiayue
ContributorsComputer Science and#38; Applications, North, Christopher L., Faust, Rebecca Jane, Huang, Lifu
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
LanguageEnglish
Detected LanguageEnglish
TypeThesis
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0018 seconds