Global ETD Search

1	Efficient image/video restyling and collage on GPU. / CUHK electronic theses & dissertations collection January 2013 (has links) 創意媒體研究中，圖像/視頻再藝術作為有表現力的用戶定制外觀的創作手段受到了很大關注。交互設計中，特別是在圖像空間只有單張圖像或視頻輸入的情況下，運用計算機輔助設計虛擬地再渲染關注物體的風格化外觀來實現紋理替換是很強大的。現行的紋理替換往往通過操作圖像空間中像素的間距來處理紋理扭曲，原始圖像中潛在的紋理扭曲總是被破壞，因為現行的方法要麼存在由於手動網格拉伸導致的不恰當扭曲，要麼就由於紋理合成而導致不可避免的紋理開裂。圖像/視頻拼貼畫是被發明用以支持在顯示畫布上並行展示多個物體和活動。隨著數字視頻俘獲裝置的快速發展，相關的議題就是快速檢閱和摘要大量的視覺媒體數據集來找出關注的資料。這會是一項繁瑣的任務來審查長且乏味的監控視頻並快速把握重要信息。以關鍵信息和縮短視頻形式為交流媒介，視頻摘要是增強視覺數據集瀏覽效率和簡易理解的手段。 / 本文首先將圖像/視頻再藝術聚焦在高效紋理替換和風格化上。我們展示了一種交互紋理替換方法，能夠在不知潛在幾何結構和光照環境的情況下保持相似的紋理扭曲。我們運用SIFT 棱角特徵來自然地發現潛在紋理扭曲，並應用梯度深度圖復原和皺褶重要性優化來完成扭曲過程。我們運用GPU-CUDA 的並行性，通過實時雙邊網格和特徵導向的扭曲優化來促成交互紋理替換。我們運用基於塊的實時高精度TV-L¹光流，通過基於關鍵幀的紋理傳遞來完成視頻紋理替換。我們進一步研究了基於GPU 的風格化方法，並運用梯度優化保持原始圖像的精細結構。我們提出了一種能夠自然建模原始圖像精細結構的圖像結構圖，並運用基於梯度的切線生成和切線導向的形態學來構建這個結構圖。我們在GPU-CUDA 上通過並行雙邊網格和結構保持促成最終風格化。實驗中，我們的方法實時連續地展現了高質量的圖像/視頻的抽象再藝術。 / 當前，視頻拼貼畫大多創作靜態的基於關鍵幀的拼貼圖片，該結果只包含動態視頻有限的信息，會很大程度影響視覺數據集的理解。爲了便於瀏覽，我們展示了一種在顯示畫布上有效並行摘要動態活動的動態視頻拼貼畫。我們提出應用活動長方體來重組織及提取事件，執行視頻防抖來生成穩定的活動長方體，實行時空域優化來優化活動長方體在三維拼貼空間的位置。我們通過在GPU 上的事件相似性和移動關係優化來完成高效的動態拼貼畫，允許多視頻輸入。擁有再序核函數CUDA 處理，我們的視頻拼貼畫爲便捷瀏覽長視頻激活了動態摘要，節省大量存儲傳輸空間。實驗和調查表明我們的動態拼貼畫快捷有效，能被廣泛應用于視頻摘要。將來，我們會擴展交互紋理替換來支持更複雜的具大運動和遮蔽場景的一般視頻，避免紋理跳動。我們會採用最新視頻技術靈感使視頻紋理替換更加穩定。我們未來關於視頻拼貼畫的工作包括審查監控業中動態拼貼畫應用，並研究含有大量相機運動和不同種視頻過度的移動相機和一般視頻。 / Image/video restyling as an expressive way for producing usercustomized appearances has received much attention in creative media researches. In interactive design, it would be powerful to re-render the stylized presentation of interested objects virtually using computer-aided design tools for retexturing, especially in the image space with a single image or video as input. The nowaday retexturing methods mostly process texture distortion by inter-pixel distance manipulation in image space, the underlying texture distortion is always destroyed due to limitations like improper distortion caused by human mesh stretching, or unavoidable texture splitting caused by texture synthesis. Image/ video collage techniques are invented to allow parallel presenting of multiple objects and events on the display canvas. With the rapid development of digital video capture devices, the related issues are to quickly review and brief such large amount of visual media datasets to find out interested video materials. It will be a tedious task to investigate long boring surveillance videos and grasp the essential information quickly. By applying key information and shortened video forms as vehicles for communication, video abstraction and summary are the means to enhance the browsing efficiency and easy understanding of visual media datasets. / In this thesis, we first focused our image/video restyling work on efficient retexturing and stylization. We present an interactive retexturing that preserves similar texture distortion without knowing the underlying geometry and lighting environment. We utilized SIFT corner features to naturally discover the underlying texture distortion. The gradient depth recovery and wrinkle stress optimization are applied to accomplish the distortion process. We facilitate the interactive retexturing via real-time bilateral grids and feature-guided distortion optimization using GPU-CUDA parallelism. Video retexturing is achieved through a keyframe-based texture transferring strategy using accurate TV-L¹ optical flow with patch motion tracking techniques in real-time. Further, we work on GPU-based abstract stylization that preserves the fine structure in the original images using gradient optimization. We propose an image structure map to naturally distill the fine structure of the original images. Gradientbased tangent generation and tangent-guided morphology are applied to build the structure map. We facilitate the final stylization via parallel bilateral grids and structure-aware stylizing in real-time on GPU-CUDA. In the experiments, our proposed methods consistently demonstrate high quality performance of image/video abstract restyling in real-time. / Currently, in video abstraction, video collages are mostly produced with static keyfame-based collage pictures, which contain limited information of dynamic videos and in uence understanding of visual media datasets greatly. We present dynamic video collage that effectively summarizes condensed dynamic activities in parallel on the canvas for easy browsing. We propose to utilize activity cuboids to reorganize and extract dynamic objects for further collaging, and video stabilization is performed to generate stabilized activity cuboids. Spatial-temporal optimization is carried out to optimize the positions of activity cuboids in the 3D collage space. We facilitate the efficient dynamic collage via event similarity and moving relationship optimization on GPU allowing multi-video inputs. Our video collage approach with kernel reordering CUDA processing enables dynamic summaries for easy browsing of long videos, while saving huge memory space for storing and transmitting them. The experiments and user study have shown the efficiency and usefulness of our dynamic video collage, which can be widely applied for video briefing and summary applications. In the future, we will further extend the interactive retexturing to more complicated general video applications with large motion and occluded scene avoiding textures icking. We will also work on new approaches to make video retexturing more stable by inspiration from latest video processing techniques. Our future work for video collage includes investigating applications of dynamic collage into the surveillance industry, and working on moving camera and general videos, which may contain large amount of camera motions and different types of video shot transitions. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Li, Ping. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2013. / Includes bibliographical references (leaves 109-121). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts also in Chinese. / Abstract --- p.i / Acknowledgements --- p.v / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Background --- p.1 / Chapter 1.2 --- Main Contributions --- p.5 / Chapter 1.3 --- Thesis Overview --- p.7 / Chapter 2 --- Efficient Image/video Retexturing --- p.8 / Chapter 2.1 --- Introduction --- p.8 / Chapter 2.2 --- Related Work --- p.11 / Chapter 2.3 --- Image/video Retexturing on GPU --- p.16 / Chapter 2.3.1 --- Wrinkle Stress Optimization --- p.19 / Chapter 2.3.2 --- Efficient Video Retexturing --- p.24 / Chapter 2.3.3 --- Interactive Parallel Retexturing --- p.29 / Chapter 2.4 --- Results and Discussion --- p.35 / Chapter 2.5 --- Chapter Summary --- p.41 / Chapter 3 --- Structure-Aware Image Stylization --- p.43 / Chapter 3.1 --- Introduction --- p.43 / Chapter 3.2 --- Related Work --- p.46 / Chapter 3.3 --- Structure-Aware Stylization --- p.50 / Chapter 3.3.1 --- Approach Overview --- p.50 / Chapter 3.3.2 --- Gradient-Based Tangent Generation --- p.52 / Chapter 3.3.3 --- Tangent-Guided Image Morphology --- p.54 / Chapter 3.3.4 --- Structure-Aware Optimization --- p.56 / Chapter 3.3.5 --- GPU-Accelerated Stylization --- p.58 / Chapter 3.4 --- Results and Discussion --- p.61 / Chapter 3.5 --- Chapter Summary --- p.66 / Chapter 4 --- Dynamic Video Collage --- p.67 / Chapter 4.1 --- Introduction --- p.67 / Chapter 4.2 --- Related Work --- p.70 / Chapter 4.3 --- Dynamic Video Collage on GPU --- p.74 / Chapter 4.3.1 --- Activity Cuboid Generation --- p.75 / Chapter 4.3.2 --- Spatial-Temporal Optimization --- p.80 / Chapter 4.3.3 --- GPU-Accelerated Parallel Collage --- p.86 / Chapter 4.4 --- Results and Discussion --- p.90 / Chapter 4.5 --- Chapter Summary --- p.100 / Chapter 5 --- Conclusion --- p.101 / Chapter 5.1 --- Research Summary --- p.101 / Chapter 5.2 --- Future Work --- p.104 / Chapter A --- Publication List --- p.107 / Bibliography --- p.109 Graphics processing units--Programming Image processing--Data processing Digital video--Editing
2	Multi-frame information fusion for image and video enhancement Gunturk, Bahadir K. 01 December 2003 (has links) No description available. Imaging systems Image quality Digital video Editing Image processing Digital techniques
3	The iterative frame : algorithmic video editing, participant observation & the black box Rapoport, Robert S. January 2016 (has links) Machine learning is increasingly involved in both our production and consumption of video. One symptom of this is the appearance of automated video editing applications. As this technology spreads rapidly to consumers, the need for substantive research about its social impact grows. To this end, this project maintains a focus on video editing as a microcosm of larger shifts in cultural objects co-authored by artificial intelligence. The window in which this research occurred (2010-2015) saw machine learning move increasingly into the public eye, and with it ethical concerns. What follows is, on the most abstract level, a discussion of why these ethical concerns are particularly urgent in the realm of the moving image. Algorithmic editing consists of software instructions to automate the creation of timelines of moving images. The criteria that this software uses to query a database is variable. Algorithmic authorship already exists in other media, but I will argue that the moving image is a separate case insofar as the raw material of text and music software can develop on its own. The performance of a trained actor can still not be generated by software. Thus, my focus is on the relationship between live embodied performance, and the subsequent algorithmic editing of that footage. This is a process that can employ other software like computer vision (to analyze the content of video) and predictive analytics (to guess what kind of automated film to make for a given user). How is performance altered when it has to communicate to human and non-human alike? The ritual of the iterative frame gives literal form to something that throughout human history has been a projection: the omniscient participant observer, more commonly known as the Divine. We experience black boxed software (AI's, specifically neural networks, which are intrinsically opaque) as functionally omniscient and tacitly allow it to edit more and more of life (e.g. filtering articles, playlists and even potential spouses). As long as it remains disembodied, we will continue to project the Divine on to the black box, causing cultural anxiety. In other words, predictive analytics alienate us from the source code of our cultural texts. The iterative frame then is a space in which these forces can be inscribed on the body, and hence narrated. The algorithmic editing of content is already taken for granted. The editing of moving images, in contrast, still requires a human hand. We need to understand the social power of moving image editing before it is delegated to automation. Practice Section: This project is practice-led, meaning that the portfolio of work was produced as it was being theorized. To underscore this, the portfolio comes at the end of the document. Video editors use artificial intelligence (AI) in a number of different applications, from deciding the sequencing of timelines to using facial and language detection to find actors in archives. This changes traditional production workflows on a number of levels. How can the single decision cut a between two frames of video speak to the larger epistemological shifts brought on by predictive analytics and Big Data (upon which they rely)? When predictive analytics begin modeling the world of moving images, how will our own understanding of the world change? In the practice-based section of this thesis, I explore how these shifts will change the way in which actors might approach performance. What does a gesture mean to AI and how will the editor decontextualize it? The set of a video shoot that will employ an element of AI in editing represents a move towards ritualization of production, summarized in the term the 'iterative frame'. The portfolio contains eight works that treat the set was taken as a microcosm of larger shifts in the production of culture. There is, I argue, metaphorical significance in the changing understanding of terms like 'continuity' and 'sync' on the AI-watched set. Theory Section In the theoretical section, the approach is broadly comparative. I contextualize the current dynamic by looking at previous shifts in technology that changed the relationship between production and post-production, notably the lightweight recording technology of the 1960s. This section also draws on debates in ethnographic filmmaking about the matching of film and ritual. In this body of literature, there is a focus on how participant observation can be formalized in film. Triangulating between event, participant observer and edit grammar in ethnographic filmmaking provides a useful analogy in understanding how AI as film editor might function in relation to contemporary production. Rituals occur in a frame that is dependent on a spatially/temporally separate observer. This dynamic also exists on sets bound for post-production involving AI, The convergence of film grammar and ritual grammar occurred in the 1960s under the banner of cinéma vérité in which the relationship between participant observer/ethnographer and the subject became most transparent. In Rouch and Morin's Chronicle of a Summer (1961), reflexivity became ritualized in the form of on-screen feedback sessions. The edit became transparent-the black box of cinema disappeared. Today as artificial intelligence enters the film production process this relationship begins to reverse-feedback, while it exists, becomes less transparent. The weight of the feedback ritual gets gradually shifted from presence and production to montage and post-production. Put differently, in cinéma vérité, the participant observer was most present in the frame. As participant observation gradually becomes shared with code it becomes more difficult to give it an embodied representation and thus its presence is felt more in the edit of the film. The relationship between the ritual actor and the participant observer (the algorithm) is completely mediated by the edit, a reassertion of the black box, where once it had been transparent. The crucible for looking at the relationship between algorithmic editing, participant observation and the black box is the subject in trance. In ritual trance the individual is subsumed by collective codes. Long before the advent of automated editing trance was an epistemological problem posed to film editing. In the iterative frame, for the first time, film grammar can echo ritual grammar and indeed become continuous with it. This occurs through removing the act of cutting from the causal world, and projecting this logic of post-production onto performance. Why does this occur? Ritual and specifically ritual trance is the moment when a culture gives embodied form to what it could not otherwise articulate. The trance of predictive analytics-the AI that increasingly choreographs our relationship to information-is the ineffable that finds form in the iterative frame. In the iterative frame a gesture never exists in a single instance, but in a potential state. The performers in this frame begin to understand themselves in terms of how automated indexing processes reconfigure their performance. To the extent that gestures are complicit with this mode of databasing they can be seen as votive toward the algorithmic. The practice section focuses on the poetics of this position. Chapter One focuses on cinéma vérité as a moment in which the relationship between production and post-production shifted as a function of more agile recording technology, allowing the participant observer to enter the frame. This shift becomes a lens to look at changes that AI might bring. Chapter Two treats the work of Pierre Huyghe as a 'liminal phase' in which a new relationship between production and post-production is explored. Finally, Chapter Three looks at a film in which actors perform with awareness that footage will be processed by an algorithmic edit. / The conclusion looks at the implications this way of relating to AI-especially commercial AI-through embodied performance could foster a more critical relationship to the proliferating black-boxed modes of production.
4	A new adaptive trilateral filter for in-loop filtering Kesireddy, Akitha January 2014 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / HEVC has achieved significant coding efficiency improvement beyond existing video coding standard by employing many new coding tools. Deblocking Filter, Sample Adaptive Offset and Adaptive Loop Filter for in-loop filtering are currently introduced for the HEVC standardization. However these filters are implemented in spatial domain despite the fact of temporal correlation within video sequences. To reduce the artifacts and better align object boundaries in video , a new algorithm in in-loop filtering is proposed. The proposed algorithm is implemented in HM-11.0 software. This proposed algorithm allows an average bitrate reduction of about 0.7% and improves the PSNR of the decoded frame by 0.05%, 0.30% and 0.35% in luminance and chroma. MPEG (Video coding standard) -- Research Digital video -- Standards -- Research Image processing -- Digital techniques Decoders (Electronics) Coding theory Algorithms Transformations (Mathematics) Electrical engineering -- Mathematics

1

Page generated in 0.0737 seconds