Spelling suggestions: "subject:"text anda image"" "subject:"text ando image""
21 |
Are AI-Photographers Ready for Hire? : Investigating the possibilities of AI generated images in journalismBreuer, Andrea, Jonsson, Isac January 2023 (has links)
In today’s information era, many news outlets are competing for attention. One way to cut through the noise is to use images. Obtaining images can be both time-consuming and expen- sive for smaller news agencies. In collaboration with the Swedish news agency Newsworthy, we investigate the possibilities of using AI-generated images in a journalistic context. Using images generated with the text-to-image generation model Stable Diffusion, we aim to answer the research question How do the parameters in Stable Diffusion affect the applicability of the generated images for journalistic purposes? A total of 511 images are generated with different Stable Diffusion parameter settings and rated on a scale of 1-5 by three journalists at Newswor- thy. The data is analyzed using ordinal logistic regression. The results suggest that the optimal value for the Stable Diffusion parameter classifier-free guidance is around 10-12, the default 50 iterations are sufficient, and keywords do not significantly affect the image outcome. The parameter that has the single greatest effect on the outcome is the prompt. Thus, to generate photo-realistic images that can be used in a journalistic context, most thought and effort should be put towards formulating a suitable prompt.
|
22 |
Fotografering med AI i bilden : En intervjustudie med svenska fotografer om det fotografiska yrket med teknologin artificiell intelligensAvelin Belin, Adam, Geidemark, Oscar January 2024 (has links)
At the turn of the century, photographers started a transition from analogue to digital photography: a transition which took years to complete. The photographic landscape is now in a new period of development, with new AI-programs based on machine learning. These programs can be about (I) editing programs that aim to streamline the photographic workflow or correct pictures with technical flaws, such as reducing grain from a high ISO or enlarging pictures. The programs also incorporate (II) AI generated images based on prompting from an AI-artist, so called text-to-image programs like DALL-E 3, Midjourney, Stable Diffusion and Firefly 2. In this study we have done seven semi-structured interviews with Swedish photographers. We have used a postphenomenological theory based on Don Ihde’s philosophy. In analyzing the material from the interviews we have used narrative analysis. The results showed that photographers who worked in advertising, with organizations or clients had a higher tolerance for image manipulation with (II). These photographers valued effectiveness in their workflow and saw a larger need to adapt to new technology. While nature photographers valued authenticity and used (I) sparingly. Another result from the study was that Swedish photographers do not consider (II) photography and do not think that AI generated images should be allowed to compete in photo competitions, only in certain categories with only AI generated images.
|
23 |
Quantitative and Qualitative Analysis of Text-to-Image modelsMasrourisaadat, Nila 30 August 2023 (has links)
The field of image synthesis has seen significant progress recently, including great strides with generative models like Generative Adversarial Networks (GANs), Diffusion Models, and Transformers.
These models have shown they can create high-quality images from a variety of text prompts. However, a comprehensive analysis that examines both their performance and possible biases is often missing from existing research.
In this thesis, I undertake a thorough examination of several leading text-to-image models, namely Stable Diffusion, DALL-E Mini, Lafite, and Ernie-ViLG. I assess their performance in generating accurate images of human faces, groups, and specified numbers of objects, using both Frechet Inception Distance (FID) scores and R-precision as my evaluation metrics. Moreover, I uncover inherent gender or social biases these models may possess.
My research reveals a noticeable bias in these models, which show a tendency towards generating images of white males, thus under-representing minorities in their output of human faces. This finding contributes to the broader dialogue on ethics in AI and sets the stage for further research aimed at developing more equitable AI systems.
Furthermore, based on the metrics I used for evaluation, the Stable Diffusion model outperforms the others in generating images from text prompts. This information could be particularly useful for researchers and practitioners trying to choose the most effective model for their future projects.
To facilitate further research in this field, I have made my findings, the related data, and the source code publicly available. / Master of Science / In my research, I explored how cutting-edge computer models, namely Stable Diffusion, DALL-E Mini, Lafite, and Ernie-ViLG, can create images from text descriptions, a process that holds exciting possibilities for the future. However, these technologies aren't without their challenges. An important finding from my study is that these models exhibit bias, e.g., they often generate images of white males more than they do of other races and genders. This suggests they're not representing our diverse society fairly. Among these models, Stable Diffusion outperforms the others at creating images from text prompts, which is valuable information for anyone choosing a model for their projects. To help others learn from my work and build upon it, I've made all my data, findings, and the code I used in this study publicly available. By sharing this work, I hope to contribute to improving this technology, making it even better and fairer for everyone in the future.
|
24 |
Assisted Prompt Engineering : Making Text-to-Image Models Available Through Intuitive Prompt Applications / Assisterad Prompt Engineering : Gör Text-till-Bild Modeller Tillgängliga Med Intuitiva Prompt ApplikationerBjörnler, Zimone January 2024 (has links)
This thesis explores the application of prompt engineering combined with human-AI interaction (HAII) to make text-to-image (TTI) models more accessible and intuitive for non-expert users. The thesis research focuses on developing an application with an intuitive interface that enables users to generate images without extensive knowledge of prompt engineering. A pre-post study was conducted to evaluate the application, demonstrating significant improvements in user satisfaction and ease of use. The findings suggest that such tailored interfaces can make AI technologies more accessible, empowering users to engage creatively with minimal technical barriers. This study contributes to the fields of Media technology and AI by showcasing how simplifying prompt engineering can enhance the accessibility of generative AI tools. / Detta examensarbete utforskar tillämpningen av prompt engineering i kombination med human-AI interaction för att göra text-till-bild modeller mer tillgängliga och intuitiva för icke-experter. Forskningen för examensarbetet fokuseras på att utveckla en applikation med ett intuitivt gränssnitt som gör det möjligt för användare att generera bilder utan omfattande kunskaper om prompt engineering. En före-efter-studie genomfördes för att utvärdera applikationen, vilket visade på en tydlig ökning i användarnöjdhet och användarvänlighet. Utfallet från studien tyder på att skräddarsydda gränssnitt kan göra AI-tekniken mer tillgänglig, och göra det möjligt för användare att nyttja det kreativa skapandet med minimerade tekniska hinder. Den här studien bidrar till områdena avmedieteknik och AI genom att demonstrera hur prompt engineering kan förenklas vilket kan förbättra tillgängligheten av AI-verktyg.
|
25 |
Ariane, vision parlante ? : l’ekphrasis illusionniste chez Catulle et les épigrammatistes hellénistiques / Ariadne, a speaking vision? : illusionist ekphraseis in Catullus and Hellenistic epigramsIff-Noël, Flora 04 July 2019 (has links)
Catulle, dans le poème 64, invente une ekphrasis d’un nouveau genre : au lieu de décrire une œuvre d’art dans sa matérialité pour la mettre sous les yeux des lecteurs selon la tradition rhétorique, il fait parler son personnage principal, Ariane. En quoi la figure d’Ariane a-t-elle permis à Catulle d’entériner une évolution de l’ekphrasis entamée par la littérature hellénistique, à savoir la focalisation non sur la matérialité de l’objet, mais sur son sens, une réflexion sur les liens entre vision et diction ? Il convient d’éclairer ce poème majeur de la littérature latine en le réintégrant, d’une part, aux multiples représentations figurées d’Ariane dans l’Antiquité et, d’autre part, à la lignée des ekphraseis précédentes, concept entendu au sens de « texte consacré à une œuvre d’art » pour inclure descriptions mais aussi narrations ou courts dialogues comme ceux des épigrammes ecphrastiques. En particulier, la prise de parole de l’objet d’art se révèle un topos épigrammatique hellénistique qui nécessite une étude systématique. Ce motif, baptisé topos de l’illusionnisme de l’art, mesure la qualité d’une œuvre d’art à sa capacité à sembler sur le point de parler, se mouvoir ou prendre vie. La typologie de ce topos met en évidence l’évolution de l’esthétique et de la relation entre poésie et arts figurés. Le poème 64 de Catulle se révèle alors reprendre ce topos – comme de nombreux textes après lui – pour constituer une surenchère illusionniste dans l’ekphrasis où l’œuvre d’art prend vie. La poétique de Catulle trouve un éclairage nouveau qui permet de mieux tracer la réception de l’esthétique alexandrine à Rome et l’influence de Catulle sur les poètes latins postérieurs. / This interdisciplinary dissertation uses text and image studies, intertextuality and metapoetics to analyze the relationships between vision and diction in ekphraseis understood as texts devoted to works of art, and particularly in Catullus’s canonical poem 64. Poem 64 has puzzled many critics by its “disobedient ekphrasis” of a coverlet: not only does it scarcely describe its subject, but it turns into a long monologue by Ariadne, the main figure woven into the coverlet. I argue that, far from disregarding the coverlet, Catullus elaborates on a topos of Hellenistic ekphrastic epigrams that measures an artwork’s value by its illusionist capacity to “seem about to speak” and “come to life”. My extensive classification of the epigrammatic variants of this topos reveals its presence in Catullus through specific keywords. Ariadne’s representation on the coverlet is so lifelike that it starts to speak. Instead of following the critical tradition which considers Ariadne’s speech as another instance of epic or tragic monologue, I analyze it as a major Catullan innovation, in dialogue with the aesthetic debates of his day. Bringing together Hellenistic and Roman figurative arts and literatures sheds a new light on Catullan poetics and, more generally, on the reception of Alexandrian aesthetics in Rome and on Catullus’s influence on posterior Latin poets.
|
26 |
"Det är bara att hälla in texten" : – En studie om att anpassa ett tryckt läromedel till digitalt formatAttersand, Åsa January 2018 (has links)
Abstract This study is about what happens to text and image content in a printed teaching material when adapted to a digital format. The purpose of my work is to find out if readability is changing in the new medium. The theoretical focus of the study is readability, text and image in collaboration, as well as multimodality. To find out if the digital format affects readability, i have performed two text analyzes, a readability analysis of the printed material and a comparative analysis of the digital material. I have also conducted interviews with teachers and experts about the teaching methods. A test and a participant observation were conducted with the target group where my improvement proposal was compared with the existing material. The conclusions i have come to is that there must be clear links between text and image in a digital teaching material. Color markings in body text with word explanations can interfere with the reading rhythm and too many heading levels can confuse the reader. The most important conclusion is, however, that how readable a text is depends on the audience's previous knowledge, objectives and motivation for the subject. A hard-worded factual text does necessarily need to be easier by splitting the text into short paragraphs, but it can however, make it more difficult to understand the content. / Sammanfattning Den här studien handlar om vad som händer med text och bildinnehåll i ett tryckt läromedel när det ska anpassas till ett digitalt format. Syftet med mitt arbete är att ta reda på om läsbarheten förändras i det nya mediet. Studiens teoretiska fokus ligger på: Läsbarhet, text och bild i samverkan, samt multimodalitet. För att ta reda på om det digitala formatet påverkar läsbarheten har jag utfört två textanalyser, en läsbarhetsanalys av det tryckta materialet och en komparativ analys av det digitala. Jag har även utfört intervjuer med lärare och experter inom läromedelsutgivning. En utprovning samt en deltagarobservation genomfördes med målgruppen där mitt förbättringsförslag jämfördes med det befintliga materialet. De slutsatser jag har kommit fram till är att det behöver finnas tydliga kopplingar mellan text och bild i ett digitalt läromedel. Färgmarkeringar i brödtext med ordförklaringar kan störa läsrytmen och för många rubriknivåer kan förvirra läsaren. Den viktigaste slutsatsen är dock att hur läsbar en text är beror på målgruppens förkunskaper, läsmål och motivation för ämnet. En faktatext med svåra ord behöver nödvändigtvis inte bli lättare för att man delar in texten i korta stycken, det kan snarare bli svårare att förstå innehållet.
|
27 |
"Världen är en saga! Sagan är en värld! Ja, vännen, det har du rätt i! Men varje saga har en moralisk mening och budskap” (Adam Mickiewicz) : En uppsats om genus och jämställdhet i barnlitteraturen i Sverige, Polen och Turkiet / "The world is a fairytale! The fairytale is a world! Yes, my dear, absolutely! But every fairytale has its morality and message"" (Adam Mickiewicz). : An essay about gender and gender equality in children's literature in Sweden, Poland and TurkeyBaybek Mehlich, Arzu, Berezak, Marta January 2017 (has links)
The aim of this research is to study how gender is constructed and represented in children's literature through image and text analysis of the four selected children's books from respectively Sweden, Poland, and Turkey. Through our analysis we want to demonstrate the prevailing gender discourses expressed in children's books through text and image. From this we want to create an understanding of the importance of the selection of children's books and the role of adults in communicating and discussing the content. The study has feminist poststructuralism as theory which is part of social constructivism. To achieve the in-depth analysis, we used semiotic text and image analysis as well as multimodality and critical discourse analysis. The study's findings show that, based on semiotic image and text analysis, most analyzed books convey and portray a conventional and traditional image of femininity and masculinity. There are some breakthrough challenges in the selected children's books that adults need to attend to in social practice with the children. The result we have reached is only partly in line with the curriculum for preschool gender equality. Some messages are also in conflict with them.
|
28 |
[pt] DO IMAGINÁRIO AO REAL: A CRIAÇÃO E A PRODUÇÃO DO LIVRO INFANTIL NA VISÃO DO ILUSTRADOR / [en] FROM IMAGINARY TO REALITY: THE ILLUSTRATOR S VIEW ABOUT CREATION AND PRODUCTION OF CHILDREN S BOOK25 October 2021 (has links)
[pt] O objetivo deste trabalho é estudar a ilustração enquanto elemento fundamental na
construção de sentido de livros infantis. É feito um breve levantamento do percurso
histórico da ilustração em livros desde a Idade Média e são examinadas as etapas da
criação de imagens a partir de depoimentos e entrevistas com ilustradores
contemporâneos, cujas obras representam uma linguagem visual específica em seus
livros. Em seguida, as fases do processo da criação ilustrativa são descritas a fim de
mostrar que a produção de imagens não é meramente uma ação intuitiva em relação ao
texto verbal. Fruto de um projeto de design, trata-se de um processo estudado,
concebido e referenciado na experiência do autor, bem como nos componentes da obra.
Desta forma, a pesquisa estuda a relação verbo-visual e sua importância na criação das
obras literárias infantis, considerando o gênero de ilustrações narrativas. A relação entre
estes dois códigos de linguagem é discutida a partir da perspectiva de diferentes autores
e ilustradores, identificando as possibilidades oferecidas pelo Design Gráfico na
interação entre tipografia e ilustração. / [en] This work s goal is to study illustration as a fundamental element in the children s book meaning making. A brief historical account of book illustration from Middle Ages is presented. Contemporary illustrators are interviewed to describe their creative process. This process is brought foreword in order to show that image creating is not just based on intuition vis-à-vis the verbal text. As the result of a project of graphic
design, such process is conceived and based on the author s experience, as well as the elements found in references around the written text. The present work, therefore, examines the verbal and visual relationship, and its importance in the creation of children s literary works concerning narrative illustration. The connection between these two codes, is then, discussed bearing in mind different perspectives presented by authors and illustrators.
|
29 |
Chimericwear : Blending Boundaries Between Digital and Physical Expressions in Fashion DesignPaberza, Liana January 2023 (has links)
The thesis bridges the gap between physical and digital print design in search of a hybrid state, reflecting our present time. With AI-assisted creativity in print development and features such as added volume to the print elements, the physical garments replicate the creation of a 3D object existing in computer software. Through innovative garment construction techniques and the development of photographic print composition, it redefines the relationship between print, form, and body. Driven by visually stimulating imagery and the desire to communicate physical sensation through concepts constructed via digital tools, the thesis proposes hybrid wearables existing in both the digital and physical states. The research examines an innovative symbiosis between the nostalgic past and the futuristic present with joyful expression in mind while simultaneously harmonizing the saturated voices of the current zeitgeist. The project contributes to the utilization of AI image-generating tools such as Dalle2 in the design practice and the role of maintaining human-centered design instead of replacing it. The results are presented in multitudes of different presentation formats from analog or digital ones in the forms of image stills and animation to phygital convergences between the virtual and the actual reality.
|
30 |
AI as a tool and its influence on the User Experience design process : A study on the usability of human-made vs more-than-human-made prototypesPop, Mira, Schricker, Max January 2023 (has links)
This research paper delves into the integration of artificial intelligence (AI) in the process of user experience (UX) design resulting in more-than-human-made designs. Specifically, the study focuses on the utilization of the text-to-image AI tool, Midjourney. The primary research questions addressed in this paper are twofold: 1) How do AI tools influence the current UX design process of a high-fidelity prototype? and 2) How do more-than-human-made high-fidelity prototypes compare with human-made high-fidelity prototypes in terms of UX? To answer these research questions, a two-method study design was employed. Firstly, two focus groups with in total of 8 designers as participants were formed, with one group utilizing Midjourney to investigate its influence on the design process and to compare the two groups regarding their workwise. The aim was to create two comparable prototypes within a specific e-commerce setting. Secondly, a between-subjects design user study with 32 participants was conducted to test the high-fidelity prototypes and to assess any potential disparities in UX quality between them. The findings regarding the first research question indicate that Midjourney primarily serves as an inspirational tool. Designers were able to harness the AI tool to generate dark mode images, with the final chosen dark mode exemplifying the impact of Midjourney. Additionally, designers attempted to utilize the tool for creating icons. Regarding the second research question, the user study revealed that, despite similar and comparable use cases, there were only minor significant differences in terms of UX quality. The overall scores in System Usability Scale (SUS) and User Experience Questionnaire Plus (UEQ+) did not exhibit any significant disparity. This study suggests that while Midjourney proves to be a useful tool within the design process, its current influence on designers' UX design process and the ultimate performance of the final prototype remains relatively modest. Further research and development may be required to enhance its impact in the field of UX design and the study design should be used to test other AI tools in comparable settings.
|
Page generated in 0.0925 seconds