Spelling suggestions: "subject:"yolov3"" "subject:"yolov5""
1 |
A Comparison Between KeyFrame Extraction Methods for Clothing RecognitionLindgren, Gabriel January 2023 (has links)
With an ever so high video consumption, applications and services need to use smart approaches to make the experience better for their users. By using key frames from a video, useful information can be retrieved regarding the entire video, and used for better explaining the content. At present, many key frame extraction (KFE) methods aim at selecting multiple frames from videos composed of multiple scenes, and coming from various contexts. In this study a proposed key frame extraction method that extracts a single frame for further clothing recognition purposes is implemented and compared against two other methods. The proposed method utilizes the state-of-the-art object detector YOLO (You Only Look Once) to ensure the extracted key frames contain people, and is referred to as YKFE (YOLO-based Key Frame Extraction). YKFE is then compared against the simple and baseline method named MFE (Middle Frame Extraction) which always extracts the middle frame of the video, and the famous optical flow based method referred to as Wolf KFE, that extracts frames having the lowest amount of optical flow. The YOLO model is pre-trained and further fine tuned on a custom dataset. Furthermore, three versions of the YKFE method are developed and compared, each utilizing different measurements in order to select the best key frame, the first one being optical flow, the second aspect ratio, and the third by combining both optical flow and aspect ratio. At last, three proposed metrics: RDO (Rate of Distinguishable Outfits), RSAR (Rate of Successful API Returns), and AET (Average Extraction Time) were used to evaluate and compare the performance of the methods against each other on two sets of test data containing 100 videos each. The results show that YKFE yields more reliable results while taking significantly more time than both MFE and Wolf KFE. However, both MFE and Wolf KFE do not consider whether frames contain people or not, meaning the context in which the methods are used is of significant importance for the rate of successful key frame extractions. Finally as an experiment, a method named Slim YKFE was developed as a combination of both MFE and YKFE, resulting in a substantially reduced extraction time while still maintaining high accuracy. / Med en ständigt växande videokonsumption så måste applikationer och tjänster använda smarta tillvägagångssätt för att göra upplevelsen så bra som möjligt för dess användare. Genom att använda nyckelbilder från en video kan användbar information erhållas om hela videon och användas för att bättre förklara dess innehåll. För nuvarande fokuserar många metoder för nyckelbildutvinning (KFE) på att utvinna ett flertal bilder från videoklipp komponerade av flera scener och många olika kontext. I denna studie föreslås och implementeras en ny nyckelbildutvinningsmetod som enbart extraherar en bild med syfte att användas av ett API för klädigenkänning. Denna metod jämförs sedan med två andra redan existerande metoder. Den föreslagna metoden använder sig av det moderna objektdetekteringssystemet YOLO (You Only Look Once) för att säkerställa förekomsten av personer i de extraherade nyckelbilderna och hänvisas som YKFE (YOLO-based Key Frame Extraction). YKFE jämförs sedan med en enkel basmetod kallad MFE (Middle Frame Extraction) som alltid extraherar den mittersta bilden av en video, och en känd metod som extraherar de bilder med lägst optiskt flöde, kallad Wolf KFE. YOLO-modellen är förtränad och vidare finjusterad på ett eget dataset. Fortsättningsvis utvecklas tre versioner av YKFE-metoden där varje version använder olika värden för att välja ut den bästa nyckelbilden. Den första versionen använder optiskt flöde, den andra använder bildförhållande och den tredje kombinerar både optiskt flöde och bildförhållande. Slutligen används tre föreslagna mättyper; RDO (Rate of Distinguishable Outfits), RSAR (Rate of Successful API Returns), and AET (Average Extraction Time) för att evaluera och jämföra metodernas prestanda mot varandra på två uppsättningar testdata bestånde av 100 videoklipp vardera. Resultaten visar att YKFE ger de mest stabila resultaten samtidigt som den har en betydligt längre exekveringstid än både MFE och Wolf KFE. Däremot betraktar inte MFE och Wolf YKFE bildernas innehåll vilket betyder att kontextet där dessa metoder används är av stor betydelse för antalet lyckade nyckelbildextraheringar. Som ett experiment så utvecklas även en metod kallad Slim YKFE, som kombinerar både MFE och YKFE som resulterade i en betydande minskning av exekveringstid samtidigt som antal lyckade extraheringar förblev hög.
|
2 |
Determination of Biomass in Shrimp-Farm using Computer VisionTammineni, Gowtham Chowdary 30 October 2023 (has links)
The automation in the aquaculture is proving to be more and more effective these days.
The economic drain on the aquaculture farmers due to the high mortality of the shrimps can be reduced by ensuring the welfare of the animals. The health of shrimps can decline with even barest of changes in the conditions in the farm. This is the result of increase in stress. As shrimps are quite sensitive to the changes, even small changes can increase the stress in the animals which results in the decline of health. This severely dampens the mortality rate in the animals.
Also, human interference while feeding the shrimps severely induces the stress on the shrimps and thereby affecting the shrimp’s mortality. So, to ensure the optimum
efficiency of the farm, the feeding of the shrimps is made automated. The underfeeding and overfeeding also affects the growth of shrimps. To determine the right amount of food to provide for shrimps, Biomass is a very helpful parameter.
The use of artificial intelligence (AI) to calculate the farm's biomass is the project's primary area of interest. This model uses the cameras mounted on top of the tank at densely populated areas. These cameras monitor the farm, and our model detects the biomass. By doing so, it is possible to estimate how much food should be distributed at that particular area. Biomass of the shrimps can be calculated with the help of the number of shrimps and the average lengths of the shrimps detected. With the reduced human interference in calculating the biomass, the health of the animals improves and thereby making the process sustainable and economical.
|
3 |
Automated Detection of Arctic Foxes in Camera Trap ImagesZahid, Mian Muhammad Usman January 2024 (has links)
This study explores the application of object detection models for detecting Arctic Foxes in camera trap images, a crucial step towards automating wildlife monitoring and enhancing conservation efforts. The study involved training models on You Only Look Once version 7(YOLOv7) architecture across different locations using k-fold cross-validation technique and evaluating their performance in terms of mean Average Precision (mAP), precision, and recall. The models were tested on both validation and unseen data to assess their accuracy and generalizability. The findings revealed that while certain models performed well on validation data, their effectiveness varied when applied to unseen data, with significant differences in performance across the datasets. While one of the datasets demonstrated the highest precision (88%), and recall (94%) on validation data, another one showed superior generalizability on unseen data (precision 76%, recall 95%). The models developed in this study can aid in the efficient identification of Arctic Foxes in diverse locations. However, the study also identifies limitations related to dataset diversity and environmental variability, suggesting the need for future research to focus on training models during different seasons and having different aged Arctic Foxes. Recommendations include expanding dataset diversity, exploring advanced object detection architectures to go one step further and detect Arctic Foxes with skin diseases, and testing the models in varied field conditions.
|
Page generated in 0.0199 seconds