Future frame prediction is a difficult but useful problem to solve in deep learning. The technology can be used to predict future occurrences in a video, anticipate anomalies, and aid autonomous devices in smart decision making. Although there is potential with frame prediction technology, there is still progress that needs to be made with it. As the predicted frame becomes farther away from the last input frame, the image becomes blurry and distorted. This indicates that the model is more uncertain about the motion occurring in the image frame. To reduce model uncertainty shown in predictions, optical flow information from each video was extracted and combined with the video frames. An optical flow-based approach is uncommon in frame prediction and has not been implemented with a fully Convolutional Neural Network (CNN) based architecture. In this work, the change in image quality evaluation metrics and overall image quality is analyzed across 4 different datasets between a state-of-the-art frame prediction model and a modified model that combines optical flow information. The results demonstrate that adding optical flow information improves the model Mean Squared Error (MSE) by 4.11% and its Structural Similarity Index Metric (SSIM) by 0.41% for the Moving MNIST dataset. Optical flow improved the SSIM value of Taxi BJ, KTH, and KITTI by 0.02%, 0.011%, and 1.297% respectively. While there was a consistent improvement in performance, the models still need more improvement in terms of the quality of images predicted in the distant future. / Master of Science / Future frame prediction is a technology that allows computers to predict what future video frames will look like. This can be used to predict future occurrences in a video, anticipate anomalies, and aid autonomous devices in smart decision making. Although there is potential with frame prediction technology, there is still progress that needs to be made with it. As the predicted frame becomes farther away from the last input frame, the image becomes blurry and distorted. This indicates that the model is more uncertain about the motion occurring in the image frame. To reduce model uncertainty shown in predictions, optical flow information from each video was extracted and combined with the video frames. Optical flow is the change in direction and magnitude of a moving object in a video. This type of information is helpful for making frame predictions because it gives the model additional information on how objects are moving to base its predictions on. In this work, the change in image quality evaluation metrics and overall image quality is analyzed across 4 different datasets between a state-of-the-art frame prediction model and a modified model that combines optical flow information. The results demonstrate that adding optical flow information improves the model Mean Squared Error (MSE) by 4.11% and its Structural Similarity Index Metric (SSIM) by 0.41% for the Moving MNIST dataset. Optical flow improved the SSIM value of Taxi BJ, KTH, and KITTI by 0.02%, 0.011%, and 1.297% respectively. While there was a consistent improvement in performance, the models still need more improvement in terms of the quality of images predicted in the distant future.
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/115671 |
Date | 06 July 2023 |
Creators | Wormack Jr, Craig Frederick Luther |
Contributors | Electrical and Computer Engineering, Jones, Creed F. III, Abbott, Amos L., Wyatt, Chris L. |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Language | English |
Detected Language | English |
Type | Thesis |
Format | ETD, application/pdf |
Rights | In Copyright, http://rightsstatements.org/vocab/InC/1.0/ |
Page generated in 0.0016 seconds