Video Prediction Using Multi-Scale Deep Neural Networks

سال انتشار: 1401
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 155

فایل این مقاله در 10 صفحه با فرمت PDF قابل دریافت می باشد

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_JADM-10-3_011

تاریخ نمایه سازی: 9 مهر 1401

چکیده مقاله:

In video prediction it is expected to predict next frame of video by providing a sequence of input frames. Whereas numerous studies exist that tackle frame prediction, suitable performance is not still achieved and therefore the application is an open problem. In this article multiscale processing is studied for video prediction and a new network architecture for multiscale processing is presented. This architecture is in the broad family of autoencoders. It is comprised of an encoder and decoder. A pretrained VGG is used as an encoder that processes a pyramid of input frames at multiple scales simultaneously. The decoder is based on ۳D convolutional neurons. The presented architecture is studied by using three different datasets with varying degree of difficulty. In addition, the proposed approach is compared to two conventional autoencoders. It is observed that by using the pretrained network and multiscale processing results in a performant approach.

نویسندگان

N. Shayanfar

Computer engineering department, Yazd University, Yazd, Iran.

V. Derhami

Computer engineering department, Yazd University, Yazd, Iran.

M. Rezaeian

Computer engineering department, Yazd University, Yazd, Iran.

مراجع و منابع این مقاله:

لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :
  • C. Zhang and J. Kim, “Modeling Long- and Short-Term Temporal ...
  • W. Liu, W. Luo, D. Lian, and S. Gao, “Future ...
  • X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W. Wong, ...
  • M. Mathieu, C. Couprie, and Y. LeCun, “Deep multi-scale video ...
  • W. Lotter, G. Kreiman, and D. Cox, “Unsupervised Learning of ...
  • W. Lotter, G. Kreiman, and D. Cox, “Deep predictive coding ...
  • S. Oprea et al., “A Review on Deep Learning Techniques ...
  • X. Jin et al., “Video Scene Parsing with Predictive Feature ...
  • J. Walker, K. Marino, A. Gupta, and M. Hebert, “The ...
  • M. Jamaseb Khollari, V. Derhami, and M. Yazdian Dehkordi, “Variational ...
  • C. Vondrick, H. Pirsiavash, and A. Torralba, “Generating Videos with ...
  • J. van Amersfoort, A. Kannan, M. Ranzato, A. Szlam, D. ...
  • L. A. Lim and H. Yalim Keles, “Foreground segmentation using ...
  • N. Srivastava, E. Mansimov, and R. Salakhutdinov, “Unsupervised Learning of ...
  • N. Mayer et al., “A Large Dataset to Train Convolutional ...
  • M. Menze and A. Geiger, “Object scene flow for autonomous ...
  • O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks ...
  • M. Sabokrou, M. Fathy, Z. Moayed, and R. Klette, “Fast ...
  • K. Simonyan and A. Zisserman, “Very deep convolutional networks for ...
  • C. Szegedy et al., “Going Deeper with Convolutions,” in Proceedings ...
  • Y. Wang, Z. Gao, M. Long, J. Wang, and P. ...
  • Y. Wang, M. Long, J. Wang, Z. Gao, and P. ...
  • J. Zhang, Y. Wang, M. Long, W. Jianmin, and P. ...
  • R. Mahjourian, M. Wicke, and A. Angelova, “Geometry-Based Next Frame ...
  • Z. Wang, A. C. Bovik, H. R. Sheikh, and E. ...
  • V. Patraucean, A. Handa, and R. Cipolla, “Spatio-temporal video autoencoder ...
  • T. Wang et al., “MSU-Net: Multiscale Statistical U-Net for Real-Time ...
  • نمایش کامل مراجع