Efficient Deep Learning-Based Viewport Estimation for 360-Degree Video Streaming

Efficient Deep Learning-Based Viewport Estimation for 360-Degree Video
Streaming

Volume 9, Issue 3, Page No 49-61, 2024

Author’s Name: Nguyen Viet Hung1,2, Tran Thanh Lam¹, Tran Thanh Binh², Alan Marshal³, Truong Thu Huong¹*

View Affiliations

¹School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam
²Faculty of Information Technology, East Asia University of Technology, Bacninh, Vietnam
³Department of Electrical Engineering and Electronics, University of Liverpool, Liverpool, United Kingdom

a)whom correspondence should be addressed. E-mail: huong.truongthu@hust.edu.vn

Adv. Sci. Technol. Eng. Syst. J. 9(3), 49-61(2024); a  DOI: 10.25046/aj090305

Keywords: Video Streaming, 360-degree Video, QoE, VR, Deep Learning

Share

65 Downloads

Export Citations

While Virtual reality is becoming more popular, 360-degree video transmission over the Internet is challenging due to the video bandwidth. Viewport Adaptive Streaming (VAS) was proposed to reduce the network capacity demand of 360-degree video by transmitting lower quality video for the parts of the video that are not in the current viewport. Understanding how to forecast future user viewing behavior is therefore a crucial VAS concern. This study presents a new deep learning-based method for predicting the typical view for VAS systems. Our proposed solution is termed Head Eye Movement oriented Viewport Estimation based on Deep Learning (HEVEL). Our proposed model seeks to enhance the comprehension of visual attention dynamics by combining information from two modalities. Through rigorous experimental evaluations, we illustrate the efficacy of our approach versus existing models across a range of attention-based tasks. Specifically, viewport prediction performance is proven to outperform four reference methods in terms of precision, RMSE, and MAE.

Received: 16 March, 2024, Revised: 14 May, 2024, Accepted: 29 May, 2024, Published Online: 12 June, 2024

  1. X. Feng, Y. Liu, S. Wei, “LiveDeep: Online Viewport Prediction for Live Virtual Reality Streaming Using Lifelong Deep Learning,” in 2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 800–808, 2020, doi:10.1109/VR46266.2020.00104.
  2. Y. Zhang, et al, “DRL360: 360-degree Video Streaming with Deep Reinforcement Learning,” in IEEE INFOCOM 2019 – IEEE Conference on Computer Communications, 1252–1260, 2019, doi:10.1109/INFOCOM.2019.8737361.
  3. M. M. Uddin, J. Park, “Machine learning model evaluation for 360° video caching,” in 2022 IEEE World AI IoT Congress (AIIoT), 238–244, 2022, doi:10.1109/AIIoT54504.2022.9817292.
  4. N. V. Hung, B. D. Tien, T. T. T. Anh, P. N. Nam, T. T. Huong, “An efficient approach to terminate 360-video stream on HTTP/3,” in AIP Conference Proceedings, volume 2909, AIP Publishing, 2023.
  5. D. V. Nguyen, et al, “Scalable 360 Video Streaming using HTTP/2,” in 2019 IEEE 21st International Workshop on Multimedia Signal Processing (MMSP), 1–6, 2019, doi:10.1109/MMSP.2019.8901805.
  6. M. Zink, et al, “Scalable 360° Video Stream Delivery: Challenges, Solutions, and Opportunities,” Proceedings of the IEEE, 107(4), 639–650, 2019, doi:10.1109/JPROC.2019.2894817.
  7.  S. Park, A. Bhattacharya, Z. Yang, S. R. Das, D. Samaras, “Mosaic: Advancing user quality of experience in 360-degree video streaming with machine learning,” IEEE Transactions on Network and Service Management, 18(1), 1000–1015, 2021, doi:10.1109/TNSM.2021.3053183.
  8. H. L. Dieu Huong, et al, “Smooth Viewport Bitrate Adaptation for 360 Video Streaming,” in 2019 6th NAFOSTED Conference on Information and Computer Science (NICS), 512–517, 2019, doi:10.1109/NICS48868.2019.9023807.
  9. H. Nguyen, et al, “An Accurate Viewport Estimation Method for 360 Video Streaming using Deep Learning,” EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, 9(4), e2, 2022, doi:10.4108/eetinis.v9i4.2218.
  10. Y. Jiang, et al, “Robust and Resource-efficient Machine Learning Aided Viewport Prediction in Virtual Reality,” arXiv preprint arXiv:2212.09945, 2022.
  11. J. Adhuran, et al, “Deep Learning and Bidirectional Optical Flow Based Viewport Predictions for 360° Video Coding,” IEEE Access, 10, 118380–118396, 2022, doi:10.1109/ACCESS.2022.3219861.
  12. L. Zhang, et al, “MFVP: Mobile-Friendly Viewport Prediction for Live 360- Degree Video Streaming,” in 2022 IEEE International Conference on Multimedia and Expo (ICME), 1–6, 2022, doi:10.1109/ICME52920.2022.9859789.
  13. Y. Ban, et al, “Exploiting Cross-Users Behaviors for Viewport Prediction in 360 Video Adaptive Streaming,” in 2018 IEEE International Conference on Multimedia and Expo (ICME), 1–6, 2018, doi:10.1109/ICME.2018.8486606.
  14. C.-L. Fan, et al, “Optimizing Fixation Prediction Using Recurrent Neural Networks for 360° Video Streaming in Head-Mounted Virtual Reality,” IEEE Transactions on Multimedia, 22(3), 744–759, 2020,  doi:10.1109/TMM.2019.2931807.
  15. C. Wu, R. Zhang, Z. Wang, L. Sun, “A Spherical Convolution Approach for Learning Long Term Viewport Prediction in 360 Immersive Video,” Proceedings of the AAAI Conference on Artificial Intelligence, 34(01), 14003–14040, 2020, doi:10.1609/aaai.v34i01.7377.
  16. D. Nguyen, “An evaluation of viewport estimation methods in 360- degree video streaming,” in 2022 7th International Conference on Business and Industrial Research (ICBIR), 161–166, IEEE, 2022, doi:10.1109/ICBIR54589.2022.9786513.
  17. D. V. Nguyen, et al, “An Optimal Tile-Based Approach for Viewport- Adaptive 360-Degree Video Streaming,” IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 9(1), 29–42, 2019, doi:10.1109/JETCAS.2019.2899488.
  18. D. Nguyen, et al, “Scalable multicast for live 360-degree video streaming over mobile networks,” IEEE Access, 10, 38802–38812, 2022, doi:10.1109/ACCESS.2022.3165657.
  19. N. V. Hung, P. H. Thinh, N. H. Thanh, T. T. Lam, T. T. Hien, V. T. Ninh, T. T. Huong, “LVSUM-Optimized Live 360 Degree Video Streaming in Unicast and Multicast Over Mobile Networks,” in 2023 IEEE 15th International Conference on Computational Intelligence and Communication Networks (CICN), 29–34, IEEE, 2023, doi:10.1109/CICN59264.2023.10402136.
  20. N. V. Hung, et al, “Flexible HTTP-based Video Adaptive Streaming for good QoE during sudden bandwidth drops,” EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, 10(2), e3–e3, 2023, doi:10.4108/eetinis.v10i2.2994.
  21. N. Kan, et al, “RAPT360: Reinforcement Learning-Based Rate Adaptation for 360-Degree Video Streaming With Adaptive Prediction and Tiling,” IEEE Transactions on Circuits and Systems for Video Technology, 32(3), 1607–1623, 2022, doi:10.1109/TCSVT.2021.3076585.
  22. J. Vielhaben, et al, “Viewport Forecasting in 360° Virtual Reality Videos with Machine Learning,” in 2019 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR), 74–747, 2019, doi:10.1109/AIVR46125.2019.00020.
  23. F. Qian, et al, “Optimizing 360 Video Delivery over Cellular Networks,” in Proceedings of the 5th Workshop on All Things Cellular: Operations, Applications and Challenges, ATC ’16, 1–6, Association for Computing Machinery, New York, NY, USA, 2016, doi:10.1145/2980055.2980056.
  24. D. V. Nguyen, et al, “An Evaluation of Tile Selection Methods for Viewport- Adaptive Streaming of 360-Degree Video,” ACM Trans. Multimedia Comput. Commun. Appl., 16(1), 2020, doi:10.1145/3373359.
  25. V. H. Nguyen, et al, “Retina-based quality assessment of tile-coded 360-degree videos,” EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, 9(32), 2022.
  26. T. C. Thang, et al, “An Evaluation of Bitrate Adaptation Methods for HTTP Live Streaming,” IEEE Journal on Selected Areas in Communications, 32(4), 693–705, 2014, doi:10.1109/JSAC.2014.140403.
  27. V. H. Nguyen, D. T. Bui, T. L. Tran, C. T. Truong, T. H. Truong, “Scalable and resilient 360-degree-video adaptive streaming over HTTP/2 against sudden network drops,” Computer Communications, 216, 1–15, 2024, doi:10.1016/j.comcom.2024.01.001.
  28. T. T. Huong, et al, “Detecting cyberattacks using anomaly detection in industrial control systems: A federated learning approach,” Computers in Industry, 132, 103509, 2021, doi:10.1016/j.compind.2021.103509.
  29. E. J. David, et al, “A dataset of head and eye movements for 360 videos,” in Proceedings of the 9th ACM Multimedia Systems Conference, 432–437, 2018, doi:10.1145/3204949.3208139.

Citations by Dimensions

Citations by PlumX

Crossref Citations

This paper is currently not cited.

No. of Downloads Per Month

No. of Downloads Per Country