Duo Wu (吴铎)

Research Experience

MANSY: Generalizing Neural Adaptive Immersive Video Streaming With Ensemble and Representation Learning[2023.2 ~ 2023.11]

Abstract: The popularity of immersive videos has prompted extensive research into neural adaptive tile-based streaming to optimize video transmission over networks with limited bandwidth. However, the diversity of users' viewing patterns and Quality of Experience (QoE) preferences has not been fully addressed yet by existing neural adaptive approaches for viewport prediction and bitrate selection. Their performance can significantly deteriorate when users' actual viewing patterns and QoE preferences differ considerably from those observed during the training phase, resulting in poor generalization. In this paper, we propose MANSY, a novel streaming system that embraces user diversity to improve generalization. Specifically, to accommodate users' diverse viewing patterns, we design a Transformer-based viewport prediction model with an efficient multi-viewport trajectory input output architecture based on implicit ensemble learning. Besides, we for the first time combine the advanced representation learning and deep reinforcement learning to train the bitrate selection model to maximize diverse QoE objectives, enabling the model to generalize across users with diverse preferences. Extensive experiments demonstrate that MANSY outperforms state-of-the-art approaches in viewport prediction accuracy and QoE improvement on both trained and unseen viewing patterns and QoE preferences, achieving better generalization.
Key words: tile-based neural adaptive immersive video streaming, generalization, ensemble learning, representation learning
Role: First author.
Status: Submitted to IEEE Transactions on Mobile Computing (CCF-A).

ILCAS: Imitation Learning-Based Configuration-Adaptive Streaming for Live Video Analytics with Cross-Camera Collaboration [2022.8 ~ 2023.1]

Abstract: The high-accuracy and resource-intensive deep neural networks (DNNs) have been widely adopted by live video analytics (VA), where camera videos are streamed over the network to resource-rich edge/cloud servers for DNN inference. Common video encoding configurations (e.g., resolution and frame rate) have been identified with significant impacts on striking the balance between bandwidth consumption and inference accuracy and therefore their adaption scheme has been a focus of optimization. However, previous profiling-based solutions suffer from high profiling cost, while existing deep reinforcement learning (DRL) based solutions may achieve poor performance due to the usage of fixed reward function for training the agent, which fails to craft the application goals in various scenarios. In this paper, we propose ILCAS, the first imitation learning (IL) based configuration-adaptive VA streaming system. Unlike DRL-based solutions, ILCAS trains the agent with demonstrations collected from the expert which is designed as an offline optimal policy that solves the configuration adaption problem through dynamic programming. To tackle the challenge of video content dynamics, ILCAS derives motion feature maps based on motion vectors which allow ILCAS to visually “perceive” video content changes. Moreover, ILCAS incorporates a cross-camera collaboration scheme to exploit the spatio-temporal correlations of cameras for more proper configuration selection. Extensive experiments confirm the superiority of ILCAS compared with state-of-the-art solutions, with 2-20.9% improvement of mean accuracy and 19.9-85.3% reduction of chunk upload lag.
Key words: live video analytics, configuration adaption, imitation learning, cross-camera collaboration
Role: First author.
Status: Accepted to appear in IEEE Transactions on Mobile Computing (CCF-A) [pdf].

A Comprehensive Survey on Segment Routing Traffic Engineering [2019.11 ~ 2020.11]

Abstract: Traffic Engineering (TE) enables management of traffic in a manner that optimizes utilization of network resources in an efficient and balanced manner. However, existing TE solutions face issues relating to scalability and complexity. In recent years, Segment Routing (SR) has emerged as a promising source routing paradigm. As one of the most important applications of SR, Segment Routing Traffic Engineering (SR-TE), which enables a headend to steer traffic along specific paths represented as ordered lists of instructions called segment lists, has the capability to overcome the above challenges due to its flexibility and scalability. In this paper, we conduct a comprehensive survey on SR-TE. A thorough review of SR-TE architecture is provided in the first place, reviewing the core components and implementation of SR-TE such as SR Policy, Flexible Algorithm and SR-native algorithm. Strengths of SR-TE are also discussed, as well as its major challenges. Next, we dwell on the recent SR-TE researches on routing optimization with various intents, e.g., optimization on link utilization, throughput, QoE (Quality of Experience) and energy consumption. Afterwards, node management for SR-TE are investigated, including SR node deployment and candidate node selection. Finally, we discuss the existing challenges of current research activities and propose several research directions worth of future exploration.
Key words: segment routing, traffic engineering, SR policy, routing optimization, segment list computation
Role: First author.
Status: Accepted to appear in Digital Communications and Networks (Q1), 2022.[PDF]

Internship Experience

Summer Internship at Huawei Technology Co., Ltd. [2021.07 ~ 2021.09]

Duty: Engaged in the development of Java Web backend server based on Spring Boot framework; responsible for the design and development of FRUD Kit in Heavenly Pond Architecture.
Position: Software engineer.