Falei LuoSiwei MaJuncheng MaHonggang QiLi SuWen Gao
This paper provides a multiple-layer parallel motion estimation (ME) scheme implemented on GPU for High Efficiency Video Coding (HEVC). The scheme is hierarchically structured, including four layers: coding tree unit (CTU), prediction unit (PU), motion vector (MV) selection and instruction optimization. In PU-layer, costs of various PU sizes were obtained through a SAD (sum of absolute differences) look-up table instead of progressive cost merging. And during MV selection, GPU's comparison instruction was used to avoid branches. At the same time, concurrent CTUs processing and SIMD (Single Instruction, Multiple Data) optimization also improve the performance significantly. Experimental results show that the proposed scheme can take full advantage of GPU and achieves over 90 times speedup compared with the HM10.0 using fast ME.
Xiantao JiangTian SongTakashi ShimamotoLi‐Sheng Wang
Augusto GomezJhon Henry Bolaños PereaMaría Trujillo
Stefan RadickeJens-Uwe HahnChristos GrecosQ. Wang
Abdelrahman AbdelazimWassim MasriBassam Noaman