[libav-devel] [PATCH] hevcdsp: add x86 SIMD for MC

James Almer jamrial at gmail.com
Sat Aug 22 23:58:41 CEST 2015


On 22/08/15 1:16 PM, Anton Khirnov wrote:
>>> +%macro QPEL_8 2
>>> +%if %2
>>> +    %define postfix    v
>>> +    %define mvfrac     myq
>>
>> Same here and below the else, rename this to mvfracq and add a mvfracd.
>>
>>> +    %define pixstride  srcstrideq
>>> +    %define pixstride3 sstride3q
>>> +    %define src_m3     srcm3q
>>> +%else
>>> +    %define postfix    h
>>> +    %define mvfrac     mxq
>>> +    %define pixstride  1
>>> +    %define pixstride3 3
>>> +    %define src_m3     (srcq - 3)
>>> +%endif
>>> +
>>> +cglobal hevc_qpel_ %+ postfix %+ _ %+ %1 %+ _8, 8, 10, 7, dst, dststride, src, srcstride, height, mx, my, sstride3, srcm3, coeffsreg

This should be 7, 10, 7, Otherwise you're loading sstride3 from stack as if it were
a function argument.
Ideally though, for vertical you'd use 5, 9, 7 then manually load either mx or my
instead of both, saving one register, or even 5, 8, 7, since coeffsreg and mvfrac
are only used during init, and you can easily reuse one of those two registers for
sstride3 or srcm3.
You can also push it down to 4, 7, 7 if you manually load height before or after
the SPLATWs and reuse the regs for coeffsreg and mvfrac. As a plus, this would make
the functions work with x86_32.

For horizontal you don't even need sstride3 or srcm3, so you definitely should
declare and use less registers.

Didn't check other functions but I'm sure similar optimizations can be done.

>>> +%if %2
>>> +    and       mvfrac, 0x3
>>> +%endif
>>> +    dec       mvfrac
>>> +    shl       mvfrac, 4
>>
>> Use mvfracd on these three, it will clear the high bits for the mova below.
> 
> anding the whole register with 3/7 should also work fine, with less
> clutter.

"and mvfrac, 0x3" is only in ff_hevc_qpel_v_* functions, but not ff_hevc_qpel_h_*.
It's the same with the "and mvfrac, 0x7" cases below. You need to use the d suffix
instead of q on the register names to make sure the high bits are cleared.


More information about the libav-devel mailing list