[libav-devel] [PATCH] x86: dcadsp: Avoid SSE2 instructions in SSE functions

James Almer jamrial at gmail.com
Sat Aug 1 20:49:03 CEST 2015


On 01/08/15 3:24 PM, Anton Khirnov wrote:
> Quoting Henrik Gramner (2015-08-01 17:10:22)
>> ---
>>  libavcodec/x86/dcadsp.asm | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/libavcodec/x86/dcadsp.asm b/libavcodec/x86/dcadsp.asm
>> index c42ee23..c99df12 100644
>> --- a/libavcodec/x86/dcadsp.asm
>> +++ b/libavcodec/x86/dcadsp.asm
>> @@ -148,7 +148,7 @@ DECODE_HF
>>      addps       m4, va ; va1+3 vb1+3 va2+4 vb2+4
>>      movhlps     vb, m4 ; va1+3  vb1+3
>>      addps       vb, m4 ; va0..4 vb0..4
>> -    movh    [outq + count], vb
>> +    movlps  [outq + count], vb
>>  %if %1
>>      sub       cf0q, 8*NUM_COEF
>>  %endif
>> -- 
>> 1.8.3.2
> 
> movh is an x86inc define, right? So wouldn't it be better to handle this
> in INIT_XMM?

Unlike movd/ss and movq/sd, movlps needs one of the two operands to
be a memory operand, so technically it can't be used in movh.

I however think movq/sd should be used here for sse2 and above instead of
movlps. The latter is usually slower and should be avoided as it brings no
benefit over the former when the destination is memory but only when the
source is memory as it doesn't clear the upper 64 bits of the destination
register.


More information about the libav-devel mailing list