[libav-devel] [PATCH v2 1/1] x86: use emms after ff_int32_to_float_fmul_scalar_sse

Janne Grunau janne-libav at jannau.net
Wed Dec 30 13:43:13 CET 2015


Hi,

On 2015-12-29 19:56:34 +0100, Henrik Gramner wrote:
> On Tue, Dec 29, 2015 at 12:32 PM, Janne Grunau <janne-libav at jannau.net> wrote:
> > ---
> >  libavcodec/x86/fmtconvert.asm | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/libavcodec/x86/fmtconvert.asm b/libavcodec/x86/fmtconvert.asm
> > index 0383322..c2ff707 100644
> > --- a/libavcodec/x86/fmtconvert.asm
> > +++ b/libavcodec/x86/fmtconvert.asm
> > @@ -61,6 +61,13 @@ cglobal int32_to_float_fmul_scalar, 4, 4, %1, dst, src, mul, len
> >      mova  [dstq+lenq+16], m2
> >      add     lenq, 32
> >      jl .loop
> > +%if cpuflag(sse)
> > +    ;; cvtpi2ps switches to MMX even if the source is a memory location
> > +    ;; possible an error in documentation since every tested CPU disagrees with
> > +    ;; that. Use emms anyway since the vast majority of machines will use the
> > +    ;; SSE2 variant
> > +    emms
> > +%endif
> >      REP_RET
> >  %endmacro
>
> Should be notcpuflag(sse2). Also the REP_RET could be replaced with
> RET, but that's a pretty minor thing.

err, yes. forgot to amend the commit before sending. Also changed the REP_RET to RET now

Janne

---8<---
Intel's Instruction Set Reference (as of September 2015) clearly states
that cvtpi2ps switches to MMX state. Actual CPUs do not switch if the
source is a memory location. The Instruction Set Reference from 1999
(Order Number 243191) describes this behaviour but all later versions
I've seen have make no distinction whether MMX registers or memory is
used as source.
The documentation for the matching SSE2 instruction to convert to double
(cvtpi2pd) was fixed (see the valgrind bug
https://bugs.kde.org/show_bug.cgi?id=210264).

It will take time to get a clarification and fixes in place. In the
meantime it makes sense to change ff_int32_to_float_fmul_scalar_sse to
be correct according to the documentation. The vast majority of users
will have SSE2 so a change to the SSE version has little effect.

Fixes fate-checkasm on x86 valgrind targets.

Valgrind 'bug' reported as https://bugs.kde.org/show_bug.cgi?id=357059
---
 libavcodec/x86/fmtconvert.asm | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/libavcodec/x86/fmtconvert.asm b/libavcodec/x86/fmtconvert.asm
index 0383322..2a3e4a5 100644
--- a/libavcodec/x86/fmtconvert.asm
+++ b/libavcodec/x86/fmtconvert.asm
@@ -61,7 +61,14 @@ cglobal int32_to_float_fmul_scalar, 4, 4, %1, dst, src, mul, len
     mova  [dstq+lenq+16], m2
     add     lenq, 32
     jl .loop
-    REP_RET
+%if notcpuflag(sse2)
+    ;; cvtpi2ps switches to MMX even if the source is a memory location
+    ;; possible an error in documentation since every tested CPU disagrees with
+    ;; that. Use emms anyway since the vast majority of machines will use the
+    ;; SSE2 variant
+    emms
+%endif
+    RET
 %endmacro
 
 INIT_XMM sse
-- 
2.6.4



More information about the libav-devel mailing list