[libav-devel] [PATCH 11/13] arm64: port synth_filter_float_neon from arm

Martin Storsjö martin at martin.st
Thu Dec 10 14:54:10 CET 2015


On Wed, 9 Dec 2015, Janne Grunau wrote:

> ~25% faster dts decoding overall. The checkasm CPU cycles numbers are
> not that useful since synth_filter_float() calls FFTContext.imdct_half().
>
>                         cortex-a57   cortex-a53
> synth_filter_float_c:    1866.2       3490.9
> synth_filter_float_neon:  915.0       1531.5
>
> With fftc.imdct_half forced to imdct_half_neon:
>                         cortex-a57   cortex-a53
> synth_filter_float_c:    1718.4       3025.3
> synth_filter_float_neon:  926.2       1530.1
> ---
> libavcodec/aarch64/asm-offsets.h       |   3 +
> libavcodec/synth_filter.h              |   1 +
> libavcodec/aarch64/dcadsp_init.c       |  16 +++++
> libavcodec/synth_filter.c              |   8 ++-
> libavcodec/aarch64/synth_filter_neon.S | 119 +++++++++++++++++++++++++++++++++
> libavcodec/aarch64/Makefile            |   3 +-
> 6 files changed, 147 insertions(+), 3 deletions(-)
> create mode 100644 libavcodec/aarch64/synth_filter_neon.S
>
> diff --git a/libavcodec/aarch64/asm-offsets.h b/libavcodec/aarch64/asm-offsets.h
> index 45b5c40..9cc1a0e 100644
> --- a/libavcodec/aarch64/asm-offsets.h
> +++ b/libavcodec/aarch64/asm-offsets.h
> @@ -27,4 +27,7 @@
> #define CELT_TMP                        0x10
> #define CELT_TWIDDLE                    (CELT_TMP + 0x8)    // loaded as pair
>
> +/* FFTContex */
> +#define IMDCT_HALF               0x48
> +

typo

Should the value be aligned with the ones above?


The rest seems fine

// Martin


More information about the libav-devel mailing list