[libav-devel] [PATCH 13/13] arm: add ff_int32_to_float_fmul_array8_neon

Martin Storsjö martin at martin.st
Thu Dec 10 14:56:01 CET 2015


On Wed, 9 Dec 2015, Janne Grunau wrote:

> Quite a bit faster than int32_to_float_fmul_array8_c calling
> ff_int32_to_float_fmul_scalar_neon through FmtConvertContext.
> Number of cycles per int32_to_float_fmul_array8 call while decoding
> padded.dts on exynos5422:
>
>               before  after   change
> cortex-a7:     1270     951    -25%
> cortex-a15:     434     285    -34%
>
> checkasm --bench cycle counts:     cortex-a15   cortex-a7
> int32_to_float_fmul_array8_c:      1730.4       4384.5
> int32_to_float_fmul_array8_neon_c:  571.5       1694.3
> int32_to_float_fmul_array8_neon:    374.0       1448.8
>
> Interesting are the differences between
> int32_to_float_fmul_array8_neon_c and int32_to_float_fmul_array8_neon.
> The former is current behaviour of calling
> ff_int32_to_float_fmul_scalar_neon repeatedly from the c function,
> The raw numbers differ since checkasm uses different lengths than the
> dca decoder.
> ---
> libavcodec/arm/fmtconvert_init_arm.c |  4 ++++
> libavcodec/arm/fmtconvert_neon.S     | 37 ++++++++++++++++++++++++++++++++++++
> 2 files changed, 41 insertions(+)

Ok

// Martin


More information about the libav-devel mailing list