[libav-api] av_seek_frame() on 44.1 kHz stereo MP3 seeks to 32 kHz mono frame

Jan Schlüter jan.schlueter at ofai.at
Wed Jul 8 16:30:48 CEST 2015


> PPS: "avconv -ss 48 -t 30" works fine, and it uses av_seek_frame() with
> the stream index set to "-1". Should I do that instead? Is this expected
> to be more robust?

Seems I did something wrong in my first attempt of using "avconv". 
Setting the stream index to "-1" doesn't help, and avconv actually sees 
the same spurious frames when seeking (full debug output attached):

$ avconv -y -ss 48 -i bug.mp3 -t 30 foo.wav
avconv version 9.16-6:9.16-0ubuntu0.14.04.1, Copyright (c) 2000-2014 the 
Libav developers
[...]
Input stream #0:0 frame changed from rate:44100 fmt:s16p ch:2 chl:stereo 
to rate:32000 fmt:s16 ch:1 chl:mono
[mp3 @ 0x1f63a60] Header missing
Error while decoding stream #0:0
Input stream #0:0 frame changed from rate:32000 fmt:s16 ch:1 chl:mono to 
rate:32000 fmt:s16 ch:2 chl:stereo
[abuffer @ 0x1f62d40] Changing frame properties on the fly is not supported.
Input stream #0:0 frame changed from rate:32000 fmt:s16 ch:2 chl:stereo 
to rate:44100 fmt:s16 ch:2 chl:stereo
size=    5213kB time=30.00 bitrate=1423.4kbits/s
video:0kB audio:5213kB global headers:0kB muxing overhead 0.002323%

Without seeking (when removing the "-ss" argument or moving it past the 
"-i" argument), it happily decodes the full file. For my application, I 
chose to circumvent this problem by skipping all frames that do not 
match the initially detected number of channels and sample_rate.

Shall I file a bug report or is this to be treated as a known limitation 
of the MP3 seeking algorithm?

Best, Jan
-------------- next part --------------
avconv version 9.16-6:9.16-0ubuntu0.14.04.1, Copyright (c) 2000-2014 the Libav developers
  built on Aug 10 2014 18:16:02 with gcc 4.8 (Ubuntu 4.8.2-19ubuntu1)
  configuration: --arch=amd64 --enable-pthreads --enable-runtime-cpudetect --extra-version='6:9.16-0ubuntu0.14.04.1' --libdir=/usr/lib/x86_64-linux-gnu --prefix=/usr --enable-bzlib --enable-libdc1394 --enable-libfreetype --enable-frei0r --enable-gnutls --enable-libgsm --enable-libmp3lame --enable-librtmp --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-vaapi --enable-vdpau --enable-libvorbis --enable-libvpx --enable-zlib --enable-gpl --enable-swscale --enable-libcdio --enable-x11grab --enable-libx264 --enable-libxvid --shlibdir=/usr/lib/x86_64-linux-gnu --enable-shared --disable-static
  libavutil     52.  3. 0 / 52.  3. 0
  libavcodec    54. 35. 0 / 54. 35. 0
  libavformat   54. 20. 4 / 54. 20. 4
  libavdevice   53.  2. 0 / 53.  2. 0
  libavfilter    3.  3. 0 /  3.  3. 0
  libavresample  1.  0. 1 /  1.  0. 1
  libswscale     2.  1. 1 /  2.  1. 1
Splitting the commandline.
Reading option '-y' ... matched as option 'y' (overwrite output files) with argument '1'.
Reading option '-v' ... matched as option 'v' (set libav* logging level) with argument 'debug'.
Reading option '-ss' ... matched as option 'ss' (set the start time offset) with argument '48'.
Reading option '-i' ... matched as input file with argument 'bug.mp3'.
Reading option '-t' ... matched as option 't' (record or transcode "duration" seconds of audio/video) with argument '30'.
Reading option 'foo.wav' ... matched as output file.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option y (overwrite output files) with argument 1.
Applying option v (set libav* logging level) with argument debug.
Successfully parsed a group of options.
Parsing a group of options: input file bug.mp3.
Applying option ss (set the start time offset) with argument 48.
Successfully parsed a group of options.
Opening an input file: bug.mp3.
[mp3 @ 0x968020] Probed with size=4096 and score=51
[mp3 @ 0x968020] Invalid size in frame com, skipping the rest of tag.
[mp3 @ 0x968020] max_analyze_duration reached
Guessed Channel Layout for  Input Stream #0.0 : stereo
Input #0, mp3, from 'bug.mp3':
  Metadata:
    artist          : Ja'Quay
    title           : Reasoning
    genre           : 159
  Duration: 00:06:06.68, start: 0.000000, bitrate: 192 kb/s
    Stream #0.0, 194, 1/14112000: Audio: mp3, 44100 Hz, stereo, s16p, 192 kb/s
Successfully openened the file.
Parsing a group of options: output file foo.wav.
Applying option t (record or transcode "duration" seconds of audio/video) with argument 30.
Successfully parsed a group of options.
Opening an output file: foo.wav.
Successfully openened the file.
[abuffer source @ 0x969840] Setting value '1/44100' for key 'time_base'
[abuffer source @ 0x969840] Setting value '44100' for key 'sample_rate'
[abuffer source @ 0x969840] Setting value 's16p' for key 'sample_fmt'
[abuffer source @ 0x969840] Setting value '0x3' for key 'channel_layout'
[abuffer @ 0x969760] tb:1/44100 samplefmt:s16p samplerate: 44100 ch layout:0x3
[aformat filter @ 0x969b60] Setting value 's16' for key 'sample_fmts'
[abuffersink @ 0x969f40] auto-inserting filter 'auto-inserted fifo 0' between the filter 'audio format for output stream 0:0' and the filter 'output stream 0:0'
[aformat @ 0x969a80] auto-inserting filter 'auto-inserted resampler 0' between the filter 'Parsed filter 0 anull' and the filter 'audio format for output stream 0:0'
[AVAudioResampleContext @ 0x98cea0] audio_convert: found function: s16p to s16  (C)
[AVAudioResampleContext @ 0x98cea0] audio_convert: found function: s16p to s16  (SSE2)
[AVAudioResampleContext @ 0x98cea0] audio_convert: found function: s16p to s16  (AVX)
[resample @ 0x98c3e0] fmt:s16p srate:44100 cl:stereo -> fmt:s16 srate:44100 cl:stereo
Output #0, wav, to 'foo.wav':
  Metadata:
    IART            : Ja'Quay
    INAM            : Reasoning
    IGNR            : 159
    ISFT            : Lavf54.20.4
    Stream #0.0, 0, 1/44100: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (mp3 -> pcm_s16le)
Press ctrl-c to stop encoding
Input stream #0:0 frame changed from rate:44100 fmt:s16p ch:2 chl:stereo to rate:32000 fmt:s16 ch:1 chl:mono
[abuffer source @ 0x96a560] Setting value '1/32000' for key 'time_base'
[abuffer source @ 0x96a560] Setting value '32000' for key 'sample_rate'
[abuffer source @ 0x96a560] Setting value 's16' for key 'sample_fmt'
[abuffer source @ 0x96a560] Setting value '0x4' for key 'channel_layout'
[abuffer @ 0x969b40] tb:1/32000 samplefmt:s16 samplerate: 32000 ch layout:0x4
[aformat filter @ 0x96a420] Setting value 's16' for key 'sample_fmts'
[aformat filter @ 0x96a420] Setting value '44100' for key 'sample_rates'
[aformat filter @ 0x96a420] Setting value '0x3' for key 'channel_layouts'
[abuffersink @ 0x968e00] auto-inserting filter 'auto-inserted fifo 0' between the filter 'audio format for output stream 0:0' and the filter 'output stream 0:0'
[aformat @ 0x9699c0] auto-inserting filter 'auto-inserted resampler 0' between the filter 'Parsed filter 0 anull' and the filter 'audio format for output stream 0:0'
[AVAudioResampleContext @ 0x98df20] Using s16p as internal sample format
[AVAudioResampleContext @ 0x98df20] audio_convert: found function: s16p to s16  (C)
[AVAudioResampleContext @ 0x98df20] audio_convert: found function: s16p to s16  (SSE2)
[AVAudioResampleContext @ 0x98df20] audio_convert: found function: s16p to s16  (AVX)
[AVAudioResampleContext @ 0x98df20] resample: s16p from 32000 Hz to 44100 Hz
[AVAudioResampleContext @ 0x98df20] audio_mix: mono to stereo
[AVAudioResampleContext @ 0x98df20]   0.707 
[AVAudioResampleContext @ 0x98df20]   0.707 
[AVAudioResampleContext @ 0x98df20] audio_mix: found function: [fmt=s16p] [c=flt] (C)
[AVAudioResampleContext @ 0x98df20] audio_mix: found function: [fmt=s16p] [c=flt] [1 to 2] (SSE2)
[AVAudioResampleContext @ 0x98df20] audio_mix: found function: [fmt=s16p] [c=flt] [1 to 2] (SSE4)
[AVAudioResampleContext @ 0x98df20] audio_mix: found function: [fmt=s16p] [c=flt] [1 to 2] (AVX)
[resample @ 0x98d4c0] fmt:s16 srate:32000 cl:mono -> fmt:s16 srate:44100 cl:stereo
[mp3 @ 0x96ab40] Header missing
Error while decoding stream #0:0
Input stream #0:0 frame changed from rate:32000 fmt:s16 ch:1 chl:mono to rate:32000 fmt:s16 ch:2 chl:stereo
[abuffer source @ 0x96aa80] Setting value '1/44100' for key 'time_base'
[abuffer source @ 0x96aa80] Setting value '44100' for key 'sample_rate'
[abuffer source @ 0x96aa80] Setting value 's16' for key 'sample_fmt'
[abuffer source @ 0x96aa80] Setting value '0x3' for key 'channel_layout'
[abuffer @ 0x9aa440] tb:1/44100 samplefmt:s16 samplerate: 44100 ch layout:0x3
[aformat filter @ 0x969d40] Setting value 's16' for key 'sample_fmts'
[aformat filter @ 0x969d40] Setting value '44100' for key 'sample_rates'
[aformat filter @ 0x969d40] Setting value '0x3' for key 'channel_layouts'
[abuffersink @ 0x968e00] auto-inserting filter 'auto-inserted fifo 0' between the filter 'audio format for output stream 0:0' and the filter 'output stream 0:0'
[abuffer @ 0x9aa440] Changing frame properties on the fly is not supported.
Input stream #0:0 frame changed from rate:32000 fmt:s16 ch:2 chl:stereo to rate:44100 fmt:s16 ch:2 chl:stereo
[abuffer source @ 0x96aa80] Setting value '1/44100' for key 'time_base'
[abuffer source @ 0x96aa80] Setting value '44100' for key 'sample_rate'
[abuffer source @ 0x96aa80] Setting value 's16' for key 'sample_fmt'
[abuffer source @ 0x96aa80] Setting value '0x3' for key 'channel_layout'
[abuffer @ 0x98e480] tb:1/44100 samplefmt:s16 samplerate: 44100 ch layout:0x3
[aformat filter @ 0x98a2a0] Setting value 's16' for key 'sample_fmts'
[aformat filter @ 0x98a2a0] Setting value '44100' for key 'sample_rates'
[aformat filter @ 0x98a2a0] Setting value '0x3' for key 'channel_layouts'
[abuffersink @ 0x968e00] auto-inserting filter 'auto-inserted fifo 0' between the filter 'audio format for output stream 0:0' and the filter 'output stream 0:0'
No more output streams to write to, finishing.
size=    5213kB time=30.00 bitrate=1423.4kbits/s    
video:0kB audio:5213kB global headers:0kB muxing overhead 0.002323%



More information about the libav-api mailing list