[libav-api] Synchronizing audio and video; meanings of pts

Ronald S. Bultje rsbultje at gmail.com
Mon Mar 21 20:32:45 CET 2011


please don't cross-post. This belongs on -api, I think.

On Sat, Mar 19, 2011 at 4:33 PM, Jieyun Fu <jieyunfu at mit.edu> wrote:
> I have a potentially dumb question about synchronizing audio and video using
> libav. I saw this piece of code in output-example.c, looks like users sync
> audio and video up by comparing audio_pts and video_pts. I have a few
> questions:
> 1. what do audio_st->pts and video_st->pts stand for?

Presentation timestamp.

> 2. what do audio_pts and video_pts stand for, how is it different,  if we
> compute audio_pts and video_pts by doing

Basically the same, but potentially in different time units (see below).

> audio_pts = (double)audio_st->pts.val * audio_st->time_base.num /
> audio_st->time_base.den;
> video_pts = (double)video_st->pts.val * video_st->time_base.num /
> video_st->time_base.den;
> 3. I can understand video_st->time_base, which would directly relate to the
> frame rate. How about audio_st->time_base then? How is this defined?

Not always framerate, but it can be. It is a timebase in which the PTS
values are defined. This _can_ be framerate, but also something else.
E.g. if the source file defines time in microseconds, the timebase is
likely 1/1000000, which means a PTS of 1 means "1 microsecond", 1*10^6
means "1 second", and so on. If the timebase is 25 (fps), a PTS of 25
means "1 second".

Typical audio timebases are samplerate (1/44100, 1/48000), or the same
as the source file, e.g. 1/1000000, etc.


More information about the libav-api mailing list