1

As I understand that a video file is a binary file, the frames are not really present in a contiguous manner in a video file, so reading the video file by doing a seek and reading it frame by frame wouldn't work until you have loaded the entire file into memory and then go through it frame by frame.

But if you see any video file reading library eg opencv code, you can read each frame one by one by calling read multiple times in a loop.

video=cv2.VideoCapture(video_path)
grabbed, frame=video.read()

My question is how do they(such libraries) work internally to read the frames one by one without loading the entire video file in memory since frames are not really present in a video file in a contiguous manner so the seek and read x+size data wouldn't work.

Himanshuman
  • 115
  • 6

1 Answers1

1

In most video formats, the frame data is in fact stored in chronological order, for many reasons:

  • video recording to file
  • video streaming
  • avoiding slow file seeking on HDDs
  • ...

In most video formats, most frames are stored as a difference to the previous frame (so-called inter frames). Since frames are usually less than 40 miliseconds apart, the difference is very small and can be stored with relatively few bits of (compressed) data most of the time.

That makes your question very important and valid: If each image is stored as a difference to the previous image, how can we jump back an forth? We would have to build each frame one after another, starting at the very beginning.

The solution is simple: Most compressed videos contain a series of key frames which are independently compressed, i.e. they do not rely on any previous frame. Their timestamps and positions are stored in a lookup table, usually at the end of the video file (after encoding has finished). To go to a given time step $t$, You jump to the most recent key-frame before $t$ and walk through the inter-frames to the desired frame.

DirkT
  • 1,021
  • 2
  • 13