This is an old revision of the document!


VobSubs

VobSubs are subtitles on DVDs, which are pictures overlaid on the video image.

Extract VOBSUBs using mencoder. Will create dvd.idx and dvd.sub.

mencoder dvd://1 -ovc copy -oac copy -vobsubout dvd -vobsuboutindex 0 -sid 0 -o /dev/null

Merge them all into a Matroska file:

mkvmerge -o dvd.mkv dvd.mp4 dvd.idx dvd.sub

Detecting VobSubs

Note that ffmpeg and libav v 0.8.* will see the vobsubs with ffprobe or avprobe, but any other higher version of avprobe (libav) will not.

archives: VobSub notes

Converting VobSubs is really hard.

First, extract them using transcode and subtitle2pgm (subtitleripper package):

tcextract -x ps1 -t vob -a 0x20 -i ../DC_Reader.vob | subtitle2pgm -o english -c 
255,0,255,255

You can find the right color codes to use by playing with the options, to make OCR easier. See http://www.bunkus.org/dvdripping4linux/en/separate/subtitles.html#subtitles for more details.

I *vaguely* recall having issues with newer (>0.45) versions of gocr, but it could just have been that it didn't fare any better.

Use pgm2txt to use OCR on the image files:

pgm2txt english

If you did the color conversion right, it should find most of them itself.


Navigation