This is an old revision of the document!
VobSubs
VobSubs are subtitles on DVDs, which are pictures overlaid on the video image.
Extract VOBSUBs using mencoder
. Will create dvd.idx
and dvd.sub
.
mencoder dvd://1 -ovc copy -oac copy -vobsubout dvd -vobsuboutindex 0 -sid 0 -o /dev/null
Merge them all into a Matroska file:
mkvmerge -o dvd.mkv dvd.mp4 dvd.idx dvd.sub
archives: VobSub notes
Converting VobSubs is really hard.
First, extract them using transcode and subtitle2pgm (subtitleripper package):
tcextract -x ps1 -t vob -a 0x20 -i ../DC_Reader.vob | subtitle2pgm -o english -c 255,0,255,255
You can find the right color codes to use by playing with the options, to make OCR easier. See http://www.bunkus.org/dvdripping4linux/en/separate/subtitles.html#subtitles for more details.
I *vaguely* recall having issues with newer (>0.45) versions of gocr, but it could just have been that it didn't fare any better.
Use pgm2txt to use OCR on the image files:
pgm2txt english
If you did the color conversion right, it should find most of them itself.