This is an old revision of the document!
VobSubs
VobSubs are subtitles on DVDs, which are pictures overlaid on the video image.
Extract VOBSUBs using mencoder. Will create subtitles.idx and subtitles.sub.
mencoder dvd://1 -ovc copy -oac copy -vobsubout subtitles -vobsuboutindex 0 -sid 0 -o /dev/null
archives: VobSub notes
Converting VobSubs is really hard.
First, extract them using transcode and subtitle2pgm (subtitleripper package):
tcextract -x ps1 -t vob -a 0x20 -i ../DC_Reader.vob | subtitle2pgm -o english -c 255,0,255,255
You can find the right color codes to use by playing with the options, to make OCR easier. See http://www.bunkus.org/dvdripping4linux/en/separate/subtitles.html#subtitles for more details.
I *vaguely* recall having issues with newer (>0.45) versions of gocr, but it could just have been that it didn't fare any better.
Use pgm2txt to use OCR on the image files:
pgm2txt english
If you did the color conversion right, it should find most of them itself.