VobSubs are subtitles on DVDs, which are pictures overlaid on the video image.

archives: VobSub notes

Converting VobSubs is really hard.

First, extract them using transcode and subtitle2pgm (subtitleripper package):

tcextract -x ps1 -t vob -a 0x20 -i ../DC_Reader.vob | subtitle2pgm -o english -c 

You can find the right color codes to use by playing with the options, to make OCR easier. See http://www.bunkus.org/dvdripping4linux/en/separate/subtitles.html#subtitles for more details.

I *vaguely* recall having issues with newer (>0.45) versions of gocr, but it could just have been that it didn't fare any better.

Use pgm2txt to use OCR on the image files:

pgm2txt english

If you did the color conversion right, it should find most of them itself.