This is an old revision of the document!
VobSubs
VobSubs are subtitles on DVDs, which are pictures overlaid on the video image.
Extract VOBSUBs using mencoder
. Will create dvd.idx
and dvd.sub
.
mencoder dvd://1 -ovc copy -oac copy -vobsubout dvd -vobsuboutindex 0 -sid 0 -o /dev/null
Merge them all into a Matroska file:
mkvmerge -o dvd.mkv dvd.mp4 dvd.idx dvd.sub
Detecting VobSubs
Note that ffmpeg
and libav
v 0.8.* will see the vobsubs and closed captioning streams with ffprobe
or avprobe
, but any other higher version of avprobe
(libav) will not.
$ ffprobe dvd_track_02.vob
ffprobe version 3.3.3 Copyright (c) 2007-2017 the FFmpeg developers built with gcc 4.9.4 (Gentoo 4.9.4 p1.0, pie-0.6.4) configuration: --prefix=/usr/local/ffmpeg libavutil 55. 58.100 / 55. 58.100 libavcodec 57. 89.100 / 57. 89.100 libavformat 57. 71.100 / 57. 71.100 libavdevice 57. 6.100 / 57. 6.100 libavfilter 6. 82.100 / 6. 82.100 libswscale 4. 6.100 / 4. 6.100 libswresample 2. 7.100 / 2. 7.100 Input #0, mpeg, from 'dvd_track_02.vob': Duration: 00:06:29.73, start: 441.272633, bitrate: 5630 kb/s Stream #0:0[0x1bf]: Data: dvd_nav_packet Stream #0:1[0x1e0]: Video: mpeg2video (Main), yuv420p(tv, smpte170m, bottom first), 720x480 [SAR 8:9 DAR 4:3], Closed Captions, 29.97 fps, 59.94 tbr, 90k tbn, 59.94 tbc Stream #0:2[0x80]: Audio: ac3, 48000 Hz, mono, fltp, 192 kb/s
archives: VobSub notes
Converting VobSubs is really hard.
First, extract them using transcode and subtitle2pgm (subtitleripper package):
tcextract -x ps1 -t vob -a 0x20 -i ../DC_Reader.vob | subtitle2pgm -o english -c 255,0,255,255
You can find the right color codes to use by playing with the options, to make OCR easier. See http://www.bunkus.org/dvdripping4linux/en/separate/subtitles.html#subtitles for more details.
I *vaguely* recall having issues with newer (>0.45) versions of gocr, but it could just have been that it didn't fare any better.
Use pgm2txt to use OCR on the image files:
pgm2txt english
If you did the color conversion right, it should find most of them itself.