How To: Convert Hard Subs to Soft Subs

This is a companion piece to the video I have done here.

This is a brief tutorial on how to convert hard subs to soft subs. Hard subs are subtitles baked into a particular video release and not split out to their own subtitle file. They can be common on older releases, but are mostly unheard of today. That said, these old hard subs may be the only translation available and you may want to convert them to soft subs so they can be used in a higher quality release.

Tools of the Trade

  1. VideoSubFinder:
    • This will be used to extract all the subs from the hard sub source and compile a final subtitle file.
  2. gImageReader:
    • This will take the extracted images and convert them to text files.
  3. Aegisub:
    • This will allow you to edit the subtitle and adjust its timings

Step by Step Process

  1. Load your hard subbed video into VideoSubFinder, File -> Open Video (FFMPEG)
  2. Adjust the bar to capture all the hard subs that will come up in the video
  3. Start the search, this will take a few minutes

4. When the search is done go to the RGBImages folder: “VideoSubFinder\RGBImages”, and remove every image that does not have a subtitle in it. You will get a number of false positives. You may have luck adjusting the settings VideoSubFinder uses to find subs (under the Settings tab at the bottom), but I fiddled with them a lot and got nowhere.

5. Go to the OCR tab and click “Create Cleared TXTImages.” This will convert every image in RGBImages to a cleared image file that is just the subtitle set to a white background. This will take a few minutes.

6. Open up gImageReader and load the TXTImages directory from VideoSubFinder\TXTImages.

7. Under “Recognize all English” do “Batch mode…” to convert all images to text files.

8. You will get a number of text files with the same name of the image file in TXTImages, copy these over to VideoSubFinder\TXTResults (or figure out how to set the output directory in gImageReader).

9. Go back to VideoSubFinder and in the OCR tab click “Create Sub From TXTResults” and save the resulting subtitle file.

10. Edit the sub in NotePadd++ or use Aegisub. If you are lucky your new source will be an exact match to the old source, but if not you will have to adjust the sync in Aegisub (via Timing -> Shift Times).