subtitle searcher | Winfred van Kuijk

subtitle searcher

By • Published: January 6, 2014 • Last updated: November 26, 2014 • Filed in: Software

Share
subtitleSearcher example resultA script that helps to automate searching & adding subtitles for movies and TV shows. It uses subliminal for the actual search+download and MP4Box for the actual adding of the subtitles. The script is there to make all of it easier: it gives you a command line interface (CLI) to add subtitles for all video files in a certain folder, for all video files recursively, or for a specific file.


[Download not found]

See requirements, usage and examples or the source code of the script.

example


$ subtitleSearcher -f "Elephants.Dream (2006).m4v"
---------------------------------------------------
processing: ./Elephants.Dream (2006).m4v
- missing: en:English nl:Dutch
INFO: Listing subtitles for
INFO: Found 2 subtitles total
INFO: Downloading subtitle with score 20
INFO: Downloading subtitle with score 20
INFO: Saving to u'./Elephants.Dream (2006).en.srt'
INFO: Saving to u'./Elephants.Dream (2006).nl.srt'
2 subtitles downloaded
- available on disk: en:English nl:Dutch
- setting comment: "subtitles: EN NL"
- adding subtitles (en:English nl:Dutch) to ./Elephants.Dream (2006).m4v
Timed Text (SRT) import - text track 1024 x 576, font Serif (size 18)
Timed Text (SRT) import - text track 1024 x 576, font Serif (size 18)
Setting up iTunes/iPod file...
Saving ./Elephants.Dream (2006).m4v: 0.500 secs Interleaving
---------------------------------------------------

summary of added subtitles:
- ./Elephants.Dream (2006).m4v: en:English nl:Dutch

requirements

The subtitle search script depends on:

  • subliminal
    • a Python script to download subtitles from various sources
    • on my Mac I install it with: sudo easy_install subliminal
  • MP4Box
    • a toolbox for (MP4) video files, in our case it is used to add subtitles files and update the iTunes comment
    • on my Mac I install it with homebrew, a super simple package installer. With homebrew installed, you install MP4Box with: brew install mp4box

Note: the subtitleSearcher script is as good as these two tools are. The script depends on subliminal to search/download and depends on MP4Box to add the subtitles. You will still need other tools to automatically add metadata (e.g. iFlicks) or sync the subtitles (e.g. iSubtitle).

installation

Download and unzip the script, save it where you like, e.g. in /usr/local/bin.
If the script does not run, make sure it has execution permissions (chmod +x scriptname).
It was created on a Mac, but should work on most Linux type systems.

usage


subtitleSearcher [-d directory] [-f file] [-r] [-c|s|m] [-b] [-k] [-u] [-h]
-d : what directory to search in (default: current directory)
-f : what video file to process
-r : recursive search (default: only single directory)
-c : check only - will only check what subtitles are missing
-s : subliminal only - will download subtitles, but not add them to the video
-m : mp4box only - will skip subliminal, only add already existing srt subtitle files
-b : create backup of original video file
-k : keep the srt files (default: move the srt files after processing)
-u : update comment only
-h : display this usage summary

configuration

Edit the script if you would like to change the following settings:

  1. subliminal location: default is /usr/local/bin/subliminal
  2. MP4Box location: default is /usr/local/bin/mp4box
  3. languages: default en:English nl:Dutch (code is needed for subliminal, name is needed for mp4box)
  4. formats: default mp4, m4v, mkv, avi
  5. srt archive folder: default /var/tmp; after processing the srt files will be moved here

examples

  1. subtitleSearcher
    Search and add subtitles for all video files in the current folder
  2. subtitleSearcher -c
    Check the video files in the current folder, show what subtitles are in the video and already on disk.
  3. subtitleSearcher -u
    Update the comments (if applicable) for the video’s in the current folder, so it lists the subtitles in the video. Example: “subtitles: EN NL”
  4. subtitleSearcher -s -f Just.Do.It.2011.720p.x264-VODO.mkv
    Download the subtitle for the specific video Just Do It. It will only download the srt file, it will not be added to the video.
  5. subtitleSearcher -r -m
    In the current directory and subdirectories add the subtitles that have already been downloaded (with the -s option and/or manually). Afterwards the subtitle files will be archived.
  6. subtitleSearcher -r -k -b
    Recursively search and add subtitles, make a backup of the original video file, after processing keep the subtitle files in the same folder as the video.

source code

#!/bin/bash

# Winfred van Kuijk - winfred@vankuijk.net - http://winfred.vankuijk.net/subtitle-searcher/
version="v1.0 - Jan 6, 2014 - initial release"
#
# script to check/load/add subtitles
#
# depends on:
# - subliminal (http://subliminal.readthedocs.org/en/latest/)
#   to install: pip install subliminal
# - mp4box (http://gpac.wp.mines-telecom.fr/mp4box/)
#   to install on Mac OS X, use Homebrew: brew install mp4box

###############################################################
languages=(en:English nl:Dutch) # language code for subliminal, language name for mp4box
searchFormats=(mp4 m4v mkv avi) # what formats to search for
mp4boxFormats=(mp4 m4v) # formats supported by MP4Box for adding subtitles

SUBLIMINAL="/usr/local/bin/subliminal"
MP4BOX="/usr/local/bin/mp4box"
srtArchive="/var/tmp" # when done, move the srt files to a folder (or comment out to remove)
###############################################################

# --- check if mp4box and subliminal exist
for i in $SUBLIMINAL $MP4BOX; do
	if [ ! -f $i ]; then echo "$i not found. Exiting."; exit 1; fi
done

#--------------------------------------------------
# read arguments
usage() { 
cat << EOF 1>&2
script to check/load/add subtitles - using subliminal and mp4box
$version
usage: 
${0##*/} [-d directory] [-f file] [-r] [-c|s|m] [-b] [-k] [-u] [-h]
	-d : what directory  to search in (default: current directory)
	-f : what video file to process
	-r : recursive search (default: only single directory)
	-c : check only - will only check what subtitles are missing
	-s : subliminal only - will download subtitles, but not add them to the video
	-m : mp4box only - will skip subliminal, only add already existing srt subtitle files
	-b : create backup of original video file
	-k : keep the srt files (default: move the srt files after processing)
	-u : update comment only
	-h : display this usage summary
	
edit script to change settings for:
- subliminal location: "$SUBLIMINAL"
- mp4box location    : "$MP4BOX"
- which languages    : "${languages[@]}"
- which video formats: "${searchFormats[@]}"
- srt archive folder : "$srtArchive"
EOF

exit 1
}

depth="-depth 1"
dir=`pwd`
while getopts "d:f:rcsmbkuh" o; do
    case "${o}" in
        d)
            dir=${OPTARG}
            ;;
        f)
            singleFile=${OPTARG}
            ;;
        r)
            depth=""
            ;;
        c)
            checkOnly=1
            ;;
        s)
            subliminalOnly=1
            unset checkOnly
            ;;
        m)
        	mp4boxOnly=1
            unset subliminalOnly
            unset checkOnly
            ;;
        b)
            createBackup=1
            ;;
        k)
            keepSrt=1
            ;;
        u)
            updateComment=1
            ;;
        h)
            usage
            ;;
        ?)
        	usage
        	;;
        *)
        	usage
            ;;
    esac
done
shift $((OPTIND-1))
[ $# -eq 0 ] || usage # for now: ignore regular arguments
if [ ! -d "${dir}" ]; then echo "directory not found: ${dir}. Exiting."; exit 1; fi
if [ ! -z "${singleFile}" ] && [ ! -f "${singleFile}" ]; then echo "file not found: ${singleFile}. Exiting."; exit 1; fi
if [ ! -z "${singleFile}" ]; then
	depth="-depth 1"
	#dir=$(dirname ${singleFile})
	dir="${singleFile%/*}" # extract the path from the file
	singleFile="${singleFile##*/}" # strip the path
	[ "$dir" == "$singleFile" ] && dir="." # in case there was no path
fi

srtArchive="${srtArchive/\~/${HOME}}" # manually replace ~ with ${HOME}

#--------------------------------------------------
# put the search formats in the right format for "find"
regexp=$(printf "|%s" "${searchFormats[@]}")
regexp=".*\.(${regexp:1})$"
parameters="-regex ${regexp}"
name="*"

if [ ! -z "${singleFile}" ]; then
	name="${singleFile}"
	parameters=""
fi

# go to the root directory
cd "$dir"

declare -a summary

#--------------------------------------------------
indexOf() {
	local i needle=$1 haystack
	shift
	haystack=("$@")
	index=-1;
	for i in "${!haystack[@]}"; do
		#if [[ "${haystack[$i]}" == *"${needle}"* ]]; then index=$i; fi
		if [[ "${haystack[$i]}" =~ "${needle}" ]]; then index=$i; fi
	done
}

#--------------------------------------------------
createComment() {
	local lang=("$@") langCode
	if [ -z $lang ]; then continue; fi
	itunesComment="subtitles:"
	for i in "${lang[@]}"; do
		langCode=${i:0:2}
		langCode=`echo ${langCode} | tr '[:lower:]' '[:upper:]'`
		itunesComment+=" ${langCode}"
	done
}

#--------------------------------------------------
# for each video
while IFS= read -d $'\0' -r video ; do
	echo "---------------------------------------------------"
	echo "processing: $video"
	videoSubtitles=()
	missingSubtitles=()
	foundSubtitles=()
	
	# get info about the video file
	mp4boxInfo=$($MP4BOX -info "${video}")

	# check which subtitle(s) already exist
	#availableSubtitles=`echo "${mp4boxInfo}" | grep sbtl | grep Language | cut -d" " -f4 | sort -u` # more precise
	availableSubtitles=`echo "${mp4boxInfo}" | grep sbtl | grep Language` # does the job and is faster
	unset commentCanBeReplaced
	currentComment=`echo "${mp4boxInfo}" | grep "\tComment:"`
	currentComment=${currentComment#*: } # strip the prefix " Comment: "
	if [ -z "$currentComment" ] || [[ "$currentComment" =~ "courtesy of" ]] || [[ "$currentComment" =~ "subtitles" ]]; then
		commentCanBeReplaced=1
	fi
	
	filebase="${video%.*}"
	extension="${video##*.}"

	# determine what subtitles are missing
	for i in "${!languages[@]}"
	do
		lang=${languages[$i]}
		langCode=${lang:0:2}
		langName=${lang:3}
		langNameLower=`echo ${langName} | tr '[:upper:]' '[:lower:]'`
		indexOf $langName $availableSubtitles
		if [ $index -gt -1 ]; then videoSubtitles+=($lang); fi
		if [ $index -lt 0 ] && [ -e "${filebase}.${langNameLower}.srt" ]; then mv "${filebase}.${langNameLower}.srt" "${filebase}.${langCode}.srt"; fi #YIFY subtitles format uses language name
		if [ $index -lt 0 ] && [ -e "${srtArchive}/${filebase}.${langCode}.srt" ]; then mv "${srtArchive}/${filebase}.${langCode}.srt" .; fi

		if [ $index -lt 0 ] && [ ! -e "${filebase}.${langCode}.srt" ]; then missingSubtitles+=($lang); fi
		if [ $index -lt 0 ] && [ -e "${filebase}.${langCode}.srt" ]; then foundSubtitles+=($lang); fi
	done

	if [ ! -z ${videoSubtitles} ]; then
		echo "- already in video: ${videoSubtitles[@]}"		
	fi # ! -z $videoSubtitles
	
	if [ ! -z $checkOnly ] && [ ! -z "${currentComment}" ]; then
		echo "- current comment: \"${currentComment}\""
	fi
			
	# -u flag is used to only update the iTunes comment metadata
	if [ ! -z $updateComment ]; then
		if [ -z $videoSubtitles ]; then
			echo "- comment not updated: no subtitles"
			continue		
		fi # ! -z $videoSubtitles
	
		if [ ! -z $commentCanBeReplaced ]; then
			createComment "${videoSubtitles[@]}"
			if [[ "$itunesComment" != "$currentComment" ]]; then
				parameters=("-itags")
				parameters+=("comment=${itunesComment}")			
				parameters+=("${video}")
				echo "- setting comment: \"${itunesComment}\""
				$MP4BOX "${parameters[@]}"
			else
				echo "- comment not updated: no change"
			fi # $itunesComment != $currentComment
		else
			echo "- comment not updated: not safe to replace (\"$currentComment\")"
		fi # ! -z $commentCanBeReplaced
		continue		
	fi # ! -z $updateComment
	
	# for missing subtitle(s): retrieve subtitle(s)
	if [ ! -z $missingSubtitles ] && [ -z $mp4boxOnly ]; then
		[ -z $foundSubtitles ] || echo "- already on disk (before subliminal): ${foundSubtitles[@]}"
		echo "- missing: ${missingSubtitles[@]}"
		
		if [ ! -z $checkOnly ]; then continue; fi

		langCodes=""
		for i in "${missingSubtitles[@]}"; do
			langCodes+=" ${i:0:2}"
		done
	
		# let subliminal search for the missing subtitles
		$SUBLIMINAL -l ${langCodes[@]} -- "$video"
		
		# check which subtitles are now available
		for i in "${missingSubtitles[@]}"; do
			langCode=${i:0:2}
			if [ -e "${filebase}.${langCode}.srt" ]; then foundSubtitles+=($i); fi
		done
	fi

	# for available subtitles: add them to the video
	if [ ! -z $foundSubtitles ]; then
		echo "- available on disk: ${foundSubtitles[@]}"

		if [ ! -z $checkOnly ] || [ ! -z $subliminalOnly ]; then continue; fi
		indexOf $extension "${mp4boxFormats[@]}"
		if [ $index -lt 0 ]; then echo "video format ($extension) not supported by MP4Box (${mp4boxFormats[@]})"; continue; fi

		#create parameters for mp4box
		parameters=() # put all parameters in array to avoid issues with spaces in filenames
		for i in "${foundSubtitles[@]}"; do
			langCode=${i:0:2}
			srtFile="${filebase}.${langCode}.srt"
			parameters+=("-add" "${srtFile}:lang=${langCode}:hdlr=sbtl:group=2")
		done

		# add parameters for adding comment
		if [ ! -z $commentCanBeReplaced ]; then	
			createComment "${videoSubtitles[@]}" "${foundSubtitles[@]}"
			if [[ "$itunesComment" != "$currentComment" ]]; then
				parameters+=("-itags")
				parameters+=("comment=${itunesComment}")			
				echo "- setting comment: \"${itunesComment}\""
			else
				echo "- comment not updated: no change"
			fi # $itunesComment" != $currentComment
		else
			echo "- comment not updated: not safe to replace (\"$currentComment\")"
		fi # ! -z $commentCanBeReplaced 

		echo "- adding subtitles (${foundSubtitles[@]}) to $video"
		if [ ! -z $createBackup ]; then
			mv $video ${video}.bak
			parameters+=("${video}.bak" "-out" "${video}")
		else 
			parameters+=("${video}")
		fi

		$MP4BOX "${parameters[@]}"
		subtitles=$(printf " %s" "${foundSubtitles[@]}")
		summary+=("$video:$subtitles");

		# move/remove the srt files after processing
		if [ -z $keepSrt ]; then
			for i in "${foundSubtitles[@]}"; do
				langCode=${i:0:2}
				if [ ! -z "$srtArchive" ]; then
					mv "${filebase}.${langCode}.srt" "$srtArchive"
				else
					\rm -f "${filebase}.${langCode}.srt"
				fi
			done
		fi 	
	fi

done < <(find -E . -name "$name" $parameters $depth -print0)  # for each video

if [ -z "$summary[@]" ]; then summary+=("no subtitles were added"); fi

echo "---------------------------------------------------"
echo " "
echo "summary of added subtitles:"
for i in "${summary[@]}"; do
	echo "-"  "${i}"
done

One Response to “subtitle searcher”

  1. YIFY Movies says:

    Nice post! Thanks for sharing. You may take a look at our the new YIFY Movies website

Leave a Reply

« | Home | »