Press "Enter" to skip to content

Distributed Video Encoding

So I’ve been struggling with 4k video encoding for Plex at home. It took me 16 days to encode Interstellar only to realize that I had Spanish subtitles enabled for some of the signage through Handbrake. I also realized that the format/container that Handbrake was outputting with h.265 was crashing my Chromecasts. It would stream through 1 of my smart TVs, but not the older one. Didn’t really matter what I was doing, but 16 days to find out something isn’t working properly or like I had intended wasn’t exactly the greatest thing in the world. So I started looking at ways to encode video across multiple hosts and stumbled across a project on GitHub here: https://github.com/nergdron/dve.

This had promise, but wasn’t working for me easily. I was able to get it to encode video slices on a single host that I had set up for encoding other videos. Trying to expand it out to multiple hosts resulted in a lot of failures. So I blew away my encoding VM and started over fresh. I started over with CentOS 7 minimal. I’m using static reservations with DHCP to make this easy to manage. In order to make this work for 4k encodes, I needed to enable epel-release, nux-desktop, and MKVToolNix. nux-desktop provides the needed libraries for ffmpeg. MKVToolNix will provide an updated release of mkvmerge that I use to slice video. Other required tools are parallel, rsync, and ffmpeg.


#!/bin/bash -e

set -e

# defaults for all configuration values
ENC="ffmpeg"
CRF=17
OPTS="-c:v libx265 -c:a libfdk_aac -vbr 1 -level 41 -crf ${CRF} -preset veryslow"
# This is to split out and copy attachment streams, like subtitles
# and fonts, so they only get copied once.
DATA_OPTS="-map 0 -c:s copy -c:t copy -c:d copy -vn -an"
SUFFIX="_new.mkv"
SERVERS="slave1,slave2,slave3,slave4"
LEN=300s
OUTDIR=`mktemp -d`
VERBOSE="error"
# override defaults in a ~/.dverc file
if [ -f ~/.dverc ]; then
source ~/.dverc
fi

function on_finish() {
echo "Cleaning up temporary working files"
cd "$CWD"
rm -rf "${OUTDIR}"/
echo "Finished cleaning"
}

function usage() {
cat << EOF
usage: $0 [options] filename
This script breaks a video file up into chunks and encodes them in parallel via SSH on
multiple hosts.
OPTIONS:
-h this help message.
-l comma separated list of hosts to use to encode. (default=${SERVERS})
-t rough length of individual video chunks, in seconds. (default=${LEN})
-o encoding options. (default=${OPTS})
-s output file suffix. (default=${SUFFIX})
-q video encoding quality, shortcut to use default encoding options with
a different CRF. (default=${CRF})
-v verbose job output. (default=false)
EOF
}

# check all required helper utils
function checkpaths() {
for cmd in parallel ffmpeg; do
if ! CMD=`which $cmd`; then
echo "$cmd not found in local path."
exit 1
fi
done
}

while getopts “hl:t:o:s:q:v” OPTION; do
case $OPTION in
h)
usage
exit 1
;;
l)
SERVERS="$OPTARG"
;;
t)
LEN="$OPTARG"
;;
q)
CRF="$OPTARG"
OPTS="-map 0 -c:v libx264 -c:a aac -q:a 100 -strict -2 -level 41 -crf ${CRF} -preset veryslow"
;;
o)
OPTS="$OPTARG"
;;
s)
SUFFIX="$OPTARG"
;;
v)
VERBOSE="info"
;;
?)
usage
exit
;;
esac
done
shift $((OPTIND-1))

if [ $# -lt 1 ]; then
usage
exit 1
fi

CWD=`pwd`
trap on_finish EXIT

checkpaths

if ! mkdir -p ${OUTDIR}; then
echo "Couldn't create temp chunk output dir ${OUTDIR}."
exit 1
fi

echo "Creating chunks to encode"
if [[ "$1" == *".AVI" || "$1" == *".avi" ]]; then
$ENC -fflags +genpts -i "$1" -map 0:a -map 0:v -codec copy -f segment -segment_time $LEN -segment_format matroska -v ${VERBOSE} "${OUTDIR}/chunk-%03d.orig"
else
mkvmerge --split $LEN -S "$1" -o "${OUTDIR}/chunk-%03d.orig"
fi

echo "Copying file metadata"
DATA_IN="-i data.enc -map 1"
${ENC} -y -v ${VERBOSE} -i "$1" ${DATA_OPTS} -f matroska "${OUTDIR}/data.enc" ||
DATA_IN=""
cd "$OUTDIR"

echo "Running parallel encoding jobs"
PAR_OPTS="--no-notice --gnu -j 1 -S ${SERVERS} --eta --retries 2 --nice 10"
PAR_OPTS="${PAR_OPTS} --workdir ... --transfer --return {.}.enc"
ENC_OPTS="-y -v ${VERBOSE} -i {} ${OPTS} -f matroska {.}.enc"

parallel ${PAR_OPTS} ${ENC} ${ENC_OPTS} ::: chunk-*.orig

echo "Combining chunks into final video file"
echo "ffconcat version 1.0" > concat.txt
for f in `ls chunk-*.enc | sort`; do
echo "file $f" >> concat.txt
done
BASE=`basename "$1"`
OUTFILE="${CWD}"/"${BASE%.*}${SUFFIX}"
${ENC} -y -v ${VERBOSE} -f concat -i concat.txt ${DATA_IN} -map 0 -f matroska -c copy "${OUTFILE}"

This is what I’m currently using. There is no job checking and there is no error checking to ensure the hosts are online. I need to look into parallel more to see what I can do to check on the status of hosts and job completion in case anything drops offline. Parallel uses rsync to copy files between hosts as defined in SERVERS and ssh to execute commands between hosts. It also uses local copies of the executables on the hosts.

I’m pretty sure that this covers everything I’ve done to make this work. I’ll probably update this more as things go on because this was a first attempt at making it work.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.