Your graphics card sucks at making video.

The following post is for an often run into issue regarding the use of Graphics cards to accelerate making videos and was originally intended for the XSplit forums, a program used to Broadcast yourself or games onto the internet.

As we know, XSplit is a high performance program, which in order to do it’s task needs to do many things – Capturing audio from many different sources, taking user input, streaming it all to the web.

The most intensive parts are getting the Video, as well as taking that and putting it into a usable format (encoding).

Currently XSplit can use tools such as Gamesource and DXTory in order to get the video using the other powerful part of a computer – the Graphics card aka GPU. This can help in reducing the amount of cpu use and bottle necking or frame rate drop that a computer will normally have.

At this point, it would be natural to wonder about if the encoding process itself can be sped up to further help with making XSplit have even less of an impact.

There exists many available options to do this – like programming for the CPU you have languages and specifications to make programs run on a GPU including OpenCL, GPgpu and Cuda.

Unfortunately there is something getting in the way – lies.

The encoder XSplit uses is provided by the very talented and clever people working on the Open source X264 Project. Many people over the years have been contributing code, adding features and optimizing it to help provide the latest and best quality available when used to create your video. If you’ve used VideoLan Player aka VLC before, its player uses the X264 codec for both playing back and encoding. If you have used Handbrake to convert videos for a more friendly format for your mobile device, you can thank X264 for that as well. The configuration options and tweaks available are a labyrinth, but well documented. Mastery of these settings will get you the absolute best quality for low and high bitrate video (the amount of storage or bandwidth it will end up taking).

When marketing your product however, what matters is that your device can outperform the others, which leads us to how one can outdo the other.

The Performance war

You need to be able to give the marketing department a number, and now.

Intel Quicksync speed
Twice as fast as ...X?

What is omitted (besides how much faster it really is), is the quality of the encoding. The hardware used is built for specific purposes, and in basic terms can only look so far when “optimizing” your picture. It is common to shortcut the process to get your end result for hardware based encoding, both because marketing wants speed above all, as well as the amount of eyeballs on the code is much more limited due to the proprietary nature of both Closed source and Non Disclosure Agreements.

A more detailed look which prompted me to write about this in the first place has been done by the lads at Extremetech, Titled: The Wretched State of GPU Transcoding) which I recommend you take a look at.

To be fair, quality can be forgiven in some places such as taking a source and downsampling it for a mobile or bitrate constrained device. But it is never optimal.

Mind the bugs

The initial allure of GPU Based encoding is due to it being an order of magnitude faster at given tasks compared to a CPU.  The differing languages of writing for a GPU muddy the water of making cross platform capable code. Nvidia and AMD have their own nuances you will need to cater for. OpenCL is looking to be the cross platform code of the future, however Nvidia have a proven interest to get game developers to use Cuda as much as possible to help their own sales.

I would not like to be on the receiving end of support when one user tries software which works differently on one graphics card to another – or worse not at all. This is another dealbreaker currently for a company looking to add support for making video.

If you take a look at the feature sets of new graphics cards, you will also see them touting the latest in GPU Transcoding (converting a video from one format to another) and Encoding. AMD Ati Graphics cards have software called Avivo, Nvidia have Badaboom. It’s not exclusive to expensive graphics cards either – Even some Intel Processors are offering on board graphics cards with technology called Quicksync. Other software offer the ability to use the encoding in places, Cyberlink and Arcsoft both being developers who often bundle in with graphics cards advertising their accelerated encoding.

With this information you could be eager to ask for XSplit to go add support themselves for doing this, then we can get on with broadcasting the Battlefield 3 bug of the week and Call of Duty commentary.

Some users have suggested using certain products which can be used in a dual PC setup. This means that one pc is playing the game, while the other computer is capturing the first one. The products offer the option to capture straight into the H.264 format. Wow, you can skip the entire process of encoding right? Well another pitfall – The software still needs to take it, and do compositing with your titling, transitions and other effects along with the audio. You would even need to encode again, a suboptimal solution.

Small sidepoint – capture cards or usb dongles which suggest they are doing H.264 encoding are either lying or will give you some very specific restrictions on things such as bitrate and resolution. With all of the problems mentioned above of using a fixed function encoding process meaning a lower quality.

There is equipment broadcasters and TV Stations use – but it comes at a large cost.

The Darim HV401
The Darim MV401 is a mid level broadcast version giving high quality H 264 support. It is currently $8900 USD. You can't do High Definition at this level..

The computer is also likely tied up already using their GPU for other intensive purposes such as playing a game. The free overhead when encoding a video from the desktop is a significantly different environment to a dynamicly changing, frame rate chewing game. The GPU does not have a hidden unused section of the card waiting to be used for your encoding – by design they excel at being reconfigurable to use as much of itself as possible for its current task.

The last and most important problem specific to XSplit is compared to the more common use of the apps, The conversion needs to be as close to realtime as possible. While the use of a GPU can be quicker, in the context of the Application in a timesensitive environment, you now need to throw back your video data back and forth the CPU for further compositing. Time is everything, and your viewers will notice it as the varying loads of your GPU will affect the amount of synching your video will have with the audio.

Afterall, noone likes a mime. (CC Licence: Natalie Lazo)

 In Summary

Hoping for your graphics card to encode your video perfectly is still a pipedream. In theory it may be possible, but the execution in the realworld will eat away from resources that can be used elsewhere (making XSplit better!) as well as many quirks which will crop up due to GPU programming’s immature code state.

Like the Extremetech article (link again), CPU based Software encoding is the best quality and only option for making your videos.