360 Video Pipeline

years ago

by Valeriy Novytskyy at ^H Hackerspace/ 5 min

to read

How to capture and process high-quality 360 video

overview

360 videography is an exciting but new field, with little information available for advanced users. This article might help other aspiring videographers avoid the trial and error when building a video processing pipeline for large projects.

equipment

I use the Insta 360 Pro I with an external SSD and battery, leaving the stock battery and a memory card inside the camera in case the external ones fail.

Since the onboard mic cannot handle the loudness and dynamic range of a real performance, I supplement the sound with an external stereo sound recorder and the audio from the venue board recorded on to a USB drive.

pre-flight

Check the equipment the day before filming:

Fully charge camera, external sound recorder, and light batteries
Format camera and sound recorder media and have a few minutes of test footage recorded to ensure integrity
Check that camera has redundant storage and power
Pack camera, sound recorder, lights, monopod, and a USB stick formatted with FAT32 (for recording audio from the venue board)

Setup at the venue:

Unpack the camera and configure settings as described in the next section, selecting external SSD as the media for recording
Setup audio capture: set level, activate limiter, plug in headphones and start recording. Avoid using a leveler because it turns up the ambient noise when the performance gets quiet
Work with the artist and the venue to find an optimal position for the camera so that it looks unobtrusive on the stage yet captures a scene that makes the viewer feel like they are a part of the performance. Avoid placing the camera where it might cast a shadow on the performer(s)
Rotate the camera to place the stitch lines that occur in-between lenses in unimportant areas that artists on stage will not be crossing often or standing nearby

settings

The following General settings work well for shooting 2D 360 video:

Setting	Value
Mode	`Normal`
Content type	`360° Pano`
Real-time Stitching	`off`
Save origins from six lens	`on`
Single lens resolution	`8K@30F`
Flat color mode	`on`
Audio gain	`-24 dB`

The following Exposure settings are good defaults:

Setting	Value
Mode	`Manual`
ISO	`640`
Shutter	`1/40s`
WB	`Auto`
Brightness	`0`
Saturation	`64`
Sharpness	`3`
Contrast	`64`

It’s important to apply the desired final camera rotation when stitching to save time at the next stage when rendering video. I use Insta360 Stitcher with the following stitch settings.

Setting	Value
Content Type	`Monoscopic`
Stitching Mode	`New Optical Flow`
Sampling Type	`Slow`
Blender Type	`Auto`
Use original offset	`off`
Use Default Circle Position	`on`
Gyroscopic Stabilization	`off`
Use Hardware Decoding	`6`
Use Hardware Encoding	`off`
Software encoding speed	`Highest Quality (Slowest)`

The following output settings ensure good quality for the next stage.

Setting	Value
Resolution	`8K`
Output Format	`MP4(H.264/H.265)`
Codec Type	`h264 codec`
Profile	`High`
Bitrate	`350 Mbps`
Frame Rate	`30 fps`
Audio Type	`Normal`

mixing audio

The audio of the performance should be as enjoyable as the video.

Sync imported audio tracks to audio from the camera’s onboard mic by using the Premiere sync audio feature. If that doesn’t work do it manually. See Synchronize clips in the Timeline panel
Apply a multi-band compressor if necessary to repair a mix with a bad spectrum balance
Apply a loudness maximizer if necessary, and choose a setting to maximize left/right channels independently to ensure both are balanced in volume
Apply a brickwall limiter to the entire audio track at -4 dB. The AAC audio codec that will be used for the final video works best when the audio does not peak too close to 0 dB

See Premiere Pro audio plug-ins reference for more information.

transcoding

I transcode the footage twice, down-scaling each time to average the “noise” pixels because this lowers the noise floor on the image and makes it look sharper without accentuating noise like a sharpening algorithm would, or blurring like a de-noising algorithm would.

I output both passes to PNG sequences. Using an uncompressed format like BMP may save up to 60% of time on rendering each pass including the final encode, if you have have several TB of space available.

For each transcoding or encoding operation, input and output files should not be on the same drive to eliminate I/O bottleneck. Outputting each frame as a separate file will let you restart the transcoding process if it stops.

color grading

In the first transcoding pass I perform color grading and re-sample the result from 8K to 5.2K. When using the camera settings in the previous section, advanced color grading is not required unless the Contrast is reduced to a lower value. A slight increase in Vibrance is enough in most cases.

I shoot in flat color mode with neutral settings to retain detail in colors and shadows. If a higher contrast or saturation are applied when shooting, pixel values in each channel are limited by chopping off highs and lows, which cannot be recovered. When this information is retained, applying a curve during color grading lets you re-distribute shadows and highlights and apply your own limiting to cut off values that contain more noise and retain values with detail.

I use the Lumetri Color plug-in for color grading and toggle the following filters in different combinations to determine if they are needed:

Exposure: This can be pushed to increase the richness of shadows, but that would also require more de-noising which blurs details
Contrast: This can be used along with Curves to tune the overall contrast. It adds an S-curve over the entire spectrum with the sharpness of the “S” controlled with this slider
Shadows: Controls the amplitude of shadow frequencies in the image. This adds a curve with a dip or a peak on shadows
Vibrance: Increase to enhance colored lights. Spot-check your footage for any places where this might blow out colored highlights. If used on scenes without intense colored lights it will compress colors and apply makeup gain, which manifests as highly visible, animated color noise
White balance: A proper white balance often ruins the mood established by colored lighting. Apply this if the key lights are neutral
Curve: The master color grading curve (the curves for each channel are only used to surgically adjust the white balance). When color grading flat color footage, the Master curve will resemble an “S” shape that re-maps the flat color distribution back into realistic colors. This lets you choose how much contrast you want to bring back and how you want to distribute shadows and highlights.

The result of color grading and re-sampling can then be rendered to a 5120x2560 PNG sequence. The 5.2K resolution works better than 6K because consumer cameras like GoPro Fusion, YI360, and Insta360 One X support at least 5.2K, making it possible to pipe in footage from secondary cameras at this stage and morph from one to another.

de-noising

The second transcoding pass applies de-noising and down-samples to 4K. Red Giant Denoiser III has a built-in re-sharpening pass. Neat Video Reduce Noise and Boris FX VR Sharpen can also be combined to the same effect.

When using the Denoiser III plug-in, I often turn off the re-sharpening pass and set other settings lower to reduce noise on light halos and make everything look smoother without increasing noise. If the lighting at the performance was bright enough I can usually get away with a small amount of re-sharpening.
When using the Neat Video plug-in, I select an area with the most noise and use the automatic profile analysis to let the plug-in decide what to do. The results are always great, although this plug-in often takes more time to process than Denoiser III. When possible, I apply a re-sharpening pass with Boris FX VR Sharpen.

The result of de-noising is re-sampled to a 4K (4096x2048) PNG sequence at maximum color depth and render quality.

animating titles

AfterEffects comes with a VR Comp Editor plug-in, which projects a portion of the video sphere onto a 1920x1080 rectangle, lets you add some text and animated effects, then re-projects it back onto the video sphere. This lets you render 2D titles over the 360 video, for example:

Blender can be used to render 3D titles with the Cycles rendering engine which supports outputting to the same equirectangular format as VR video:

Setting	Value
Engine	`Cycles`
Camera Lens	`Panoramic`
Camera Lens Type	`Equirectangular`
Resolution X	`4096`
Resolution Y	`2048`
Frame Rate	`30 fps`
Samples	`512`
World Surface	`Emission`
World Surface Color	`Environment Texture`
World Surface Color Image	(Choose a frame from the video)

With the above World settings any 3D objects placed in the scene will be lit by lights in the video, and any shiny objects will actually reflect the scene in the video. Some examples:

encoding

I create the final sequence from the second transcode, and add the previously rendered audio and titles.

Setting	Value
Format	`H.264`
Width	`4096`
Height	`2048`
Frame Rate	`30`
Encoding Performance	`Software Encoding`
Encoding Profile	`High`
Encoding Level	`5.1`
Bitrate Encoding	`VBR, 2 pass`
Target Bitrate	`45 Mbps`
Maximum Bitrate	`45 Mbps`
Render at Maximum Depth	`on`
Video is VR	`on`
Frame Layout	`Monoscopic`

performance

Working with 8K footage and using de-noising plug-ins requires a computer that balances performance with storage capacity. Here’s what worked for me.

Spec	Choice
CPU	Intel Core i9-9980XE
GPU	NVIDIA Titan Xp
Motherboard	ASUS ROG Rampage VI Extreme Omega X299-II
Memory	2x G.SKILL Trident Z 16GB 288-Pin RGB DDR4 4266
HD1 (Transcoding)	HighPoint rSSD7101B 2TB NVMe RAID Drive (only one fits on the motherboard)
HD2 (Transcoding)	Samsung 970 EVO 2TB - NVMe PCIe M.2 2280 SSD
HD3 (Cache)	Samsung 970 EVO 2TB - NVMe PCIe M.2 2280 SSD
HD4 (System)	Samsung 860 PRO V-NAND 1TB SSD
Power Supply	EVGA SuperNOVA TITANIUM 1600W
CPU Cooler	ARCTIC Liquid Freezer 240 (fans swapped for Noctua)
Case Fans	Noctua NF-S12B Redux Quiet Case Fan
Case	Fractal Design ATX Silent Mid Tower Case

Most of the transcoding work is done on a single core. Multi-core performance accelerates only minor tasks like color grading and blending, therefore the CPU was chosen to favor a high single-core speed over the number of cores.

uploading

This section provides a quick reference for sharing 360 videos.

For Facebook use Facebook 360 video requirements.

Setting	Value
Format	`MP4`
Video Codec	`H.264`
Max Resolution	`5K (5120x2560)`
Frame Rates	`30`
Audio Codec	`AAC`
Video Bitrate (4K)	`45 Mbps`
Audio Bitrate (Stereo)	`128 kbps`
Recommended Length	Up to `30 minutes`

For YouTube use YouTube 360 video requirements.

Setting	Value
Format	`MP4`
Video Codec	`H.264`
Frame Rates	`24`, `25`, `30`, `48`, `50`, `60`
Audio Codec	`AAC-LC`
Video Bitrate (4K)	`35-45 Mbps`
Audio Bitrate (Stereo)	`128 Kbps`, `384 kbps`
Recommended Length	Up to `30 minutes`

360 Video Pipelinecopy link

overviewcopy link

equipmentcopy link

pre-flightcopy link

settingscopy link

mixing audiocopy link

transcodingcopy link

color gradingcopy link

de-noisingcopy link

animating titlescopy link

encodingcopy link

performancecopy link

uploadingcopy link