Mar 04, 2021

Esri collaborates with Binomial to improve Basis Universal Supercompressed GPU Texture Codec speed

By Tamrat Belayneh

Performant 3D systems routinely utilize compressed textures, as an essential optimization technique to support the loading and display of massive amounts of textures in 3D applications. Indexed 3D Scene Layer (I3S) supports various compressed texture formats for that purpose targeting different platforms. With wider adoption of GPU-decodable Supercompressed Textures (GSTs) such as Basis Universal, gone are the days where 3D systems had to greatly compromise on texture quality, bloated texture asset sizes and various formats when using compressed textures. Basis Universal allows for excellent decoding speed, great compression rates while keeping the visual quality and works cross platform. In this blog, we are pleased to share a collaboration effort with Binomial that improved Basis Universal encoder speed by a factor of at least 3X! We also show a sneak preview of Basis Universal texture support as an I3S compressed texture asset.

Indexed 3D Scene Layer (I3S), a specification created by Esri for streaming textured mesh and point cloud dataset, has become one of the widely used format for disseminating massive geospatial content. I3S Supports various layer types including 3D object and IntegratedMesh – allowing the streaming of millions of 3D objects and high fidelity meshes, as well as Point Cloud Scene Layer – enabling the streaming of point cloud data consisting of billions of points. I3S has also added support for Building Scene Layer – unlocking complex BIM (Building Information Model) content to be accessible in a user friendly, web stream-able standard. In short, I3S enables the streaming of massive geospatial content to web browsers, mobile devices, and desktop applications.

I3S has been evolving since it was publicly shared to the open source community under an Apache license in early 2015. Adopted as the first 3D streaming Community Standard by the Open Geospatial Consortium (OGC) in the fall of 2017, I3S has continued to evolve, as a living, breathing specification. The latest example of this includes OGC’s recent incorporation of the Point Cloud Scene Layer specification, into an updated I3S 1.1 OGC Community Standard.

One of the key features of IntegratedMesh and 3D Object Scene Layers in I3S include support for compressed texture formats such as DXT and ETC2, as GPU native texture asset resources. Compressed textures bring massive reduction in client application memory since they are directly loaded in the GPU without having to be uncompressed to RGB/A. This affords the consuming application much-less memory utilization and avoid the CPU cycles required to decompress highly compressed image formats such as JPEG and PNG.

The advantages of a compressed texture is clear. Compressed textures considerably lower memory footprint of an application (particularly important in GPUs with shared graphics memory), more compressed images are able to fit into the cache of the processor, and using compressed texture can lower battery usage on mobile devices by avoiding expensive decompression from highly compact formats such as JPEG/PNG. However, generating and using compressed textures comes at a cost, including, compressed textures are significantly larger compared to the same image quality in JPEG/PNG formats and they tend to be hardware and platform specific (different platforms require different type of compressed textures) – creating challenges in storage and transmission. Lastly, creating compressed textures is also prohibitively slow even when using state of the art compression software. For example, ETC2 (a compressed texture format native to GPUs on mobile platforms) and PVRTC (GPU native compressed texture supported on IOS) are on average about 100x magnitude order slower compared to the generation of DXT (using Intel SSE compressor) which is GPU native compressed texture format on desktop platforms.

Until the introduction of GSTs – GPU-decodable Supercompressed Textures, GPU native compressed image generation was extremely specialized and was tightly coupled to specific hardware/platforms. Before GSTs, there wasn’t much option other than redundantly generating the same texture and distributing it in the various compressed texture format flavors targeting specific platforms.

GSTs allowed creating a supercompressed texture format once and being able to transcode it on the GPU to the native compressed texture format supported on the target platform. The general idea behind GSTs is to further compress endpoint texture compression formats such as DXT, ETC2 and PVRTC – significantly reducing the payload size (which are typically 3X larger compared to JPEG/PNG – even after further lossless LZ77 re-compression). The reduction in compressed texture size coupled with just having to deal with a single texture asset that works everywhere, allows asset creators the ability to create and transmit GPU native textures to any desired platform. GSTs came to further widespread usage with the introduction of Basis Universal Texture format by Binomial.

Basis Universal

Basis Universal is a GPU optimized texture compression format that significantly reduces the image storage size while keeping highest quality. Basis Universal texture format is 6-8 times smaller than JPEG on the GPU yet is a similar storage size as JPEG – making it a great alternative to current GPU compression methods that are inefficient and do not operate cross platform – and provides a more performant alternative to JPEG/PNG.

In The Beginning ….

We were particularly excited when Binomial in collaboration with Google, open sourced the Basis Universal texture codec, including the CLI compression utility under an Apache 2.0 license in May 2019. Basis Universal met most of the requirements to include it as a supported I3S texture compression format. One exception was the fact that the texture compression speed of the encoder was on par with ETC2, which was the lowest performing in terms of the compression speed.

It is in the area of encoder performance where we saw an opportunity to collaborate with Binomial. We set out to improve the compression speed of the core encoder and CLI for creating Basis Universal Textures by at least a factor of 3X, paying particular attention to texture types that are common in the geospatial world. These improvements were targeted and included algorithmic updates, ISPC compilers & SSE optimizations, as well as performance focused optimizations. The optimizations were designed to be agnostic to the texture content in most cases.

The Basis Universal texture format provides various encoders and transcoders available in different languages (C++, C, JS/WASM) allowing the direct transformation of the basis format to a target compressed texture. In particular, Basis transcoders allow transcoding from basis format to DXT – specifically the variant widely supported in I3S, DXT5 (equivalent to BC3). This meant, on desktop (windows) platforms the only change for a consuming client application is the introduction of a JIT transcoder code but the content that gets sent to the GPU remains the same as now. Basis also supports encoding to BC7, which is a better quality, but to minimize the risk for I3S consuming clients, we opted to focus on BC3 format.

But first, though there are many benchmarks available, we needed to establish a baseline benchmark. This baseline focused primarily around texture compression ratio (quality), compression speed and file size, that was acceptable in geospatial applications.

The I3S layers used for building the benchmark data included representative texture atlas sourced from various 3D Object and IntegratedMesh I3S layers. Representative texture atlas were selected based on:

– LOD ranges – including both leaf (highest LOD) and Interior (intermediate LOD) nodes from 7 representative dataset
– Quality/size – from the maximum texture atlas size supported in I3S (4k) to various recommended texture sizes (1k, 2k)
– Composition – where some texture atlases are well-batched (tightly-packed) and others contained significant white-noise (wasted space)
– Content – some of the texture atlas contained repeating textures

Basis encoding Version 1.12 compared to Version 1.13: Plots Quality (measured as a function of a Y-PSNR (db), Encoding times (seconds) and Bits Per Texel (basis file size in bits/Texel Count*Num Channels) – where Texel count is texture.width*texture.height for 7 use cases, where the basis encoder setting for quality was set to 128 and compression level to 1. Note the difference between Basis Ver. 1.12 Encoding Time (in seconds – orange bars) vs Basis Ver 1.13 Encoding Time using the Optimized code base Version 1.13 (In seconds – yellow bars).

The chart above shows encoding performance and quality using the Basis encoder before and after optimization. Texture quality is represented by the blue bar graphs showing Y-PSNR of the texture, where typical values are in the ranges of 30 – 50 dB. Total encoding time to the basis format are represented by the orange (unoptimized) and yellow (optimized) bar graphs, using Version 1.12 and Version 1.13 Basis codebases, respectively. Note that the total encode time includes also the decompression time from JPEG/PNG input source format to RGB/A as well as to the basis file format. The Bits per Texel (basis file size in bits/texel count*Num Channels) is represented as a line graph to account in differences among different sized texture atlases.

The basis texture files for the various use cases were compressed using 128 Quality settings and compression level 1. Quality runs on a 1-255 scale, where default is 128 (bigger better quality) and compression level runs on a 0 – 5 scale, where 0 being the fastest and 5 the slowest (also impacting quality).

The 7 representative texture atlases used in establishing the baseline were sourced from various I3S layers (covering 3D Objects and IM layers) from arcgis.com. The test datasets were texture atlases from the following sources:

IM_Frankfurt_4K & IM_4K (texture atlases sourced from nFrames I3S IntegratedMesh (IM) service), IM_NYC_2K (texture atlases sourced from NearMap I3S IM service), 3D_Object_SanFran_4k (texture atlases sourced from Pictometery I3S 3D Object Service), IM_Chateau_1k (texture atlases sourced from Bentley I3S IM service), IM_Rancho_4K (texture atlases sourced from Pix4D/Drone2Map I3S IM service) and 3D_Object_Singapore_2k (texture atlases sourced from GeoScene I3S 3D Object Service).

The Setup

Our main goal, as stated above, was to bring in at least a 3X factor improvement by reducing the encoding time. We were prepared for a 5 – 10% increment in file size and/or very minimal loss in quality (less than 3db) to achieve this improvement factor.

As can be seen in images below, basis compression brings huge reduction for in-memory file size while keeping the transmission file size on par or below jpeg (see in-memory reduction in the 3 highlighted cases below, where the in-memory file size reduction ranges from 9X – 5X). This parity in file size to JPEG on average is quite appealing especially when considering a 3X larger transmission size when comparing to other compressed formats such as DXT (BC3). Note that the 3X factor is with an additional lossless LZ77 re-compression of the DXT payload which gets uncompressed by the browser or end client application (on CPU). If the DXT compressed texture were not to be LZ77 compressed (for performance reasons), the size factor (compared to source JPG file) is in the range of 6X – 8X.

As a result of the compactness of the Basis file format, we were glad to have the option to trade off better/faster compression rates for increment in file size.

Similarly, after experimenting with various Basis compressor Quality settings (32, 64, 128) we settled on using 128. A lower quality setting in the Basis compressor will result in a better compression (smaller compressed file sizes) but at the expense of lower ‘quality’. As a result, we ‘locked-in’ the quality settings we are going to use before and after optimization to be 128 (on a scale of 0 – 255). Obviously, quality is subjective, however in the GPU texture compression field, Peak-Signal-to-Noise Ratio – PSNR (more precisely the Y-PSNR*) is accepted as a good/fair estimator-metric that could be used to measure the quality of compressed textures. Typical values for the PSNR in lossy image and video compression are between 30 and 50 dB, provided the bit depth is 8 bits, where higher is better. As a result, we planned to use the Y-PSNR values of the compressed textures/images to assess the quality of the compressed image after optimizing the compressor code, in addition to visual inspections.

IM_Frankfurt_4k: Top Left: a texture atlas from an IM I3S layer in jpg format. Top right: same texture compressed in Basis with Compression Quality of 128 and Compression Level 1. Bottom Left and Bottom Right images are the red box areas in the top images zoomed in @ about 128%.

IM_NYC_2k: Top Left: a texture atlas from an IM I3S layer in jpg format. Top right: same texture compressed in Basis with Compression Quality of 128 and Compression Level 1. Bottom Left and Bottom Right images are the red box areas in the top images zoomed in @ about 128%.

IM_chateau_1k: Top Left: a texture atlas from an IM I3S layer in jpg format. Top right: same texture compressed in Basis with Compression Quality of 128 and Compression Level 1. Bottom Left and Bottom Right images are the red box areas in the top images zoomed in @ about 137%.

* The size in-memory is the uncompressed memory size of the Basis file transcoded to target compressed texture. In all the above examples the final target transcoded compressed texture format is in BC3_RGBA (DXT compatible usable on desktop systems); the in-memory size of other compressed texture formats could be different.

Why use Y-PSNR:

Though there are few methodologies for computing the PSNR of a color image, researchers use specifically the luma information because of the the human eyes sensitive to it. As a result, the Y-PSNR for color images can be computed by converting the image to a color space that separates the intensity (luma) channel, such as YCbCr. The Y (luma), in YCbCr represents a weighted average of R, G, and B. G is given the most weight, again because the human eye perceives it most easily.

Results

The optimization done under this work and now are shared as part of Basis Version 1.13 included:

Basis v1.13 now optionally supports SSE 4.1 (Streaming SIMD Extensions) on x86/x64 platforms. It does this by utilizing CppSPMD_Fast, a C++ library that implements a strong subset of Intel’s ispc SPMD programming language in C++. SSE usage reduces overall encoding time by approximately 25%.
The compressor was profiled and the most important hotspots were identified and rewritten for higher performance, even without SSE.
Careful recalibration of the compressor’s “compression level” presets, which control how the encoder trades off quality vs. compression time. The Basis compressor now supports an additional compression level preset that is tuned for faster encoding with minimal impact on quality: level 1, shifting all the previous levels beginning at 1 down to levels 2+. (Higher compression levels encode more slowly, but at slightly higher average quality.)
Careful tuning of the formulas used to convert the quality level to color endpoint palette (or “codebook”) sizes. The updated encoder now outputs up to 55% larger color endpoint palettes at quality level 128, resulting in less banding/more saturation on complex textures.
The encoders’ mipmap generator was optimized to use the next largest texture when generating successive levels of the mipmpas as opposed to starting with the largest input texture. It has noticeable impact on large textures.

We are pleased to report, as it can be seen from the first chart, across various I3S dataset types, there is about ~4X reduction in compression time for the same Quality (Y-PSNR) and File sizes. This is excellent news especially as the optimization was brought about without any loss in quality or increment in file size (for which we still have room to do further optimization!)

Compares the Encoding Optimization gained for various I3S datasets by using by using Compression Level 1 and the new geospatial focused compression level 0 settings.

Geospatial Use Case

We would be remiss if we didn’t showcase what this improvement would mean in geospatial context, in particular as it relates to I3S!

As documented In the I3S Standard and have been implemented by several data providers, I3S supports compressed textures (DXT, ECT2) as a compressed texture format. We are now pleased to announce the next version of I3S will include support for Basis Universal as one of the compression formats supported.

We thought we can share a sneak preview of the improvements gained in utilizing Basis Universal texture (memory reduction, smaller file sizes and not having to decompress form JPEG/PNG to GPU native formats, results in faster loading) when used in an IntegratedMesh I3S layer, using a live app.

Note: When using JPEG/PNG texture the app defensively reduces the quality, by loading lower levels of detail to avoid further using excessive amounts of texture memory. As a result, when loading some of the bookmarks all of the expected nodes/tiles might not load in the case of the JPEG layer view (Left viewer) and as a result might show lower memory usage compared to the Basis layer view (Right viewer).

Credits:

Esri would like to thank Stephanie Hurlburt & Richard Geldreich of Binomial for the opportunity to collaborate on an exciting area of optimization of Basis Universal encoder. The optimized encoder code of Basis Universal (Version 1.13) is now publicly available under Apache 2.0 license.

Tamrat Belayneh

With over 20 years of experience in 3D software development, Tam is lead author for Indexed 3D Scene Layers (I3S) OGC standard. Over his career Tam has worked in advancing 3D graphics in the geospatial industry and has contributed as a 3d developer and engineer in various ArcGIS family of 3D products. Tam has authored and presented numerous articles and technical talks throughout the world including GTC, Siggraph, Foss4G, 3DGeoInfo, ISPRS and many other internationally recognized conferences. In the last few years, Tam has been focusing on using Machine Learning and Deep Learning in advancing 3d object recognition and mesh segmentation within a geospatial context.

Article Discussion:

0 Comments

Oldest

Newest

Inline Feedbacks

View all comments

ARCGIS

CAPABILITIES

BUY ARCGIS

INDUSTRIES

Support & Services

SELF-SERVICE

CONTACT US

ESRI STORIES

About Esri

About GIS

Commitment to Innovation

ArcGIS Blog