Getting Started

Building

The CEED library, libceed, is a C99 library with no required dependencies, and with Fortran, Python, Julia, and Rust interfaces. It can be built using:

$ make

or, with optimization flags:

$ make OPT='-O3 -march=skylake-avx512 -ffp-contract=fast'

These optimization flags are used by all languages (C, C++, Fortran) and this makefile variable can also be set for testing and examples (below).

The library attempts to automatically detect support for the AVX instruction set using gcc-style compiler options for the host. Support may need to be manually specified via:

$ make AVX=1

or:

$ make AVX=0

if your compiler does not support gcc-style options, if you are cross compiling, etc.

To enable CUDA support, add CUDA_DIR=/opt/cuda or an appropriate directory to your make invocation. To enable HIP support, add ROCM_DIR=/opt/rocm or an appropriate directory. To enable SYCL support, add SYCL_DIR=/opt/sycl or an appropriate directory. Note that SYCL backends require building with oneAPI compilers as well:

$ . /opt/intel/oneapi/setvars.sh
$ make SYCL_DIR=/opt/intel/oneapi/compiler/latest/linux SYCLCXX=icpx CC=icx CXX=icpx

The library can be configured for host applications which use OpenMP paralellism via:

$ make OPENMP=1

which will allow operators created and applied from different threads inside an omp parallel region.

To store these or other arguments as defaults for future invocations of make, use:

$ make configure CUDA_DIR=/usr/local/cuda ROCM_DIR=/opt/rocm OPT='-O3 -march=znver2'

which stores these variables in config.mk.

WebAssembly

libCEED can be built for WASM using Emscripten. For example, one can build the library and run a standalone WASM executable using

$ emmake make build/ex2-surface.wasm
$ wasmer build/ex2-surface.wasm -- -s 200000

Additional Language Interfaces

The Fortran interface is built alongside the library automatically.

Python users can install using:

$ pip install libceed

or in a clone of the repository via pip install ..

Julia users can install using:

$ julia
julia> ]
pkg> add LibCEED

See the LibCEED.jl documentation for more information.

Rust users can include libCEED via Cargo.toml:

[dependencies]
libceed = "0.12.0"

See the Cargo documentation for details.

Testing

The test suite produces TAP output and is run by:

$ make test

or, using the prove tool distributed with Perl (recommended):

$ make prove

Backends

There are multiple supported backends, which can be selected at runtime in the examples:

CEED resource

Backend

Deterministic Capable

CPU Native

/cpu/self/ref/serial

Serial reference implementation

Yes

/cpu/self/ref/blocked

Blocked reference implementation

Yes

/cpu/self/opt/serial

Serial optimized C implementation

Yes

/cpu/self/opt/blocked

Blocked optimized C implementation

Yes

/cpu/self/avx/serial

Serial AVX implementation

Yes

/cpu/self/avx/blocked

Blocked AVX implementation

Yes

CPU Valgrind

/cpu/self/memcheck/*

Memcheck backends, undefined value checks

Yes

CPU LIBXSMM

/cpu/self/xsmm/serial

Serial LIBXSMM implementation

Yes

/cpu/self/xsmm/blocked

Blocked LIBXSMM implementation

Yes

CUDA Native

/gpu/cuda/ref

Reference pure CUDA kernels

Yes

/gpu/cuda/shared

Optimized pure CUDA kernels using shared memory

Yes

/gpu/cuda/gen

Optimized pure CUDA kernels using code generation

No

HIP Native

/gpu/hip/ref

Reference pure HIP kernels

Yes

/gpu/hip/shared

Optimized pure HIP kernels using shared memory

Yes

/gpu/hip/gen

Optimized pure HIP kernels using code generation

No

SYCL Native

/gpu/sycl/ref

Reference pure SYCL kernels

Yes

/gpu/sycl/shared

Optimized pure SYCL kernels using shared memory

Yes

MAGMA

/gpu/cuda/magma

CUDA MAGMA kernels

No

/gpu/cuda/magma/det

CUDA MAGMA kernels

Yes

/gpu/hip/magma

HIP MAGMA kernels

No

/gpu/hip/magma/det

HIP MAGMA kernels

Yes

The /cpu/self/*/serial backends process one element at a time and are intended for meshes with a smaller number of high order elements. The /cpu/self/*/blocked backends process blocked batches of eight interlaced elements and are intended for meshes with higher numbers of elements.

The /cpu/self/ref/* backends are written in pure C and provide basic functionality.

The /cpu/self/opt/* backends are written in pure C and use partial e-vectors to improve performance.

The /cpu/self/avx/* backends rely upon AVX instructions to provide vectorized CPU performance.

The /cpu/self/memcheck/* backends rely upon the Valgrind Memcheck tool to help verify that user QFunctions have no undefined values. To use, run your code with Valgrind and the Memcheck backends, e.g. valgrind ./build/ex1 -ceed /cpu/self/ref/memcheck. A ‘development’ or ‘debugging’ version of Valgrind with headers is required to use this backend. This backend can be run in serial or blocked mode and defaults to running in the serial mode if /cpu/self/memcheck is selected at runtime.

The /cpu/self/xsmm/* backends rely upon the LIBXSMM package to provide vectorized CPU performance. If linking MKL and LIBXSMM is desired but the Makefile is not detecting MKLROOT, linking libCEED against MKL can be forced by setting the environment variable MKL=1. The LIBXSMM main development branch from 7 April 2024 or newer is required.

The /gpu/cuda/* backends provide GPU performance strictly using CUDA.

The /gpu/hip/* backends provide GPU performance strictly using HIP. They are based on the /gpu/cuda/* backends. ROCm version 4.2 or newer is required.

The /gpu/hip/* backends can also run on non-AMD GPUs (e.g., Intel) via chipStar, which implements HIP on top of SPIR-V through Level Zero or OpenCL. To build against chipStar, set HIP_DIR to the chipStar install prefix (in place of ROCM_DIR); libCEED’s Makefile detects chipStar by inspecting hipconfig and automatically enables the required code paths. At runtime, chipStar’s own environment variables (e.g., CHIP_BE=level0 or CHIP_BE=opencl, CHIP_DEVICE_TYPE, CHIP_PLATFORM) select the backend and device — see the chipStar documentation for details.

The /gpu/sycl/* backends provide GPU performance strictly using SYCL. They are based on the /gpu/cuda/* and /gpu/hip/* backends.

The /gpu/*/magma/* backends rely upon the MAGMA package. To enable the MAGMA backends, the environment variable MAGMA_DIR must point to the top-level MAGMA directory, with the MAGMA library located in $(MAGMA_DIR)/lib/. By default, MAGMA_DIR is set to ../magma; to build the MAGMA backends with a MAGMA installation located elsewhere, create a link to magma/ in libCEED’s parent directory, or set MAGMA_DIR to the proper location. MAGMA version 2.5.0 or newer is required. Currently, each MAGMA library installation is only built for either CUDA or HIP. The corresponding set of libCEED backends (/gpu/cuda/magma/* or /gpu/hip/magma/*) will automatically be built for the version of the MAGMA library found in MAGMA_DIR.

Users can specify a device for all CUDA, HIP, and MAGMA backends through adding :device_id=# after the resource name. For example:

  • /gpu/cuda/gen:device_id=1

Bit-for-bit reproducibility is important in some applications. However, some libCEED backends use non-deterministic operations, such as atomicAdd for increased performance. The backends which are capable of generating reproducible results, with the proper compilation options, are highlighted in the list above.