Portable shared-memory parallelisation strategies for high-order finite element codes