Global and Local Work Size in OpenCL

The work-items in a given work-group execute concurrently on the processing elements of a single compute unit. This is a critical point in understanding the concurrency in OpenCL. ... OpenCL only assures that the workitems within a work-group execute concurrently (and share processor resources on the device).

  • global work offset: what this parameter does is to alter the values that are returned by get_global_id() in the kernel.
  • global work size: the total number of work-items that can execute this kernel in parallel.
  • local work size: the number of work-items to be grouped together in a workgroup.
    • The total number of work-items in a work-group is computed as local_work_size[0] *... * local_work_size[work_dim - 1].
    • The total number of work-items in the work-group must be less than or equal to the CL_DEVICE_MAX_WORK_GROUP_SIZE value specified in table of OpenCL Device Queries for clGetDeviceInfo and
    • the number of work-items specified in local_work_size[0],... local_work_size[work_dim - 1] must be less than or equal to the corresponding values specified by CL_DEVICE_MAX_WORK_ITEM_SIZES[0],.... CL_DEVICE_MAX_WORK_ITEM_SIZES[work_dim - 1].

References:

Read More: