cpp-opencl

Please see http://dimitri-christodoulou.blogspot.com.es/2014/02/implement-data-parallelism-on-gpu.html

The cpp-opencl project provides a way to make programming GPUs easy for the developer. It allows you to implement data parallelism on a GPU directly in C++ instead of using OpenCL. See the example below. The code in the parallel_for_each lambda function is executed on the GPU, and all the rest is executed on the CPU. More specifically, the “square” function is executed both on the CPU (via a call to std::transform) and the GPU (via a call to compute::parallel_for_each). Conceptually, compute::parallel_for_each is similar to std::transform except that one executes code on the GPU and the other on the CPU.

#include <vector>
#include <stdio.h>
#include "ParallelForEach.h"

template<class T> 
T square(T x)  
{
    return x * x;
}

void func() {
  std::vector<int> In {1,2,3,4,5,6};
  std::vector<int> OutGpu(6);
  std::vector<int> OutCpu(6);

  compute::parallel_for_each(In.begin(), In.end(), OutGpu.begin(), [](int x){
      return square(x);
  });

  
  std::transform(In.begin(), In.end(), OutCpu.begin(), [](int x) {
    return square(x);
  });

  // 
  // Do something with OutCpu and OutGpu …..........

  //

}

int main() {
  func();
  return 0;
}

Function Overloading

Additionally, it is possible to overload functions. The “A::GetIt” member function below is overloaded. The function marked as “gpu” will be executed on the GPU and other on the CPU.

struct A {
  int GetIt() const __attribute__((amp_restrict("cpu"))) {
    return 2;
  }
  int GetIt() const __attribute__((amp_restrict("gpu"))) {
    return 4;
  }
};

compute::parallel_for_each(In.begin(), In.end(), OutGpu.begin(), [](int x){
    A a; 
    return a.GetIt(); // returns 4
});

If you want to use function overloading using the amp_restrict attribute, you will need to patch your Clang compiler:

git clone https://github.com/llvm-mirror/clang.git
cd clang
git checkout 5806bb59d2d19a9b32b739589865d8bb1e2627c5
git apply PATH-TO-cpp_opencl/restrict.patch

I used this llvm version:

git clone https://github.com/llvm-mirror/llvm.git
cd llvm
git checkout 47042bcc266285676f8ff284e5d46a2c196c367b

You can use any recent Clang version already installed on your machine (without the patch), if you do not intend to use the amp_restrict attribute.

Build the Executable

The tool uses a special compiler based on Clang/LLVM.

cpp_opencl -x c++ -std=c++11 -O3 -o Input.cc.o -c Input.cc

The above command generates four files:

Input.cc.o
Input.cc.cl
Input.cc_cpu.cpp
Input.cc_gpu.cpp

Use the Clang C++ compiler directly to link:

clang++ ./Input.cc.o -o test -lOpenCL

Then just execute:

./test

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
cmake		cmake
include		include
sources		sources
tests		tests
CMakeLists.txt		CMakeLists.txt
README.md		README.md
restrict.patch		restrict.patch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmake

cmake

include

include

sources

sources

tests

tests

CMakeLists.txt

CMakeLists.txt

README.md

README.md

restrict.patch

restrict.patch

Repository files navigation

cpp-opencl

Function Overloading

Build the Executable

About

Releases

Packages

DarkOfTheMoon/cpp-opencl

Folders and files

Latest commit

History

Repository files navigation

cpp-opencl

Function Overloading

Build the Executable

About

Resources

Stars

Watchers

Forks