Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
190 views
in Technique[技术] by (71.8m points)

c++ - External calls are not supported - CUDA

Objective is to call a device function available in another file, when i compile the global kernel it shows the following error *External calls are not supported (found non-inlined call to _Z6GoldenSectionCUDA)*.

Problematic Code (not the full code but where the problem arises), cat norm.h

# ifndef NORM_H_
# define NORM_H_
# include<stdio.h>

__device__ double invcdf(double prob, double mean, double stddev);

#endif

cat norm.cu

# include <norm.h>

__device__ double invcdf(double prob, double mean, double stddev) {
    return (mean + stddev*normcdfinv(prob));
       }

cat test.cu

# include <norm.h>
# include <curand.h>
# include <curand_kernel.h>

__global__ void phase2Kernel(double* out_profit, struct strategyHolder* strategy) {
       curandState seedValue;
       curand_init(threadIdx.x, 0, 0, &seedValue);
       double randomD = invcdf(curand_uniform_double( &seedValue ), 300, 80);
    }

nvcc -c norm.cu -o norm.o -I"."
nvcc -c test.cu -o test.o -I"."

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You're trying to do separate compilation, which needs some special command line options. See the NVCC manual for details, but here's how to get your example to compile. I've targeted sm_20, but you can target sm_20 or later depending on what GPU you have. Separate compilation is not possible on older devices (sm_1x).

  • You don't need to declare the __device__ function as extern in your header file, but if you have any static device variables they will need to be declared as extern
  • Generate relocatable code for the device by compiling as shown below (-dc is the device equivalent of -c, see the manual for more information)

    nvcc -arch=sm_20 -dc norm.cu -o norm.o -I.
    nvcc -arch=sm_20 -dc test.cu -o test.o -I.
    
  • Link the device parts of the code by calling nvlink before the final host link

    nvlink -arch=sm_20 norm.o test.o -o final.o
    

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...