Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] spgemm #550

Draft
wants to merge 70 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
c07bcfb
Add sycl configure option and memory functionality
Jul 24, 2021
61d0edb
Change names and fix initialization
Jul 27, 2021
bafa6c2
Fix cuda compilation
waynemitchell Jul 27, 2021
c16315d
Choose default exec policy for matvec
Jul 28, 2021
6d9fb5c
Merge branch 'master' into sycl
Jul 28, 2021
c58f944
Start boxloop implementation
Aug 3, 2021
25348d4
Remove nonfunctional code for fresh start
Sep 16, 2021
58b6e23
Add simple driver and remove problematic flag from configure
waynemitchell Sep 17, 2021
0c58ebe
Reproducing invalid kernel name error in simple
waynemitchell Sep 27, 2021
5695c97
boxloop1 running on frank
waynemitchell Sep 29, 2021
2ad440f
Merge branch 'master' into sycl
waynemitchell Sep 29, 2021
f4d9ba4
Resolve further merge conflicts, passes struct tests
waynemitchell Sep 29, 2021
845a433
Non-reduction boxloops done
waynemitchell Sep 30, 2021
c733ad6
Merge branch 'master' into sycl
waynemitchell Sep 30, 2021
4ed00c4
First attempt at reduction boxloops, seg faulting right now
waynemitchell Oct 1, 2021
2fb3f27
Reproducing seg fault when trying to launch trivial reduction paralle…
waynemitchell Oct 5, 2021
001fb9f
Reduction boxloops done
waynemitchell Oct 5, 2021
df301df
Cleanup
waynemitchell Oct 6, 2021
4e54d48
Added hypreLoopBegin/End
waynemitchell Oct 6, 2021
a127622
Bug fix
waynemitchell Oct 19, 2021
94a269d
Fix configuration options for non-unified memory
waynemitchell Oct 25, 2021
39fbd2d
Update oneapi reduction
waynemitchell Oct 26, 2021
193ee25
Bug fix in parallel
waynemitchell Oct 26, 2021
9166c16
Additional macro fixes and implementation of redblack relax
waynemitchell Oct 26, 2021
4fca1be
Automatic selection of block dimension
waynemitchell Oct 27, 2021
a6383e8
zboxloop
liruipeng Oct 27, 2021
4ddcc4a
Fixes for compiler update on jlse
waynemitchell Oct 29, 2021
345b0d0
Renamings
waynemitchell Oct 29, 2021
f48eec0
Try different formulation of reduction
waynemitchell Oct 29, 2021
980bee5
Autoconf clean up
waynemitchell Nov 2, 2021
2d5ee90
Cleanup boxloops, renamings, make sure tests compile
waynemitchell Nov 2, 2021
496afa7
Merge branch 'master' into sycl
waynemitchell Nov 2, 2021
4d303d3
Some placeholders and changes to allow ij interface to run on the host
waynemitchell Nov 3, 2021
99c5d9d
Add cmake compilation
waynemitchell Nov 4, 2021
ec8c5de
Some code cleanup
waynemitchell Nov 4, 2021
3254e31
[SYCL] convert sycl::device to sycl::device* for better handling (#504)
abagusetty Nov 5, 2021
68fc8be
[SYCL] add complex types for device
abagusetty Dec 7, 2021
b7ebf4e
[SYCL] kernel launch macro
abagusetty Dec 8, 2021
31d6238
Merge branch 'master' into seq_mv_sycl
abagusetty Dec 8, 2021
35fa901
[SYCL] changes to function, var names from _cuda_ to _device_ for uni…
abagusetty Dec 8, 2021
243e2b8
[SYCL] update, unify new functions for CUDA and SYCL in csr_matop_device
abagusetty Dec 9, 2021
9eb1f7f
[SYCL] enable oneDPL and some more updates
abagusetty Dec 10, 2021
f11b593
[SYCL] adding sycl::gather and few more common GPU functions
abagusetty Dec 14, 2021
ae30f74
[SYCL] fix the sycl scatter_if
abagusetty Dec 14, 2021
c73ef06
[SYCL] fix the build issues from std::exclusive_scan, lambda for scat…
abagusetty Dec 14, 2021
d3e3bf0
[SYCL] cleanup a for SYCL kernel query helper functions
abagusetty Dec 16, 2021
09674eb
[SYCL] spgemm initial commits
abagusetty Dec 20, 2021
ad32b6f
[SYCL] simplify namespace for sycl::ext::oneapi::sub_group to sycl::s…
abagusetty Dec 21, 2021
9c6b6bc
[SYCL] unify code for CUDA, HIP and SYCL for easier maintanence
abagusetty Dec 21, 2021
187e69e
[SYCL] fix build issues
abagusetty Dec 21, 2021
89b8cb1
Merge branch 'seq_mv_sycl' into seq_mv_spgemm
abagusetty Dec 21, 2021
b42dc4a
[SYCL] fixes build for sycl
abagusetty Dec 22, 2021
478f5eb
[SYCL] spgemm device attempt support
abagusetty Dec 23, 2021
e5b4a8a
[SYCL] add csr_spgemm confident port
abagusetty Dec 23, 2021
0e423d8
[SYCL] update spgemm kernels
abagusetty Dec 30, 2021
9ef2ea7
[SYCL] update
abagusetty Jan 4, 2022
34be7fc
[SYCL] update warp reduce functions (plus, max, min) with SYCL's sub-…
abagusetty Jan 4, 2022
07277ed
[SYCL/CUDA] fix build for sycl and cuda
abagusetty Jan 4, 2022
737261d
1. renaming hypre_SyncDeviceComputeStream() -> hypre_SyncComputeStrea…
abagusetty Jan 6, 2022
83cae95
[SYCL] oneMKLRNG uses oneDPL for simplicity, adding std versions for …
abagusetty Jan 7, 2022
e187b2c
[SYCL] replace oneMKL rng with oneDPL::random
abagusetty Jan 7, 2022
e26ece2
[SYCL] csr_spmv_device
abagusetty Jan 7, 2022
1024877
[SYCL] replace hypre_ResetCudaDevice to hypre_ResetGpuDevice
abagusetty Feb 1, 2022
3e4c7d8
[SYCL] add SYCL flags to configure, and SYCL device setup cleanup
abagusetty Feb 1, 2022
da7b951
Merge branch 'master' into seq_mv_spgemm
abagusetty Feb 1, 2022
356240d
[SYCL] fix merge conflicts, renamed *CudaDevice to *GpuDevice
abagusetty Feb 1, 2022
2b7999e
[SYCL] style commit
abagusetty Feb 1, 2022
a1cab34
[SYCL] rename preproc directive MKLRNG to MKLRAND
abagusetty Feb 2, 2022
32179dd
[SYCL] add few more cases to SYCL kernel, SYCL device singleton class…
abagusetty Feb 12, 2022
ad96cd9
[SYCL] run formatting
abagusetty Feb 12, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 11 additions & 10 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,9 @@ option(HYPRE_WITH_CUDA "Use CUDA. Require cuda-8.0 or higher" OFF)
option(HYPRE_WITH_SYCL "Use SYCL" OFF)
option(HYPRE_ENABLE_UNIFIED_MEMORY "Use unified memory for allocating the memory" OFF)
# CUDA options
option(HYPRE_ENABLE_ONEMKLSPARSE "Use oneMKLSPARSE" OFF)
option(HYPRE_ENABLE_ONEMKLBLAS "Use oneMKLBLAS" OFF)
option(HYPRE_ENABLE_ONEMKLRAND "Use oneMKLRNG" ON)
option(HYPRE_ENABLE_CUDA_STREAMS "Use CUDA streams" ON)
option(HYPRE_ENABLE_CUSPARSE "Use cuSPARSE" ON)
option(HYPRE_ENABLE_DEVICE_POOL "Use device memory pool" OFF)
Expand Down Expand Up @@ -289,7 +292,7 @@ endif (HYPRE_WITH_CUDA)
# SYCL
if (HYPRE_WITH_SYCL)
enable_language(CXX)
message(STATUS "Enabled support for CXX.")
message(STATUS "Enabled support for SYCL.")

# Enforce C++17
if (NOT CMAKE_CXX_STANDARD OR CMAKE_CXX_STANDARD LESS 17)
Expand All @@ -299,17 +302,14 @@ if (HYPRE_WITH_SYCL)

message(STATUS "Using CXX standard: c++${CMAKE_CXX_STANDARD}")

# Set CXX compiler to dpcpp
set(CMAKE_CXX_COMPILER "dpcpp")

# Add any extra CXX compiler flags HYPRE_WITH_EXTRA_CXXFLAGS
if (NOT HYPRE_WITH_EXTRA_CXXFLAGS STREQUAL "")
string(REPLACE " " ";" HYPRE_WITH_EXTRA_CXXFLAGS "${HYPRE_WITH_EXTRA_CXXFLAGS}")
add_compile_options("$<$<COMPILE_LANGUAGE:CXX>:${HYPRE_WITH_EXTRA_CXXFLAGS}>")
endif ()

set(HYPRE_USING_SYCL ON CACHE BOOL "" FORCE)
set(HYPRE_USING_GPU ON CACHE BOOL "" FORCE)
set(HYPRE_USING_SYCL ON CACHE BOOL "" FORCE)
set(HYPRE_USING_GPU ON CACHE BOOL "" FORCE)

if (HYPRE_ENABLE_UNIFIED_MEMORY)
set(HYPRE_USING_UNIFIED_MEMORY ON CACHE BOOL "" FORCE)
Expand All @@ -324,12 +324,17 @@ if (HYPRE_WITH_SYCL)
set(HYPRE_BUILD_EXAMPLES OFF CACHE BOOL "" FORCE)
endif ()

add_compile_options("$<$<COMPILE_LANGUAGE:CXX>:-D_GLIBCXX_USE_TBB_PAR_BACKEND=0>")
add_compile_options("$<$<COMPILE_LANGUAGE:CXX>:-sycl-std=2020>")
add_compile_options("$<$<COMPILE_LANGUAGE:CXX>:-fsycl>")
add_compile_options("$<$<COMPILE_LANGUAGE:CXX>:-fsycl-unnamed-lambda>")
add_compile_options("$<$<COMPILE_LANGUAGE:CXX>:-fsycl-device-only>")
add_compile_options("$<$<COMPILE_LANGUAGE:CXX>:-fsycl-device-code-split=per_kernel>")

set(HYPRE_USING_HOST_MEMORY OFF CACHE BOOL "" FORCE)

# include(HYPRE_SetupSYCLToolkit)
# set(CMAKE_CXX_FLAGS "-I/home/abagusetty/build/oneDPL/build_sycl/include ${CMAKE_CXX_FLAGS}")
endif (HYPRE_WITH_SYCL)

# Add any extra C compiler flags HYPRE_WITH_EXTRA_CFLAGS
Expand Down Expand Up @@ -477,10 +482,6 @@ if (HYPRE_USING_CUDA)
set_source_files_properties(${HYPRE_GPU_SOURCES} PROPERTIES LANGUAGE CUDA)
endif ()

if (HYPRE_USING_SYCL)
set_source_files_properties(${HYPRE_GPU_SOURCES} PROPERTIES LANGUAGE CXX)
endif ()

# Set MPI compile flags
if (NOT HYPRE_SEQUENTIAL)
find_program(MPIEXEC_EXECUTABLE NAMES mpiexec mpirun)
Expand Down
6 changes: 3 additions & 3 deletions src/IJ_mv/HYPRE_IJVector.c
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,7 @@ HYPRE_IJVectorSetValues( HYPRE_IJVector vector,

if ( hypre_IJVectorObjectType(vec) == HYPRE_PARCSR )
{
#if defined(HYPRE_USING_CUDA) || defined(HYPRE_USING_HIP)
#if defined(HYPRE_USING_GPU)
HYPRE_ExecutionPolicy exec = hypre_GetExecPolicy1( hypre_IJVectorMemoryLocation(vector) );

if (exec == HYPRE_EXEC_DEVICE)
Expand Down Expand Up @@ -295,7 +295,7 @@ HYPRE_IJVectorAddToValues( HYPRE_IJVector vector,

if ( hypre_IJVectorObjectType(vec) == HYPRE_PARCSR )
{
#if defined(HYPRE_USING_CUDA) || defined(HYPRE_USING_HIP)
#if defined(HYPRE_USING_GPU)
HYPRE_ExecutionPolicy exec = hypre_GetExecPolicy1( hypre_IJVectorMemoryLocation(vector) );

if (exec == HYPRE_EXEC_DEVICE)
Expand Down Expand Up @@ -333,7 +333,7 @@ HYPRE_IJVectorAssemble( HYPRE_IJVector vector )

if ( hypre_IJVectorObjectType(vec) == HYPRE_PARCSR )
{
#if defined(HYPRE_USING_CUDA) || defined(HYPRE_USING_HIP)
#if defined(HYPRE_USING_GPU)
HYPRE_ExecutionPolicy exec = hypre_GetExecPolicy1( hypre_IJVectorMemoryLocation(vector) );

if (exec == HYPRE_EXEC_DEVICE)
Expand Down
6 changes: 3 additions & 3 deletions src/IJ_mv/IJMatrix_parcsr_device.c
Original file line number Diff line number Diff line change
Expand Up @@ -164,8 +164,8 @@ hypre_IJMatrixSetAddValuesParCSRDevice( hypre_IJMatrix *matrix,
/* mark unwanted elements as -1 */
dim3 bDim = hypre_GetDefaultDeviceBlockDimension();
dim3 gDim = hypre_GetDefaultDeviceGridDimension(len1, "thread", bDim);
HYPRE_CUDA_LAUNCH( hypreCUDAKernel_IJMatrixValues_dev1, gDim, bDim, len1, indicator,
(HYPRE_Int *) row_indexes, ncols, indicator );
HYPRE_GPU_LAUNCH( hypreCUDAKernel_IJMatrixValues_dev1, gDim, bDim, len1, indicator,
(HYPRE_Int *) row_indexes, ncols, indicator );

auto new_end = HYPRE_THRUST_CALL(
copy_if,
Expand Down Expand Up @@ -233,7 +233,7 @@ hypre_IJMatrixAssembleSortAndReduce1(HYPRE_Int N0, HYPRE_BigInt *I0, HYPRE_Big
/*
dim3 bDim = hypre_GetDefaultDeviceBlockDimension();
dim3 gDim = hypre_GetDefaultDeviceGridDimension(N0, "thread", bDim);
HYPRE_CUDA_LAUNCH( hypreCUDAKernel_IJMatrixAssembleSortAndReduce1, gDim, bDim, N0, I0, J0, X0, A0 );
HYPRE_GPU_LAUNCH( hypreCUDAKernel_IJMatrixAssembleSortAndReduce1, gDim, bDim, N0, I0, J0, X0, A0 );
*/

/* output X: 0: keep, 1: zero-out */
Expand Down
32 changes: 16 additions & 16 deletions src/IJ_mv/IJVector_parcsr.c
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ HYPRE_Int
hypre_IJVectorSetMaxOffProcElmtsPar(hypre_IJVector *vector,
HYPRE_Int max_off_proc_elmts)
{
hypre_AuxParVector *aux_vector;
hypre_AuxParVector *aux_vector = NULL;

aux_vector = (hypre_AuxParVector*) hypre_IJVectorTranslator(vector);
if (!aux_vector)
Expand All @@ -134,7 +134,7 @@ hypre_IJVectorSetMaxOffProcElmtsPar(hypre_IJVector *vector,
}
hypre_AuxParVectorMaxOffProcElmts(aux_vector) = max_off_proc_elmts;

#if defined(HYPRE_USING_CUDA) || defined(HYPRE_USING_HIP)
#if defined(HYPRE_USING_GPU)
hypre_AuxParVectorUsrOffProcElmts(aux_vector) = max_off_proc_elmts;
#endif

Expand All @@ -156,7 +156,7 @@ hypre_IJVectorDistributePar(hypre_IJVector *vector,
const HYPRE_Int *vec_starts)
{
hypre_ParVector *old_vector = (hypre_ParVector*) hypre_IJVectorObject(vector);
hypre_ParVector *par_vector;
hypre_ParVector *par_vector = NULL;
HYPRE_Int print_level = hypre_IJVectorPrintLevel(vector);

if (!old_vector)
Expand Down Expand Up @@ -208,8 +208,8 @@ hypre_IJVectorZeroValuesPar(hypre_IJVector *vector)

hypre_ParVector *par_vector = (hypre_ParVector*) hypre_IJVectorObject(vector);
MPI_Comm comm = hypre_IJVectorComm(vector);
HYPRE_BigInt *partitioning;
hypre_Vector *local_vector;
HYPRE_BigInt *partitioning = NULL;
hypre_Vector *local_vector = NULL;
HYPRE_Int print_level = hypre_IJVectorPrintLevel(vector);

hypre_MPI_Comm_rank(comm, &my_id);
Expand Down Expand Up @@ -281,13 +281,13 @@ hypre_IJVectorSetValuesPar(hypre_IJVector *vector,
HYPRE_Int my_id;
HYPRE_Int j, k;
HYPRE_BigInt i, vec_start, vec_stop;
HYPRE_Complex *data;
HYPRE_Complex *data = NULL;
HYPRE_Int print_level = hypre_IJVectorPrintLevel(vector);

HYPRE_BigInt *IJpartitioning = hypre_IJVectorPartitioning(vector);
hypre_ParVector *par_vector = (hypre_ParVector*) hypre_IJVectorObject(vector);
MPI_Comm comm = hypre_IJVectorComm(vector);
hypre_Vector *local_vector;
hypre_Vector *local_vector = NULL;

/* If no components are to be set, perform no checking and return */
if (num_values < 1) { return 0; }
Expand Down Expand Up @@ -394,14 +394,14 @@ hypre_IJVectorAddToValuesPar(hypre_IJVector *vector,
{
HYPRE_Int my_id;
HYPRE_Int i, j, vec_start, vec_stop;
HYPRE_Complex *data;
HYPRE_Complex *data = NULL;
HYPRE_Int print_level = hypre_IJVectorPrintLevel(vector);

HYPRE_BigInt *IJpartitioning = hypre_IJVectorPartitioning(vector);
hypre_ParVector *par_vector = (hypre_ParVector*) hypre_IJVectorObject(vector);
hypre_AuxParVector *aux_vector = (hypre_AuxParVector*) hypre_IJVectorTranslator(vector);
MPI_Comm comm = hypre_IJVectorComm(vector);
hypre_Vector *local_vector;
hypre_Vector *local_vector = NULL;

/* If no components are to be retrieved, perform no checking and return */
if (num_values < 1) { return 0; }
Expand Down Expand Up @@ -557,8 +557,8 @@ hypre_IJVectorAssemblePar(hypre_IJVector *vector)
{
HYPRE_Int off_proc_elmts, current_num_elmts;
HYPRE_Int max_off_proc_elmts;
HYPRE_BigInt *off_proc_i;
HYPRE_Complex *off_proc_data;
HYPRE_BigInt *off_proc_i = NULL;
HYPRE_Complex *off_proc_data = NULL;
current_num_elmts = hypre_AuxParVectorCurrentOffProcElmts(aux_vector);
hypre_MPI_Allreduce(&current_num_elmts, &off_proc_elmts, 1, HYPRE_MPI_INT,
hypre_MPI_SUM, comm);
Expand Down Expand Up @@ -704,12 +704,12 @@ hypre_IJVectorAssembleOffProcValsPar( hypre_IJVector *vector,
HYPRE_Int first_index;

void *void_contact_buf = NULL;
void *index_ptr;
void *recv_data_ptr;
void *index_ptr = NULL;
void *recv_data_ptr = NULL;

HYPRE_Complex tmp_complex;
HYPRE_BigInt *ex_contact_buf = NULL;
HYPRE_Complex *vector_data;
HYPRE_Complex *vector_data = NULL;
HYPRE_Complex value;

hypre_DataExchangeResponse response_obj1, response_obj2;
Expand All @@ -718,7 +718,7 @@ hypre_IJVectorAssembleOffProcValsPar( hypre_IJVector *vector,
MPI_Comm comm = hypre_IJVectorComm(vector);
hypre_ParVector *par_vector = (hypre_ParVector*) hypre_IJVectorObject(vector);

hypre_IJAssumedPart *apart;
hypre_IJAssumedPart *apart = NULL;

hypre_MPI_Comm_rank(comm, &myid);

Expand Down Expand Up @@ -1119,7 +1119,7 @@ hypre_IJVectorAssembleOffProcValsPar( hypre_IJVector *vector,
hypre_TMemcpy(off_proc_data_recv_d, off_proc_data_recv, HYPRE_Complex, off_proc_nelm_recv_cur,
HYPRE_MEMORY_DEVICE, HYPRE_MEMORY_HOST);

#if defined(HYPRE_USING_CUDA) || defined(HYPRE_USING_HIP)
#if defined(HYPRE_USING_GPU)
hypre_IJVectorSetAddValuesParDevice(vector, off_proc_nelm_recv_cur, off_proc_i_recv_d,
off_proc_data_recv_d, "add");
#endif
Expand Down
6 changes: 3 additions & 3 deletions src/IJ_mv/IJVector_parcsr_device.c
Original file line number Diff line number Diff line change
Expand Up @@ -251,9 +251,9 @@ hypre_IJVectorAssembleParDevice(hypre_IJVector *vector)
/* set/add to local vector */
dim3 bDim = hypre_GetDefaultDeviceBlockDimension();
dim3 gDim = hypre_GetDefaultDeviceGridDimension(new_nnz, "thread", bDim);
HYPRE_CUDA_LAUNCH( hypreCUDAKernel_IJVectorAssemblePar, gDim, bDim, new_nnz, new_data, new_i,
vec_start, new_sora,
hypre_VectorData(hypre_ParVectorLocalVector(par_vector)) );
HYPRE_GPU_LAUNCH( hypreCUDAKernel_IJVectorAssemblePar, gDim, bDim, new_nnz, new_data, new_i,
vec_start, new_sora,
hypre_VectorData(hypre_ParVectorLocalVector(par_vector)) );

hypre_TFree(new_i, HYPRE_MEMORY_DEVICE);
hypre_TFree(new_data, HYPRE_MEMORY_DEVICE);
Expand Down
4 changes: 2 additions & 2 deletions src/IJ_mv/_hypre_IJ_mv.h
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ typedef struct

HYPRE_MemoryLocation memory_location;

#if defined(HYPRE_USING_CUDA) || defined(HYPRE_USING_HIP)
#if defined(HYPRE_USING_GPU)
HYPRE_Int max_stack_elmts; /* length of stash for SetValues and AddToValues*/
HYPRE_Int current_stack_elmts; /* current no. of elements stored in stash */
HYPRE_BigInt *stack_i; /* contains row indices */
Expand All @@ -193,7 +193,7 @@ typedef struct

#define hypre_AuxParVectorMemoryLocation(vector) ((vector) -> memory_location)

#if defined(HYPRE_USING_CUDA) || defined(HYPRE_USING_HIP)
#if defined(HYPRE_USING_GPU)
#define hypre_AuxParVectorMaxStackElmts(vector) ((vector) -> max_stack_elmts)
#define hypre_AuxParVectorCurrentStackElmts(vector) ((vector) -> current_stack_elmts)
#define hypre_AuxParVectorStackI(vector) ((vector) -> stack_i)
Expand Down
4 changes: 2 additions & 2 deletions src/IJ_mv/aux_par_vector.c
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ hypre_AuxParVectorCreate( hypre_AuxParVector **aux_vector)
hypre_AuxParVectorOffProcI(vector) = NULL;
hypre_AuxParVectorOffProcData(vector) = NULL;
hypre_AuxParVectorMemoryLocation(vector) = HYPRE_MEMORY_HOST;
#if defined(HYPRE_USING_CUDA) || defined(HYPRE_USING_HIP)
#if defined(HYPRE_USING_GPU)
hypre_AuxParVectorMaxStackElmts(vector) = 0;
hypre_AuxParVectorCurrentStackElmts(vector) = 0;
hypre_AuxParVectorStackI(vector) = NULL;
Expand Down Expand Up @@ -61,7 +61,7 @@ hypre_AuxParVectorDestroy( hypre_AuxParVector *vector )
hypre_TFree(hypre_AuxParVectorOffProcI(vector), HYPRE_MEMORY_HOST);
hypre_TFree(hypre_AuxParVectorOffProcData(vector), HYPRE_MEMORY_HOST);

#if defined(HYPRE_USING_CUDA) || defined(HYPRE_USING_HIP)
#if defined(HYPRE_USING_GPU)
hypre_TFree(hypre_AuxParVectorStackI(vector), hypre_AuxParVectorMemoryLocation(vector));
hypre_TFree(hypre_AuxParVectorStackData(vector), hypre_AuxParVectorMemoryLocation(vector));
hypre_TFree(hypre_AuxParVectorStackSorA(vector), hypre_AuxParVectorMemoryLocation(vector));
Expand Down
4 changes: 2 additions & 2 deletions src/IJ_mv/aux_par_vector.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ typedef struct

HYPRE_MemoryLocation memory_location;

#if defined(HYPRE_USING_CUDA) || defined(HYPRE_USING_HIP)
#if defined(HYPRE_USING_GPU)
HYPRE_Int max_stack_elmts; /* length of stash for SetValues and AddToValues*/
HYPRE_Int current_stack_elmts; /* current no. of elements stored in stash */
HYPRE_BigInt *stack_i; /* contains row indices */
Expand All @@ -53,7 +53,7 @@ typedef struct

#define hypre_AuxParVectorMemoryLocation(vector) ((vector) -> memory_location)

#if defined(HYPRE_USING_CUDA) || defined(HYPRE_USING_HIP)
#if defined(HYPRE_USING_GPU)
#define hypre_AuxParVectorMaxStackElmts(vector) ((vector) -> max_stack_elmts)
#define hypre_AuxParVectorCurrentStackElmts(vector) ((vector) -> current_stack_elmts)
#define hypre_AuxParVectorStackI(vector) ((vector) -> stack_i)
Expand Down
2 changes: 1 addition & 1 deletion src/config/HYPRE_config.h.in
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@
/* onemkl::BLAS being used */
#undef HYPRE_USING_ONEMKLBLAS

/* onemkl::rng being used */
/* onemkl::rng::device being used */
#undef HYPRE_USING_ONEMKLRAND

/* onemkl::SPARSE being used */
Expand Down
12 changes: 6 additions & 6 deletions src/config/configure.in
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@ dnl *********************************************************************
hypre_using_sycl=no
hypre_using_onemklsparse=no
hypre_using_onemklblas=no
hypre_using_onemklrand=no
hypre_using_onemklrand=yes

hypre_found_mkl=no

Expand Down Expand Up @@ -2341,20 +2341,20 @@ dnl *********************************************************************
dnl * Set SYCL options
dnl *********************************************************************
AS_IF([test x"$hypre_using_sycl" == x"yes"],
[
[
AC_DEFINE(HYPRE_USING_GPU, 1, [Define to 1 if executing on GPU device])
AC_DEFINE(HYPRE_USING_SYCL, 1, [SYCL being used])

dnl (Ab)Using CUCC when compiling HIP
LINK_CC=${CUCC}
LINK_CC=${CUCC}
LINK_CXX=${CUCC}

SYCLFLAGS="-fsycl -fsycl-unnamed-lambda -fsycl-device-code-split=per_kernel"
SYCLFLAGS="-Wunknown-pragmas -Wunused-local-typedef -Wunused-lambda-capture -D_GLIBCXX_USE_TBB_PAR_BACKEND=0 -std=c++17 -sycl-std=2020 -fsycl -fsycl-unnamed-lambda -fsycl-device-code-split=per_kernel"
if test "$hypre_using_debug" = "yes"
then
SYCLFLAGS="-Wall -g ${SYCLFLAGS}"
SYCLFLAGS="-O0 -Wall -g -fno-sycl-early-optimizations ${SYCLFLAGS}"
else
SYCLFLAGS="-g -O3 ${SYCLFLAGS}"
SYCLFLAGS="-g -O2 ${SYCLFLAGS}"
fi

dnl (Ab)Use CUFLAGS to capture SYCL compilation flags
Expand Down
Loading