-
Notifications
You must be signed in to change notification settings - Fork 563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DO NOT MERGE] MueLu: Refactor CoalesceDropFactory_kokkos #12861
base: develop
Are you sure you want to change the base?
Conversation
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_PR_gcc-8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-8.3.0-serial
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-8.3.0-debug
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_clang-11.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_python3
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-11.4.2-uvm-off
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_intel-2021.3
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-11.4.20-uvm
Jenkins Parameters
Using Repos:
Pull Request Author: cgcgcg |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_PR_gcc-8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-8.3.0-serial
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-8.3.0-debug
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_clang-11.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_python3
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-11.4.2-uvm-off
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_intel-2021.3
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-11.4.20-uvm
Jenkins Parameters
|
bc183ae
to
5f90623
Compare
packages/muelu/src/Graph/MatrixTransformation/MueLu_DroppingCommon.hpp
Outdated
Show resolved
Hide resolved
auto aiiajj = ATS::magnitude(diag(rlid)) * ATS::magnitude(diag(clid)); // |a_ii|*|a_jj| | ||
auto aij2 = ATS::magnitude(val) * ATS::magnitude(val); // |a_ij|^2 | ||
|
||
results(offset + k) = (aij2 <= eps * eps * aiiajj) ? DROP : KEEP; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be useful to note somewhere this is classical (SA) dropping, not classical (RS) dropping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand. What's the difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GrahamBenHarper means that classical could refer to "Ruge Stueben" or to the standard smoothed aggregation dropping criterion.
2a34193
to
40cfcf7
Compare
98e537d
to
a5dec0f
Compare
Signed-off-by: Christian Glusa <caglusa@sandia.gov>
The Dirichlet threshold detection was not used when the drop tol != 0.
Signed-off-by: Christian Glusa <caglusa@sandia.gov>
Signed-off-by: Christian Glusa <caglusa@sandia.gov>
9a28985
to
1362ac3
Compare
Signed-off-by: Christian Glusa <caglusa@sandia.gov>
Signed-off-by: Christian Glusa <caglusa@sandia.gov>
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - GitHub reports Mergeable status = False |
@trilinos/muelu
Motivation
We eventually want to merge
CoalesceDropFactory
andCoalesceDropFactory_kokkos
. Some of the dropping schemes (e.g. cut-drop) inCoalesceDropFactory
need access to the entire row. Moreover, the two factories are so messy because we want to apply a whole variety of different criteria.I refactored the code to enable more composability than before and break up the algorithms into smaller, more readable units. Criteria take the form of functors that are applied to each row. This means that we need at most 3 kernel launches, instead of launching kernels for different criteria one-by-one.
One large change is that I allocate a view for the dropping decisions. On the other hand, we no longer allocate and then shrink a copy of the original matrix.
Questions/todos:
Kokkos_ENABLE_CUDA_CONSTEXPR=ON
to allow the use ofstd::get
. Is there a way around that?