Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eigen3.3 support #225

Merged
merged 12 commits into from
Jun 2, 2017
Merged

Eigen3.3 support #225

merged 12 commits into from
Jun 2, 2017

Conversation

costashatz
Copy link
Member

As discussed in #201, we should add support for Eigen3.3 (I also add LAPACK support if it is there). I have the following observations:

  • We cannot use the -mavx -mfma as we have some compilation errors (minor things, we can fix it) and libcmaes crashes. We really need to solve this issue.
  • The performance of cholesky decomposition is not really better unless you use big matrices (i.e., over 2000 samples).
  • Let's wait if everything goes ok with the tests and the results of the benchmarking (I am doing it in another branch)..

@costashatz
Copy link
Member Author

So overall I have the following conclusions:

  • The -mavx -mfma flags with Eigen3.3 really improve the performance (close to 30%).
  • LAPACKE/BLAS when combined improve the performance of big matrix computations (more than ~1000 dimensions).
  • LAPACKE/BLAS hurt the performance when we have small matrices (less than ~800 dimensions).
  • We are losing performance because we are not vectorizing the kernel functionalities (see Restructure kernels to allow vectorization of kernel functions #228), but I will handle this in another PR/time.

Therefore, I fixed the flags in libcmaes and made a pull-request (it seems that they will accept it -- until they do, you can use the following branch) and now we can use the good flags. On the contrary, I just added an option whether to include LAPACKE/BLAS. So depending on the application we may want or do not want to use it. It seems weird to me that LAPACKE/BLAS performs badly in low dimensional matrices, but either it is true (usually the algorithms that work well in complicated examples, do not work as well in the simpler cases) or they have a small bug in Eigen (it is only recently that they put this functionality).

I am waiting for the tests to pass and I will make sure we do not need to update the docs anywhere... Let me know what you think..

@jbmouret
Copy link
Member

jbmouret commented Jun 2, 2017

A 30% performance boost is great! I agree that LAPACKE/BLAS should only be an option, but don't Eigen3 automatically switch it off for small matrices?

@costashatz
Copy link
Member Author

A 30% performance boost is great!

It depends on the size of the matrix, but still is very good.

don't Eigen3 automatically switch it off for small matrices?

I quote Eigen's docs:

These substitutions apply only for Dynamic or large enough objects

We always have dynamic vectors and thus no, Eigen3 does not automatically switch it off.. I empirically evaluated that..

I fixed the flags in libcmaes and made a pull-request (it seems that they will accept it -- until they do, you can use the following branch) and now we can use the good flags

They accepted the PR, so we can use the original repo again (I will commit shortly the change)..

@costashatz
Copy link
Member Author

So to recap, this PR adds/introduces:

  • Eigen3.3 support with the nice new flags
  • Option to enable LAPACKE/BLAS
  • Small change of API for the EvaluationFunction. In short, instead of defining as static constexpr the dim_in and dim_out, we should use the BO_PARAM macros:
struct EvalFunc{
    BO_PARAM(size_t, dim_in, 10);
    BO_PARAM(size_t, dim_out, 1);
};

Everyone ok to merge?

@jbmouret
Copy link
Member

jbmouret commented Jun 2, 2017 via email

@Aneoshun
Copy link
Member

Aneoshun commented Jun 2, 2017

  • OK for me *

@costashatz costashatz merged commit 43854c5 into master Jun 2, 2017
@costashatz costashatz deleted the eigen3.3 branch June 2, 2017 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants