Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Premature end of optimization #318

Open
kirilllzaitsev opened this issue Aug 29, 2022 · 5 comments
Open

Premature end of optimization #318

kirilllzaitsev opened this issue Aug 29, 2022 · 5 comments

Comments

@kirilllzaitsev
Copy link

kirilllzaitsev commented Aug 29, 2022

Hi, my question is about the samples.dat file that is produced using limbo::stat::Samples:

#iteration sample
-1 0.178732
-1 0.0228679
-1 0.453411
-1 0.545145
-1 0.550571
-1 0.910424
-1 0.580422
-1 0.259014
-1 0.10441
-1 0.952085
-1 0.47953
-1 0.955465
0 0.5
1 0.5
2 0.5
3 0.5

Best observation gives the first sample in the series of 10 random ones - 0.178732. However, this result is suboptimal with the ground truth being 0.2. For some reason, the optimization yields dummy 0.5 values instead of looking for the actual solution. I define Params struct as follows:

struct Params
{
    struct bayes_opt_boptimizer : public defaults::bayes_opt_boptimizer
    {
        BO_PARAM(int, hp_period, -1);
    };

    struct bayes_opt_bobase : public defaults::bayes_opt_bobase
    {
        BO_PARAM(int, stats_enabled, true);
        BO_PARAM(bool, bounded, true);
    };

    struct stop_maxiterations
    {
        BO_DYN_PARAM(int, iterations);
    };

    struct acqui_ei : public defaults::acqui_ei
    {
        BO_PARAM(double, jitter, 0.05);
    };

    struct init_randomsampling
    {
        BO_PARAM(int, samples, 12);
    };

    struct kernel : public defaults::kernel
    {
        BO_DYN_PARAM(double, noise);
    };

    struct kernel_maternfivehalves : public defaults::kernel_maternfivehalves
    {
    };

    struct opt_rprop : public defaults::opt_rprop
    {
    };
    struct opt_nloptnograd : public defaults::opt_nloptnograd
    {
    };

Thank you.

@costashatz
Copy link
Member

@kirilllzaitsev can you share the code with us?

The samples.dat will not give the best observation per iteration but the one returned by your objective function. I have a feeling that you have defined the objective function wrongly.

@kirilllzaitsev
Copy link
Author

The objective function correctly prefers 0.2 over 0.178 when evaluated separately, and scores from 0.178 to 0.2 are non-decreasing. The question is why there are 0.5 values that give a mediocre score instead of something in the vicinity of 0.178732. This may be something on the NLopt side.

@costashatz
Copy link
Member

Without code, it is difficult to say what happens. I am talking about a programming/C++ error, not in semantics of what your objective function does. If you give me more, I can help more. Without code or some glimpse of how you define the objective function structure etc.., it is difficult to say.

@kirilllzaitsev
Copy link
Author

@costashatz, I attach the code and contents of the observations.dat and samples.dat from a different run, but showing similar problem.

struct Params
{
    struct bayes_opt_boptimizer : public defaults::bayes_opt_boptimizer
    {
        BO_PARAM(int, hp_period, -1);
    };

    struct bayes_opt_bobase : public defaults::bayes_opt_bobase
    {
        BO_PARAM(int, stats_enabled, true);
        BO_PARAM(bool, bounded, true);
    };

    struct stop_maxiterations
    {
        BO_PARAM(int, iterations, 5);
    };

    struct acqui_ei : public defaults::acqui_ei
    {
        BO_PARAM(double, jitter, 0.01);
    };

    struct init_randomsampling
    {
        BO_PARAM(int, samples, 15);
    };

    struct kernel : public defaults::kernel
    {
        BO_DYN_PARAM(double, noise);
    };

    struct kernel_maternfivehalves : public defaults::kernel_maternfivehalves
    {
    };

    struct opt_rprop : public defaults::opt_rprop
    {
    };
    struct opt_nloptnograd : public defaults::opt_nloptnograd
    {
    };
};

double my_eval_func(double b, C c)
{
    double score = c.calcScore(b);
    return score;
}

template <typename Params>
struct eval_func
{
    BO_PARAM(size_t, dim_in, 1);
    BO_PARAM(size_t, dim_out, 1);

    C c;

    explicit eval_func(C c)
    {
        this->c = c;
    }

    Eigen::VectorXd operator()(const Eigen::VectorXd &x) const
    {
        double b = x(0);
        double score = my_eval_func(b, c);
        return tools::make_vector(score);
    }
};

int func(C c){
    using kernel_t = kernel::MaternFiveHalves<Params>;
    using gp_t = model::GP<Params, kernel_t>;
    using acqui_t = acqui::EI<Params, gp_t>;
    using acqui_opt_t = opt::NLOptNoGrad<Params>;
    using init_t = init::RandomSampling<Params>;

    using stat_t = boost::fusion::vector<limbo::stat::ConsoleSummary<Params>,
                                            limbo::stat::AggregatedObservations<Params>,
                                            limbo::stat::BestAggregatedObservations<Params>,
                                            limbo::stat::BestObservations<Params>,
                                            limbo::stat::Samples<Params>,
                                            limbo::stat::BestSamples<Params>,
                                            limbo::stat::GPLikelihood<Params>,
                                            limbo::stat::GPMeanHParams<Params>,
                                            limbo::stat::GPPredictionDifferences<Params>,
                                            limbo::stat::Observations<Params>,
                                            limbo::stat::GPKernelHParams<Params>>;

    bayes_opt::BOptimizer<Params, modelfun<gp_t>, acquifun<acqui_t>,
                        acquiopt<acqui_opt_t>, initfun<init_t>, statsfun<stat_t>>
        boptimizer;
    boptimizer.optimize(eval_func<Params>(c));
}

Concatenated observations and samples (the first 15 come from random sampling):

	observation	sample
0	25251.1	0.395505
1	20854.8	0.914420
2	12992.6	0.025444
3	20414.1	0.957259
4	13426.5	0.031459
5	25831.6	0.730435
6	26717.7	0.710624
7	33853.2	0.579910
8	22018.8	0.077399
9	22336.2	0.094469
10	22734.2	0.820494
11	24977.3	0.245046
12	24506.2	0.156391
13	25197.6	0.385747
14	34350.9	0.601550
--- end of random sampling
15	30278.0	0.542875
16	30227.3	0.542241
17	30192.9	0.541807
18	30168.0	0.541493
19	30149.3	0.541255
20	30134.7	0.541070

This random sample gives the largest observation:

14	34350.9	0.601550

But optimizing, it gets stuck on 0.54, with a wrong search direction.

@costashatz
Copy link
Member

I am pretty confident that the issue is coming from the fact that your objective values are too big (in the order of 10^3), while the default parameters for the kernel are sigma=1 and l=1. This is basically removing the exploration part of the algorithm and it fails to explore the space.

I think if you scale your objective function in the space of [-1,1] or at least something in the scale of 10^1, the algorithm should work better. The other solution would be to change the default initial parameters for the GP kernel (e.g. here for the Matern kernel). From the two solutions, I would vote for the first one: scale your objective function. Speaking of scaling the objective function, this line BO_PARAM(bool, bounded, true); says to the optimizer to search only in [0,1] in the parameter space (so all the xs that the algorithm tries will be in [0,1]).

Hope this helps :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants