Investigate if ES Storage is passed a proper isArchive flag #6065

yurishkuro · 2024-10-06T20:41:28Z

mahadzaryab1 · 2024-10-17T02:34:01Z

@yurishkuro dug into this a little bit while redoing the configurations and here's what I found

The factory creates a primary and archive namespace for each provided configuration (

Lines 88 to 95 in f411b3c

 defaultConfig := DefaultConfig() 

 cfg.ApplyDefaults(&defaultConfig) 

 archive := make(map[string]*namespaceConfig) 

 archive[archiveNamespace] = &namespaceConfig{ 

 Configuration: cfg, 

 namespace: archiveNamespace, 

 }

CreateArchiveSpanReader and CreateArchiveSpanWriter are called by InitArchiveStorage in v1's query service (https://github.com/jaegertracing/jaeger/blob/main/cmd/query/app/querysvc/query_service.go#L130)
The InitArchiveStorage method is called by the v2 jaeger query extension (https://github.com/jaegertracing/jaeger/blob/main/cmd/jaeger/internal/extension/jaegerquery/server.go#L137)

So I believe that this is working as expected. Let me know if there's anything I'm missing here.

One other thing to note is that in v1, the additional namespaces need to be explictly enabled (https://github.com/jaegertracing/jaeger/blob/main/plugin/storage/es/options.go#L110) but the archive namespace is enabled by default in v2 (https://github.com/jaegertracing/jaeger/blob/main/plugin/storage/es/options.go#L437).

yurishkuro · 2024-10-17T14:16:12Z

@mahadzaryab1 In v1 the workflow is:

the binary creates a Factory
asks factory to initialize itself from CLI flags
creates main storage implementations via CreateSpanWriter
optionally calls CreateArchiveSpanWriter

The key observation here is that the distinction between primary and archive is handled inside the Factory because the caller clearly indicates the intent by calling InitArchiveStorage and CreateArchiveSpanWriter. in other words, the caller delegates the specific initialization required for archive storage to the Factory - a single Factory that manages both primary and archive.

In contrast, in v2 the workflow is supposed to be different

the user designates a configuration as primary or archive, but the configuration otherwise is the same and each configuration is represented by a unique Factory object
when initializing query service the v2 extension is supposed to call CreateSpanWriter on the primary factory, and also call CreateSpanWriter on the archive factory (calling the same API)
instead, v2 extension still falls back onto calling two different APIs due to invoking InitArchiveStorage

So v2 is serendipitously working right now, but not as expected. We can see it in how Memory storage is handled - when I configured all-in-one with primary and archive storage, originally it did not work for me because Memory storage factory never implemented the Archive sub-API, so I had to add it (and the side effect of it is that archive for memory cannot be turned off now). But that was exactly the opposite of what I wanted to happen - I wanted the config to be able to configure two storages and designate each as primary / archive, so that the v2 extension would only interact with the factories via CreateSpanWriter.

mahadzaryab1 · 2024-10-19T02:25:06Z

@yurishkuro Thanks so much for the helpful context. Do you have any thoughts on how we should proceed here? Do we want to make changes to the query extension to not use InitArchiveStorage?

yurishkuro · 2024-10-19T16:51:04Z

Yes that would be ideal. It may need refactoring of query service, eg perhaps it should be taking a separate archive factory instead of using a different interface on the main factory.

mahadzaryab1 · 2024-10-19T21:47:50Z

@yurishkuro just looking for a bit of clarification

so today, we're doing the following:

	f, err := jaegerstorage.GetStorageFactory(s.config.Storage.TracesArchive, host)
	if err != nil {
		return fmt.Errorf("cannot find archive storage factory: %w", err)
	}

	if !opts.InitArchiveStorage(f, s.telset.Logger) {
		s.telset.Logger.Info("Archive storage not initialized")
	}

Is the goal to be able to just do the following? Why do we need to introduce a new archive factory?

	f, err := jaegerstorage.GetStorageFactory(s.config.Storage.TracesArchive, host)
	if err != nil {
		return fmt.Errorf("cannot find archive storage factory: %w", err)
	}

	spanReader, err := f.CreateSpanReader()
	if err != nil {
		return fmt.Errorf("cannot create span reader: %w", err)
	}
        opts.ArchiveSpanReader = f.spanReader

yurishkuro · 2024-10-19T23:33:37Z

yes

mahadzaryab1 · 2024-10-19T23:54:09Z

@yurishkuro do we still need a separate interface to do the above? One other question I had was, will simply calling CreateSpanReader behave the same way as CreateArchiveSpanReader does? How would there be a differentiation between normal storage logic and archive storage logic?

yurishkuro · 2024-10-20T17:18:18Z

That is a good question, but if you look at all implementations of GetArchiveSomething they are no different from regular method. I think the only difference is in ES implementation which needs the isArchive flag. We can expose that flag as part of the ES config - not ideal because the user has to remember to set it, otherwise they might get limited look back. But I think in ES the archive storage requires separate settings anyway, eg you may not want to rollover index every day.

mahadzaryab1 · 2024-10-20T17:52:08Z

@yurishkuro How would it work if we add the is_archive flag to the ES config? We're currently storing separate configs in the ES factory as follows so we have no way of knowing if CreateSpanSomething is being called for the regular config or the archive config.

	primaryConfig *config.Configuration
	archiveConfig *config.Configuration

mahadzaryab1 · 2024-10-20T18:02:33Z

Do we want to refactor the ES factory to only only hold one config/client and then propagate the isArchive flag to it?

yurishkuro · 2024-10-20T18:48:12Z

In v2 we have two independent factories. I'd say yes, we want to refactor the factories to represent just one kind of storage. It will require changes to query service. I think only Cassandra and ES factories are storing two different namespaces.

mahadzaryab1 · 2024-10-20T18:53:28Z

@yurishkuro Got it! I'll get started on that. Thanks!

yurishkuro added the help wanted Features that maintainers are willing to accept but do not have cycles to implement label Oct 6, 2024

yurishkuro mentioned this issue Oct 6, 2024

[jaeger-v2] Refactor ElasticSearch/OpenSearch Storage Configurations #6060

Merged

4 tasks

dosubot bot added area/storage storage/elasticsearch labels Oct 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate if ES Storage is passed a proper isArchive flag #6065

Investigate if ES Storage is passed a proper isArchive flag #6065

yurishkuro commented Oct 6, 2024

mahadzaryab1 commented Oct 17, 2024 •

edited

Loading

yurishkuro commented Oct 17, 2024

mahadzaryab1 commented Oct 19, 2024

yurishkuro commented Oct 19, 2024

mahadzaryab1 commented Oct 19, 2024

yurishkuro commented Oct 19, 2024

mahadzaryab1 commented Oct 19, 2024 •

edited

Loading

yurishkuro commented Oct 20, 2024

mahadzaryab1 commented Oct 20, 2024 •

edited

Loading

mahadzaryab1 commented Oct 20, 2024

yurishkuro commented Oct 20, 2024

mahadzaryab1 commented Oct 20, 2024

Investigate if ES Storage is passed a proper isArchive flag #6065

Investigate if ES Storage is passed a proper isArchive flag #6065

Comments

yurishkuro commented Oct 6, 2024

mahadzaryab1 commented Oct 17, 2024 • edited Loading

yurishkuro commented Oct 17, 2024

mahadzaryab1 commented Oct 19, 2024

yurishkuro commented Oct 19, 2024

mahadzaryab1 commented Oct 19, 2024

yurishkuro commented Oct 19, 2024

mahadzaryab1 commented Oct 19, 2024 • edited Loading

yurishkuro commented Oct 20, 2024

mahadzaryab1 commented Oct 20, 2024 • edited Loading

mahadzaryab1 commented Oct 20, 2024

yurishkuro commented Oct 20, 2024

mahadzaryab1 commented Oct 20, 2024

mahadzaryab1 commented Oct 17, 2024 •

edited

Loading

mahadzaryab1 commented Oct 19, 2024 •

edited

Loading

mahadzaryab1 commented Oct 20, 2024 •

edited

Loading