Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IQSS/10379 Fix Metrics API Bugs #10865

Merged
merged 4 commits into from
Oct 24, 2024

Conversation

qqmyers
Copy link
Member

@qqmyers qqmyers commented Sep 19, 2024

What this PR does / why we need it: This PR addresses a bug reported in #10379 where a cached value did not include the choice of whether to report local, remote (harvested), or all datasets for the /datasets and /datasets/monthly endpoints, resulting in incorrect values when different values were used and a cached value was available. The PR uses the existing mechanism to cache the results with the local/remote/all dataLocation choices to avoid this.

The PR also addresses the issue raised in #10710 (comment) where the old queries to find the latest relevant version were broken if/when the minor version was greater than 9. The new queries rely on ordering the results by both the major and minor version numbers and using DISTINCT ON to pick the first entry (the one with the highest major version, and among those, the highest minor version.

Which issue(s) this PR closes:

Closes #10379

Special notes for your reviewer: I did some checking of the results from the old/new queries in sql: e.g. in simplest form, comparing
select DISTINCT ON (datasetversion.dataset_id) datasetversion.id from datasetversion join dataset on dataset.id = datasetversion.dataset_id where versionstate='RELEASED' order by datasetversion.dataset_id, datasetversion.versionnumber desc, datasetversion.minorversionnumber desc;

with
select datasetversion.dataset_id || ':' || max(datasetversion.versionnumber || datasetversion.minorversionnumber) from datasetversion join dataset on dataset.id = datasetversion.dataset_id where versionstate='RELEASED' group by dataset_id;

and they appear to give the same results. In going through the queries for different metrics, I realized that the string value being returned was never used, so I've simplified the selects to just return just the dataset id (when all we're doing is count(*)) or the datasetversion id (when we're matching against the version to, for example, find filemetadatas in that version).

Suggestions on how to test this:
Compare the results of any/all of the metrics APIs with the changed code. A potential shortcut that covers most endpoints would be to install/run the https://github.com/gdcc/dv-metrics app and view it's webpage and then compare the graphs produced before and after the PR. Whether testing manually or via the app, you need to truncate metric; to remove the cached values or call the /clearMetricsCache api call.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?: included.

Additional documentation:

@qqmyers qqmyers added the Size: 10 A percentage of a sprint. 7 hours. label Sep 19, 2024
@coveralls
Copy link

coveralls commented Sep 19, 2024

Coverage Status

coverage: 20.868% (-0.001%) from 20.869%
when pulling d383802 on QualitativeDataRepository:IQSS/10379
into a0cb73d on IQSS:develop.

@cmbz cmbz added the FY25 Sprint 8 FY25 Sprint 8 (2024-10-09 - 2024-10-23) label Oct 9, 2024
@pdurbin pdurbin added the Type: Bug a defect label Oct 9, 2024
@cmbz cmbz added the FY25 Sprint 9 FY25 Sprint 9 (2024-10-23 - 2024-11-06) label Oct 23, 2024
@ofahimIQSS ofahimIQSS self-assigned this Oct 24, 2024
@ofahimIQSS
Copy link
Contributor

Tested in internal environment. Fix looks good.

Testing of 10865.docx

@ofahimIQSS ofahimIQSS merged commit aebf31d into IQSS:develop Oct 24, 2024
11 checks passed
@ofahimIQSS ofahimIQSS removed their assignment Oct 24, 2024
@pdurbin pdurbin added this to the 6.5 milestone Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FY25 Sprint 6 FY25 Sprint 6 FY25 Sprint 7 FY25 Sprint 7 (2024-09-25 - 2024-10-09) FY25 Sprint 8 FY25 Sprint 8 (2024-10-09 - 2024-10-23) FY25 Sprint 9 FY25 Sprint 9 (2024-10-23 - 2024-11-06) Size: 10 A percentage of a sprint. 7 hours. Type: Bug a defect
Projects
Status: Merged 🚀
Development

Successfully merging this pull request may close these issues.

Metrics caching doesn't handle dataLocation parameter
6 participants