Update queries #163

stuartmcalpine · 2024-10-29T19:40:14Z

Make more options for querying

There is now a ~= query operator that can utalise the .ilike filter to allow non-case-sensitive filering with wildcards (i.e., the % character).
dregs ls can now filter on the dataset name, including % wildcards, using the --name option.
dregs_ls can return arbitrary columns using the --return_cols option

JoanneBogart

I left one inline comment about mixing case-insensitivity with wildcard parsing.
A second issue has to do with escaping wildcards. Postgres treats both % and _ as potential wildcards. I believe sqlalchemy just passes the string as-is to Postgres. One can specify an escape character (by default \). I would say we should go with the default escape character - it's already something we exclude from names. Then before using .like or .ilike any underscore characters should be escaped (otherwise an underscore matches any single character; I think we can do without that capability). Maybe we also need to look for backslashes and, if found, escape them as well, but it's unlikely to come up except possibly in a field like description.
I'm also wondering whether we should use * as the wildcard character rather than %. One advantage is it's already excluded from names (but not from all string fields). The other is that people are used to it as a wildcard character. If we use it, the procedure would be

escape all _ and %
replace * with %
invoke .like or .ilike as appropriate
There still is an issue with either * or % existing in the string when it's not intended to be a matching character. Maybe we just have to disallow "like" comparisons for all but a carefully-selected set of string columns.

JoanneBogart · 2024-10-31T16:45:48Z

src/dataregistry/query.py

-        return stmt.where(column_ref[0].__getattribute__(the_op)(value))
+        # Special case where we are partially matching with a wildcard
+        if f[1] == "~=":
+            return stmt.where(column_ref[0].ilike(value))


I don't think we can assume that wildcard searches should also be case-insensitive. Unfortunately this probably means we need two new operators: 1. wildcard+case-insensitive (current definition of ~=), using sqlalchemy .ilike and 2. wildcard+case-sensitive, using sqlalchemy .like. If someone just wants case-insensitivity without wildcard searching they could use 1.
Another issue with using either .like and .ilike is escaping the special characters used in pattern-matching. I'll say more about that in a separate comment.

stuartmcalpine · 2024-11-01T10:38:29Z

Added the ~== operator for case sensitive wildcard searches
Changed the wildcard operator from % to * (and _ is also escaped)

There still is an issue with either * or % existing in the string when it's not intended to be a matching character. Maybe we just have to disallow "like" comparisons for all but a carefully-selected set of string columns.

Can either just leave it, or limit the columns. There is no limit on the columns currently, I don't know how ilike works on non-string columns...

I'd imagine this will primarily be used on the name column.

stuartmcalpine added 6 commits October 29, 2024 19:41

Add ~= operator to search using wildcards

ae72103

Update dregs ls to allow for wildcard querying on the dataset name

a0c047b

Update version to 1.0.4

bed26bc

Remove % from help string

3e03e68

Update docs

6638120

Add support for searching all owners

b81e915

stuartmcalpine requested a review from JoanneBogart October 29, 2024 19:57

Get rid of all from cli

2421509

JoanneBogart requested changes Oct 31, 2024

View reviewed changes

stuartmcalpine added 2 commits November 1, 2024 11:11

Add case-sensitive wildcard operator ~==

7627aa5

Change query wildcard to * over %

5012a1f

Fix bug

47ef94e

stuartmcalpine requested a review from JoanneBogart November 1, 2024 10:39

Fix query test for sqlite

a99b3ba

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update queries #163

Update queries #163

stuartmcalpine commented Oct 29, 2024

JoanneBogart left a comment

JoanneBogart Oct 31, 2024

stuartmcalpine commented Nov 1, 2024

Update queries #163

Are you sure you want to change the base?

Update queries #163

Conversation

stuartmcalpine commented Oct 29, 2024

JoanneBogart left a comment

Choose a reason for hiding this comment

JoanneBogart Oct 31, 2024

Choose a reason for hiding this comment

stuartmcalpine commented Nov 1, 2024