-
Hi, We have a few millions of vertices in our graph, a few composite indexes and elastic search indexes on common properties shared by all vertices in our graph (but most queries are probably not using indexes), around 16 worker processes that are constantly writing data (every second or so). and some clients that are making queries. we noticed that every once a while some queries are getting really slow, and we get timeouts, then when we try the same query again it works. we guessed that it has to do with things being removed from the cache, so we tried to disable the cache to see the performance without it, and saw that when the cache is disabled, everything is extremely slow, even the most simple queries, something like: g.V(269840512).values('_entity_type').next() (we have a composite index on _entity_type field), is taking more then 5 seconds to complete (and sometimes 2 minutes) on a python script. but when trying to run the same query (and also more complicated queries) from within the gremlin console it is working quite fast. it is also working fast when disabling all workers and clients and just running this query alone from a python script. is there some connection pool limit that is limiting the performance? is there something we can do to make this setting (with a single instance) work, or with the amount of data/connections/reads and writes that we have we need to change to a multi node cluster? Thanks, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi Roi, From what you describe, gremlin/janusgraph server seems to be the limiting part, indeed. Caching in janusgraph can help you save the retrieval of vertices from BigTable, but then you talk about speeding up from 10 ms to sub-ms levels, not the waiting times you describe. Marc |
Beta Was this translation helpful? Give feedback.
Hi Roi,
Did you check the following references:
https://tinkerpop.apache.org/docs/current/reference/#_tuning
https://www.experoinc.com/post/janusgraph-nuts-and-bolts-part-1-write-performance
From what you describe, gremlin/janusgraph server seems to be the limiting part, indeed. Caching in janusgraph can help you save the retrieval of vertices from BigTable, but then you talk about speeding up from 10 ms to sub-ms levels, not the waiting times you describe.
Marc