-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prefer use of resourceUuid
index over resourceType
when selecting from resourceentity
table
#2739
Conversation
resourceUuid
index over resourceType
when selecting from resourceentity
tableresourceUuid
index over resourceType
when selecting from resourceentity
table
c043494
to
b338fcd
Compare
Switching from resourceType multi-column index to resourceUuid, the order of results in include/revInclude is no longer predictable since the resourceUuids are randomly generated, also saved in the db as blob and hence ordered by byte representation of the resourceUuid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you pls add some stats from testing.
@aditya-07 please can you add comments.
engine/src/main/java/com/google/android/fhir/search/SearchDsl.kt
Outdated
Show resolved
Hide resolved
engine/src/androidTest/java/com/google/android/fhir/db/impl/DatabaseImplTest.kt
Outdated
Show resolved
Hide resolved
engine/src/androidTest/java/com/google/android/fhir/db/impl/DatabaseImplTest.kt
Outdated
Show resolved
Hide resolved
engine/src/main/java/com/google/android/fhir/search/MoreSearch.kt
Outdated
Show resolved
Hide resolved
Given fhirEngine.search<Task> {
filter(Task.IDENTIFIER, { value = of("routine_screening") })
filter(Task.STATUS, { value = of("ready") })
} Previously this would generate the query SELECT a.resourceUuid, a.serializedResource
FROM ResourceEntity a
WHERE a.resourceType = 'Task'
AND a.resourceUuid IN (SELECT resourceUuid
FROM TokenIndexEntity
WHERE resourceType = 'Task'
AND index_name = 'identifier'
AND index_value = 'routine_screening')
AND a.resourceUuid IN (SELECT resourceUuid
FROM TokenIndexEntity
WHERE resourceType = 'Task'
AND index_name = 'status'
AND index_value = 'ready') with the query plan
Testing on db with 125206 resources, and 109654 Tasks, the query takes around
The changes in this PR, would be generating the query SELECT a.resourceUuid, a.serializedResource
FROM ResourceEntity a
WHERE a.resourceUuid IN (SELECT resourceUuid
FROM TokenIndexEntity
WHERE resourceType = 'Task'
AND index_name = 'identifier'
AND index_value = 'routine_screening')
AND a.resourceUuid IN (SELECT resourceUuid
FROM TokenIndexEntity
WHERE resourceType = 'Task'
AND index_name = 'status'
AND index_value = 'ready'); resulting in the query plan
Testing on db with 125206 resources, and 109654 Tasks, the query takes around
Also Note: the db returned 0 rows since it didn't have any Task with identifier |
For a database with 166293 resources and 137517 Encounters, The query SELECT a.resourceUuid, a.serializedResource
FROM ResourceEntity a
WHERE a.resourceType = 'Encounter'
AND a.resourceUuid IN (SELECT resourceUuid
FROM TokenIndexEntity
WHERE resourceType = 'Encounter'
AND index_name = '_id'
AND ((index_value = 'cf5f865f-602b-4c8c-9ece-acd4761f6eca' OR
index_value = '16fa13cb-a977-44ab-9db5-ccc62a38a414') OR
(index_value = 'b228ff4d-ee37-4752-ba96-1d198adb9951' OR
index_value = '89138401-c9b5-4b57-9871-edac67a6fbde'))); takes
while the query with preference to the SELECT a.resourceUuid, a.serializedResource
FROM ResourceEntity a
WHERE a.resourceUuid IN (SELECT resourceUuid
FROM TokenIndexEntity
WHERE resourceType = 'Encounter'
AND index_name = '_id'
AND ((index_value = 'cf5f865f-602b-4c8c-9ece-acd4761f6eca' OR
index_value = '16fa13cb-a977-44ab-9db5-ccc62a38a414') OR
(index_value = 'b228ff4d-ee37-4752-ba96-1d198adb9951' OR
index_value = '89138401-c9b5-4b57-9871-edac67a6fbde'))); takes
|
This change looks good from the performance metrics. But I am also surprised that the query earlier was taking 2-3 seconds. What are real, user, and sys times - difference between them ? |
IMPORTANT: All PRs must be linked to an issue (except for extremely trivial and straightforward changes).
Fixes #[issue number]
Description
For queries that select from the
resourceentity
table and have filters toresourceentity.resourceUuid
, prefer to useresourceentity.resourceUuid` over
resourceentity.resourceType. Both are indexed but the ``resourceentity.resourceUuid
is unique and is fasterAlternative(s) considered
Have you considered any alternatives? And if so, why have you chosen the approach in this PR?
Type
Choose one: (Bug fix | Feature | Documentation | Testing | Code health | Builds | Releases | Other)
Screenshots (if applicable)
Checklist
./gradlew spotlessApply
and./gradlew spotlessCheck
to check my code follows the style guide of this project../gradlew check
and./gradlew connectedCheck
to test my changes locally.