Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf: Operator Dashboard performance #535

Closed
bmtcril opened this issue Dec 4, 2023 · 4 comments
Closed

Perf: Operator Dashboard performance #535

bmtcril opened this issue Dec 4, 2023 · 4 comments
Labels
epic Large unit of work, consisting of multiple tasks

Comments

@bmtcril
Copy link
Contributor

bmtcril commented Dec 4, 2023

Effectively no work has been done to optimize performance on the Operator dashboard since it has been a lower priority than the Instructor dash. However under large scale load testing it's obvious that we need to make some improvements. Here are the charts that are exhibiting the worst behavior:

  • Active users over time
    • (worse with course key filter)
  • Total unique users
    • (worse with course key filter)
  • Total courses
    • (worse with course key filter)
  • Last received event
    • (worse with course key filter)
  • Total organizations
    • (ok without course key filter)
  • Enrollments over time
    • Runs out of memory
  • Enrollments by type
    • Runs out of memory
  • Filters on Enrollments tab
    • Runs out of memory
  • Most active courses per day
    • Timeout
  • Active users per organization
    • Runs out of memory

Many of these can be overcome by moving from using the xapi_all_parsed table to materialized views, or even better creating new aggregated MVs that specifically roll up these values over time. A first pass on this ticket should determine which course of action to take for each chart and create separate tickets for them with this as the epic.

@bmtcril bmtcril added aspects v1 epic Large unit of work, consisting of multiple tasks labels Dec 4, 2023
@SoryRawyer
Copy link
Contributor

@bmtcril for active users per organization and most active courses, would it make sense to re-create the navigation_events MV in dbt and use that as the basis for measuring activity?

@bmtcril
Copy link
Contributor Author

bmtcril commented Jan 11, 2024

I was initially going for "any activity", but this seems more reasonable, and I think we'll want that for some of the v1 "page" level events anyway.

@bmtcril bmtcril changed the title Operator Dashboard performance Perf: Operator Dashboard performance Jan 12, 2024
@SoryRawyer
Copy link
Contributor

The enrollment filters now pull from the course_names dictionary-backed table, so we shouldn't see any more memory errors loading the filters. There are also pull requests open to:

  • Use the same course_names table for the "Total courses" and "Total organizations" charts
  • Add navigation events to dbt, which would enable us to use that dataset for the "Total unique users", "Active users per organization", "Active learners over time", and "Most active courses per day"

Of the charts I wasn't able to update, here are my thoughts:

  • Given the lack of improvement to the fact_enrollments_by_day dataset, I don't think there are ways to improve the performance of the enrollment charts without subtly changing the questions those charts answer. Requiring an org or course-level filter would help here.
  • The "last received event" chart could possibly benefit for the work on "at risk" learners. This work could include an MV that updates the most recent events per some dimensions (e.g. learner, verb, course, org). Such a dataset could help performance for showing recent events.
  • If navigation events are an insufficient proxy for learner activity, aggregating materialized views could provide high-level summaries suitable for these charts

@bmtcril
Copy link
Contributor Author

bmtcril commented Jan 12, 2024

Thanks, I think we'll need to rethink those a little bit. We may want another aggretating mergetree MV just for these enrollments if it comes to it, but the rest of this can wait until we have the new datasets to look at.

@bmtcril bmtcril closed this as completed Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic Large unit of work, consisting of multiple tasks
Projects
None yet
Development

No branches or pull requests

2 participants