Make Wise Decisions for Your DBMSs: Workload Forecasting and Performance Prediction Before Execution

Zhengtong Yan, Jiaheng Lu, Qingsong Guo, Gongsheng Yuan, Calvin Sun, Steven Yang

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review


The performance of a Database Management System (DBMS) is decided by the system configurations and the workloads it needs to process. To achieve instance optimality, database administrators and end-users need to choose the optimal configurations and allocate the most appropriate resources in accordance with the workloads for each database instance. However, the high complexity of time-varying workloads makes it extremely challenging to find the optimal configuration, especially for a cloud DBMS that may have millions of database instances with diverse workloads. There is no one-size-fits-all configuration that works for all workloads since each workload has varying patterns on configuration and resource requirements. If a configuration cannot adapt to the dynamic changes of workloads, there could be a significant degradation in the overall performance of a DBMS unless a sophisticated administrator is continuously re-configuring the DBMS.

An ideal solution to address the above challenges is the autonomous or self-driving DBMSs (e.g., Oracle Autonomous Database, Peloton, NoisePage, and openGauss) which are expected to automatically and constantly configure, tune, and optimize themselves in accordance with the workload changes without any intervention from human experts. Since the optimal configuration setting is very dependent on the workload characteristics, thus the first and key step for an autonomous DBMS is to predict the future workload based on the historical data. Firstly, the DBMS should be able to forecast when the workload will significantly change (i.e., workload shift), how many workloads will arrive (i.e., arrival rate), and what is the next query that a user will execute (i.e., next query) in the future. That predicted workload information enables an autonomous DBMS to decide when and how to re-configure itself in a predictive manner before the workload changes occur. Secondly, an autonomous DBMS also needs to predict the query performance by estimating some essential runtime metrics before execution, such as how long a query will take to complete (i.e, execution time) and how much resources will be consumed (i.e., resource utilization). Predicting the execution time and resource demand prior to execution is useful in many tasks, including admission control, query scheduling, progress monitoring, system sizing, and resource management.

In this tutorial, we will focus on 1) how to forecast the future workloads (e.g., workload shift detection, arrival rate prediction, and next query prediction), and 2) how to analyze the behaviors of the workloads (e.g., execution time prediction and resource usage estimation). We will provide a comprehensive overview and detailed introduction of the two topics, from state-of-the-art methods, real-world applications, to open problems and future directions. Specifically, we will not only discuss traditional methods, such as time-series analysis, Markov modeling, analytical modeling, and experiment-driven methods, but also cover the state-of-the-art AI techniques, including machine learning, deep learning, reinforcement learning, and graph embedding.
Titel på värdpublikation27th International Conference on Database Systems for Advanced Applications (DASFAA-2022)
Antal sidor4
Utgivningsdatum11 apr. 2022
StatusPublicerad - 11 apr. 2022
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangInternational Conference on Database Systems for Advanced Applications -
Varaktighet: 11 apr. 202214 apr. 2022
Konferensnummer: 27


NamnLecture Notes in Computer Science
ISSN (elektroniskt)1611-3349


  • 113 Data- och informationsvetenskap

Citera det här