AI-Powered Query Optimization

Nagireddy Karri

doi:10.63282/3050-9262.IJAIDSML-V2I1P108

Authors

Nagireddy Karri Senior IT Administrator Database, Sherwin-Williams, USA. Author

DOI:

https://doi.org/10.63282/3050-9262.IJAIDSML-V2I1P108

Keywords:

Query optimization, cost-based optimizer, execution feedback, memorization, tail latency, plan stability

Abstract

The framework for AI-powered query optimization that augments, rather than replaces, a classical cost-based optimizer. The design incorporates three components of learned: (i) a learned cardinality estimator that learns the correlation between joins and predicates; (ii) a neural residual cost corrector that learns cost error at the operator level; and (iii) a reinforcement-learning (RL) planner that focuses on high-leverage transformations of the plan under the constraints of latency and resource cost. The system works in two steps, first, offline training of the system based on past workloads and schema-sensitive synthetic queries, and lastly online adaptation that is done cautiously by using execution feedback (observed row counts, operator run times, spill events). Uncertainty gating is used to enforce safety, time-out-sandboxed trials, and immediate fallback to baseline heuristics. We detail an integration path that keeps optimizer modularity intact (Volcano/Cascades memoization, rule rewrites) while exposing pluggable inference hooks. Compared to a robust cost-based baseline, TPC-H/DS and JOB evaluation indicate a consistent decrease in p95/p99 latency, plan stability and decreased re-optimization, as well as a decrease in the CPU and memory consumption during peak loads. Failure modes out-of-distribution predicates, opaque UDFs and drift and demonstrate how risk can be mitigated using drift detection and canaried fine-tuning. The findings show that AI help produces empirical, trustworthy returns in combination with strong guardrails and observability

References

[1] Ammar, A. B. (2016). Query optimization techniques in graph Databases. arXiv preprint arXiv:1609.01893.

[2] Chen, Z., Gehrke, J., & Korn, F. (2001, May). Query optimization in compressed database systems. In Proceedings of the 2001 ACM SIGMOD international conference on Management of data (pp. 271-282).

[3] Azhir, E., Navimipour, N. J., Hosseinzadeh, M., Sharifi, A., & Darwesh, A. (2019). Query optimization mechanisms in the cloud environments: A systematic study. International Journal of Communication Systems, 32(8), e3940.

[4] Li, G., Zhou, X., & Cao, L. (2021, October). Machine learning for databases. In Proceedings of the First International Conference on AI-ML Systems (pp. 1-2).

[5] Van Aken, D., Pavlo, A., Gordon, G. J., & Zhang, B. (2017, May). Automatic database management system tuning through large-scale machine learning. In Proceedings of the 2017 ACM international conference on management of data (pp. 1009-1024).

[6] Schüle, M., Simonis, F., Heyenbrock, T., Kemper, A., Günnemann, S., & Neumann, T. (2019). In-database machine learning: Gradient descent and tensor algebra for main memory database systems. In BTW 2019 (pp. 247-266). Gesellschaft für Informatik, Bonn.

[7] Günnemann, S. (2017). Machine learning meets databases. Datenbank-Spektrum, 17(1), 77-83.

[8] Tzoumas, K., Sellis, T., & Jensen, C. S. (2008). A reinforcement learning approach for adaptive query processing. History, 1-25.

[9] Nogueira, R., & Cho, K. (2017). Task-oriented query reformulation with reinforcement learning. arXiv preprint arXiv:1704.04572.

[10] Rosset, C., Jose, D., Ghosh, G., Mitra, B., & Tiwary, S. (2018, June). Optimizing query evaluations using reinforcement learning for web search. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (pp. 1193-1196).

[11] Krishnan, S., Yang, Z., Goldberg, K., Hellerstein, J., & Stoica, I. (2018). Learning to optimize join queries with deep reinforcement learning. arXiv preprint arXiv:1808.03196.

[12] Wu, Y. (2020). Cloud-edge orchestration for the Internet of Things: Architecture and AI-powered data processing. IEEE Internet of Things Journal, 8(16), 12792-12805.

[13] Bai, X., Zhang, H., & Zhou, J. (2014). VHR object detection based on structural feature extraction and query expansion. IEEE Transactions on Geoscience and Remote Sensing, 52(10), 6508-6520.

[14] Johnson, C. R., Glatter, M., Kendall, W., Huang, J., & Hoffman, F. (2009, May). Querying for feature extraction and visualization in climate modeling. In International Conference on Computational Science (pp. 416-425). Berlin, Heidelberg: Springer Berlin Heidelberg.

[15] Hellerstein, J. M. (1998). Optimization techniques for queries with expensive methods. ACM Transactions on Database Systems (TODS), 23(2), 113-157.

[16] Boz, O. (2002, July). Extracting decision trees from trained neural networks. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 456-461).

[17] Hueber, C., Horejsi, K., & Schledjewski, R. (2016). Review of cost estimation: methods and models for aerospace composite manufacturing. Advanced Manufacturing: Polymer & Composites Science, 2(1), 1-13.

[18] Sethi, I. K. (2002). Entropy nets: from decision trees to neural networks. Proceedings of the IEEE, 78(10), 1605-1613.

[19] Shah, H., & Gopal, M. (2010). Fuzzy decision tree function approximation in reinforcement learning. International Journal of Artificial Intelligence and Soft Computing, 2(1-2), 26-45.

[20] Belling, P. K., Suss, J., & Ward, P. (2015). The effect of time constraint on anticipation, decision making, and option generation in complex and dynamic environments. Cognition, Technology & Work, 17(3), 355-366.

AI-Powered Query Optimization

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

How to Cite

call for paper

Make a Submission

Cover Image

CURRENT INDEX

TOOLS

Latest publications

Information