Maksim Kita
I work on database management systems development and specialize in Performance Engineering, Query Analysis and Planning, JIT Compilation, System Programming, Distributed Systems.
Currently, I work at ClickHouse and focus on query analysis, planning, and execution.
Specialties: Databases Development, Performance Engineering, Software Design, JIT Compilation, Distributed Systems, Scalability, System Programming, C++, C, Assembly Language.
Work History
Tinybird
- Principal Software Engineer
January 2024 - Present - Improved performance for many short-running concurrent queries, decreased latency by 2-6 times and increased throughput by 3 times. Blog post about this improvement Lock contention in ClickHouse.
- Increased performance of queries with FINAL for MergeTree engines up to 60 times. Link to the pull request.
- Added support for recursive CTEs. Added support for QUALIFY clause. Release blog post about these features ClickHouse Release 24.4.
- Improved JOIN filter push down using equivalent sets. Added optimization to automatically convert OUTER JOIN to INNER JOIN. Blog post about these improvements ClickHouse JOINs... 100x faster.
- Increased performance for queries that spend a lot of execution time on string deserialization by 10-60%. Blog post about this improvement Power of small optimizations.
- Increased performance of external aggregation by 8 times. Link to the pull request.
- Made many improvements in query execution, planning, JOINs, sort and aggregation. Fixed critical issues and memory leaks.
Development of ClickHouse DBMS. Lead ClickHouse team.
Main achievements:
Yandex School of Data Analysis
- Lecturer
September 2023 - January 2024 - Worked as a lecturer for DBMS course. Link to the course repository.
Yandex
- Principal Software Engineer
April 2023 - January 2024 - Optimized YDB for ARM. Increased overall performance of YDB for ARM by 30%. The results were presented at Highload 2023.
- Designed and developed asynchronous JIT query compilation service. Enabled JIT query compilation by default. As a result, for heavy queries that process a lot of data, the performance increased by 2-29 times.
- Created a tool for measuring the latency and throughput of a single query (similar to clickhouse-benchmark) and found many places to improve performance using this tool.
- Increased performance of row storage for queries that process a lot of data by 15-35%.
- Increased performance of data import into YDB by 3-7 times.
- Increased performance of serialization and deserialization of low-level row representation by 2-2.5 times.
- Developed first version of YDB interactive CLI and set the further vector of interactive CLI development.
- Gave a talk about YDB ARM Performance Optimizations at HighLoad 2023. In the talk, I covered ARM optimization basics and different benchmarks (ClickBench, YCSB, TPC-C) with many examples (video, slides).
- Gave a talk about database development at Yandex. We discussed DBMS development at Yandex, DBMS development trends and shared useful resources (video).
Development of YDB distributed SQL DBMS. Worked on query execution and performance.
Main achievements:
Side activities:
ClickHouse
- Senior Software Engineer
August 2021 - March 2023 - Designed and implemented new infrastructure for query analysis and planning. That infrastructure opens a lot of opportunities for improved JOINs, full-featured optimizer, SQL support, and many other things.
- Improved sorting using low-level specializations and processing data in batches resulting in 2-10 times performance improvement.
- Improved insertion into MergeTree storage engine resulting in 2-3 times performance improvement.
- Designed and implemented JIT compilation for ORDER BY.
- Significantly improved and enhanced dynamic dispatch infrastructure. After that, enhancement significantly improved performance of unary functions, some aggregate functions, and logical functions.
- Maintaining Dictionaries. Made a lot of usability and performance improvements.
- Gave a talk about “ClickHouse performance optimization practices” at C++ Russia 2022. Talk covers ClickHouse CI/CD pipeline, performance tests, high level architecture decisions for writing high performance products, JIT compilation, dynamic dispatch with a lot of examples (slides).
- Gave a talk about “JIT compilation of queries in ClickHouse” at HighLoad 2022. Talk covers query execution and some recent JIT compilation improvements in ClickHouse (slides).
Development of ClickHouse DBMS. Worked on query analysis, planning and execution. Lead migration to new query analysis and planning infrastructure.
Main achievements:
Side activities:
Higher School of Economics
- Lecturer
January 2022 - January 2024 - Worked as a DBMS lecturer at Computer Science Faculty. Link to course overview. Link to course repository.
Yandex
- Senior Software Engineer
January 2021 - March 2022 - Was responsible for Dictionaries. Made significant improvements, refactored and redesigned major parts of Dictionaries. After that redesign implemented a lot of features and resolved almost all issues from the Dictionaries backlog.
- Designed and implemented infrastructure for JIT compilation.
- Finished implementation of JIT compilation for expressions evaluation.
- Designed and implemented JIT compilation for GROUP BY.
- Designed and implemented Executable UDF (user defined functions).
- Finished implementation of SQL UDF (user defined functions).
- Made a lot of performance and usability improvements mostly in the query execution area.
- Reviewed and merged more than 600 external pull requests from contributors (around 14 percent of all external pull requests).
Development of ClickHouse DBMS. Worked on Dictionaries, JIT compilation and low-level optimizations.
Main achievements:
Side activities:
EPAM Systems
- Senior Software Engineer
November 2017 - January 2021 - Worked on several projects in different domains including financial, telecommunication and video streaming. On each project played a core developer role, and was deeply involved in all major technical design decisions. Designed and implemented a lot of reusable components.
- Had experience on refactoring and introducing new features without regressions on legacy codebases.
- Developed a framework for UI automatization that was reused on different projects across the company.
- Contributed to an internal tool integrated with JIRA to evaluate overall project performance.
- Contributed to open-source projects Apple Swift compiler and Poco libraries.
- Gave a podcast talk about “How to start contributing to a big Open Source project” (video).
- Participated in hackathons, company talks, conferences.
Main achievements:
Side activities:
Yanka Kupala State University of Grodno
- Lecturer
January 2022 - June 2022 - Worked as a DBMS teacher.
- Developed DBMS course labs.
Education
Yandex School of Data Analysis
- Completed specialization Big data infrastructure: 2021 - 2023.
Yanka Kupala State University of Grodno
- M.S. Computer Science: 2019 - 2021.
- B.S. Computer Science: 2015 - 2019.