Author | Wolfram Wingerath |
Supervisors | Norbert Ritter, Daniela Nicklas, Erhard Rahm |
Title | Scalable Push-Based Real-Time Queries on Top of Pull-Based Databases |
Abgabe am | 15.10.2018 |
Abstract | Many of today's web applications notify users of status updates and other events in realtime. But even though more and more usage scenarios revolve around the interaction between users, detecting and publishing changes remains notoriously hard even with state-of-the-art data management systems. While traditional database systems excel at complex queries over historical data, they are inherently pull-based and therefore ill-equipped to push new information to clients. Systems for data stream management and processing, on the other hand, are natively push-oriented and thus facilitate reactive behavior. However, they do not retain data indefinitely and are therefore not able to answer historical queries. The separation between these two system classes gives rise to both high complexity and high maintenance costs for applications that require persistence and real-time change notifications at the same time. How can push-based access be enabled for database queries over historical data collections in a simple and efficient manner? In this thesis, we explore the system space between pull-oriented database systems and push-oriented stream management systems. Specifically, we focus on the novel system class of real-time databases that bridge the gap between both paradigms by providing collection-based semantics for pull-based and push-based queries alike. Through an in-depth system survey, we uncover deficiencies in existing implementations and scale-prohibitive limitations in their respective designs. In order to address these issues, we propose the system design InvaliDB which makes push-based real-time queries available as an opt-in feature for existing pull-based database systems. InvaliDB exhibits several substantial benefits over current real-time database architectures. First, it avoids the scalability bottlenecks that other systems are constrained by through a novel two-dimensional workload partitioning scheme. Second, our design supports more expressive queries than its peers, including sorted filter queries with limit and offset clauses, aggregations, and joins. Third, InvaliDB is database-agnostic through a pluggable query engine and can therefore be applied to existing (pull-based) application stacks in order to enable push-based data access. We provide an experimental evaluation to demonstrate that sustainable query matching throughput scales linearly with the number of servers employed for query matching, while end-to-end notification latency remains consistently low across all InvaliDB configurations. A detailed case study of our InvaliDB prototype in a production deployment further illustrates that our approach is feasible to implement, enables easy-to-use query interfaces, and is practically useful for data-intensive industry applications. |
Document | |
Other formats |
Build Faster Apps Faster
IFB Hamburg
Dr. Felix Gessert, Prof. Dr. -Ing Norbert Ritter, Florian Bücklers, Malte Lauenroth, Hannes Kuhlmann, Prof. Dr. Wolfram Wingerath, Benjamin Wollmer
| |
Objects RESTfully Encapsulated in Standard Formats
| |
Scalable Push-Based Real-Time Queries on Top of Pull-Based Databases
|
CALL getCollectionFull('publications/lookthesis','dbis',1023,0)