AbstractsComputer Science

Expressive And Scalable Event Stream Processing

by Mingsheng Hong




Institution: Cornell University
Department:
Year: 2009
Record ID: 1854308
Full text PDF: http://hdl.handle.net/1813/12803


Abstract

Rapid technical advances have made it possible to instrument even massive computing systems. However, the technology for processing high-speed data streams from physical sensors and software systems has lagged the capability to produce such streams. The goal of stream processing research is to develop algorithms and software infrastructures capable of processing streaming data with high throughput and low latency. A large class of streaming applications requires that stream processing systems be both expressive and scalable. That is, a stream processing system should be able to process a large number of reasonably sophisticated queries over high-speed input streams. There are however general tensions between expressiveness and scalability in stream processing. In this dissertation, we present techniques and prototype systems that address both expressiveness and scalability aspects. First, we present Cayuga, a general-purpose event monitoring system with an expressive event algebra and a set of novel query optimization techniques. Second, we describe a rule-based Multi-Query Optimization framework, which generalizes Cayuga, unifies large-scale stream processing and event processing, and provides a platform for integrating future query rewrite based optimization techniques. Finally, we describes an approach in large-scale XML stream join processing. To our knowledge, this is the first scalable solution to the problem of processing a large number of XML stream join queries. As the adoption of stream processing technology increasingly gains momentum, the ideas and techniques developed in this dissertation provide a foundation for building expressive and scalable stream processing systems with affordable cost and high reliability.