CLASH

CLASH is a research project dedicated to query processing over semi-relational streams.

Welcome to CLASH

Clash is a system for query processing.

The principal idea is as follows:

Ideal course of events

We take a query (for now select-from-where) and fixed data characteristics (stream rates and join selectivities) and put this into an optimizer. This optimizer generates a physical graph, which in an abstract way models a streaming system. This physical graph can then be translated to a Storm topology.

Contents

For consistent wording, refer to this document: Glossary

All pages of this documentation in no particular order:

Future Work

As this should be a streaming system, obviously the initial data might change over time. Thus, changes in the data characteristics imply other optimal strategies.

Ideal course of events

So now a physical graph is given (the one currently used for answering the query), and the currently observed data. This is given into a (re)optimizer, which then tells, how the graph should be changed. E.g., by changing the join order.

Productive Code

The code lives in the Clash Gitlab Project. Under Code Structure the way the project is set up is explained.

There are further helpful projects, e.g., for running Clash: