Skip to content
Zubair Nabi edited this page Mar 24, 2014 · 12 revisions

Cluster resource managers enable multiple data intensive computing and storage frameworks to co-exist within the same cluster and share physical resources. In addition, they enable practitioners to mix different programming paradigms together to collate disparate datasets, for instance, to analyse data in motion and data at rest using a stream processing system and a batch processing system, respectively. The predominant model to realise cluster management is via two level scheduling, wherein at the first level, the cluster manager allocates resources to each individual framework and at the second level, each framework distributes its allocation across the various jobs that it needs to execute. Examples of this type of cluster management include Apache YARN and Apache Mesos. The purpose of this project is to integrate IBM InfoSphere Streams with these cluster managers. As a first step towards this goal, we currently have support for Apache YARN.

Wiki contents: