State Safety with ZooKeeper: All the running states of the system are stored in Apache ZooKeeper. ZooKeeper’s atomic Read() and Write() APIs provide a safe mechanism for writing/reading system states. The states are necessary for recovering from various crashes/failures.

Also, ZooKeeper provides an e cient way for log purging. Write Ahead Logs: In order to ensure that there is no data loss even on driver failure, Spark came up with Write Ahead Logs (WAL) which when enabled saves the received data into log files on a fault-tolerant file system like HDFS.

This command will use the default directories for storing ledgers and the write ahead log, and will look for a zookeeper server on localhost See the Admin Guide for more details. To see the default values of these configuration variables, run.

The controllers must operate on disks with low latency. The cluster requires disk storage system for each node to have a peak write latency of less than ms, and a mean write latency of less than ms. > >[myid:1] - WARN [[email protected]] - > fsync-ing the write ahead log in SyncThread:1 took ms which will > adversely effect operation latency.

See the ZooKeeper troubleshooting guide > > I am running ZK cluster of size 3 in a VM.

