Blogs  >  Protecting Your Tables Against Application Errors

Protecting Your Tables Against Application Errors

Protecting Your Tables Against Application Errors


“Do applications need to backup data in Windows Azure Storage if Windows Azure Storage already stores multiple replicas of the data?” For business continuity, it can be important to protect the data against errors in the application, which may erroneously modify the data.


If there are problems at the application layer, the errors will get committed on the replicas that Windows Azure Storage maintains. So, to go back to the correct data, you will need to maintain a backup. Many application developers today have implemented their own backup strategy. The purpose of this post is to cover backup strategies for Tables.


To backup tables, we would have to iterate through the list of tables and scan each table to copy the entities into blobs or a different destination table. Entity group transactions could then be used here to speed up the process of restoring entities from these blobs. Note, the example in this post is a full backup of the tables, and not a differential backup.


Table Backup


We will go over a simple full backup solution here. The strategy will be to take as input a list of tables and for each table a list of keys to use to partition the table scan. The list of keys will be converted into ranges such that separately backing up each of these ranges will provide a backup of the entire table. Breaking the backing up of the table into ranges this way allows the table to be backed up in parallel. The TableKeysInfo class will encapsulate the logic of splitting the keys into ranges as shown below:

public class TableKeysInfo
{
private List<PartitionKeyRange> keyList = new List<PartitionKeyRange>();

/// <summary>
///
The table to backup
/// </summary>
public string TableName { get; set; }

public TableKeysInfo(string tableName, string[] keys)
{
if (tableName == null)
{
throw new ArgumentNullException("tableName");
}

if (keys == null)
{
throw new ArgumentNullException("keys");
}

this.TableName = tableName;
// sort the keys
Array.Sort<string>(keys, StringComparer.InvariantCulture);

// split key list {A, M, X} into {[null-A), [A-M), [M-X), [X-null)}
this.keyList.Add(new PartitionKeyRange(null, keys.Length > 0 ? keys[0] : null));
for (int i = 1; i < keys.Length; i++)
{
this.keyList.Add(new PartitionKeyRange(keys[i - 1], keys[i]));
}

if (keys.Length > 0)
{
this.keyList.Add(new PartitionKeyRange(keys[keys.Length - 1], null));
}
}

/// <summary>
///
The ranges of keys that will cover the entire table
/// </summary>
internal IEnumerable<PartitionKeyRange> KeyRangeList
{
get
{
return this.keyList.AsEnumerable<PartitionKeyRange>();
}
}
}


BackupTables is the entry method into the backup process. The client can provide the TableKeysInfo for each table. For example if partition keys are GUIDs then here is how we can invoke the BackupTables method:

CloudTableClient tableClient = new CloudTableClient(account.TableEndpoint.AbsoluteUri, account.Credentials);
CloudBlobClient blobClient = new CloudBlobClient(account.BlobEndpoint.AbsoluteUri, account.Credentials);

// The more keys listed here, the better for scale
string[] rangeQueries = new string[] { "3", "8", "a", "f" };
string[] tableNames = new string[] { "Customers", "Orders" };

List<TableKeysInfo> keyInfo = new List<TableKeysInfo>();
foreach(string tableName in tableNames)
{
keyInfo.Add( new TableKeysInfo(tableName, rangeQueries));
}

BackupTables(tableClient, blobClient, keyInfo);


The BackupTables now iterates through each table and for each range in the table, it invokes BackupTableRange which is responsible for saving the result set for the assigned range into a blob. For simplicity, we the example does this sequentially, but for faster backup you would want to parallelize the below to do the regions and tables in parallel.

/// <summary>
///
Backup each table to blobs. Each table will be stored under a container with same name as the table.
/// </summary>
/// <param name="tableClient"></param>
/// <param name="blobClient"></param>
/// <param name="tablesToBackup">
///
The tablesToBackup will contain the table name to a list of keys that will be used to partition the query
/// </param>
public static void BackupTables(CloudTableClient tableClient,
CloudBlobClient blobClient, List<TableKeysInfo> tablesToBackup)
{
if (tableClient == null)
{
throw new ArgumentNullException("tableClient");
}

try
{
// we will use this id as the folder name. The blobs will be stored under:
// <lower cased table name>/<backupid>/
string backupId = DateTime.UtcNow.ToString("yy-MM-dd-HH-mm-ss");

// list each range in each table and backup up each range
foreach (TableKeysInfo tableKeysInfo in tablesToBackup)
{
CloudBlobContainer container = blobClient.GetContainerReference(tableKeysInfo.TableName.ToLower());
container.CreateIfNotExist();

foreach (PartitionKeyRange range in tableKeysInfo.KeyRangeList)
{
BackupTableRange(tableClient, container, tableKeysInfo.TableName, range, backupId);
}
}
}
catch (Exception e)
{
// TODO: log exception for debugging purpose and then rethrow

throw;
}
}


The BackupTableRange builds a query that will scan the assigned key range and then invoke BackupToContainer as shown below. We use the BackupEntity class to read the result. The BackupEntity stores an internal XElement called EntryElement that stores the raw OData XML for the entity that is received in the query response. To get hold of this raw data, we use the ReadingEntity event on the context as shown in the code. The ResolveType delegate is used to provide the type name that WCF Data Service client should use (See this forum post for details).

/// <summary>
///
Create a query that will scan the assigned range and save it to a blob in the given container
/// </summary>
/// <param name="tableClient"></param>
/// <param name="container"></param>
/// <param name="tableName"></param>
/// <param name="range"></param>
/// <param name="backupId"></param>
private static void BackupTableRange(
CloudTableClient tableClient,
CloudBlobContainer container,
string tableName,
PartitionKeyRange range,
string backupId)
{
TableServiceContext context = tableClient.GetDataServiceContext();
context.MergeOption = MergeOption.NoTracking;
context.ResolveType = TableBackup.ResolveType;
context.ReadingEntity += new EventHandler<ReadingWritingEntityEventArgs>(TableBackup.OnReadingEntity);
context.RetryPolicy = RetryPolicies.RetryExponential(5, RetryPolicies.DefaultClientBackoff);

var query = from entity in context.CreateQuery<BackupEntity>(tableName) select entity;
if (range.Min != null)
{
query = query.Where(entity => entity.PartitionKey.CompareTo(range.Min) >= 0);
}

if (range.Max != null)
{
query = query.Where(entity => entity.PartitionKey.CompareTo(range.Max) < 0);
}

CloudTableQuery<BackupEntity> cloudQuery = new CloudTableQuery<BackupEntity>((DataServiceQuery<BackupEntity>)query);

BackupToContainer(container, cloudQuery, backupId, range);
}

/// <summary>
///
Entities used for backup/restore
/// </summary>
[DataServiceKey("PartitionKey", "RowKey")]
public class BackupEntity
{
public string PartitionKey { get; set; }

public string RowKey { get; set; }

/// <summary>
///
Used during restore to store the entry element.
/// </summary>
internal XElement EntryElement { get; set; }
}

static void OnReadingEntity(object sender, ReadingWritingEntityEventArgs args)
{
BackupEntity entity = args.Entity as BackupEntity;
entity.EntryElement = args.Data;
}

static Type ResolveType(string entityName)
{
return typeof(BackupEntity);
}


The BackupToContainer creates a Block Blob to save the result set. Each block in the blob contains a collection of batches. Each batch contains a collection of entities that can be part of a single Entity Group Transaction i.e. batch command. This means all of the entities in the same batch must have the same PartitionKey value. The entity is stored as an entry element which is the raw format that the OData protocol uses to send the entity over the wire. The xml in a blob will look like the following with each Block portion being in a single block.

<?xml version="1.0" encoding="utf-8"?>
<Block>
<Batch>
<entry>...</entry>
<entry>...</entry>
<entry>...</entry>
</Batch>
<Batch>
<entry>...</entry>
<entry>...</entry>
</Batch>
</Block>

<?xml version="1.0" encoding="utf-8"?>
<Block>
<Batch>
<entry>...</entry>
<entry>...</entry>
</Batch>
<Batch>
<entry>...</entry>
<entry>...</entry>
</Batch>
</Block>

Since each block is a well formed xml, it allows us to read one block at a time and execute the group of transactions in them during the restore process.


To build this structure, we will need some classes to be defined that we will go over first.



  • State class – maintains a global state that is used while iterating through the entities in the query. It keeps track of the size that has been serialized into a memory stream. It also controls the entity iteration over the query.

  • Batch class – is a group of entities that can be part of a single batch transaction. It reads an entity from the query and as long as the number of entities is less than 100, the entities have the same partition key, and we do not exceed the size limit, we keep returning entities to batch up.

  • Block class – is a single block element that stores the group of batch commands. A block has a size limit to maintain. A block can be at most 4MB but we limit it to less than that because when the batch request is sent, more xml tags for the request are added as required by the OData protocol and there is a per entity overhead to consider.

  • Blob class – is a list of blocks that will be stored in a single blob. We set an arbitrary limit of 20 blocks per backup blob to allow easy parallelization of the backup restore at the blob level. This can easily be changed to a larger value, but must be less than 50,000 blocks, since that is the limit per Block Blob set by the storage system. The name of the blob will be <Backup Id>/<guid>_<Min range>_<Max range>and will be placed in a container which has the same name as the table but lower cased.

  • So the complete Uri will be: <Container name>/<Backup Id>/<guid>_<Min range>_<Max range>

    • Container name – is the lower cased name of the table

    • Backup Id – is the unique id formed from the timestamp when the backup started. The format used is: “yy-MM-dd-HH-mm-ss”

    • Min range – the min key used for the query. “null” if the min key is unbounded in the query range.

    • Max range – the max key used for the query. “null” if the max key is unbounded in the query range.

Given the above classes, serializing into a blob is simple. As long as there is an entity we have not processed in the iterator, we will group entities into batches and batches into blocks. Each block class will be written to an Azure Blob Block and we will reset the MemoryStream since the next entity will be written to a new batch that belongs to a new block. When we hit the limit for blocks in a blob, we will invoke PutBlockList with all the block ids written and create a new blob for future blocks. The BackupToBlock stores the data we have written into memory stream to a block only if we have seen at least one entity.

static void BackupToContainer(CloudBlobContainer containerToSave, CloudTableQuery<BackupEntity> query,
string backupId, PartitionKeyRange range)
{
// A block can be at most 4 MB in Azure Storage. Though we will be using much less
// we will allocate 4MB for the edge case where an entity may be 1MB
MemoryStream stream = new MemoryStream(4 * 1024 * 1024);
State state = new State(query, stream);

BlobRequestOptions requestOptions = new BlobRequestOptions()
{
RetryPolicy = RetryPolicies.RetryExponential(5, RetryPolicies.DefaultClientBackoff)
};

while (!state.HasCompleted)
{
// Store the resultset to a blob in the container. We will use a naming scheme but the scheme does not
// have any conseuqences on the strategy itself
string backupFileName = string.Format("{0}/{1}_{2}_{3}.xml",
backupId,
Guid.NewGuid(),
range.Min == null ? "null" : range.Min.GetHashCode().ToString(),
range.Max == null ? "null" : range.Max.GetHashCode().ToString());
CloudBlockBlob backupBlob = containerToSave.GetBlockBlobReference(backupFileName);

Blob blob = new Blob(state);

List<string> blockIdList = new List<string>();

foreach (Block block in blob.Blocks)
{
string blockId = BackupBlock(stream, requestOptions, backupBlob, block);
if (!string.IsNullOrEmpty(blockId))
{
blockIdList.Add(blockId);
}
}

if (blockIdList.Count > 0)
{
// commit block list
backupBlob.PutBlockList(blockIdList, requestOptions);
}
}
}

private static string BackupBlock(MemoryStream stream, BlobRequestOptions requestOptions, CloudBlockBlob backupBlob, Block block)
{
int entityCount = 0;

// reset the memory stream as we begin a new block
stream.Seek(0, SeekOrigin.Begin);
stream.SetLength(0);
XmlWriter writer = XmlWriter.Create(stream);

writer.WriteStartElement("Block");
foreach (Batch batch in block.Batches)
{
// write begin batch statement
writer.WriteStartElement("Batch");
foreach (BackupEntity entity in batch.Entities)
{
entityCount++;
entity.EntryElement.WriteTo(writer);
}
writer.WriteEndElement();

}
writer.WriteEndElement();
writer.Close();
stream.SetLength(stream.Position);
stream.Seek(0, SeekOrigin.Begin);

// if we have written > 0 entities, let us store to a block. Else we can reject this block
if (entityCount > 0)
{
backupBlob.PutBlock(block.BlockId, stream, null, requestOptions);
return block.BlockId;
}

return null;
}

/// <summary>
///
The class that maintains the global state for the iteration
/// </summary>
internal class State
{
protected MemoryStream stream;
IEnumerator<BackupEntity> queryIterator;

internal State(CloudTableQuery<BackupEntity> query, MemoryStream stream)
{
this.queryIterator = query.GetEnumerator();
this.stream = stream;
}

/// <summary>
///
This entity is the one we may have retrieved but it does not belong to the batch
/// So we store it here so that it can be returned on the next iteration
/// </summary>
internal BackupEntity LookAheadEntity { private get; set; }

/// <summary>
///
We have completed if look ahead entity is null and iterator is completed too.
/// </summary>
internal bool HasCompleted
{
get
{
return this.queryIterator == null && this.LookAheadEntity == null;
}
}

/// <summary>
///
Get the amount of data we have saved in the entity
/// </summary>
internal long CurrentBlockSize
{
get
{
stream.Flush();
return stream.Position;
}
}

/// <summary>
///
Return the next entity - which can be either the
/// look ahead entity or a new one from the iterator.
/// We return null if there are no more entities
/// </summary>
/// <returns></returns>
internal BackupEntity GetNextEntity()
{
BackupEntity entityToReturn = null;
if (this.LookAheadEntity != null)
{
entityToReturn = this.LookAheadEntity;
this.LookAheadEntity = null;
}
else if (this.queryIterator != null)
{
if (this.queryIterator.MoveNext())
{
entityToReturn = this.queryIterator.Current;
}
else
{
this.queryIterator = null;
}
}

return entityToReturn;
}
}

/// <summary>
///
Represents a collection of entities in a single batch
/// </summary>
internal class Batch
{
static int MaxEntityCount = 100;
// Save at most 3.5MB in a batch so that we have enough room for
// the xml tags that WCF Data Services adds in the OData protocol
static int MaxBatchSize = (int)(3.5 * 1024 * 1024);

State state;

internal Batch(State state)
{
this.state = state;
}

/// <summary>
///
Yield entities until we hit a condition that should terminate a batch.
/// The conditions to terminate on are:
/// 1. 100 entities in a batch
/// 2. 3.5MB of data
/// 2. 3.8MB of block size
/// 3. We see a new partition key
/// </summary>
internal IEnumerable<BackupEntity> Entities
{
get
{
BackupEntity entity;
long currentSize = this.state.CurrentBlockSize;

string lastPartitionKeySeen = null;
int entityCount = 0;

while ((entity = state.GetNextEntity()) != null)
{
if (lastPartitionKeySeen == null)
{
lastPartitionKeySeen = entity.PartitionKey;
}

int approxEntitySize = entity.EntryElement.ToString().Length * 2;
long batchSize = this.state.CurrentBlockSize - currentSize;
if (entityCount >= Batch.MaxEntityCount
|| !string.Equals(entity.PartitionKey, lastPartitionKeySeen)
|| batchSize + approxEntitySize > Batch.MaxBatchSize
|| this.state.CurrentBlockSize + approxEntitySize > Block.MaxBlockSize)
{
// set this current entity as the look ahead since it needs to be part of the next batch
state.LookAheadEntity = entity;
yield break;
}

entityCount++;
yield return entity;
}
}
}
}

/// <summary>
///
Represents all batches in a block
/// </summary>
internal class Block
{
// Though a block can be of 4MB we will stop before to allow buffer
static int MaxBlockSize = (int)(3.8 * 1024 * 1024);

State state;

internal string BlockId { get; private set; }

internal Block(State state)
{
this.state = state;
this.BlockId = Convert.ToBase64String(Guid.NewGuid().ToByteArray());
}

/// <summary>
///
The list of batches in the block.
/// </summary>
internal IEnumerable<Batch> Batches
{
get
{
while (!state.HasCompleted && state.CurrentBlockSize < Block.MaxBlockSize)
{
yield return new Batch(state);
}
}
}
}

/// <summary>
///
Represents all blocks in a blob
/// </summary>
internal class Blob
{
/// <summary>
///
We will allow storing at most 20 blocks in a blob
/// </summary>
static int MaxBlocksInBlobs = 20;

State state;
internal CloudBlob blob { get; private set; }

internal Blob(State state)
{
this.state = state;
}

/// <summary>
///
The blocks that form the blob
/// </summary>
internal IEnumerable<Block> Blocks
{
get
{
int blockCount = 0;

while (!state.HasCompleted && blockCount < Blob.MaxBlocksInBlobs)
{
blockCount++;
yield return new Block(state);
}
}
}
}


This allows us to store a single table into multiple blobs and allows restore to be parallelized across these blobs.


Table Restore


To restore tables from blobs, we can get the list of blobs in the container “table name” and then for each blob, we retrieve the list of blocks. For each block in the blob, we load the xml and retrieve the entry elements and create a BackupEntity instance for each entry we add to the context. Once we have added all entities in a batch element, we can call SaveChanges with Batch option to execute the transaction. The restore process assumes that it is a new table with no existing entities - so any “conflict” errors during the adding process is ignored.

CloudBlobContainer container = blobClient.GetContainerReference(tableName.ToLower());
CloudBlobDirectory dir = container.GetDirectoryReference(backupIdToRestore);

IEnumerable<IListBlobItem> blobs = dir.ListBlobs();
foreach (IListBlobItem blob in blobs)
{
CloudBlockBlob blockBlob = new CloudBlockBlob(blob.Uri.AbsoluteUri);
blockBlob = container.GetBlockBlobReference(blob.Uri.AbsoluteUri);
TableRestore.RestoreTo(tableClient, blockBlob, tableToRestoreTo);
}


The RestoreTo method gets the block list and for each blob, it then downloads the data using range gets for each block. It then retrieves the list of batch elements for each block, and for each batch element it invokes ExecuteBatch method.

static void RestoreTo(CloudTableClient tableClient, CloudBlockBlob blob, string tableName)
{
tableClient.CreateTableIfNotExist(tableName);

BlobRequestOptions options = new BlobRequestOptions()
{
RetryPolicy = RetryPolicies.RetryExponential(5, RetryPolicies.DefaultClientBackoff)
};

// get all blocks and for each block read it separately as it is an xml doc by itself
IEnumerable<ListBlockItem> blocks = (IEnumerable<ListBlockItem>)blob.DownloadBlockList(BlockListingFilter.Committed, options);

long currentOffset = 0;
foreach (ListBlockItem block in blocks)
{
// read each block using the range
HttpWebRequest request = BlobRequest.Get(blob.Uri, 120, null, currentOffset, block.Size, null);
blob.ServiceClient.Credentials.SignRequest(request);

XDocument doc = null;

using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
using (Stream stream = response.GetResponseStream())
{
doc = XDocument.Load(new XmlTextReader(stream));
}
}

IEnumerable<XElement> batchNodes = doc.Element("Batches").Elements("Batch");
foreach(XElement batchNode in batchNodes)
{
ExecuteBatch(tableClient, tableName, batchNode);
}

currentOffset += block.Size;
}


The ExecuteBatch method retrieves all entry elements in the batch and creates a BackupEntity instance and adds it the context. It sets the WritingEntity event to control the xml sent over the wire. WritingEntity is called after WCF Data Services forms the xml element, but before it is serialized over the wire. We use the WritingEntity event to replace the property element of what WCF Data Services has written with the property element that we have retrieved from the entry element saved in the blob.

static void ExecuteBatch(CloudTableClient tableClient, string tableName, XElement batchNode)
{
TableServiceContext context = tableClient.GetDataServiceContext();
context.WritingEntity += new EventHandler<ReadingWritingEntityEventArgs>(OnWritingEntity);

// for each entry create a backup entity
IEnumerable<XElement> entries = batchNode.Elements(AtomNamespace + "entry");
foreach (XElement entryNode in entries)
{
XElement propertiesElem = entryNode.Elements(AtomNamespace + "content")
.Elements(AstoriaMetadataNamespace + "properties")
.FirstOrDefault();
XElement pkElement = propertiesElem.Element(AstoriaDataNamespace + "PartitionKey");
XElement rkElement = propertiesElem.Element(AstoriaDataNamespace + "RowKey");

BackupEntity entity = new BackupEntity()
{
PartitionKey = pkElement.Value,
RowKey = pkElement.Value,
EntryElement = entryNode
};

context.AddObject(tableName, entity);
}

context.BatchWithRetries(TableExtensions.RetryExponential());
}

static void OnWritingEntity(object sender, ReadingWritingEntityEventArgs args)
{
BackupEntity entity = args.Entity as BackupEntity;
XElement content = args.Data.Element(AtomNamespace + "content");
XElement propertiesElem = content.Element(AstoriaMetadataNamespace + "properties");

propertiesElem.Remove();

XElement propertiesElemToUse = entity.EntryElement.Elements(AtomNamespace + "content")
.Elements(AstoriaMetadataNamespace + "properties")
.FirstOrDefault();

content.Add(propertiesElemToUse);
}


The following are some improvements that can be applied to the above code are:



  1. Parallel processing of backup and restore for faster backup and restore.

    • For backups, we can parallelize the query execution over the different ranges as well as over each table in the storage account. When we process the ranges in parallel, it can be good to randomly select the query ranges to better spread out the load of the backup over your tables in production.

    • For restore we can parallelize by processing either different blobs or blocks in parallel and if we have multiple tables to restore, we can also parallelize across the different backup blob containers (representing different tables).

  2. Rather than taking the list of keys as input from the user, we could remember the ranges we see from each run of the backup, and use them as input to guide the next backup to be performed. In doing this, each time we perform the backup, we would remember the key ranges we see, and use that information to form better backup ranges for the next backup to be performed in parallel. This can then be continuously improved based on how the dataset changes. In fact, with the above approach, one could list through the blobs from the prior backup and get the key ranges from the blob names. Then use this information to form the desired size of ranges when performing the next parallel backup.

  3. You may want to store a version or other metadata in the xml and with the blob so that it can be used during restore.

  4. You may also wish to store the blocks in a compressed format.

The code provided in this post is meant to be used as building blocks and not a full-fledged backup and restore tool. The idea was to go over some challenges and the options available to solve them. Please test the code if you intend to use it in your application.


Libraries that support backup


In this section we wanted to list known backup tools that we are aware of. We should point out that we have not verified the functionality claimed by these utilities and their listing does not imply an endorsement by Microsoft. Since these applications have not been verified, it is possible that they could exhibit undesirable behavior.















Windows Azure Storage Backup Utilities


Blob


Table


Free


Table Storage Backup & Restore for Windows Azure


No


Yes


Yes


Also, please email us (using the link on the left of this site) if you have any ideas on the above code or if we have missed any library that provides data backup for storage accounts and we will add them to this list.


Jai Haridas

Comments (3)

  1. Anthony Super says:

    Great article thanks. There are a few typos in the code that caught us out:

    – ExecuteBatch function: RowKey = rkElement.Value, not RowKey = pkElement.Value

    – RestoreTo function: IEnumerable<XElement> batchNodes = doc.Element("Block").Elements("Batch"), not IEnumerable<XElement> batchNodes = doc.Element("Batches").Elements("Batch")

    Cheers,

    Anthony.

  2. Mikel says:

    Thanks for this article. I am fairly new to Azure, and there is something I would like to know. Could you give tips on how to integrate this backup solution (and also the earler solution for blobs) with an existing Asp.net/Silverlight application that is running within Azure? For instance, what role should the backup code run under, and what is the recommended way to trigger a backup within the application? We need to automate this as much as possible.

    Thanks, Mikel

  3. Jai Haridas (MSFT) says:

    @Mikel, I would run this in a worker role that gets the schedule for backup from a Azure Table. You can now modify the "schedule" row to change the frequency/time of backup. The table itself can be modified either via a ASP.NET/Silverlight app or using some app which allows you to modify rows in Azure Table. The worker role can get this schedule on startup and from there on, either poll every hour for a change or just sleep until it is time to perform the backup job (if applying the change to schedule is not very important). If you update the schedule via an app, you could trigger the change by updating the config for the worker role using service management APIs.

    Thanks,

    jai