We are pleased to announce an updated release of the Windows Azure Storage Client for Java. This release includes several notable features such as logging support, new API overloads, and full support for the 2013-08-15 REST storage server version. (See here for details). As usual all of the source code is available via github (note the updated location). You can download the latest binaries via maven:
<dependency>
<groupId>com.microsoft.windowsazure.storage</groupId>
<artifactId>microsoft-windowsazure-storage-sdk</artifactId>
<version>0.5.0</version>
</dependency>
Emulator Guidance
Please note, that the 2013-08-15 REST is currently unsupported by the storage emulator. An updated Windows Azure Storage Emulator is expected to ship with full support of these new features in the next couple of months. Users attempting to develop against the current version of the Storage emulator will receive Bad Request errors in the interim. Until then, users wanting to use the new features would need to develop and test against a Windows Azure Storage Account to leverage the 2013-08-15 REST version.
Samples
We have provided a series of samples on github to help clients get up and running with each storage abstraction and to illustrate some additional key scenarios. To run a given sample using Eclipse, simply load the samples project and update the following line in Utility.java to provide your storage credentials.
public static final String storageConnectionString = "DefaultEndpointsProtocol=http;AccountName=[ACCOUNT_NAME];AccountKey=[ACCOUNT_KEY]";
If you wish to use fiddler to inspect traffic while running the samples, please uncomment the following 2 lines in Utility.java.
// System.setProperty("http.proxyHost", "localhost");
// System.setProperty("http.proxyPort", "8888");
After updating Utiltity.java, right-click on the specific project you want to run and click on Run As Java Application.
A Note about Packaging and Versioning
We have migrated the Storage package out of the larger Windows Azure SDK for Java for this release. Developers who are currently leveraging the existing SDK will need to update their dependencies accordingly. Furthermore, the package names have been changed to reflect this new structure:
- com.microsoft.windowsazure.storage – RetryPolicies, LocationMode, StorageException, Storage Credentials etc. All public classes that are common across services
- com.microsoft.windowsazure.storage.blob - Blob convenience implementation, applications utilizing Windows Azure Blobs should include this namespace in their import statements
- com.microsoft.windowsazure.storage.queue - Queue convenience implementation, applications utilizing Windows Azure Queues should include this namespace in their import statements
- com.microsoft.windowsazure.storage.table - Table convenience implementation, applications utilizing Windows Azure Tables should include this namespace in their import statements
For a more detailed list of changes in this release, please see the Change Log & Breaking Changes section below.
We are also adopting the SemVer specification regarding all of the storage client sdk components we provide. This will help provide consistent and predictable versioning guidance to developers who leverage the sdk.
Whats New
The 0.5.0 version of the Java client library provides full support for the 2013-08-15 REST service version (you can read more about the supported features here), as well as key client improvements listed below.
Support for Read Access Geo Redundant Storage
This release has full support for Read Access to the storage account data in the secondary region. This functionality needs to be enabled via the portal for a given storage account. You can read more about RA-GRS here. As mentioned in the blog, there is a getServiceStats API for Cloud[Blob|Table|Queue]Client that allows applications to easily retrieve the replication status and LastSyncTime for each service. Setting the location mode on the client object and invoking getServiceStats is shown in the example below. The LocationMode can also be configured on a per request basis by setting it on the RequestOptions object.
CloudStorageAccount httpAcc = CloudStorageAccount.parse(connectionString);
CloudTableClient tClient = httpAcc.createCloudTableClient();
// Set the LocationMode to SECONDARY_ONLY since getServiceStats is supported only on the secondary endpoints.
tClient.setLocationMode(LocationMode.SECONDARY_ONLY);
ServiceStats stats = tClient.getServiceStats();
Date lastSyncTime = stats.getGeoReplication().getLastSyncTime();
System.out.println(String.format("Replication status = %s and LastSyncTime = %s",stats.getGeoReplication().getStatus().toString(), lastSyncTime != null ? lastSyncTime.toString(): "empty"));
Expanded Table Protocol Support (JSON)
In the previous release all table traffic was sent using the AtomPub protocol. With the current release the default protocol is now JSON minimal metadata. (You can read more details regarding these protocols as well as view sample payloads here) This improvement allows the client to dramatically reduce the payload size of the request as well as reduce the CPU required to process the request. These improvements allow client applications to scale higher and realize lower overall latencies for table operations. An example of setting the tablePayloadFormat on the client object is shown in the example below. The tablePayloadFormat can also be configured on a per request basis by setting it on the TableRequestOptions object.
CloudStorageAccount httpAcc = CloudStorageAccount.parse(connectionString);
CloudTableClient tClient = httpAcc.createCloudTableClient();
// Set the payload format to JsonNoMetadata.
tableClient.setTablePayloadFormat(TablePayloadFormat.JsonNoMetadata);
When using JsonNoMetadata the client library will “infer” the property types by inspecting the type information on the POJO entity type provided by the client. Additionally, in some scenarios clients may wish to provide the property type information at runtime such as when querying with the DynamicTableEntity or doing complex queries that may return heterogeneous entities. To support this scenario the user should implement PropertyResolver which allows users to return an EdmType for each property based on the data received from the service. The sample below illustrates a propertyResolver implementation.
public static class Class1 extends TableServiceEntity implements PropertyResolver {
private String A;
private byte[] B;
public String getA() {
return this.A;
}
public byte[] getB() {
return this.B;
}
public void setA(final String a) {
this.A = a;
}
public void setB(final byte[] b) {
this.B = b;
}
@Override
public EdmType propertyResolver(String pk, String rk, String key, String value) {
if (key.equals("A")) {
return EdmType.STRING;
}
else if (key.equals("B")) {
return EdmType.BINARY;
}
return null;
}}
This propertyResolver is set on the TableRequestOptions as shown below.
Class1 ref = new Class1();
ref.setA("myPropVal");
ref.setB(new byte[] { 0, 1, 2 });
ref.setPartitionKey("testKey");
ref.setRowKey(UUID.randomUUID().toString());
options.setPropertyResolver(ref);
Table Insert Optimizations
In previous versions of the REST api the Prefer header was not supported for Insert operations. As a result, the service would “echo” back the entity content in the response body. With this release all Table Insert operations, including those executed as part of a batch operation, will send the Prefer: return-no-content header to avoid this behavior. This optimization can dramatically reduce latencies for insert operations. Please note that this will cause the resulting HTTP status code on the TableResult for successful inserts to be 204 (no-content) rather than 201 (Created). The echo content behavior can be re-enabled by using the insert(TableEntity, boolean) method and specifying true.
Table Reflection Optimizations
When clients are persisting POJO objects to the Table Service the client can now cache the type and property information to avoid repeated reflection calls. This optimization can dramatically reduce CPU during queries and other table operations. Note, clients can disable this cache by setting TableServiceEntity.setReflectedEntityCacheDisabled(true).
New APIs and overloads
In response to customer feedback we have expanded the api surface to add additional conveniences, including:
- CloudBlob.downloadRange
- CloudBlob.downloadToByteArray
- CloudBlob.downloadRangeToByteArray
- CloudBlob.uploadFromByteArray
- CloudBlob.uploadFromByteArray
- CloudBlob.downloadToFile
- CloudBlob.uploadFromFile
- CloudBlockBlob.uploadText
- CloudBlockBlob.downloadText
Logging
The 0.5.0 release supports logging via the SLF4J logging facade. This allows users to utilize various logging frameworks in conjunction with the storage api in order to log information regarding request execution (See below for a table of what information is logged). We plan on providing additional rich logging information in subsequent releases to further assist clients in debugging their applications.
Logged Data
Each log line will include the following data:
- Client Request ID: Per request ID that is specified by the user in OperationContext
- Event: Free-form text
- Any other information, as determined by the user-chosen underlying logging framework
For example, if the underlying logging framework chosen is Simple, the log line might look something like the following:
[main] INFO ROOT - {c88f2fe5-7150-467b-8571-7aa40a9a3e27}: {Starting operation.}
Trace Levels
Level |
Events |
ERROR |
If an exception cannot or will not be handled internally and will be thrown to the user; it will be logged as an error. |
WARN |
If an exception is caught and might be handled internally, it will be logged as a warning. Primary use case for this is the retry scenario, where an exception is not thrown back to the user to be able to retry. It can also happen in operations such as CreateIfNotExists, where we handle the 404 error silently. |
INFO |
The following info will be logged:
|
DEBUG |
Nothing logged at this level currently. |
TRACE |
Nothing logged at this level currently. |
*Please take care when enabling logging while using SAS as the SAS tokens themselves will be logged. As such, clients using SAS with logging enabled should take care to protect the logging output of their application.
Enabling Logging
A key concept is the opt-in / opt-out model that the client provides to tracing. In typical applications it is customary to enable tracing at a given verbosity for a specific class. This works fine for many client applications, however for cloud applications that are executing at scale, this approach may generate much more data than what is required by the user. As such we have provided an opt-in model for logging which allows clients to configure listeners at a given verbosity, but only log specific requests if and when they choose. Essentially this design provides the ability for users to perform “vertical” logging across layers of the stack targeted at specific requests rather than “horizontal” logging which would record all traffic seen by a specific class or layer.
Another key concept is the facade logging model. SLF4J is a facade and does not provide a logging framework. Instead, the user may choose the underlying logging system, whether it be the built in jdk logger or one of the many open-source alternatives (see the SLF4J site for a list of compatible loggers). Once a logger is selected clients can add the corresponding jar which binds the facade to the chosen framework and the application will log using the settings specific to that framework. As a result, if the user has already chosen a logging framework for their application the storage sdk will work with that framework rather than requiring a separate one. The façade model allows users to change the logging framework easily throughout the development process and avoid any framework lock in.
Choose a Logging Framework
To enable logging, first choose a logging framework. If you already have one, simply add the corresponding SLF4J binding jar to your classpath or add a dependency to it in Maven. If you do not already have a logging framework, you will need to choose one and add it to your classpath or add a dependency to it in Maven. If it does not natively implement SLF4J you will also need to add the corresponding SLF4J binding jar. SLF4J comes with its own logger implementation called Simple, so we will use that in our example below. Either download the slf4j-simple-1.7.5.jar from SLF4J and add it to your classpath or include the following Maven dependency:
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>1.7.5</version>
</dependency>
This will enable logging to the console with a default log level of info. To change the logging settings, follow the directions for Simple.
1. Turn Logging on in the SDK
By default, the Azure Storage Java SDK will not produce any logs, even if it finds a logging framework and corresponding SLF4J binding. This way, the user is not forced to edit any log framework settings if they do not want Azure Storage logs. If logging is turned on, the root logger settings will be used by default. A different logger may be specified on a per request basis.
Example 1: Enable logging on for every request:
OperationContext.setLoggingEnabledByDefault(true);
Example 2: Enable logging for a single request:
OperationContext ctx = new OperationContext();
ctx.setLoggingEnabled(true);
blockBlobRef.upload(srcStream, -1, null /* accessCondition */, null /* requestOptions */, ctx);
Example 3: To use a specific logger for a request:
OperationContext ctx = new OperationContext();
// turn logging on for that operation context
ctx.setLoggingEnabled(true);
// the slf4j logger factory will get the logger with this name and use
// this logger’s settings include location to log and log level
ctx.setLogger(LoggerFactory.getLogger(“MyLogger”));
blockBlobRef.upload(srcStream, -1, null /* accessCondition */, null /* requestOptions */, ctx);
With client side logging used in conjunction with storage service logging clients can now get a complete view of their application from both the client and server perspectives.
Change Log & Breaking Changes
As mentioned above, this release supports the 2013-08-15 REST service version and details of features and changes in the version can be found in MSDN and also blogged here. In addition to the REST changes, this release includes several client side changes and features. Key changes to note for this release are highlighted below. You can view the complete ChangeLog and BreakingChanges log on github.
Common
- Package Restructure
- RetryResult has been replaced by RetryInfo which provides additional functionality
- Event operations (including event firing) that occur during a request are no longer synchronized, (thread safety is now guaranteed by a CopyOnWriteArrayList of the event listeners)
- OperationContext.sendingRequest event is now fired prior to the connection being established, allowing users to alter headers
Blob
- Blob downloadRange now downloads to a Stream. The previous downloadRange has been renamed to downloadRangeToByteArray.
- Removed sparse page blob feature
- CloudBlobContainer.createIfNotExist was renamed to CloudBlobContainer.createIfNotExists
- CloudBlobClient.streamMinimumReadSizeInBytes has been removed. This functionality is now provided by CloudBlob.streamMinimumReadSizeInBytes (settable per-blob, not per-client.)
- CloudBlobClient.pageBlobStreamWriteSizeInBytes and CloudBlobClient.writeBlockSizeInBytes have been removed. This functionality is now provided by CloudBlob.streamWriteSizeInBytes.
Table
- Removed id field (along with getId, setId) from TableResult
- CloudTable.createIfNotExist was renamed to CloudTable.createIfNotExists
- Inserts in operations no longer echo content. Echo content can be re-enabled by using the insert(TableEntity, boolean) method and specifying true. This will cause the resulting HTTP status code on the TableResult for successful inserts to be 204 (no-content) rather than 201 (Created).
- JsonMinimalMetadata is now the default payload format (rather than AtomPub). Payload format can be specified for all table requests by using CloudTableClient.setTablePayloadFormat or for an individual table request by using TableRequestOptions.setTablePayloadFormat.
Queue
- CloudQueue.createIfNotExist was renamed to CloudQueue.createIfNotExists
Summary
We are continuously making improvements to the developer experience for Windows Azure Storage and very much value your feedback in the comments section below, the forums, or GitHub. If you hit any issues, filing them on GitHub will also allow you to track the resolution.
Joe Giardino, Veena Udayabhanu, Emily Gerner, and Adam Sorrin
Resources
Windows Azure Storage Release - Introducing CORS, JSON, Minute Metrics, and More
Windows Azure Tables: Introducing JSON
Windows Azure Storage Redundancy Options and Read Access Geo Redundant Storage
"CloudBlob.uploadFromByteArray" is duplicated.
Using this API to download Page Blobs – VHDs (>2GB): downloadToFile.
Download API returns back with a Storage Exception (An incorrect number of bytes was read from the connection. The connection may have been closed). Reading from source input stream and writing back to output stream is not completing properly and before that it exits.
Setting default timeout value to 0 – options.setTimeoutIntervalInMs(0);
Am I missing something anywhere ? Is that related to HttpConnectionUrl settings/timeout ?
This is a known issue in version 0.5.0. Blob downloads are not retried by default and therefore if you hit an exception while downloading a blob due to a network issue, you would get a non-retryable exception. We have fixed this in our latest release (0.6.0). You can grab it from here – mvnrepository.com/…/0.6.0