Posts Tagged 'NoSQL'

Implementing a Data Storage Tier in the Cloud Using Amazon DynamoDB

One of the most important aspects of application design is the storage tier. For many developers a relational database has traditionally been the engine of choice. While that is often a good solution, the cloud has certainly brought other options to light. NoSQL is increasingly gaining popularity as an alternative that offers internet-class scalability when the benefits and overhead associated with a relational database are neither needed nor desired.

Of course good system design would incorporate a logical model of the data anyway which would be abstracted from any particular physical storage technology. That logical model could be implemented  using a variety of products and would mitigate the risk of vendor lock in. That is not, however, a subject I am going to cover in this post. Here I am going to just move ahead with how a simple data model can be implemented using Amazon DynamoDB.

It is very easy to get started using DynamoDB. After you subscribe to the service you can use the AWS Console to provision resources. At some point, though, you will have to write some code. Amazon provides an SDK for Java, .NET and PHP. If you use the Eclipse or Visual Studio IDE there is also a plug-in toolkit which will enable you to easily create projects that use DynamoDB and will allow you to interactively create tables right from within your development environment. You can also create tables in your code.

Figure 1. Installing the AWS toolkit for Visual Studio allows access to AWS Services from within the IDE

NoSQL databases such as DynamoDB are schema-less. That means tables are defined simply in terms of a hash key and an optional range key. Tables contain items and each item has one or more attributes. Attributes are simple name-value pairs. Items in a table do not necessarily all have to have the same attributes. Tables in DynamoDB are also provisioned for a “throughput capacity” (read and write) and may be configured for alarms using CloudWatch. Relationship semantics between the tables are handled in the code. While this may seem a little strange to someone who is used to working only with relational databases it does allow for a lot of flexibility.

Figure 2. Creating a table in DynamoDB

For example the application I am building is going to be used to track golf betting games that might occur in a regular weekend choose-up. On any given Sunday’s foursome there may be one or more of these games in play. Each game has a slightly different set of rules, point count, stake and payout logic. The goal of this application is to minimize confusion when it comes time to settle up on the 19th hole!

Now, this application could become pretty complex because of all of the possible sub-games that could occur on a round. Also golf betting is usually done based on net strokes per hole which may be taken as they lay or played off the low handicap. We will leave most of these complications aside for now and start by building something simple which can be later refined. This, to me anyway, is one of the beauties of a NoSQL approach to storage.

So, to get started, let’s say at a minimum I will need to store data for each Round, the Players involved and the Games in play. I will create three tables in code as follows after downloading the AWS SDK and re-using some sample code:

Figure 3. Creating Rounds table in code. Players and Games are similar but do not require a Range Key.

Next week we will populate these tables and start using them for storage in the application.

To learn more about cloud computing with Amazon, check out Learning Tree’s course, Cloud Computing with Amazon Web Services.

Kevin Kell

What is NoSQL ?

Today saw the release of Spring Hadoop, the Spring Frameworks support for working with Hadoop. This is an addition to the Spring Data project that provides support for working with the now many non-relational data storage facilities available to application developers. These storage solutions are often termed NoSQL, Big Data, Big Table and Cloud Storage amongst many others. Probably the most common term used is NoSQL storage to distinguish these from relational databases. To try and help clarify the types of storage solutions available I have listed them as follows with example implementations of each.

Column Stores enable date to be stored in a large grid structure. Data is accessed based on column values using bespoke query syntax. Examples include Cassandra, Googles Big Table, Amazon Table Storage and Microsoft Azure Table Storage.

Blob Stores enable the storage of binary objects that are assigned a unique URL in store that can be used to access the data. Examples include Amazon’s Simple Storage Service (S3) and Microsoft Azure Blob storage.

Graph Storage enables data to be stored in a graph of related objects – think people with friends in Facebook. An example of graph storage is Neo4J.

Document Storage stores data in document form rather than individual values. The most popular example is MongoDB.

Key Value Storage typically used for caches and extremely fast data lookup. An example is Redis.

Hadoop a large distributed file store that facilitates the processing of this large scale data in an efficient manner.

So to summarise, with NoSQL there are a variety of different data stores available. These have evolved rapidly because when building applications today, the data can be categorised according to three dimensions: volume, velocity and variety. The velocity is the rate at which the data grows/changes. Based on your data requirements there is a suitable solution available that may well be a NoSQL solution. Many of these NoSQL data stores are discussed in detail in Learning Tree’s Cloud Computing course. If you are interested, why not consider attending.

Chris Czarnecki

Amazon DynamoDB Ups the Ante for NoSQL Database Service

This past week I watched with great interest as Amazon CTO Werner Vogels announced the launch of Amazon’s DynamoDB service. I feel that rather than trying to say something pithy I will just recommend that you check it out for yourselves.

DynamoDB is a NoSQL database service that is, in my opinion, head and shoulders above what Amazon previously offered with SimpleDB.

DynamoDB removes almost all of the administrative burden associated with provisioning a database for an application. Developers can simply create a database and assume it will be available to store and retrieve any amount of data and serve any level of traffic that may materialize. DynamoDB handles all the load balancing for you transparently behind the scenes.

Unlike some NoSQL databases, DynamoDB gives the developer the choice between strong consistency or eventual consistency on every database transaction. This allows for great control over what happens when data is read or written. Also, DynamoDB has built-in fault tolerance to automatically and synchronously replicate data across multiple Availability Zones.

DynamoDB also integrates with Amazon Elastic Map Reduce. For example it is pretty straightforward to use EMR to analyze data stored in DynamoDB and to archive results in Amazon S3.

DynamoDB is an example of another storage option offered in the cloud. Developers should consider this option for any future development projects they may have.

Kevin Kell


Learning Tree Logo

Cloud Computing Training

Learning Tree offers over 210 IT training and Management courses, including Cloud Computing training.

Enter your e-mail address to follow this blog and receive notifications of new posts by e-mail.

Join 53 other followers

Follow Learning Tree on Twitter

Archives

Do you need a customized Cloud training solution delivered at your facility?

Last year Learning Tree held nearly 2,500 on-site training events worldwide. To find out more about hosting one at your location, click here for a free consultation.
Live, online training
.NET Blog

%d bloggers like this: