Tag Archives: java

Getting Started With Windows Azure Blob Storage Using Java

The following is a quick start guide to getting up and running with Windows Azure and Blob storage using Java.  First thing you’ll need to do is setup a Windows Azure account, you can do that here: https://www.windowsazure.com

Next you’ll need to download the Azure SDK for Java which is available here: https://www.windowsazure.com/en-us/develop/java/

Azure SDK Download

Azure SDK Download

I found it easier to simply download the libraries directly and include the azure jar file (microsoft-windowsazure-api-0.3.2.jar) directly.

Windows Azure Libraries

Windows Azure Libraries

Next you’ll need to capture your Account Name and Account Key.

Windows Azure Dashboard

Windows Azure Dashboard

Windows Azure Keys

Windows Azure Keys

Now for the fun part, the code.  I’ve already done a lot of the heavy lifting for you (you’re welcome) so all you’ll really need to do is modify a few values and implement as needed.  So the first thing you’ll need is to take the Key values from above and create an Azure connection string like so:

// Replace your-account-name and your-account-key with your account values
public static final String storageConnectionString = 
"DefaultEndpointsProtocol=http;" + 
"AccountName=your-account-name;" + 
"AccountKey=your-account-key";

Now let’s create a client which we’ll use to communicate with the Azure blob storage.
// Connection String
CloudStorageAccount storageAccount = CloudStorageAccount.parse(storageConnectionString);
 
// Create the client
CloudBlobClient blobClient = storageAccount.createCloudBlobClient();

Now that we have a client we can begin performing actions like upload, listing and retrieving files.  Let’s start with listing files.  You’ll need a root container name, this is essentially the name of the bucket that you’re going to be sticking your files in.  Now let’s list some files:
// Change this to your container name
String containerName = "your-container-name";
 
// Create the client
CloudBlobClient blobClient = storageAccount.createCloudBlobClient();
 
// Get a reference to the container
CloudBlobContainer container = blobClient.getContainerReference(containerName);
 
// Create the container if it does not exist
container.createIfNotExist();
 
// List them
for (ListBlobItem blobItem : container.listBlobs())
   System.out.println(blobItem.getUri());

So here’s what’s going on, we’re using the blob client we created which is looking at the container (or root bucket) and generating a list of the blobs there, if it container doesn’t exist we create it. Next we’re iterating the blobs and displaying the URI for each.  Now let’s say we had a folder structure on that container.  Something to the extent of an “images” directory with image files inside it.  Here’s how you’d list those files:

// Change this to your container name
String containerName = "your-container-name";
 
// Create the client
CloudBlobClient blobClient = storageAccount.createCloudBlobClient();
 
// Get a reference to the container
CloudBlobContainer container = blobClient.getContainerReference(containerName);
 
// Create the container if it does not exist
container.createIfNotExist();
 
// Get a reference to the images directory
CloudBlobDirectory directory = container.getDirectoryReference("images/");
 
// List them
for (ListBlobItem blobItem : directory.listBlobs())
   System.out.println(blobItem.getUri());

The only real difference between this example and the previous one is simply the list of either the container or the directory. In this case we get a reference to the cloud directory then list.  Now that we can list container and directory contents, let’s upload some files:

// Change to your file to upload
File source = new File("path/to/your/file");
 
// Where are we going to store the files?
String uri = "your-container-name/images";
 
// Change this to your container name
String containerName = "your-container-name";
 
// Create the client
CloudBlobClient blobClient = storageAccount.createCloudBlobClient();
 
// Retrieve reference to a previously created container
CloudBlobContainer container = blobClient.getContainerReference(uri);
 
// Create the container if it does not exist
container.createIfNotExist();
 
// Let's upload the file.
CloudBlockBlob blob = container.getBlockBlobReference(source.getName());
blob.upload(new FileInputStream(source), source.length());

The above example will upload the source file into the images directory of your container. The only thing left to do now is download a file and pattern is complete.

// Where are we going to save the downloaded file?
File destination = new File("path/to/your/file");
 
// Change this to your container name
String containerName = "your-container-name";
 
// Get a reference to the container
CloudBlobContainer container = blobClient.getContainerReference(containerName);
 
// Create the container if it does not exist
container.createIfNotExist();
 
// Get the actual Azure storage uri
CloudBlobDirectory directory = container.getDirectoryReference("images/");
String fileUri = String.format("%s/%s", directory.getUri(), destination.getName());
 
// Download it
blobItem.download(new FileOutputStream(destination));

And that’s the whole enchilada, hopefully this helped you save a little bit of time and got you up to speed quickly…and I’m out.

Web Developer Essential Tools

Foreword

The following highlights what I believe to be the best tools for web development and best of all, they’re all free.  This is a living reference which will be updated as new and better tools are found.  Feedback and suggestions are always welcome.

The List

Google Chrome

Google Chrome

Google Chrome

Light, extremely fast, jam packed with features and a debugger that makes short work of CSS and JavaScript.  The ability to right click inspect an element, change CSS values live in the DOM using up, down arrows or by entering values directly make this an essential tool for web development.

Get Google Chrome

ColorPic

ColorPic

ColorPic

Fantastic color picking tool which allows for obtaining the color values for anything on the screen.  Creation and loading of palettes, values can be slider modified for easy lighting or darkening of tones.  Shortcuts make color selection a cinch.  A must have tool.

Get ColorPic

Tape: Google Chrome Extension

Tape: Google Chrome Extension

Tape: Google Chrome Extension

You know that problem you have in CSS where you’re trying to line up elements either horizontally or vertically and you end up using paper to see if the alignment is correct, well you can forget about doing that ever again.  The Tape plugin allows you to easily place lines on the screen, then you can adjust your CSS to match.

Get Tape Extension

XML Tree: Google Chrome Extension

XML Tree: Google Chrome Extension

XML Tree: Google Chrome Extension

Display XML, RSS any form of node tree in a collapsible human readable format.

Get XML Tree Extension

Awesome Screenshot: Google Chrome Extension

Awesome Screenshot: Google Chrome Extension

Awesome Screenshot: Google Chrome Extension

Quickly screen grab what’s being displayed in the browser, draw, blur, annotate, write, etc. all from within the browser. Save when you’re done either locally or in the cloud.

Get Awesome Extension

LightTable

LightTable

LightTable

LightTable is one of the most robust and unique editors I have ever used.  Extremely powerful, easily customized with more features than I can list here.

Get LightTable

GIMP

GIMP

GIMP

If you’re used to Photoshop, it will take a little bit of a learning curve to get used to GIMP’s shortcuts and approach to image manipulation. However, once you get used to the shortcuts and image editing as a whole, you won’t want to go back. Select an area, resize the selection area by mouse or by shrink or grow amounts, copy then shortcut paste into a new image frame, fast. No more, click save, select extension type from a drop down, confirm, confirm, confirm. Instead, shortcut for save, enter filename with the extension (.png, .jpg, .gif, etc.), hit enter, enter, done. Made for rapid manipulation of images, tons of plugins, filters, great tool.

Get GIMP

Using Hadoop And PHP

Getting Started

So first things first.  If you haven’t used Hadoop before you’ll first need to download a Hadoop release and make sure you have Java and PHP installed.  To download Hadoop head over to:

http://hadoop.apache.org/common/releases.html

Click on download a release and choose a mirror.  I suggest choosing the most recent stable release.  Once you’ve downloaded Hadoop, unzip it.

user@computer:$ tar xpf hadoop-0.20.2.tar.gz

I like to create a symlink to the hadoop-<release> directory to make things easier to manage.

user@computer:$ link -s hadoop-0.20.2 hadoop

Now you should have everything you need to start creating a Hadoop PHP job.

Creating The Job

For this example I’m going to create a simple Map/Reduce job for Hadoop.  Let’s start by understanding what we want to happen.

  1. We want to read from an input system – this is our mapper
  2. We want to do something with what we’ve mapped – this is our reducer

At the root of your development directory, let’s create another directory called script.  This is where we’ll store our PHP mapper and reducer files.

user@computer:$ ls

.
..
hadoop-0.20.2
hadoop-0.20.2.tar.gz
hadoop
user@computer:$ mkdir script

Now let’s being creating our mapper script in PHP.  Go ahead and create a PHP file called mapper.php under the script directory.

user@computer:$ touch script/mapper.php

Now let’s look at the basic structure of a PHP mapper.

#!/usr/bin/php
<?php
//this can be anything from reading input from files, to retrieving database content, soap calls, etc.
//for this example I'm going to create a simple php associative array.
$a = array(
'first_name' => 'Hello',
'last_name' => 'World'
);
//it's important to note that anything you send to STDOUT will be written to the output specified by the mapper.
//it's also important to note, do not forget to end all output to STDOUT with a PHP_EOL, this will save you a lot of pain.
echo serialize($a), PHP_EOL;
?>

So this example is extremely simple.  Create a simple associative array and serialize it.  Now onto the reducer.  Create a PHP file in the script directory called reducer.php.

user@computer:$ touch script/reducer.php

Now let’s take a look at the layout of a reducer.

#!/usr/bin/php
 
<?php
 
//Remember when I said anything put out through STDOUT in our mapper would go to the reducer.
//Well, now we read from the STDIN to get the result of our mapper.
//iterate all lines of output from our mapper
while (($line = fgets(STDIN)) !== false) {
    //remove leading and trailing whitespace, just in case 🙂
    $line = trim($line);
    //now recreate the array we serialized in our mapper
    $a = unserialize($line);
    //Now, we do whatever we need to with the data.  Write it out again so another process can pick it up,
    //send it to the database, soap call, whatever.  In this example, just change it a little and
    //write it back out.
    $a['middle_name'] = 'Jason';
    //do not forget the PHP_EOL
    echo serialize($a), PHP_EOL;
}//end while
?>

So now we have a very simple mapper and reducer ready to go.

Execution

So now let’s run it and see what happens.  But first, a little prep work.  We need to specify the input directory that will be used when the job runs.

user@computer:$ mkdir input
user@computer:$ touch input/conf

Ok, that was difficult.  We have an input directory and we’ve created an empty conf file.  The empty conf file is just something that the mapper will use to get started.  For now, don’t worry about it.  Now let’s run this bad boy.  Make sure you have your JAVA_HOME set, this is usually in the /usr directory.  You can set this by running #export JAVA_HOME=/usr.

user@computer:$ hadoop/bin/hadoop jar hadoop/contrib/streaming/hadoop-0.20.2-streaming.jar -mapper script/mapper.php -reducer script/reducer.php -input input/* -output output

So here’s what the command does.  The first part executes the hadoop execute script.  The “jar” argument tells hadoop to use a jar, in this case it tells it to use “hadoop/contrib/streaming/hadoop-0.20.2-streaming.jar”.  Next we pass the mapper and reducer arguments to the job and specify input and output directories.  If we wanted to, we could pass configuration information to the mapper, or files, etc.  We would just use the same line read structure that we used in the reducer to get the information.  That’s what would go in the input directory if we needed it to.  But for this example, we’ll just pass nothing.  Next the output directory will contain the output of our reducer.  In this case if everything works out correct, it will contain the PHP serialized form of our modified $a array.  If all goes well you should see something like this:

user@computer:$ hadoop/bin/hadoop jar hadoop/contrib/streaming/hadoop-0.20.2-streaming.jar -mapper script/mapper.php -reducer script/reducer.php -input input/* -output output

10/12/10 12:53:56 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
10/12/10 12:53:56 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
10/12/10 12:53:56 INFO mapred.FileInputFormat: Total input paths to process : 1
10/12/10 12:53:56 INFO streaming.StreamJob: getLocalDirs(): [/tmp/hadoop-root/mapred/local]
10/12/10 12:53:56 INFO streaming.StreamJob: Running job: job_local_0001
10/12/10 12:53:56 INFO streaming.StreamJob: Job running in-process (local Hadoop)
10/12/10 12:53:56 INFO mapred.FileInputFormat: Total input paths to process : 1
10/12/10 12:53:56 INFO mapred.MapTask: numReduceTasks: 1
10/12/10 12:53:56 INFO mapred.MapTask: io.sort.mb = 100
10/12/10 12:53:57 INFO mapred.MapTask: data buffer = 79691776/99614720
10/12/10 12:53:57 INFO mapred.MapTask: record buffer = 262144/327680
10/12/10 12:53:57 INFO streaming.PipeMapRed: PipeMapRed exec [/root/./script/mapper.php]
10/12/10 12:53:57 INFO streaming.PipeMapRed: MRErrorThread done
10/12/10 12:53:57 INFO streaming.PipeMapRed: Records R/W=0/1
10/12/10 12:53:57 INFO streaming.PipeMapRed: MROutputThread done
10/12/10 12:53:57 INFO streaming.PipeMapRed: mapRedFinished
10/12/10 12:53:57 INFO mapred.MapTask: Starting flush of map output
10/12/10 12:53:57 INFO mapred.MapTask: Finished spill 0
10/12/10 12:53:57 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
10/12/10 12:53:57 INFO mapred.LocalJobRunner: Records R/W=0/1
10/12/10 12:53:57 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
10/12/10 12:53:57 INFO mapred.LocalJobRunner:
10/12/10 12:53:57 INFO mapred.Merger: Merging 1 sorted segments
10/12/10 12:53:57 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 70 bytes
10/12/10 12:53:57 INFO mapred.LocalJobRunner:
10/12/10 12:53:57 INFO streaming.PipeMapRed: PipeMapRed exec [/root/./script/reducer.php]
10/12/10 12:53:57 INFO streaming.PipeMapRed: R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s]
10/12/10 12:53:57 INFO streaming.PipeMapRed: Records R/W=1/1
10/12/10 12:53:57 INFO streaming.PipeMapRed: MROutputThread done
10/12/10 12:53:57 INFO streaming.PipeMapRed: MRErrorThread done
10/12/10 12:53:57 INFO streaming.PipeMapRed: mapRedFinished
10/12/10 12:53:57 INFO mapred.TaskRunner: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
10/12/10 12:53:57 INFO mapred.LocalJobRunner:
10/12/10 12:53:57 INFO mapred.TaskRunner: Task attempt_local_0001_r_000000_0 is allowed to commit now
10/12/10 12:53:57 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to file:/root/output
10/12/10 12:53:57 INFO mapred.LocalJobRunner: Records R/W=1/1 > reduce
10/12/10 12:53:57 INFO mapred.TaskRunner: Task 'attempt_local_0001_r_000000_0' done.
10/12/10 12:53:57 INFO streaming.StreamJob:  map 100%  reduce 100%
10/12/10 12:53:57 INFO streaming.StreamJob: Job complete: job_local_0001
10/12/10 12:53:57 INFO streaming.StreamJob: Output: output

If you get errors where it’s complaining about the output directory, just remove the output directory and try again.

Result

Once you’ve got something similar to the above and no errors, we can check out the result.

user@computer:$ cat output/*

a:3:{s:10:"first_name";s:5:"Hello";s:9:"last_name";s:5:"World";s:11:"middle_name";s:5:"Jason";}

There we go, a serialized form of our modified PHP array $a.  That’s all there is to it.  Now, go forth and Hadoop.