Tag Archives: php

PHP Best Practices: Models and Data Mining

Most PHP developers utilizing MVC frameworks would argue that the best approach for models are classes with a varied number of getters and setters.  After a long time working with PHP and especially MVC frameworks and data mining/retrieval, I am of a completely different opinion and I will discuss why.

First let’s start with the traditional approach which I will refer to as a “heavy model”.  Consider the following:

class Person {
    private $id = 0;
    private $first_name = null;
    private $last_name = null;
 
    public function setId($id){
        $this->id = $id;
    }
 
    public function getId(){
        return $this->id;
    }
 
    public function setFirstName($first_name){
        $this->first_name = $first_name;
    }
 
    public function getFirstName(){
        return $this->first_name;
    }
 
    public function setLastName($last_name){
        $this->last_name = $last_name;
    }
 
    public function getLastName(){
        return $this->last_name;
    }
}

Admittedly, this is a very rudimentary data model, however it does follow the standard approach for most MVC frameworks with regard to model layout and corresponding getters and setters for privatized properties.  Its usage might then follow something along the lines of:

$p = new Person();
$p->setId(1);
$p->setFirstName('Bob');
$p->setLastName('Dobalina');

Usability wise, this above approach is very clean and makes perfect sense from an object oriented model approach even if it is rather remedial. However, having been living in PHP for quite some time now I have two problems with this approach. The first comes from data retrieval, or to be more precise, retrieving data from a database and populating models for use. The other is flexibility/code maintenance. For example, let’s say we’re using PDO to connect to a MySQL database:

$conn = /* get a database connection */;
$sql = 'SELECT * FROM person';
$stmt = $conn->prepare($sql);
$stmt->execute();
 
$people = array();
while($result = $stmt->fetch(PDO::FETCH_ASSOC)){
    $p = new Person();
    $p->setId($result['id']);
    $p->setFirstName($result['first_name']);
    $p->setLastName($result['last_name']);
    $people[] = $p;
}

So if we examine the above we see an array of Person objects being populated from a database result set. Now ignoring the semantics of the above, this is a pretty common way to retrieve and populate models. Sure there are different ways, methods of the model, etc. But in essence they are all pretty much performing this kind of loop somewhere. What if I told you there was a more efficient way to do the above, that executed faster, was more flexible and required less code?

Now consider this example, a modification to the above model which I will refer to as “light model”.

class Person {
    public $id = 0;
    public $first_name = null;
    public $last_name = null;
}

Now I know a lot of developers who see this are currently cringing, just stay with me for a minute. The above acts more like a structure than the traditional model, but it has quite a few advantages. Let me demonstrate with the following data mining code:

$conn = /* get a database connection */;
$sql = 'SELECT * FROM person';
$stmt = $conn->prepare($sql);
$stmt->execute();
 
$people = array();
while($result = $stmt->fetch(PDO::FETCH_ASSOC)){
    $p = new Person();
    foreach($result as $field_name=>$field_value)
        $p->{$field_name} = $field_value;
    $people[] = $p;
}

If you’re unfamiliar with the foreach notation within the while loop, all it is doing is dynamically using the result set name and value to populate the model’s respective matching property. Here’s why I find the light model a much better practice especially when combined with the above while and foreach mining pattern. Firstly, the light model will populate faster with a smaller execution time being that no functions are being invoked. Each function call of the heavy model takes an additional hit performance wise, this can be validated quite easily using time stamps and loops. Secondly, the second mining example allows us to paint our models with all the values being mined out of the database directly, which means going forward if the table changes, only the properties of the model change, the data mining pattern will still work with no code changing. If the database changed on the previous heavy model example, both the model and all mining procedures would have to be updated with the new result field names and respective set methods or at the very least, updated in some model method.

Finally to come full circle, what about the CRUD methods which are usually attached to the models such as save() or get(), etc? Instead of creating instance methods of models which have the overhead of object instantiation, how about static methods of like objects which would be termed “business objects”. For example:

class PersonBO{
 
    public static function save($person){
        /* do save here */
    }
 
    public static function get($person){
        /* do get here */
    }
 
}

This example performs the same functionality that is usually attached to the model however, it makes use of static methods and executes faster with less overhead than its heavy model counterpart. This adds an additional layer of abstraction between the model and its functional business object counterpart and lends itself to clean and easy code maintainability.

In summary, the light models used in conjunction with the data mining pattern demonstrated above reduce quite a bit of retrieval code and add to the codebase’s overall flexibility. This pattern is currently in use within several very large enterprise codebases and have been nothing but a pleasure to work with.

Special thanks to Robert McFrazier for his assistance in developing this pattern.

Web Developer Essential Tools

Foreword

The following highlights what I believe to be the best tools for web development and best of all, they’re all free.  This is a living reference which will be updated as new and better tools are found.  Feedback and suggestions are always welcome.

The List

Google Chrome

Google Chrome

Google Chrome

Light, extremely fast, jam packed with features and a debugger that makes short work of CSS and JavaScript.  The ability to right click inspect an element, change CSS values live in the DOM using up, down arrows or by entering values directly make this an essential tool for web development.

Get Google Chrome

ColorPic

ColorPic

ColorPic

Fantastic color picking tool which allows for obtaining the color values for anything on the screen.  Creation and loading of palettes, values can be slider modified for easy lighting or darkening of tones.  Shortcuts make color selection a cinch.  A must have tool.

Get ColorPic

Tape: Google Chrome Extension

Tape: Google Chrome Extension

Tape: Google Chrome Extension

You know that problem you have in CSS where you’re trying to line up elements either horizontally or vertically and you end up using paper to see if the alignment is correct, well you can forget about doing that ever again.  The Tape plugin allows you to easily place lines on the screen, then you can adjust your CSS to match.

Get Tape Extension

XML Tree: Google Chrome Extension

XML Tree: Google Chrome Extension

XML Tree: Google Chrome Extension

Display XML, RSS any form of node tree in a collapsible human readable format.

Get XML Tree Extension

Awesome Screenshot: Google Chrome Extension

Awesome Screenshot: Google Chrome Extension

Awesome Screenshot: Google Chrome Extension

Quickly screen grab what’s being displayed in the browser, draw, blur, annotate, write, etc. all from within the browser. Save when you’re done either locally or in the cloud.

Get Awesome Extension

LightTable

LightTable

LightTable

LightTable is one of the most robust and unique editors I have ever used.  Extremely powerful, easily customized with more features than I can list here.

Get LightTable

GIMP

GIMP

GIMP

If you’re used to Photoshop, it will take a little bit of a learning curve to get used to GIMP’s shortcuts and approach to image manipulation. However, once you get used to the shortcuts and image editing as a whole, you won’t want to go back. Select an area, resize the selection area by mouse or by shrink or grow amounts, copy then shortcut paste into a new image frame, fast. No more, click save, select extension type from a drop down, confirm, confirm, confirm. Instead, shortcut for save, enter filename with the extension (.png, .jpg, .gif, etc.), hit enter, enter, done. Made for rapid manipulation of images, tons of plugins, filters, great tool.

Get GIMP

How To Use FQL With The Facebook PHP SDK

Foreword

So the other day while building a Facebook application, I needed to get some information that I just couldn’t find any way to get other than through the use of FQL  Needless to say, finding examples of FQL are plentiful.  However, examples that use FQL and the Facebook PHP SDK are not.  So I thought I’d put one together.

Getting Started

If you haven’t already done so, make sure you’ve installed the latest Facebook SDK and registered your Facebook application.

FQL And The Facebook SDK

Here’s an example call to get all the photos that belong to the logged in user:

$appInfo = array(
'appId' => 'XXXXXXXXXXXXX',
'appSecret' => 'XXXXXXXXXXXXXXXXXXXXX'
); 
 
$facebook = new Facebook($appInfo);
 
$result = $facebook->api( array('method' => 'fql.query', 'query' => 'SELECT src, caption FROM photo WHERE owner=me()') );
foreach($result as $photo){
   ...
}

And there you have it. Hopefully this will save someone the amount of time it took for me to figure it out.

Using Hadoop And PHP

Getting Started

So first things first.  If you haven’t used Hadoop before you’ll first need to download a Hadoop release and make sure you have Java and PHP installed.  To download Hadoop head over to:

http://hadoop.apache.org/common/releases.html

Click on download a release and choose a mirror.  I suggest choosing the most recent stable release.  Once you’ve downloaded Hadoop, unzip it.

user@computer:$ tar xpf hadoop-0.20.2.tar.gz

I like to create a symlink to the hadoop-<release> directory to make things easier to manage.

user@computer:$ link -s hadoop-0.20.2 hadoop

Now you should have everything you need to start creating a Hadoop PHP job.

Creating The Job

For this example I’m going to create a simple Map/Reduce job for Hadoop.  Let’s start by understanding what we want to happen.

  1. We want to read from an input system – this is our mapper
  2. We want to do something with what we’ve mapped – this is our reducer

At the root of your development directory, let’s create another directory called script.  This is where we’ll store our PHP mapper and reducer files.

user@computer:$ ls

.
..
hadoop-0.20.2
hadoop-0.20.2.tar.gz
hadoop
user@computer:$ mkdir script

Now let’s being creating our mapper script in PHP.  Go ahead and create a PHP file called mapper.php under the script directory.

user@computer:$ touch script/mapper.php

Now let’s look at the basic structure of a PHP mapper.

#!/usr/bin/php
<?php
//this can be anything from reading input from files, to retrieving database content, soap calls, etc.
//for this example I'm going to create a simple php associative array.
$a = array(
'first_name' => 'Hello',
'last_name' => 'World'
);
//it's important to note that anything you send to STDOUT will be written to the output specified by the mapper.
//it's also important to note, do not forget to end all output to STDOUT with a PHP_EOL, this will save you a lot of pain.
echo serialize($a), PHP_EOL;
?>

So this example is extremely simple.  Create a simple associative array and serialize it.  Now onto the reducer.  Create a PHP file in the script directory called reducer.php.

user@computer:$ touch script/reducer.php

Now let’s take a look at the layout of a reducer.

#!/usr/bin/php
 
<?php
 
//Remember when I said anything put out through STDOUT in our mapper would go to the reducer.
//Well, now we read from the STDIN to get the result of our mapper.
//iterate all lines of output from our mapper
while (($line = fgets(STDIN)) !== false) {
    //remove leading and trailing whitespace, just in case 🙂
    $line = trim($line);
    //now recreate the array we serialized in our mapper
    $a = unserialize($line);
    //Now, we do whatever we need to with the data.  Write it out again so another process can pick it up,
    //send it to the database, soap call, whatever.  In this example, just change it a little and
    //write it back out.
    $a['middle_name'] = 'Jason';
    //do not forget the PHP_EOL
    echo serialize($a), PHP_EOL;
}//end while
?>

So now we have a very simple mapper and reducer ready to go.

Execution

So now let’s run it and see what happens.  But first, a little prep work.  We need to specify the input directory that will be used when the job runs.

user@computer:$ mkdir input
user@computer:$ touch input/conf

Ok, that was difficult.  We have an input directory and we’ve created an empty conf file.  The empty conf file is just something that the mapper will use to get started.  For now, don’t worry about it.  Now let’s run this bad boy.  Make sure you have your JAVA_HOME set, this is usually in the /usr directory.  You can set this by running #export JAVA_HOME=/usr.

user@computer:$ hadoop/bin/hadoop jar hadoop/contrib/streaming/hadoop-0.20.2-streaming.jar -mapper script/mapper.php -reducer script/reducer.php -input input/* -output output

So here’s what the command does.  The first part executes the hadoop execute script.  The “jar” argument tells hadoop to use a jar, in this case it tells it to use “hadoop/contrib/streaming/hadoop-0.20.2-streaming.jar”.  Next we pass the mapper and reducer arguments to the job and specify input and output directories.  If we wanted to, we could pass configuration information to the mapper, or files, etc.  We would just use the same line read structure that we used in the reducer to get the information.  That’s what would go in the input directory if we needed it to.  But for this example, we’ll just pass nothing.  Next the output directory will contain the output of our reducer.  In this case if everything works out correct, it will contain the PHP serialized form of our modified $a array.  If all goes well you should see something like this:

user@computer:$ hadoop/bin/hadoop jar hadoop/contrib/streaming/hadoop-0.20.2-streaming.jar -mapper script/mapper.php -reducer script/reducer.php -input input/* -output output

10/12/10 12:53:56 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
10/12/10 12:53:56 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
10/12/10 12:53:56 INFO mapred.FileInputFormat: Total input paths to process : 1
10/12/10 12:53:56 INFO streaming.StreamJob: getLocalDirs(): [/tmp/hadoop-root/mapred/local]
10/12/10 12:53:56 INFO streaming.StreamJob: Running job: job_local_0001
10/12/10 12:53:56 INFO streaming.StreamJob: Job running in-process (local Hadoop)
10/12/10 12:53:56 INFO mapred.FileInputFormat: Total input paths to process : 1
10/12/10 12:53:56 INFO mapred.MapTask: numReduceTasks: 1
10/12/10 12:53:56 INFO mapred.MapTask: io.sort.mb = 100
10/12/10 12:53:57 INFO mapred.MapTask: data buffer = 79691776/99614720
10/12/10 12:53:57 INFO mapred.MapTask: record buffer = 262144/327680
10/12/10 12:53:57 INFO streaming.PipeMapRed: PipeMapRed exec [/root/./script/mapper.php]
10/12/10 12:53:57 INFO streaming.PipeMapRed: MRErrorThread done
10/12/10 12:53:57 INFO streaming.PipeMapRed: Records R/W=0/1
10/12/10 12:53:57 INFO streaming.PipeMapRed: MROutputThread done
10/12/10 12:53:57 INFO streaming.PipeMapRed: mapRedFinished
10/12/10 12:53:57 INFO mapred.MapTask: Starting flush of map output
10/12/10 12:53:57 INFO mapred.MapTask: Finished spill 0
10/12/10 12:53:57 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
10/12/10 12:53:57 INFO mapred.LocalJobRunner: Records R/W=0/1
10/12/10 12:53:57 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
10/12/10 12:53:57 INFO mapred.LocalJobRunner:
10/12/10 12:53:57 INFO mapred.Merger: Merging 1 sorted segments
10/12/10 12:53:57 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 70 bytes
10/12/10 12:53:57 INFO mapred.LocalJobRunner:
10/12/10 12:53:57 INFO streaming.PipeMapRed: PipeMapRed exec [/root/./script/reducer.php]
10/12/10 12:53:57 INFO streaming.PipeMapRed: R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s]
10/12/10 12:53:57 INFO streaming.PipeMapRed: Records R/W=1/1
10/12/10 12:53:57 INFO streaming.PipeMapRed: MROutputThread done
10/12/10 12:53:57 INFO streaming.PipeMapRed: MRErrorThread done
10/12/10 12:53:57 INFO streaming.PipeMapRed: mapRedFinished
10/12/10 12:53:57 INFO mapred.TaskRunner: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
10/12/10 12:53:57 INFO mapred.LocalJobRunner:
10/12/10 12:53:57 INFO mapred.TaskRunner: Task attempt_local_0001_r_000000_0 is allowed to commit now
10/12/10 12:53:57 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to file:/root/output
10/12/10 12:53:57 INFO mapred.LocalJobRunner: Records R/W=1/1 > reduce
10/12/10 12:53:57 INFO mapred.TaskRunner: Task 'attempt_local_0001_r_000000_0' done.
10/12/10 12:53:57 INFO streaming.StreamJob:  map 100%  reduce 100%
10/12/10 12:53:57 INFO streaming.StreamJob: Job complete: job_local_0001
10/12/10 12:53:57 INFO streaming.StreamJob: Output: output

If you get errors where it’s complaining about the output directory, just remove the output directory and try again.

Result

Once you’ve got something similar to the above and no errors, we can check out the result.

user@computer:$ cat output/*

a:3:{s:10:"first_name";s:5:"Hello";s:9:"last_name";s:5:"World";s:11:"middle_name";s:5:"Jason";}

There we go, a serialized form of our modified PHP array $a.  That’s all there is to it.  Now, go forth and Hadoop.

Encryption and Decryption Between .NET and PHP

I recently worked on a project that required encryption and decryption by and between .NET and PHP. By default, the 2 technologies don’t mesh very well. Being that the data was originally being encrypted and decrypted by .NET, I had to write PHP code that worked with the encryption schemas being used. One of the main problems I ran into was the use of padding, in my case pkcs7 which was used by default in .NET. First thing to do was to make sure the encyption schemas were the same. For example, when using DES, the .NET default mode is MCRYPT_MODE_CBC. Once that was setup, I could initialize the mcrypt libraries.

 

$module = mcrypt_module_open(MCRYPT_DES, '', MCRYPT_MODE_CBC, '');
 
if($module === false)
die("DES module could not be opened");
 
$blockSize = mcrypt_get_block_size(MCRYPT_DES, MCRYPT_MODE_CBC);

 

The $blockSize variable is used later for padding and padding removal using pkcs7. Next to encrypt data I had to implement the following:

 

//encryption
$key = substr($key, 0, 8);
 
$iv = $key;
$rc = mcrypt_generic_init($module, $key, $iv);
 
//apply pkcs7 padding
$value_length = strlen($value);
$padding = $blockSize - ($value_length % $blockSize);
$value .= str_repeat( chr($padding), $padding);
 
$value = mcrypt_generic($module, $value);
$value = base64_encode($value);
mcrypt_generic_deinit($module);

 

//value now encrypted

Basically, the encryption scheme the .NET side was using was set the iv to the key, pad data, encrypt data, then base64 encode data. So here I’ve done the same thing in PHP. Now I needed to do the exact same thing for decryption:

//Decryption
$key = substr($key, 0, 8);
$iv = $key;
$rc = mcrypt_generic_init($module, $key, $iv); 
 
$value = base64_decode($value);
$value = mdecrypt_generic($module, $value); 
 
//apply pkcs7 padding removal
$packing = ord($value[strlen($value) - 1]);
if($packing && $packing < $this->_blockSize){
    for($P = strlen($value) - 1; $P >= strlen($value) - $packing; $P--){
        if(ord($value{$P}) != $packing){
            $packing = 0;
        }//end if
    }//end for
}//end if 
 
$value = substr($value, 0, strlen($value) - $packing); 
 
mcrypt_generic_deinit($module); 
 
//value now decrypted

This is basically the same as encryption but in reverse. The only real difference is the pkcs7 padding removal. Hopefully this tidbit helps a few others out there who run into encrypt and decryption issues between .NET and PHP.