Amazon S3 PHP helpers

The Amazon documentation for using S3 with PHP refers to an elusive function called “setAuthorizationHeader”. It’s apparently supposed to magically set the correct value for the Authorization header on a Pear HTTP_Request object. As far as I could tell, it didn’t actually exist — but I wanted it, so I wrote it:

Source

<?php

require_once ‘Crypt/HMAC.php’;
require_once ‘HTTP/Request.php’;

define("S3URL", ‘http://s3.amazonaws.com’);
define("AWSACCESSKEYID", ‘XXXXXXXXXXXXXXXXXXXX’);
define("AWSSECRETKEYID", ‘XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX’);

$s3_hasher =& new Crypt_HMAC(AWSSECRETKEYID, "sha1");

function s3_sign($StringToSign)
{
    global $s3_hasher;
    return hex2b64($s3_hasher->hash($StringToSign));
}

function hex2b64($str)
{
    $raw = ;
    for ($i = 0; $i < strlen($str); $i += 2) {
        $raw .= chr(hexdec(substr($str, $i, 2)));
    }
    return base64_encode($raw);
}

function setAuthorizationHeader($request)
{
    $headers = $request->_requestHeaders;
    
    $HTTP_Verb      = $request->_method;
    $Content_MD5    = $headers[‘content-md5’];
    $Content_Type   = $headers[‘content-type’];

    // Get the date, or set it if not already:
    if (!isset($headers[‘date’])) {
        $Date = gmdate("D, d M Y H:i:s T");
        $request->addHeader("date", $Date);
    }
    else {
        $Date = $headers[‘date’];
    }
    
    // Canonicalize the Amazon headers:
    $CanonicalizedAmzHeaders = ;
    $amz_headers = array();
    foreach ($headers as $key => $value) {
        if (substr($key, 0, 6) == ‘x-amz-‘) {
            if (isset($amz_headers[$key]))
                $amz_headers[$key] .= ‘,’ . $value;
            else
                $amz_headers[$key] = $value;
        }
    }
    ksort($amz_headers);
    foreach ($amz_headers as $key => $value)
        $CanonicalizedAmzHeaders .= $key . ‘:’ . $value . "\n";
    
    // Canonicalize the resource string
    $CanonicalizedResource    = ;
    $host = $request->_generateHostHeader();
    if ($host != ‘s3.amazonaws.com’) {
        $pos = strpos($host, ‘s3.amazonaws.com’);
        $CanonicalizedResource .= ‘/’ . ($pos === false) ? $host : substr($host, 0, $pos);
    }
    $CanonicalizedResource .= $request->_url->path;
    // TODO: sub-resources "?acl", "?location", "?logging", or "?torrent"

    // Build the string to sign:
    $StringToSign = $HTTP_Verb . "\n" .
                    $Content_MD5 . "\n" .
                    $Content_Type . "\n" .
                    $Date . "\n" .
                    $CanonicalizedAmzHeaders .
                    $CanonicalizedResource;
    
    $Signature = s3_sign($StringToSign);
    
    $Authorization = "AWS" . " " . AWSACCESSKEYID . ":" . $Signature;
    
    // Set the Authorization header:
    $request->addHeader("Authorization", $Authorization);
}

function s3AuthURL($Resource, $HTTP_Verb = ‘GET’, $seconds = 120)
{
    // Calculate expiration time:
    $Expires = time() + $seconds;

    // Build the string to sign:
    $StringToSign = $HTTP_Verb . "\n" .
                    "\n" .
                    "\n" .
                    $Expires . "\n" .
                    $Resource;

    $Signature = s3_sign($StringToSign);

    // Build the authorized URL:
    return  S3URL . $Resource .
            ‘?AWSAccessKeyId=’  . AWSACCESSKEYID .
            ‘&Expires=’         . $Expires .
            ‘&Signature=’       . urlencode($Signature);
}

?>

Note: this hasn’t been tested extensively, so use it at your own risk. Post a comment or contact me at if you find any bugs. Also, IANAPHPE.

Just replace the XX’s with your keys and it should work with this sample code.

There’s also a function for creating query string authorized URLs.

Stealing LAPD's crime data

This post explains how to get data from LAPD’s [crime maps](http://www.lapdcrimemaps.org/) website. See my [previous post](http://tlrobinson.net/blog/?p=6) on scraping DPS’s incident logs for background of TOOBS.

After completing the TOOBS project for the UPE P.24 programming contest I was checking out LAPD’s [crime maps](http://www.lapdcrimemaps.org/) website, which is similar to TOOBS (but not as cool!), and I realized I could integrate their data with DPS’s data for the ultimate Los Angeles / USC crime map. There very little overlap between the LAPD and DPS data since the two are separate entities. Murders and some other incidents may show up in both, but hopefully these are rare…

The LAPD system also uses JavaScript and XMLHttpRequest to fetch the data from a server side script. Additionally, there is no security to check that the requests are coming from the LAPD web app. This means we can easily, and (as far as i know) legally, access their data.

Due to the same origin policy that restricts JavaScript to only making requests to the originating server, you cannot simply use their PHP script from your own JavaScript, you must use sort of a proxy. While this policy can be annoying, it is necessary to limit what malicious JavaScript could do.

To obtain the crime data from LAPD’s servers, we begin by forming the request URL which contains parameters such as the start date, the interval length, lat/lon coordinates, radius, and crime types. A HTTP request is made to their server, and the response is stored.

We notice the response is simply JavaScript that gets eval’d on the client:

searchPoints = new Array ();

searchPoints[0] = new searchPoint (‘0’, ‘#070307306’, ‘lightblue’, ’17’, ‘-118.301638’, ‘34.022812’, ’14XX W 36th St’, ‘0.74’, ‘6’, ’02-04-2007 10:45:00 PM’, ‘Southwest Division: 213-485-6571’);
searchPoints[1] = new searchPoint (‘1’, ‘#070307280’, ‘violet’, ’17’, ‘-118.284008’, ‘34.033212’, ’25XX S Hoover St’, ‘0.52’, ‘3’, ’02-04-2007 10:00:00 PM’, ‘Southwest Division: 213-485-6571’);
searchPoints[2] = new searchPoint (‘2’, ‘#070307224’, ‘cyan’, ’17’, ‘-118.304108’, ‘34.032481’, ’26XX Dalton Av’, ‘0.83’, ‘4’, ’02-04-2007 12:15:00 AM’, ‘Southwest Division: 213-485-6571’);
searchPoints[3] = new searchPoint (‘3’, ‘#070307222’, ‘blue’, ’17’, ‘-118.2903’, ‘34.0284’, ‘Menlo Av and 29th Av’, ‘0.02’, ‘2’, ’02-03-2007 11:00:00 PM’, ‘Southwest Division: 213-485-6571’);

We could simply redirect this code to our own app and do the processing on the client side with JavaScript, but we also notice that JavaScript syntax is very similar to PHP syntax. By creating a compatible PHP object called searchPoint and prepending a “$” to each variable name, we have valid PHP code that we can simply eval. The result is an array of searchPoint objects that we can easily add to our response, or insert into a database, or whatever we want!

Note that this is extremely insecure since we’re eval’ing text that we got from somewhere else. By changing the response, the provider of the data could execute any PHP they wanted on my server.

A more secure method would be to actually parse the data rather than letting PHP’s eval do the work.