Upstart
Upstart

Thoughts on building software and startups.

Tags


Twitter


Signature errors when uploading files to S3

Chris HickmanChris Hickman

I have an application that is written in Python and uses the Boto library for making calls to the AWS S3 API in support of a file upload feature. Since being deployed, the code has handled many, many file uploads flawlessly. However, recently I encounted a bizarre bug where a single file would simply not upload to S3 correctly.

When trying to upload this file using the S3 put_object operation, AWS returned the following error:

An error occurred (SignatureDoesNotMatch) when calling the PutObject operation: The request signature we calculated does not match the signature you provided. Check your key and signing method.

When uploading to S3, my code sets request header values for both Content-Type and Content-Disposition. The value set for Content-Disposition contains, among other information, the original filename for the file being uploaded.

headers = {  
    'ContentDisposition': 'attachment;filename="{}"'.format(filename)
}

A clue as to why this upload was failing was that this particular file had multiple consecutive space characters in the filename.

When making a request to S3, the request is first signed with a signature. Among other things, the signature is based on a calculation that includes the request header data.

When a request is received by S3, it also calculates a signature based upon the request data and compares this calculation to the value calculated during the initial request. If the values do not match, we get a SignatureDoesNotMatch error.

It turns out, signature calculations are performed differently by AWS and Boto. When Boto makes its calculation, it does so without any manipulation of the request header data. But with AWS, it will fold consecutive spaces into a single space before making its calculation. Since the signature calculations are performed differently, the result is a SignatureDoesNotMatch error for our file with multiple consecutive spaces in the filename.

To fix this problem, we just need to make sure that signature calculations are consistent between Boto and AWS. When setting the Content-Disposition request header using the filename, a simple regular expression is used to fold runs of multiple spaces into a single space character.

#  Replace runs of consecutive whitespace with single space
filename = re.sub('\s+', ' ', filename).strip()  

Once this change was made, the S3 put_object operation succeeded for this troublesome file.

Chris Hickman
Author

Chris Hickman

Entrepreneur, technologist and startup junkie. Founded two companies, one with $24M in VC, the other bootstrapped with SBA loan.

Comments