miércoles, 13 de junio de 2012

recursive s3 uploading from python

I recently had the need to upload a directory with many files to s3 so i wrote a small function to to this for me, this is not optimized at all, as it will only send one file at the time, and it doesn't take into account the 5gb limitation on s3 uploads, but is the simplest that could get the job done.
This function traverse through a folder and uploads the files with the correct key so they show up inside their respective 'folders' on the s3 management panel, i say 'folders' because there is no such concept on s3 only, keys that can be optionally prepended by slashes so '/home/static/files/lol.py' is the key of the 'lol.py' file, and thats how you request it.
This function checks if the key already exists only uploading missing keys.
So here it is, in all its slow glory.
Also, sometimes amazon fails to find a key its supposed to be there, so i had to add a little try, except to handle these cases, i had to upload ~4000 files and only had problems with 10, so i guess this is the 99.9% availability amazon claims to have with s3 files.


import os
import boto
from boto.s3.key import Key

failed = open('failers','w')
def uploadResultToS3(awsid,awskey,bucket,source_folder):
    c = boto.connect_s3(awsid,awskey)
    b = c.get_bucket(bucket)
    k = Key(b)
    for path,dir,files in os.walk(source_folder):
        for file in files:
            relpath = os.path.relpath(os.path.join(path,file))
            if not b.get_key(relpath):
                print 'sending...',relpath
                k.key = relpath
                k.set_contents_from_filename(relpath)
                try:
                    k.set_acl('public-read')
                except:
                    failed.write(relpath+', ') 
    failed.close()

You obviously need the boto library, get it first
pip install boto

No hay comentarios:

Publicar un comentario