How to store files gracefully
When building web application, one thing you may need to think about is how you plan to store user files. If you are building an application that requires users to upload or download files(images, documents, pdf's..etc.). file storage can be an important part of your application architecture.
Where Should I Store Files
When building web applications, you’ve got a few choices for where to store your files.
like
- Store user files in your database in a text column, or something similar
- Store user files directly on your web server
- Store user files in a file storage service like Amazon S3
Out of the above choices, #3 is your best bet.
Storing files in a database directly is not very performant. Databases are not optimized for
storing large blobs of content. Both retrieving and storing files from a database server is
incredibly slow and will tax all other database queries.
Storing files locally on your web server is also not normally a good idea. A given web server
only has so much disk space, which means you now have to deal with the very real
possibility of running out of disk space. Furthermore, ensuring your user files are properly backed up and easily
accessible at all times can be a difficult task for even experienced engineers.
Unlike the other two options, storing files in a file storage service like S3 is a great option:
it’s cheap, your files are replicated and backed up transparently, and you’re also able to
quickly retrieve and store files there without taxing your web servers or database servers.
It even provides fine-grained control over who can access what files, which allows you to
build complex authorization rules for your files if necessary.
For storing what can sometimes be sensitive information, a file storage service like
Amazon S3 is a great way to get the best of all worlds: availability, simplicity, and security.
To sign up for an Amazon Web Services (AWS) account, and to start using Amazon S3,
you can visit their website .
How Do I Store Files in S3?
Now that we’ve talked about where you should store your user files (a service like Amazon
S3), let’s talk about how you actually store your files there.
When storing files in S3, there are a few things you need to understand.
Firstly, you need to pick the AWS region in which you want your files to live. An Amazon
region is basically a data-center in a particular part of the world.
Like all big tech companies, Amazon maintains data-centers all over the world so they can
build fast services for users in different physical locations. One of the benefits to using an
Amazon service is that you can take advantage of this to help build faster web applications.
Let’s say you’re building a website for Korean users. You probably want to put all of your
web servers and content in a data-center somewhere in Korea. This way, when your users
visit your site, they only need to connect over a short physical distance to your web server,
thereby decreasing latency.
Amazon has a list of regions for which you can store files in S3 on their website
The first thing you need to do is use the list above to pick the most appropriate location for
storing your files. If you’re building a web application that needs to be fast from all over the
world: don’t worry, just pick the AWS region closest to you — you can always use a CDN
service like Amazon Cloud-front to optimize this later.
Next, you need to create an S3 bucket. An S3 bucket is basically a directory for which all of
your files will be stored. I usually give my S3 buckets the same name as my application.
Let’s say I’m building an application called “The Greatest Test App”—I would probably
name my S3 bucket: “the-greatest-test-app”.
S3 allows you to create as many buckets as you want, but each bucket name must be
globally unique. That means that if someone else has already created a bucket with the
name you want to use: you won’t be able to use it.
Finally, after you’ve picked your region and created your bucket, you can now start storing
files.
This brings us to the next question: how should you structure your S3 bucket when storing
user files?
The best way to do this is to partition your S3 bucket into user-specific sub-folders.
Let’s say you have three users for your web application, and each one has a unique ID.
You might then create three sub-folders in your main S3 bucket for each of these users — this way, when you store user files for these users, those files are stored in the appropriately named sub-folders.
Here’s how this might look:
This is a nice structure because you can easily see the separation of files by user, which makes managing these files in a central location simple. If you have multiple processes or applications reading and writing these files, you already know your what files are owned by which user.
How Do I “Link” Files to My User Accounts?
Now that you’ve seen how to store files in S3, how do you ‘link’ those files to your actual
Stormpath user accounts? The answer is custom data.
Custom Data is a essentially a JSON store that Stormpath provides for every resource.
This JSON store allows you to store any arbitrary JSON data you want on your user accounts. This is the perfect place to store file metadata to make searching for user files simpler.
Let’s say you have just uploaded two files for a given user into S3, and want to store a
‘link’ to those files in your Stormpath Account for that user. To do this, you will insert
the following JSON data into your Stormpath user’s CustomData resource:
This is a nice structure for storing file metadata because it means that every time you
have the user account object in your application code, you can easily know:
- What files this user owns.
- How to access each file the user owns by its public URL. NOTE: These URLs may not
- actually be public depending on how you permission these files in S3. More on this later.
- When each file was last modified.
This JSON data makes it much easier to build complex web applications, as you can
seamlessly find your user files either directly from S3, or from your user account.
Either way: finding the files you need is no longer a problem.
How Do I Secure My Files?
So far we’ve seen how you can store files, link them to your user accounts, and manage them.
But now let’s talk about how you can secure your user files.
Security is a large issue for sensitive applications. Storing medical records or personal information can be a huge risk. Ensuring you take the proper precautions when working with this type of data will save you a lot of trouble down the road.
There are several things you need to know about securely storing files in Amazon S3.
First: let’s talk about file encryption.
S3 provides two different ways to encrypt your user files:
If you’re building a simple web app that stores personal information of some sort,
you’ll want to use client side encryption. This is the most “secure” form of file storage,
as it requires you (the developer) to encrypt the files on your web server BEFORE storing
them in S3. This means that no matter what happens, Amazon (as a company) can not
possibly decrypt and view your stored files.On the other hand, if you’re building an
application that doesn’t require the utmost (and usually more complicated) client side
encryption functionality S3 provides, you can instead use the provided server side
encryption technology. This technology allows Amazon to theoretically decrypt and
read your files, but still provides a decent amount of protection against many forms of
attacks.
Encrypt Files on S3
If you’re building a sensitive application: use client-side encryption to encrypt your files
before storing them in S3. This will keep them really safe.
If you’re not building a sensitive application, use Amazon’s server-side encryption to help
alleviate various security concerns. It’s not as secure as client-side encryption, but is better
than nothing.
Set Restrictive ACLs for Your Files
Finally, be sure to only grant the minimal necessary permissions you need for each file you
store. This way, files are not left open or accessible to people who shouldn’t see them.
And… That’s it! If you follow these rules to storing user files, you’ll do just fine.
Comments
Post a Comment