Friday 27 July 2018

Delivering your content faster with CloudFront

About

In this blog, I will be writing about how you can connect your website to Cloud Front and make it load faster across the world. I will also show how to connect your website to Cloud Front if it is using SSL.


What is CloudFront?

Amazon CloudFront is a global content delivery network (CDN) service that securely delivers data, videos, applications, and APIs to your viewers with low latency and high transfer speeds. CloudFront is integrated with AWS – including physical locations that are directly connected to the AWS global infrastructure, as well as software that works seamlessly with services including AWS Shield for DDoS mitigation, Amazon S3, Elastic Load Balancing or Amazon EC2 as origins for your applications, and Lambda@Edge to run custom code close to your viewers.


What is a CloudFront edge location?

CloudFront delivers your content through a worldwide network of data centers called edge locations. The regional edge caches are located between your origin web server and the global edge locations that serve content directly to your viewers. This helps improve performance for your viewers while lowering the operational burden and cost of scaling your origin resources.


How CloudFront helps you deliver content faster?

CloudFront caches your data at its edge locations around the world. So, when a user requests a page from your website it checks the cache and if it is present in the cache returns the response to the user otherwise sends the request your web server which sends the response. If the request made by the user is not cached the cloud front edge location caches it depending on your cache behavior settings.


How the request is getting routed?

After you configure CloudFront to deliver your content, here's what happens when users request your objects:


  1. A user accesses your website or application and requests one or more objects, such as an image file and an HTML file.
  2. DNS routes the request to the CloudFront edge location that can best serve the request—typically the nearest CloudFront edge location in terms of latency—and routes the request to that edge location.
  3. In the edge location, CloudFront checks its cache for the requested files. If the files are in the cache, CloudFront returns them to the user. If the files are not in the cache, it does the following:


  1. CloudFront compares the request with the specifications in your distribution and forwards the request for the files to the applicable origin server for the corresponding file type—for example, to your Amazon S3 bucket for image files and to your HTTP server for the HTML files.
  2. The origin servers send the files back to the CloudFront edge location.

As soon as the first byte arrives from the origin, CloudFront begins to forward the files to the user. CloudFront also adds the files to the cache in the edge location for the next time someone requests those files.



Creating a  distribution

Go to cloud front home on AWS console and click on "create distribution" and then "get started" for "Web".

Origin Settings

Enter the "Origin Domain name". This is the URL of your website. If you are using an elastic beanstalk you have to enter the URL of it.

Leave the Origin Path empty for now.

Enter the Origin ID, this is the unique identifier for your Cloud Front distribution.

For Origin SSL Protocols,  leave it default.

Origin Protocol Policy -> This is how the cloud front makes the connection with your website. If you have SSL setup you can use HTTPS only, if your website does not support HTTPS you can select HTTP only. If your website supports both HTTPS and HTTP you can select match viewer. Although if your website supports both HTTPS, I would suggest using HTTPS Only
Leave the rest settings to default. 

Default Cache Behavior Settings

Set the Viewer Protocol Policy according to your needs. If your website is just on HTTP keep it the default one(HTTP and HTTPS). If you want your website to be only accessed by HTTPS select the second option(Redirect HTTP to HTTPS).

Cache Based on Selected Request Headers -> If you are using HTTP for your website, leave this the default(None). If you are using HTTPS for your website, select Whitelist and click on Host from the list of headers and then click on Add>> to add whitelist it. This is important because if you are connecting to the website for eg https://xyz.com and your origin has an SSL certificate for this URL it will forward the Host header that will tell that the request is actually coming from https://xyz.com.
In my case, I had an Elastic Beanstalk which had an SSL certificate for https://xyz.com on its load balancer. See packet routing to see how the connection is made to your web server.

Object Caching -> If you select Use Origin Cache Headers, you would have to specify the TTL through the headers of the requests and responses. If you select Customize, you can select how long to cache the particular object. You can specify the time for which the objects are getting cached. In creating cache behaviors you will see how you can cache different kind of objects.
Leave the rest settings to default.

Distribution Settings

Price Class-> Select where you want your objects to be cached.

Alternate Domain Names -> This is where you specify the URL from which your website will be accessed from. For eg., if you want to access your website through the URL https:xyz.com, enter xyz.com here. 

SSL Certificate ->  If you are using Alternate Domain Names, you will have to put the SSL certificate of that URL here. If you want to get SSL certificate through AWS Certificate Manager(ACM), make sure you apply for this certificate in the N.Virginia region.
Leave the rest of the settings to default.

From the place you bought your domain, point it to the CloudFront url. 


Creating cache behaviors

Once you have created your distribution, click on it and go to Behaviors and then click on Create Behavior. Cache behaviors tell cloud front what to cache, how long to cache it etc.

Path Pattern -> This tells the cloud front what objects to cache. You can write path to specific objects like images/logo.jpg or you can write *.jpg to cache all the jpg files. The same thing can be done for other files like javascript, CSS. If you change these files in your web-server, you can invalidate the cache and the new files will be cached.

Viewer Protocol Policy -> Do according to the same logic we used earlier.

Cache Based on Selected Request Headers ->  Do according to the same logic we used earlier. If the objects need to be loaded over HTTPS whitelist the Host header.

Object Caching -> For the statics file like all the images you can set a high cache time. Select Customize and write a high value for all the TTL(these values are in seconds).
Leave other settings to default.
Click on create.


Error pages

You can have CloudFront return an object to the viewer (for example, an HTML file) when your Amazon S3 or custom origin returns an HTTP 4xx or 5xx status code to CloudFront. You can also specify how long an error response from your origin or a custom error page is cached in CloudFront edge caches. You can have different error pages that are on your S3 or other location to show error page for different status codes. For eg., if your web-server if not functioning properly and returning 5xx status code you can have the error page for this. 


Invalidating cache

Invalidating cache is important when you update your static content. For eg., if you are caching CSS, Javascript files and you update them in your web-server, the changes would not reflect on the website unless the TTL of the cached objects expires. To force them to update you could invalidate the cache. To do this click on the CloudFront distribution then Invalidations and then click on Create Invalidation. In the Object Paths write the path of the object you need to invalidate, for eg., *.CSS to invalidate all CSS files.  

Geo Restriction 

If you need to prevent users in selected countries from accessing your content, you can specify either a whitelist (countries where they can access your content) or a blacklist (countries where they cannot). For more information, see Restricting the Geographic Distribution of Your Content in the Amazon CloudFront Developer Guide.

Trouble Shooting 

If you are not seeing the changes you made to your website, try invalidating the cache.

If you get a 502 error, there may be a problem with the connection between cloud front and your origin. Try to see if the SSL certificate is set if you are using https. Also, see if you have whitelisted the Host header in your Cache Based on Selected Request Headers.


Useful links





1 comment:

  1. As soon as the first byte arrives from the origin, CloudFront begins to forward the files to the user. CloudFront also adds the files to the cache in the edge location for the next time someone requests those files. cheap bedroom comforter sets , jaipur cotton bed sheets online ,

    ReplyDelete