Friday, 19 October 2018

API Gateway Authentication with Amazon Cognito


Amazon API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. With a few clicks in the AWS Management Console, you can create an API that acts as a “front door” for applications to access data, business logic, or functionality from your back-end services, such as workloads running on Amazon Elastic Compute Cloud (Amazon EC2), code running on AWS Lambda, or any web application.

Amazon API Gateway handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls, including traffic management, authorization, and access control, monitoring, and API version management. Amazon API Gateway has no minimum fees or startup costs. You pay only for the API calls you receive and the amount of data transferred out.


Different authentication methods for API Gateway

There are multiple ways to authenticate the requests an API Gateway. These are:


  1. Resource policies let you create resource-based policies to allow or deny access to your APIs and methods from specified source IP addresses or VPC endpoints.
  2. Standard AWS IAM roles and policies offer flexible and robust access controls that can be applied to an entire API or individual methods.
  3. Cross-origin resource sharing (CORS) lets you control how your API responds to cross-domain resource requests.
  4. Lambda authorizers are Lambda functions that control access to your API methods using bearer token authentication as well as the information described by headers, paths, query strings, stage variables, or context variables request parameters.
  5. Amazon Cognito user pools let you create customizable authentication and authorization solutions.
  6. Client-side SSL certificates can be used to verify that HTTP requests to your backend system are from API Gateway.
  7. Usage plans let you provide API keys to your customers — and then track and limit usage of your API stages and methods for each API key.


Cognito

What is Cognito and Cognito User Pool?

Amazon Cognito lets you add user sign-up, sign-in, and access control to your web and mobile apps quickly and easily.

Amazon Cognito User Pools provide a secure user directory that scales to hundreds of millions of users. As a fully managed service, User Pools are easy to set up without any worries about standing up server infrastructure.


Creating a Cognito User Pool?

To create a User Pool go to the Cognito User Pool console and follow these steps.


1. Set the name of the user pool
2. Click on review defaults and then on App clients.
3.On the App Clients page click on Add an app client.
4. Enter the name of the client. Uncheck the Generate client secret, we won't be needing it. Check the last option - Enable username-password (non-SRP) flow for app-based authentication (USER_PASSWORD_AUTH).  Click on Create Client App.



5. Click on return to pool details and then click on Create Pool.
6. Now once the user pool is created. Go to App Client settings under App Integration.
7. Check the Cognito User Pool checkbox. Put your call back URLs. These are the URLs that Cognito will redirect to after sign in/up.
8. Under Allowed OAuth Flows check Authorization code grant and Implicit grant.
9. Under Allowed OAuth Scopes check email and openid.
10. Click on Save Changes.



11. Now go to Domain Name under App Integration.
12. In Domain prefix, enter the prefix of the domain you want for sign in/up. After entering a prefix which is available click on save changes.
13. At this point, you would have a sign in/up page where users can sign in or register. The URL for this page will be of the following format :

https://your-domain-prefix.auth.us-east-1.amazoncognito.com/login?response_type=token&client_id=asd233wde232da23&redirect_uri=https://google.com

Replace your-domain-prefix with the domain prefix you set up in step 12, client_id with the client id of the client you created in step 3, redirect_uri with the Callback URL(s) you set in step 7.





14. Once you login or sign up using this you will be redirected to your call back URL. In the URL generated for redirecting you will see the Cognito has added some key-value pairs. The id_token is the token you would need to authenticate your request with API Gateway. The token is in JWT format which is explained below.



Enabling Authentication in API Gateway

1. Go to your API in API Gateway. Make sure CORS is enabled.
2. Under your API, go to Authorizers, and click on Create New Authorizer.
3. Enter the name of your Authorizer.
4. In type select Cognito.
5. Under Cognito User Pools, select the user pool you created.
6. In Token Source, write Authorization, and click on create.



7. Now go to your API -> Resources ->  method -> Method Request.
8. Under settings, click on the pencil next to Authorization, from the drop-down that comes select the Authorizer that you just created and save the settings. If you don't see your authorizer try refreshing the page. 
9. Deploy the API.
10. At this point, your API would only accept requests which have the id token. If you want to know more the format in which the token is and how to store it you can read the following parts about JWT.


JSON Web Tokens (JWT) 

JSON web Tokens or JWT and OAuth 2.0 are the most common ways to authenticate the APIs. 

JSON Web Tokens are an open, industry standard RFC 7519 method for representing claims securely between two parties. They are a great authentication mechanism. They give you a structured and stateless way to declare a user and what they can access. They can be cryptographically signed and encrypted to prevent tampering on the client side.

Visualizing/decoding JWT data

The token you get from Cognito is in JWT format and you cant really see what is the data in it. To see the data that it contains go to the following website and paste your token. 
https://jwt.io/

Where to Store your JWTs – Cookies vs local storage

Once you are authenticated with Cognito, you would get the token in the URL. You need to store it safely so that you can access it in every API call. 
There are two ways in which you can easily store the JWT. The first one is localStorage and second is cookies. Both have their advantages and disadvantages. Let's see what are the advantages and disadvantages or each of them. 

Local Storage:

Web Storage (localStorage/sessionStorage) is accessible through JavaScript on the same domain. This means that any JavaScript running on your site will have access to web storage, and because of this can be vulnerable to cross-site scripting (XSS) attacks. We import multiple JS files on our website, even if one of them is compromised our website could be susceptible to XSS. On Local storage, we can't put restrictions like HTTPS only or prevent javascript from using it. 

Cookie Storage

Cookies, when used with the HttpOnly cookie flag, are not accessible through JavaScript, and are immune to XSS. You can also set the Secure cookie flag to guarantee the cookie is only sent over HTTPS. This is one of the main reasons that cookies have been leveraged in the past to store tokens or session data.  The HttpOnly flag is set by the server and not by the Javascript in front-end. This becomes a problem for us as we need to access cookies to store the token we get. 

Cookies are vulnerable to a different type of attack: cross-site request forgery (CSRF). A CSRF attack is a type of attack that occurs when a malicious web site, email, or blog causes a user’s web browser to perform an unwanted action on a trusted site on which the user is currently authenticated. This is an exploit of how the browser handles cookies. A cookie can only be sent to the domains in which it is allowed. CSRF can be prevented easily so this should not be a big problem.

Few things to keep in mind:

Things you store in cookies are sent to the server on every call(if the server and the place setting the cookies are on the same domain). So if you want to store something that is not required by the server you should save it in the local storage otherwise you would be wasting bandwidth by sending extra data in every request.

You may need to share cookies between subdomains if you are setting cookies and sending a request in different subdomains.

The id token you get from Cognito has a short lifespan. If the id token has expired the request will fail on which you can ask the user to log in again. You can also set the expiry in cookies.



Resources:

Sunday, 26 August 2018

Debugging database latency on AWS RDS

About

I was recently debugging the high latency of a website I am working on and I would like to share how I did that so it may be helpful to someone. In my case, the database was the bottleneck. So most part of this blog will be about debugging database latency.
The website I am working on in on Elastic Beanstalk. The database is Postgres on RDS.

What causes high latency?

High latency can be caused by several factors, such as:
  • Network connectivity
  • ELB configuration
  • Backend web application server issues including but not limited to:
    • Memory utilization – One of the most common causes of web application latency is when most or all available physical memory (RAM) has been consumed on the host EC2 instance.
    • CPU utilization – High CPU utilization on the host EC2 instance can significantly degrade web application performance and in some cases cause a server crash.
    • Web server configuration – If a backend web application server exhibits high latency in the absence of excessive memory or CPU utilization, the web server configuration should be reviewed for potential problems.
    • Web application dependencies – If a backend web application server exhibits high latency after ruling out memory, CPU, and web server configuration issues, then web application dependencies such as external databases or Amazon S3 buckets may be causing performance bottlenecks.

In this blog, I will not be covering how to solve the Network connectivity or the ELB configuration. If you think that is the issue please see the links 1 and 2 below to troubleshoot them. 


Finding the cause of latency

Start by looking at the hardware your application is running on to see if it is causing the latency. See the CPU, Memory utilization and Disk usage in CloudWatch. CPU metric is available by default but other metrics like Memory is not but you can add it manually. You can even SSH into your instances and see the metrics. If you see any one of the metrics constantly being high you might want to look into upgrading your instance type or instance class. I have covered CPU and disk IO briefly in debugging database latency below. 

If you see all these metrics are within normal range, the issue might be some external dependencies like the database. 

If you think the reason for the slow load time of your website is static files like JS, CSS, images, videos etc, you should use a CDN(Content delivery network) like CloudFront. 


Debugging database latency


CPU and storage

Check your CPU utilization. If you see it is constantly high or frequently it goes to full capacity it might be a good idea to look at your instance type and upgrade it if required. While selecting an instance type, keep in mind what your application needs(IO, CPU or memory). 

Storage

If your database makes a lot of IO requests, you might want to upgrade the storage of your DB instance from general purpose SSD to provisioned IOPS SSD. For gp2 instances, the baseline IOPS provided are 100 IOPS up to 33GB of storage volume size and 3 IOPS per storage size from 34GB onwards. That is Volumes of 33.4 GiB and greater have a baseline performance of 10,000 IOPS.


How do know if you need more IOPS?

See the Disk Queue Depth metric of Cloud watch. As you might be knowing, DiskQueueDepth shows the number of I/O requests in the queue waiting to be serviced. These are I/O requests that have been submitted by the application but have not been sent to the device because the device is busy servicing other I/O requests. Higher values indicate that the number of processes waiting in the queue is more.

If your write throughput also increases when the Disk Queue Dept increases, this means that the database is making a lot of write requests(insert/update).
If your read throughput also increases when the Disk Queue Dept increases, this means that the database is making a lot of read requests(select).

CPU Utilization

If you are using T2 instances, you get CPU credits per hour. See this link[3] to know how much CPU credits you get per hour. See your baseline CPU utilization.
For e.g., a t2.large instance has two vCPUs, which earns 36 credits per hour, resulting in a baseline performance of 30% (18/60 minutes) per vCPU.

The CPU credit has depleted due to the CPU utilization running higher than the baseline of 30% over a period of time and using up the available CPU credits. That is, if a burstable performance instance uses fewer CPU resources than is required for baseline performance (such as when it is idle), the unspent CPU credits are accrued in the CPU credit balance. If a burstable performance instance needs to burst above the baseline performance level, it spends the accrued credits. The more credits a burstable performance instance has accrued, the more time it can burst beyond its baseline when more performance is needed.

So if the CPU credits are not much, your DB instance won't burst up more than its baseline.  If this is the problem you are having look into upgrading the instance from T class to M or any other according to your needs.




Debugging Slow queries

This is one of the main reasons I have seen so far which results in database latencies. If you have inefficient queries or not indexed the tables properly you might see extremely high query execution times. 

To see how much time each query is taking, you can look at the database logs. Note that logging of query execution times is turned off by default and you need to enable it in the parameter group of your database. Once enables, go to the logs and see how much time the queries are taking and identify the slow queries.

Once you have the slow queries run them in PGAdmin with the explain command. This might be different depending on the database you are using. I used this on a Postgres database. This will help you see what part of the query is taking time.  If you see a sequential scan is going on on a large table, you might want to index that table on the search column. Creating indices on search columns would help bring down the query execution time a lot. For me, it brought down the time from a few seconds to less than 100ms.

If you are doing a lot of insert statements do a batch insert instead of inserting rows one by one. This can also bring down execution time a lot.

Enabling Enhanced Monitoring

Cloud watch collects statistics at every 60s interval, so if there is any resource spike within 60s, it might not show on Cloud watch. However, enhanced monitoring provides data points of much more granularity and give more insight. Thus there might be a point where the number of IOPS provisioned or the CPU spike might have gone even above the data points as shown by the cloud watch metrics.
Enable Enhanced Monitoring for your RDS instance as it will provide more granular data which will be helpful in analyzing the IOPS and CPU Credits of up to 5-6 secs.
Please note there is an additional cost associated with enabling enhanced monitoring.

View processes running during high CPU consumption

Along with the enhanced monitoring you can use the pg_stat_activity to view current processes running during high CPU consumption. This will help you in identifying queries that need optimization. Thus if process list of Enhanced monitoring tracks down PID of queries with increased CPU utilization then in real time you can map the PID of those queries from enhanced monitoring with the PID of the queries as shown by pg_stat_activity. The command is as follows :          
 
Select * from pg_stat_activity;  

See queries execution time in Postgres

You can also use pg_stat_statement view which lists queries by total_time & see which query spends the most time in the database_List queries with the total number of calls, total rows & rows returned etc.

Enable PG_STAT_STATEMENTS:
                
                ○ Modify your existing custom parameter group and set the following settings:
                ---------------------------------------------
                shared_preload_libraries = pg_stat_statements
                track_activity_query_size = 2048
                pg_stat_statements.track = ALL
                pg_stat_statements.max = 10000
                ---------------------------------------------
                
                ○ Apply the parameter group and restart the RDS Instance.
                ○ Run "CREATE EXTENSION pg_stat_statements" on the database you want to monitor.
                
Once PG_STAT_STATEMENTS is setup you can start watching the pg_stat_statements in following ways:
                
                ○ List queries by total_time & see which query spends the most time in the database:
                ------------------------------------------------------
                SELECT round(total_time*1000)/1000 AS total_time,query
                FROM pg_stat_statements
                ORDER BY total_time DESC;
                ------------------------------------------------------
                
                ○ List queries with total no. of calls, total rows & rows returned etc:
                ----------------------------------------------------------------
                SELECT query, calls, total_time, rows, 100.0 * shared_blks_hit /
                nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
                FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5;
                ----------------------------------------------------------------
                
                ○ List queries on 'per execution' basis & try to sample them over time:
                -----------------------------------------------------------
                SELECT queryid, query, calls, total_time/calls, rows/calls,
                temp_blks_read/calls, temp_blks_written/calls
                FROM pg_stat_statements
                WHERE calls != 0
                ORDER BY total_time DESC LIMIT 10;
                -----------------------------------------------------------
                
4. Once you identify the culprit queries, run EXPLAIN or EXPLAIN ANALYZE for those statements and try to tune them by seeing the execution plan like we did above.


If you are having a hard time trying to solve the issue try contacting AWS support. I contacted them and they were really helpful. Keep in mind that this would take a couple of days.

Resouces:

  1. https://aws.amazon.com/premiumsupport/knowledge-center/elb-latency-troubleshooting/
  2. https://aws.amazon.com/premiumsupport/knowledge-center/elb-connectivity-troubleshooting/
  3. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/t2-credits-baseline-concepts.html
  4. https://www.postgresql.org/docs/9.2/static/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW



Friday, 27 July 2018

Delivering your content faster with CloudFront

About

In this blog, I will be writing about how you can connect your website to Cloud Front and make it load faster across the world. I will also show how to connect your website to Cloud Front if it is using SSL.


What is CloudFront?

Amazon CloudFront is a global content delivery network (CDN) service that securely delivers data, videos, applications, and APIs to your viewers with low latency and high transfer speeds. CloudFront is integrated with AWS – including physical locations that are directly connected to the AWS global infrastructure, as well as software that works seamlessly with services including AWS Shield for DDoS mitigation, Amazon S3, Elastic Load Balancing or Amazon EC2 as origins for your applications, and Lambda@Edge to run custom code close to your viewers.


What is a CloudFront edge location?

CloudFront delivers your content through a worldwide network of data centers called edge locations. The regional edge caches are located between your origin web server and the global edge locations that serve content directly to your viewers. This helps improve performance for your viewers while lowering the operational burden and cost of scaling your origin resources.


How CloudFront helps you deliver content faster?

CloudFront caches your data at its edge locations around the world. So, when a user requests a page from your website it checks the cache and if it is present in the cache returns the response to the user otherwise sends the request your web server which sends the response. If the request made by the user is not cached the cloud front edge location caches it depending on your cache behavior settings.


How the request is getting routed?

After you configure CloudFront to deliver your content, here's what happens when users request your objects:


  1. A user accesses your website or application and requests one or more objects, such as an image file and an HTML file.
  2. DNS routes the request to the CloudFront edge location that can best serve the request—typically the nearest CloudFront edge location in terms of latency—and routes the request to that edge location.
  3. In the edge location, CloudFront checks its cache for the requested files. If the files are in the cache, CloudFront returns them to the user. If the files are not in the cache, it does the following:


  1. CloudFront compares the request with the specifications in your distribution and forwards the request for the files to the applicable origin server for the corresponding file type—for example, to your Amazon S3 bucket for image files and to your HTTP server for the HTML files.
  2. The origin servers send the files back to the CloudFront edge location.

As soon as the first byte arrives from the origin, CloudFront begins to forward the files to the user. CloudFront also adds the files to the cache in the edge location for the next time someone requests those files.



Creating a  distribution

Go to cloud front home on AWS console and click on "create distribution" and then "get started" for "Web".

Origin Settings

Enter the "Origin Domain name". This is the URL of your website. If you are using an elastic beanstalk you have to enter the URL of it.

Leave the Origin Path empty for now.

Enter the Origin ID, this is the unique identifier for your Cloud Front distribution.

For Origin SSL Protocols,  leave it default.

Origin Protocol Policy -> This is how the cloud front makes the connection with your website. If you have SSL setup you can use HTTPS only, if your website does not support HTTPS you can select HTTP only. If your website supports both HTTPS and HTTP you can select match viewer. Although if your website supports both HTTPS, I would suggest using HTTPS Only
Leave the rest settings to default. 

Default Cache Behavior Settings

Set the Viewer Protocol Policy according to your needs. If your website is just on HTTP keep it the default one(HTTP and HTTPS). If you want your website to be only accessed by HTTPS select the second option(Redirect HTTP to HTTPS).

Cache Based on Selected Request Headers -> If you are using HTTP for your website, leave this the default(None). If you are using HTTPS for your website, select Whitelist and click on Host from the list of headers and then click on Add>> to add whitelist it. This is important because if you are connecting to the website for eg https://xyz.com and your origin has an SSL certificate for this URL it will forward the Host header that will tell that the request is actually coming from https://xyz.com.
In my case, I had an Elastic Beanstalk which had an SSL certificate for https://xyz.com on its load balancer. See packet routing to see how the connection is made to your web server.

Object Caching -> If you select Use Origin Cache Headers, you would have to specify the TTL through the headers of the requests and responses. If you select Customize, you can select how long to cache the particular object. You can specify the time for which the objects are getting cached. In creating cache behaviors you will see how you can cache different kind of objects.
Leave the rest settings to default.

Distribution Settings

Price Class-> Select where you want your objects to be cached.

Alternate Domain Names -> This is where you specify the URL from which your website will be accessed from. For eg., if you want to access your website through the URL https:xyz.com, enter xyz.com here. 

SSL Certificate ->  If you are using Alternate Domain Names, you will have to put the SSL certificate of that URL here. If you want to get SSL certificate through AWS Certificate Manager(ACM), make sure you apply for this certificate in the N.Virginia region.
Leave the rest of the settings to default.

From the place you bought your domain, point it to the CloudFront url. 


Creating cache behaviors

Once you have created your distribution, click on it and go to Behaviors and then click on Create Behavior. Cache behaviors tell cloud front what to cache, how long to cache it etc.

Path Pattern -> This tells the cloud front what objects to cache. You can write path to specific objects like images/logo.jpg or you can write *.jpg to cache all the jpg files. The same thing can be done for other files like javascript, CSS. If you change these files in your web-server, you can invalidate the cache and the new files will be cached.

Viewer Protocol Policy -> Do according to the same logic we used earlier.

Cache Based on Selected Request Headers ->  Do according to the same logic we used earlier. If the objects need to be loaded over HTTPS whitelist the Host header.

Object Caching -> For the statics file like all the images you can set a high cache time. Select Customize and write a high value for all the TTL(these values are in seconds).
Leave other settings to default.
Click on create.


Error pages

You can have CloudFront return an object to the viewer (for example, an HTML file) when your Amazon S3 or custom origin returns an HTTP 4xx or 5xx status code to CloudFront. You can also specify how long an error response from your origin or a custom error page is cached in CloudFront edge caches. You can have different error pages that are on your S3 or other location to show error page for different status codes. For eg., if your web-server if not functioning properly and returning 5xx status code you can have the error page for this. 


Invalidating cache

Invalidating cache is important when you update your static content. For eg., if you are caching CSS, Javascript files and you update them in your web-server, the changes would not reflect on the website unless the TTL of the cached objects expires. To force them to update you could invalidate the cache. To do this click on the CloudFront distribution then Invalidations and then click on Create Invalidation. In the Object Paths write the path of the object you need to invalidate, for eg., *.CSS to invalidate all CSS files.  

Geo Restriction 

If you need to prevent users in selected countries from accessing your content, you can specify either a whitelist (countries where they can access your content) or a blacklist (countries where they cannot). For more information, see Restricting the Geographic Distribution of Your Content in the Amazon CloudFront Developer Guide.

Trouble Shooting 

If you are not seeing the changes you made to your website, try invalidating the cache.

If you get a 502 error, there may be a problem with the connection between cloud front and your origin. Try to see if the SSL certificate is set if you are using https. Also, see if you have whitelisted the Host header in your Cache Based on Selected Request Headers.


Useful links





Sunday, 22 July 2018

Deploying Django project to Elastic Beanstalk part 2

What is Elastic Beanstalk? 

AWS Elastic Beanstalk is an easy-to-use service for deploying and scaling web applications and services developed with Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker on familiar servers such as Apache, Nginx, Passenger, and IIS.

You can simply upload your code and Elastic Beanstalk automatically handles the deployment, from capacity provisioning, load balancing, auto-scaling to application health monitoring. At the same time, you retain full control over the AWS resources powering your application and can access the underlying resources at any time.

In this post, I will be showing you how to deploy your code to elastic Beanstalk through the command line.


Setting up AWS Elastic Beanstalk command line tool

AWS has a command line tool for its elastic beanstalk service which you can install by running the following command.

pip install awsebcli

But before you do this, you should install aws cli.
Please see my previous post named Creating EC2 instance with AWS CLI  to see how to install and setup AWS CLI.

Deploying your code

Creating the configurations and requirements file

  • Create a folder '.ebextension' in your Django root folder.
  • Create a config file in this folder and name it Django.config (name doesn't matter, the extension should be .config).
  • Add the following content in this file:

option_settings:
  aws:elasticbeanstalk:container:python:
    WSGIPath: django_project_name/wsgi.py

  • Replace django_project_name with the name of your Django project.

Elastic Beanstalk uses the requirements.txt file to see which python modules you are using. So put all of the modules you are using in the requirements.txt file and put this file in the root folder of your django code. You can do the by executing the following command:

pip freeze > requirements.txt

Setting up git

AWS EB CLI uses git to track which files it should use for the deployment package. So all the files that you want to deploy should be added to git and the changes must be committed.
Go to the root folder of your django code
Execute the following commands to initialize the folder with git.

git init
git add . 
git commit -m "your commit message"

Please note that git add .  will add all the files in current folder and its subfolders. If you don't want this add the files that you don't want to include in a .gitignore file or add each file you want in the deployment package manually one by one.

Once you are done with the git you need to initialize the directory with Elastic Beanstalk as well and then create an environment in it.

Initializing Elastic Beanstalk

Execute the following command to initialize:

eb init

When you execute this command it will ask you for the options like application name, region etc. The first option you have to enter is the default region where you want your Elastic Beanstalk to be located. You should select a region that is close to the place where your website will be used the most as it would improve its load time. If you plan to set up a content delivery network for your website the region doesn't matter much.



 The second thing you have to set up is the application to use. This is the application on Elastic Beanstalk. If you already have an application where you want to deploy the code select it from the list otherwise select create new application.




Set up the language you are using. As we are setting this up for Django it will be python. It will automatically detect that and you just need to confirm it and provide the python version you are using for your Django project.


Setting up SSH. If you want to ssh into the EC2 which gets created by the Elastic Beanstalk select yes from the options. From my experience, I have found it really helpful to ssh into the EC2 to see live logs(tail -f). Enter the name of the key file that will be used to SSH into the EC2. This file will be saved in your .SSH folder.

Creating an Environment

Use eb create command to create an environment. You will be asked to enter your environment name and the load balancer type. Set the name to whatever you want and the load balancer type to classic. You can learn more about load balancers here.
Once you do this your code will be deployed to Elastic Beanstalk. This may take a few minutes. Once done you can use eb open to open the Django project in web browser.

Updating your code 

Add the changes you made in your code to git and commit them.
Use eb deploy to deploy the latest code.

Troubleshooting

Checking the logs

There are two ways to see the logs. The first way is to use the command eb logs and the logs will be displayed in ther terminal. The second way is to ssh( eb SSH) into the EC2 and see the logs with tail, nano etc. 




Wednesday, 11 July 2018

Deploying Django project to Elastic Beanstalk part 1

What is Elastic Beanstalk? 

AWS Elastic Beanstalk is an easy-to-use service for deploying and scaling web applications and services developed with Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker on familiar servers such as Apache, Nginx, Passenger, and IIS.

You can simply upload your code and Elastic Beanstalk automatically handles the deployment, from capacity provisioning, load balancing, auto-scaling to application health monitoring. At the same time, you retain full control over the AWS resources powering your application and can access the underlying resources at any time.

How do you deploy your code to Elastic Beanstalk?

There are two ways to deploy your Django project to Elastic Beanstalk. The first method is through the AWS console and the other way is through the Elastic Beanstalk command line tool. In this blog, I will be showing you how to deploy the Django project through the AWS console. I assume that you already have a working Django project.

Creating a zip of your code

   1. Create a folder '.ebextension' in your Django root folder.
   2. Create a config file in this folder and name it Django.config (name doesn't matter, the extension should be .config).
   3. Add the following content in this file:

option_settings:
  aws:elasticbeanstalk:container:python:
    WSGIPath: django_project_name/wsgi.py

    4. Replace django_project_name with the name of your Django project.
    5. Add a requirements.txt file in your Django project root folder and put the name of all the required libraries there. You can do the by executing the following command:

pip freeze > requirements.txt

    6. Create a zip of your Django root folder and make sure that .ebextentions folder is in zip.

Uploading your code to elastic beanstalk


  1. Go the Elastic Beanstalk page on AWS console.
  2. Click on get started
  3. Give a name to your application
  4. Set its platform to python
  5. Select upload your code and upload the zip file that you created in the previous step.
  6. Click on 'create application'.

It will take a few minutes to create an environment for your app.



There is an option to configure more settings before you create an application but you can change these configurations anytime so, for now, let's just leave that. These configurations are for things like setting up load-balancer, attaching databases, setting environment variables.

Redeploying previous version of your code

Whenever you deploy(upload) your code to Elastic Beanstalk, Elastic Beanstalk keeps track of different deployments. So anytime if you feel like the new code you deployed is not good and you want to revert to the previous version you can just select the version from AWS Elastic Beanstalk console. To do this go to your environments homepage on Elastic Beanstalk and click on upload and deploy button in the middle of the screen.


After you click on upload and deploy, click on application versions page. This will show you the list of your previous deployments from which you can select any one and deploy it again.

Troubleshooting

If you get the error Your WSGIPath refers to a file that does not exist. Go to the Django.config file in .ebextensions folder and make sure that the WSGI path is right.

You can see the logs as well to get detailed info of what is happening on the EC2 instance that Elastic Beanstalk creates to run your code. There are two ways to see the logs. First and the easy way is to go to the environment webpage you just created. On the left-hand side, you will see that there is a Logs option, click on it and then on request logs, and then you can select whether you want last 100 lines or the complete logs. The second way is to ssh into the underlying EC2 instance and see the logs. I will tell you how to do this in the second part where I also write about how to deploy your Django project through the command line.



Sunday, 17 June 2018

Creating EC2 instance with AWS CLI

About

The AWS Command Line Interface (CLI) is a unified tool to manage your AWS services. With just one tool to download and configure, you can control multiple AWS services from the command line and automate them through scripts.

In this post, I will show you how to create an EC2 instance using AWS CLI. The process of creating EC2 with AWS CLI can be divided into the following major parts:

  1. Getting access key and secret access key
  2. Installing and configuring AWS CLI
  3. Create security group 
  4. Create a key pair and change its permissions
  5. Creating EC2 instance
  6. Terminating EC2


Getting access key and secret access key


  1. Go to AWS console home and select IAM
  2. From left navigation bar select users
  3. Click on add user button on top 
  4. Enter the name of the user and in the access type select Programmatic access
  5. Click on permissions button on the bottom right
  6. Click on create group 
  7. Enter the name of the group 
  8. Select the permission you want to give to this user. For creating EC2 give it AmazonEC2FullAccess permission
  9. Click on review and then create user
  10. The new user will be creates and it will show you the access key ID and secret access key. Copy both of these somewhere as we would need this later to configure AWS CLI.


Installing and configuring AWS CLI


  1. AWS CLI can be installed using pip. Run the following command to install it: pip3 install awscli
  2. After it gets installed we need to configure it. To do that run the following command:  aws configure
  3. Enter the AWS Access Key ID and the AWS Secret Access Key you got earlier.
  4. Enter the Default region name according to you preference. Here is the list of regions. You can leave this blank if you want although if you do so you will have to provide the region input in every command.
  5. Enter the Default output format. The AWS CLI supports three different output formats:
  • JSON (json)
  • Tab-delimited text (text)
  • ASCII-formatted table (table)

Create security group 

Execute the following commands to create a security group that allows ssh connection(port 22).

  1. aws ec2 create-security-group --group-name EC2SecurityGroup --description "Security Group for EC2 instances to allow port 22"
  2. aws ec2 authorize-security-group-ingress --group-name EC2SecurityGroup --protocol tcp --port 22 --cidr 0.0.0.0/0


Create a key pair and setting its permissions

  1. For sshing into the new EC2 we create we need ssh key file. Execute the following commands to create a new ssh key file.
aws ec2 create-key-pair --key-name MyKeyPair3 --query 'KeyMaterial' --output text --region us-west-2 > MyKeyPair3.pem

      2. We need to change the permission of the key file we just created. To do that execute the following command:
          chmod 600 MyKeyPair3.pem


Creating an EC2 instance

Use the following command to create an EC2 instance:

aws ec2 run-instances --region us-west-2 --image-id  ami-32d8124a --key-name MyKeyPair3 --security-group-ids sg-fss3e980  --instance-type t2.micro --placement AvailabilityZone=us-west-2b --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=EC2NAME}]'


To get the image id run the following command:
aws ec2 describe-images --region us-west-2 
This would give a very large output so you may want to use it with grep

'ResourceType=instance,Tags=[{Key=Name,Value=EC2NAME}]'  This part sets the name of your new EC2 instance

The New EC2 will take a few minutes to create after which you can ssh into it using the key file you generated earlier.

Terminating your EC2

Run the following command to terminate your EC2:
aws ec2 terminate-instances --instance-id i-0xxxxxxxxxxxxxxx
















Tuesday, 24 April 2018

Creating a new user and restricting him to a specific folder in AWS EC2 Linux

Creating a new user and restricting him to a specific folder in AWS EC2 Linux

Creating a new website? Need to give someone access to the website folder on your EC2 instance without giving him the complete access to your EC2? If the answer to that is yes you may find this blog helpful. 
I had to work on a website with my project partner when I thought I should use EC2 so that we both can collaborate as well as see the website live. So I needed to give access to my friend so that he can also work on the website, but at the same time, I didn't want him to get access to all the other folders that I had on EC2. 
The whole process can be divided into the following steps
  1. Create a new group and user
  2. Generating SSH keys to login to EC2 from SFTP
  3. Connecting to EC2 as the new user using FileZilla.

Let's dig deeper to understand how each of these steps works. 

Create a new group and user


  • Start by creating a new group. We will add the user we create in the following steps to this group. Use the following code to create a new group.

sudo addgroup exchangefiles

  • Create the root directory for the group. After creating the directory we change the permissions of that directory. We set it to read and execute. You can set this according to your needs. All users in this group will be able to read and execute from this folder. The can write only in their specific folders.
sudo mkdir /var/www/GroupFolder/
sudo chmod g+rx /var/www/GroupFolder/

  • Now create another directory for the user. Give it write permission as well. Same as above you can give the permissions according to your needs. Also, You don't have to create two different directories, you can create just one directory and give it the permissions you need.
sudo mkdir -p /var/www/GroupFolder/files/
sudo chmod g+rwx /var/www/GroupFolder/files/
  • Assign both these directories to the group we created.
sudo chgrp -R exchangefiles /var/www/GroupFolder/




  • Edit /etc/ssh/sshd_config and make sure to add the following at the end of the file:
  •   # Force the connection to use SFTP and chroot to the required directory.  
      ForceCommand internal-sftp  
      ChrootDirectory /var/www/GroupFolder/  
      # Disable tunneling, authentication agent, TCP and X11 forwarding.  
      PermitTunnel no  
      AllowAgentForwarding no  
      AllowTcpForwarding no  
      X11Forwarding no  





  • Now let's create a new user. 
  • sudo adduser -g exchangefiles obama 
    • If you get a command not found error It might be because your environment doesn't include the /usr/sbin directory that holds such system programs. The quick fix should be to use /usr/sbin/adduser instead of just adduser
    • Now that we have made the proper changes let's restart ssh so that it can reflect the changes.
    sudo /sbin/service sshd restart
    You are all set. You have created a new user and group and given the permission of the folder to that group. The user can connect only using SFTP protocol. You can use FileZilla for connecting using you the new user. When you log in you will be in the folder you created above. You cannot go out of that folder.

    Generating SSH keys to login to EC2 from SFTP

    Now for connecting to EC2 as the new user you first need to create the public and private ssh keys. The public ssh key will be in the home folder of the new user and you will download the private key on your system. You have to use this key file(permanent key) on FileZilla to connect to EC2.
    • Go to the home directory of the new user and execute the following commands to create a new folder and set permissions to it.
    cd
    mkdir .ssh
    chmod 700 .ssh
    • Now create a file in .ssh and set its permissions
    touch .ssh/authorized_keys
    chmod 600 .ssh/authorized_keys
    • Now generate your public and private keys using the following command. replace username with the name of the new user that you created
    ssh-keygen -f username
    • This will generate two files username and username.pub. username is your private key and username.pub is your public key.
    • Copy the public key, and then use the Linux cat command to paste the public key into the .ssh/authorized_keys file for the new user.
    cat username.pub > .ssh/authorized_keys
    • Download the private key file to your local system. This will be used to login using SFTP. 

    Connecting to EC2 as the new user using FileZilla.

    • Open FileZilla. Go to File->site manager->New Site. Enter the details here. The host is the public DNS of your EC2. leave the port empty, change the protocol to SFTP, set logon type to "key file", set the user to the new user that you created, browse to where you downloaded your private key and set it in "key file".
    • Click on connect

    That's it, you have now configured your EC2 to give limited accedd to a user. I hope you liked this blog. If you get error or get stuck on some point comment below, I will try my best to help you.

    Sunday, 22 April 2018

    Creating Database on AWS and using it on MySQL Workbench

    The relational databases in AWS are under the name of RDS which stands for Relational Database Service. AWS has a lot of different databases supported like Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL Server. You can use the AWS Database Migration Service to easily migrate or replicate your existing databases to Amazon RDS.
    Creating and using a database on AWS can be a little tricky if you are new to it. In this blog I will show you how to create a database in RDS and how to connect to it from MySQL Workbench.

    Steps for creating database instance on AWS RDS


    • click on Launch DB instance
    • Select MySQL and click Next


    • In the next page select Dev/Test MySQL and click Next
    • In the Instance specificationsCheck the Only enable the options eligible for RDS free usage tier. By checking this the options available in the next setting DB instance class is set to db.t2.micro. If you want a bigger database instance you can uncheck this and select the DB instance class which you want. Please keep in mind that you would be billed accordingly.

    • Leave all other options to the default value in Instance specifications.
    • In the Settings part, put in your DB instance identifier, username and password. This username and password would be required to access you database later from MySQL Workbench.

    • Click on Next
    • In Network & Security part set Public accessibility to Yes. This setting will be required to connect from MySQL Workbench.

    • Leave all other settings to default value.
    • In Database options part, Put in your database name.
    • You can leave all other options on this page to the default value.
    • Click on Launch DB Instance at the bottom of the page.
    • It will take 10-12 minutes to create your DB instance.




    Steps for connecting to AWS RDS DB instance from MySQL workbench

    • On your AWS console go to RDS and click on instances. Here you will see the DB instance you just created. 
    • Click on the DB instance to see its details.
    • On you DB instance page scroll down to security groups.



    • Here on the security group for Inbound traffic, you will see an IP address written in the rule. This was your public IP address from which you created the database instance. What this means is it will only allow you to connect to this database instance from this IP address. We want it to be able to connect from anywhere. So to do this we need to edit this rule. 
    • Click on the security group for Inbound connection.
    • It will open a new page. On this page click on the Inbound tab of the security group.



    • Click on edit. 


    • Remove the IP address from source and put 0.0.0.0/0 . It should look like this.

    • Click on Save to save the settings. You have changed the security settings. You will now be able to connect to this database instance from any IP.


     


    • To connect to a database we need four things, host, port, username and password.
    • To see where your database instance is hosted go to your database instance page. See the Connect part. Here you will see the endpoint and the port. The endpoint is the host. Copy these values.
    • Open MySQL Workbench.
    • Click on the + next to MySQL Connections.


    • Enter any name in Connection Name 
    • Enter the host, port, username.
    • Click on test connection
    • If the host, port and username is right it will ask for your password, enter it.
    • If the password is right it will show a success message. Now just click on OK to save it.