Introduction to AWS services

AWS intro before AWS intro after

DNS (Route 53 service)

it is where you define all the DNS configuration
- where you are going to route traffic to from the internet
- supports health checks on endpoints and any traffic shaping that you want to do
will be used frequently when defining externally facing APIs or endpoints
the DNS will point to a load balancer
DNS can also be point to the API Gateway which then points to the load balancer

Load Balancer (Elastic Load Balancer service)

2 variations
1. application load balancer
  - it operates at the L7 layer
  - for those that want to use content from http headers to route the traffic
2. network load balancer
  - it operates at the L4 layer
  - it is a bit more cost effective and supports higher throughput limits

Web Backend Layer

this is where the request from the frontend touches the backend
services available
- EC2: Elastic Compute Cloud service
  - you rent virtual machines that you pay for by the hour
  - it is convenient because with EC2 machines, you can set them up to do whatever you please
    - can be used to hold backend databases, wordpress blog, rest api, etc.
  - cons is that you have to do a lot of setup and configuration by yourself to use it
- Lambda service
  - it is a serverless compute infrastructure
    - you define, write and deploy functions
      - the functions are snippets of code that can be small or large pieces of code
    - you do not need to worry about the infrastructure compared to EC2
    - AWS will handle the deployment of the application onto a container and scaling the application whenever the number of requests to the lambda function increase
  - it is a paper invocation
    - so its cost effective for application workloads that have bursty traffic patterns
    - it also good for applications that have traffic during the day and almost nothing during the evening
- ECS: Elastic Container Service service
  - there is a variation of it called EKS: Elastic Kubernetes Service
  - it helps you manage containers, setup servers with integrated load balancing and auto scaling, facilitate your deployments to those containers

Application Layer (similar to Web Backend Layer)

it doesn't really change in terms of the compute options
you have the same kind of fundamental building blocks that are available to you regardless whether its your web backend layer or your kind of business logic heavy application layer

(API Gateway service)

helps to facilitate the creation of the rest apis, and hosting those apis
it offers additional functionality on top of just using a load balancer to distribute the traffic to different nodes
features it offers
- api throttling
- authorization on an api
  - useful if you want to build a private api that can only be accessible from users in a user pool
  - the api gateway can be setup to validate a token by integrating with the user pool service or can be defined on your own
- model validations
  - you can define what types of models your api supports, then have that validation performed before the request actually gets to the Web Backend Layer

User Pools (Cognito service)

it allows you to create user pools with features like login and registration
users will sign up for accounts directly within cognito using the hosted UI, or you can integrate with other 3rd party identity providers such as (google, facebook, amazon, etc.)

Cache (Elastic Cache service)

it has 2 flavors
- Memcache
- Radis
  - most popular
it is based on key value lookups
you have to worry about hosting that infrastructure
- you would need to own a cluster of memory optimized nodes
- you have to worry about the maintenance, and alarming of that cluster
  - such as node replacement, hardware failures

Databases

Relational Database

Aurora

most popular because it is more hands-off
it is an in-house built amazon database
- it compatible with both MySQL and PostgreSQL
makes it easier for administration monitoring, auto scaling, storage auto scaling, compute auto scaling
it offers the data rest api that you can use call your database
it also offers a serverless compute model
- you don't need to worry about provisioning any type of hardware behind the scenes
- by using this, it offers auto scaling type of database

RDS (Relational Database Service)

it allows you to select which database configuration you would like
- you can pick from common database such as MySQL, PostgreSQL, Microsoft SQL Server, Oracle, Cassandra, etc.
this is where most customers would be comfortable in
you would need to worry about provisioning any type of hardware behind the scenes

NoSQL Database

DynamoDB

most popular
it is optimized for key value lookups
it is a fully managed database service
- this means you do not need to worry about anything with regards to infrastructure or hardware
- only need to worry about the scaling configuration, while DynamoDB will handle the auto scaling for you behind the scenes

DocumentDB

it is a fully managed service
it is compatible with MongoDB

OpenSearch

it is a new name for ElasticSearch
it is good if you are looking for something that supports more flexible querying at scale
- also allows powerful grouping features, and dynamic grouping
it also comes with Kibana which is an open source dashboarding technology to take a look at your data inside your open search database
it has been used quite effectively as a replacement to RDS in some cases

Packaged Infrastructure

it makes your life easier as a developer
it offers a combination of the different services
it helps to abstract some of the complexity away from you at the sacrifice of configuration and control

Elastic Beanstalk service

it allows you to setup any kind of web application
- can be containerized
- makes it easier for you to set up your apps with different components
  - backend layer, load balancer, auto scaling and monitoring
everything is managed at one spot
you control the infrastructure with Elastic Beanstalk but its similar to an orchestrator service
- where it would go to the services and provision what it needs for the type of application that you are trying to deploy onto it
it is orchestrating the deployment of your EC2 machines, load balancers, and any other stuff that you may need
- however, you still have visibility or insight into that infrastructure
- you still need to worry about maintaining it

App Runner service (new)

all of the lower level components are abstracted away from you
- you just worry about your application configuration and deployment
- app runner will worry about deploying that onto your infrastructure and scaling it if it requires it
behind the scenes, it uses
- ECS service
- fargate service
  - it is a serverless mode for running containers where it uses provision containers that you can specify
  - but you don't need to worry about the infrastructure

Lightsail service

it is another pre-packaged services that makes your life easier
similar to GoDaddy, DigitalOcean
- where you select the type of stack that you want to set up for your application
  - you select the different pre-packaged pricing models
  - you don't have to worry about the details of which node type, where if you use EC2 directly you have to
- you pick the different pre-packaged options for compute and the costs are reasonable
can deploy wordpress blog, or other kinds of applications, or containers
you can add other components
- such as load balancing, auto scaling
it is basically a more simplified version and safer, since it is very beginner friendly

AppSync service

a fully managed graphql as a service
provides you with the graphql functionality where you can use it to integrate with other backend aws services such as DynamoDB
you can also use it to integrate with Lambda functions if you want to have some custom resolvers
it can also scale very well based on the level of traffic and is transparent to you as an administrator

Cached Content

anything that you want to cache
- image files, javascript, html, css, etc.

CloudFront service

it allows you to deploy a CloudFront distribution which replicates some of your content from the General Object Storage to regional nodes that are located all across the world close to your end users
- this helps to get better performance for the static content
great for applications that want to optimize the experience for the user

Deployment Orchestration

Code Commit service

it is for storing your source code
you can either store your source code directly inside Code Commit as a service
- or you can integrate code commit with 3rd party providers such as github
by itself it is not impressive or powerful
- its merits comes from being able to integrate with 3rd party providers

Code Build service

it allows you to take your source that's located in Code Commit or any other 3rd party connected repository and then build that up into artifacts
it also allows you to create and run tests in a test environment for your source code
when you combine that with some other components, you can build sophisticated CICD pipelines that have multiple different steps in terms of running your unit tests and integration tests
it is basically for building and testing your source code

Code Deploy service

it takes the artifacts that are built in the Code Build and then it would deploy it to the other services in the Web Backend Layer or Application Layer

Code Pipeline service

it is the deployment orchestration service
it allows you to define a workflow of the different stages that your application will run through
- e.g.: source code -> build -> test -> deploy to test env -> run another tests -> deploy to prod

Monitoring

need to know where to look to determine when things are going wrong and you need to step in
- or when things are fine

CloudWatch service

kinda like an Umbrella service because it has a lot of different features that are inside
- most important feature is the ability to evaluate your metrics on the AWS services
- i.e.: you can use CloudWatch to view different metrics on your EC2 machines to see over time in chart format
  - such as cpu utilization, memory utilization
- i.e.: for Lambda
  - what are the concurrent number of invocations or all the invocations in 1 day
another useful feature is logging
- many of the applications would be emitting application logs
  - in terms of what your application is doing
- for other services are a kind of managed services that sit on their own
  - often these services will integrate with CloudWatch to give you an Administration level events whenever things are happening on the services if you're using that piece of infrastructure
new feature is Insights
- lets you search over very large volumes of CloudWatch data using similar to a SQL style language
basically it helps you monitor the state of your applications in your AWS

CloudTrail service

it is the audit trail of the operations that are being performed on your infrastructure
- the operations that are being performed and who is performing those operations
- i.e.: a Lambda function calling a database, a user deleting all your infrastructure
it will offer you different types of events that allow you to gain insight into who is accessing different services and what they are doing on those services
- the types of events can either be at the control or administration level
  - its is for when your infrastructure gets provisioned or deleted or modified
- there is also data level events
  - the collections of events is called Trails
    - data level events give you a little more granular data
      - when configured on the DyanamoDB, it will give you log information on every single request that comes to your table
        it is not recommended because it would be chewing through a lot of bandwidth for log storage, but can be enable or disabled
- insights event
  - it is a proactive one where you can configure it to automatically monitor your account
    - AWS uses machine learning to monitor the CloudTrail events for any anomalies
  - useful for being proactive about security threats

Identity and Access Management (IAM) service

it is a security management service for AWS
you create high-level entities such as users or roles
- and associated with these users are policies
  - policies are important because by default a user will not have any kind of permissions to do anything on AWS
    - unless you define a IAM policy that gives that user to perform that action
basically AWS security management uses an implicit deny operation
- you deny access to everything unless someone says otherwise
you create IAM policies and attached these policies to users
- you can assign users to different groups that has policy permission set predefined and apply to anyone in the group
you can also use it to create accounts that users can log in directly
- a developer can have their own account

Infrastructure as code

it is the preferred way to create and manage your infrastructure
no one really goes into the console anymore to create things and manage your infrastructure
- unless it's your first time doing it and you're just experimenting
its more preferred to write your infrastructure in a code format or configuration format that can be easily picked up and deployed to a new environment

Cloud Formation service

it is one of the options that allows you to write your infrastructure in a code format or configuration format that can be easily picked up and deployed to a new environment
it is a service that allows you to write json or yaml configuration files
- then you upload these files to Cloud Formation and it will be responsible for calling other AWS services to provision your infrastructure
- i.e.: you can write a template file that has a DynamoDB in it, and a Lambda function
  - once you upload the file into Cloud Formation, it will go and create your Lambda function and DynamoDB table
- it is convenient and quick, however, it has some downfalls
  - downfalls are writing your infrastructure as yaml or json is not really good
    - CDK solves these problem

Cloud Development Kit (CDK) service

it is a method of writing your infrastructure as code that is a little more fluent for developers
it involves you writing actual code
- allowing you to be more expressive with your infrastructure definition files so that you can be a little more dynamic and structure your code in a much simpler way using functions
it is easy to use Higher Level Constructs
- these constructs can contain an entire application specifications
- you can have a construct that's an entire serverless architecture
  - ie.g.: that contains Lambda function, DynamoDB table, etc.
behind the scenes, CDK does generate the code you write into Cloud Formation, then Cloud Formation is the one that deploys that out into AWS
there are outside AWS options you can use for Infrastructure as codes such as
- i.e.: Terraform, Pulumi
- which can be used to integrate with AWS

Rapid Development

Amplify service

it is a tool kit style service that allows you to rapidly build and deploy entire applications
it is different because it is primarily a Cli tool, and it focuses on the functionality and not on the infrastructure of what you are trying to provision
i.e.: you can run a simple command to add an api
- behind the scenes, Amplify will deploy maybe a Lambda function with an Api Gateway
it allows you to add things such as user authentication and authorization
- behind the scenes it will give you a Cognito user pool
you can add things such as a relational database
- it will give you an Aurora serverless database that you can use
it is good because of its abstraction, which is also a con because of that abstraction
- the debugging of any problems that arises is a nightmare, because you need to dive into the different independent services, which is problem if you do not understand about the AWS services you are trying to debug
if you want to stay within a well-defined box then Amplify is great
- if you want to venture out of that box, writing your own CDK would be recommended, and understand the AWS services

Serverless Application Model (SAM) service

it is great in terms of providing shorthands of common infrastructure setups that would typically be written in Cloud Formation
there is SAM templates that you can use that is similar to what those Higher Order Constructs do in CDK
it can handles much of the complexity of the setup for you
it is also great for local testing of Lambda functions
- can use SAM to build and run your Lambda functions locally before you deploy them into AWS

Event Coordination

it is a pub sub service
it is the pub in the sub
it is responsible for publishing notifications to a topic
- a topic can have many different subscribers
the idea is that a domain model owner such as search query service puts an entry into its database, it wants to notify other services
- those services will use an SNS topic that they publish to to notify other micro-services that something changed
- the subscribers can be many different types of infrastructure
  - you can have other AWS services that are your subscribers such as a Lambda function, you can have a http endpoint that exists on an EC2 instance
basically if you want to tell other people about data or data changes, then use SNS

Simple Queue Service (SQS) Queue service

it is responsible for holding messages so that you can process them at a later time
you define queues which can be connected to from many different types of compute infrastructure
- you can connect queue to a Lambda function or an EC2 machine or an ECS task
- these infrastructure will pull your queue for new messages and then perform some type of action when it finds new messages in the queue
typically people set up an SNS to an SQS
- so the SNS topic being the publisher
- the SQS being the subscriber
basically if you want to be notified of when something changes in someone else's data, use SQS

EventBridge service

it is similar to what SNS offers with some distinct benefits
instead of SNS topics, EventBridge uses this concept of Event Buses
- you can integrate your EventBridge Event Bus with many different kind of application actions all across AWS
- i.e.: you want to integrate your EventBridge with whenever an EC2 machine gets terminated
  - or whenever a Lambda function gets updated
  - or whenever the configuration on your DynamoDB table gets changed
- then you can define rules that specify who to deliver these events to
  - similar to how SNS has subscribers, EventBridge also has subscribers, and you define these rules and target groups of who to deliver these messages to depending on the type of event
the benefits of EventBridge over SNS is that it has 2 important features
1. Schema Discovery
  - if you are using SNS and you are publishing to your topic and someone is getting a message in your SQS queue
    - what does the schema of that message look like
    - what does the format of the message look like
    - is it have a JSON, is it an XML file, does it have foo as a key and bar as a value, etc.
  - EventBridge allows you to define these schema definitions to help subscribers get access to the models that are going to be delivered from the EventBridge Event Bus
  - it also allows you to search through different schemas
2. 3rd party integrations
  - an example of 3rd party integrations that you can work with is Shopify
    - Shopify has native integration with EventBridge
      - what that means is that any time someone places an order on your Shopify e-commerce website, that can be directly integrated into EventBridge
      - then you can have specify rules setup to deliver that notification to a micro-service or backend service that cares about those updates, or to General Object Storage

Step Function service

it allows you to define workflows and different steps
- i.e.: a starting step, x step, y step, etc.
- you can have conditional logic in the step function workflow
- it allows you to build things like customer ordering workflow
you can have a fail safe and conditional logic in that workflow
- so if anything fails, then a different path or a different choice is taken
it offers direct integration with many other AWS services
- you can use a service like Lambda to glue different parts of the workflow together and is serverless
basically it is a kind of orchestration service that allows you to define sophisticated and large workflows that may run through many steps

General Object Storage

Simple Storage (S3) service

it is just your kitchen sink of data storage
very cheap and scalable
you can store exabytes (1k PB) or petabytes (1k TB) of data
you can move data over time into cold storage to get better price points
typically you will store your assets in the S3 and then connect it to your CloudFront distribution so that it can be replicated and serve to all your customers around the world
it can be used to store any type of files
- images, css, videos, etc.
there are file sizes limits, refer to AWS website for more details

Analytical Processing

Elastic MapReduce (EMR) service

it is a large-scale distributed data processing system
it allows you to run many different frameworks including
- Spark clusters, Hive, Presto
- you can also run it in a serverless mode
it is a service where you are going to do just massive number crunching to perform some kind of analytics

Athena service

it is a completely serverless big data processing or analytics service
how it works is that you can keep your data stored in S3
- you don't need to load the data anywhere which is the case for EMR
- Athena will directly connect to your S3 data, crawl your data, automatically detect the schema of the data, and then create tables that you can query using SQL
- so whenever you dispatch a job to Athena it uses AWS infrastructure behind the scenes to prioritize the request
you can run massive queries on data that is already stored in S3

Data Warehouse

place to store data like business intelligence, or any types of analytics that we may want to perform on it
we don't want to store those data in places like DocumentDB or DynamoDB etc. because its not meant for it

Redshift service

it is a Columnar style database that allows you to perform some very large queries concurrently
it can support many users at the same time
it is a little expensive
it offers
- serverless mode
  - pay per use type model
- provision mode
  - where you can provision the nodes in a distributed way
this where you will run your workloads for the kind of OLAP (Online Analytical Processing) style queries
there is a connection you can do with Redshift and S3
- you can just deliver data to S3 and then set up an automatic load job to load that to Redshit so that it can be available for this OLAP style querying
good for data engineers, business intelligence users, anyone that wants to interact with data at scale using SQL

Dashboarding

QuickSight service

it is a tool that is great for end users
you give users their independent logins they can access whether it be in Redshit or S3 or anything else
it can create business facing dashboards

Network Boundary

offers a service that allows you to isolate all of your resources into specific isolated network

Virtual Private Cloud (VPC) service

allows you to create your own VPCs that are basically private networking spaces for your infrastructure to exist in
it is completely separate from all other AWS accounts, and it is just your networking space
you can connect your VPC to other VPC
- so as to be able to talk to other services or different account
you can open up your VPC so that your infrastructure is callable from the public internet
you can have very large VPC that host many different micro-services or service oriented architectures
basically you can use VPC to isolate your resources from any other systems, and also allows you to find some security rules to make sure that your infrastructure is protected from any outside actor

Introduction to AWS services

DNS (Route 53 service)​

Load Balancer (Elastic Load Balancer service)​

Web Backend Layer​

Application Layer (similar to Web Backend Layer)​

(API Gateway service)​

User Pools (Cognito service)​

Cache (Elastic Cache service)​

Databases​

Relational Database​

Aurora​

RDS (Relational Database Service)​

NoSQL Database​

DynamoDB​

DocumentDB​

OpenSearch​

Packaged Infrastructure​

Elastic Beanstalk service​

App Runner service (new)​

Lightsail service​

AppSync service​

Cached Content​

CloudFront service​

Deployment Orchestration​

Code Commit service​

Code Build service​

Code Deploy service​

Code Pipeline service​

Monitoring​

CloudWatch service​

CloudTrail service​

Identity and Access Management (IAM) service​

Infrastructure as code​

Cloud Formation service​

Cloud Development Kit (CDK) service​

Rapid Development​

Amplify service​

Serverless Application Model (SAM) service​

Event Coordination​

Simple Notification Service (SNS) service​

Simple Queue Service (SQS) Queue service​

EventBridge service​

Step Function service​

General Object Storage​

Simple Storage (S3) service​

Analytical Processing​

Elastic MapReduce (EMR) service​

Athena service​

Data Warehouse​

Redshift service​

Dashboarding​

QuickSight service​

Network Boundary​

Virtual Private Cloud (VPC) service​

DNS (Route 53 service)

Load Balancer (Elastic Load Balancer service)

Web Backend Layer

Application Layer (similar to Web Backend Layer)

(API Gateway service)

User Pools (Cognito service)

Cache (Elastic Cache service)

Databases

Relational Database

Aurora

RDS (Relational Database Service)

NoSQL Database

DynamoDB

DocumentDB

OpenSearch

Packaged Infrastructure

Elastic Beanstalk service

App Runner service (new)

Lightsail service

AppSync service

Cached Content

CloudFront service

Deployment Orchestration

Code Commit service

Code Build service

Code Deploy service

Code Pipeline service

Monitoring

CloudWatch service

CloudTrail service

Identity and Access Management (IAM) service

Infrastructure as code

Cloud Formation service

Cloud Development Kit (CDK) service

Rapid Development

Amplify service

Serverless Application Model (SAM) service

Event Coordination

Simple Notification Service (SNS) service

Simple Queue Service (SQS) Queue service

EventBridge service

Step Function service

General Object Storage

Simple Storage (S3) service

Analytical Processing

Elastic MapReduce (EMR) service

Athena service

Data Warehouse

Redshift service

Dashboarding

QuickSight service

Network Boundary

Virtual Private Cloud (VPC) service