# Well-Architected Framework

* Well-architected framework is a set of principles.
* These principles are documented as 5 pillars:
  * Operational Excellence
  * Security
  * Cost Optimization
  * Reliability
  * Performance Efficiency

### General Design Principles

* **Stop guessing capacity needs** - scale up and down as required
* **Automate everything** - automated systems ensure consistency and reliability.
* **Test at scale** - test an accurate replica of production on-demand.
* **Adapt and evolve** - adapt the architecture as needed to meet new challenges.
* **Be data driven** - drive decisions through data.
* **Game days** - practice, practice, practice.&#x20;

### The Five Pillars

<https://aws.amazon.com/blogs/apn/the-5-pillars-of-the-aws-well-architected-framework/>

* **Operational Excellence**
  * Does your architecture work? Will it continue to work?
  * There are six design principles for operational excellence in the cloud:
    * Perform operations as code
    * Annotate documentation
    * Make frequent, small, reversible changes
    * Refine operations procedures frequently
    * Anticipate failure
    * Learn from all operational failures (and success)
  * Prioritize to align with business priorities
    * What is the business goal?
    * What are the critical pieces needed to meet that goal?
    * Any compliance restrictions/requirements?
    * Dependencies between services?
  * Design your architecture to support business priorities
    * Is the design observable?
    * Is the entire design code? Can it be redeployed in even of a failure?
    * Are your logs and observations actionable? Can you derive values from data you're collecting?
  * Is your workload ready to go live
    * Are your processes consistent?
    * Is operational code properly managed?
    * Are tests in place?
    * Are you anticipating failure?
  * Ensure your workloads are actually working
    * Metrics indicate health of each service
    * Metrics show overall health
    * Are you monitoring business metrics too?
  * Responding to events
    * Anticipate planned and unplanned events
    * Respond in code
    * Connect observations with 3rd party tools as needed
  * Learn from success or failure

    * Post-event, have runbooks changed?
    * Are teams evaluating their processes?
    * Test assumptions
    * Experiment early and often to find better solutions

* **Cost Optimization**
  * Spend only what you have to. Deliver business value for the lowest price point.
  * There are five design principles for cost optimization in the cloud:
    * Adopt a consumption model
    * Measure overall efficiency
    * Stop spending money on data center operations
    * Analyze and attribute expenditure
    * Use managed services to reduce cost of ownership
  * Use the appropriate resources and configurations
    * Provision for current needs with an eye to the future
    * "Right size" to lowest resource that meets the needs
    * Use data to choose purchase options
    * Optimize by geography
    * Default to managed services
    * Optimize data transfer
  * Matching supply and demand
  * Know how much you're spending and where
    * Understand your stakeholders
    * Implement a governance model
    * Attribute cost to teams/projects
    * Tag AWS resources
    * Track lifecycle of the resources
  * Continuously work to maximize value delivered

    * Align utilization with requirements
    * Report and validate findings
    * Evaluate new services for value
    * Continue push for managed services, if they're cost-effective

* **Reliability**
  * There are five design principles for reliability in the cloud:
    * Test recovery procedures
    * Automatically recover from failure
    * Scale horizontally to increase aggregate system availability
    * Stop guessing capacity, reduce idle resources
    * Manage change in automation
  * Will this system work consistently and recover quickly
    * Recover from issues automatically
    * Scale horizontally first for resiliency
    * Reduce idle resources
    * Manage change through automation
  * Understand the default and requested limits
    * Are you planning beyond current limits for a resource?
    * Will you scale past specific resource limits?
    * Can those limits be lifted?
    * Can you plan around those limits?
  * Networking
    * IP address space management (are you considering IPv6)
    * Subnets structures
    * Resilient topologies
    * Ability to handle sudden increase in traffic
    * Provide consistent performance regardless (latency)
  * Ensure your application is ready for business use

    * Can users access your application?
    * Deploy without an issue
    * Can you push issue to a planned downtime?
    * Can your application withstand partial outages?

* **Performance Efficiency**
  * There are five design principles for performance efficiency in the cloud:
    * Democratize advanced technologies
    * Go global in minutes
    * Use serverless architectures
    * Experiment more often
    * Mechanical sympathy
  * Remove bottlenecks, reduce waste
    * Let AWS do the work whenever possible
    * Reduce latency through regions and AWS Edge
    * Serverless whenever possible, then containers, only then fall down to instances
    * Experiment as new services are released
    * Think about the user, not your tech stack
  * Is this the optimal solution for this workload
    * What type of compute best suits?
    * Which data store is ideal for this workload?
    * Does your network design complement compute and data store choices?
  * Continuously ensure choices work for your workloads
    * Is infrastructure stored as code?
    * Are deployments simple and automated?
    * Can benchmarks be taken automatically?
    * Does load testing interfere with production?
  * Monitoring

    * Use active and passive monitoring where appropriate
    * Understand the 5 phases of monitoring - generation, aggregation, real-time processing, storage, analytics
    * Create actionable metrics

* **Security**
  * There are six design principles for security in the cloud:
    * Implement a strong identity foundation
    * Enable traceability
    * Apply security at all layers
    * Automate security best practices
    * Protect data in transit and at rest
    * Prepare for security events
  * Does this system work only as intended?
    * Identities have the least privileges required
    * Know who did what and when
    * Security is woven into the fabric of the system
    * Automate security tasks
    * Encrypt all data at rest and in transit
    * Prepare for the worst
  * Look for abnormal behavior in your logs
    * Capture and analyze logs
    * Regularly audit controls and configurations (AWS CloudFormation drift, AWS Config)
  * Defense in depth
    * Establish trust boundaries
    * Protect the network in/out
    * Protect all hosts
    * Configure services to meet security posture needs
    * Enforce service level protection
  * Classify and protect data
    * How sensitive is the data?
    * Who should have access to the data and when?
    * Encrypt in transit and at rest
    * Backup your data, test backups
  * Contain and recover from an unplanned event
    * Do you have a plan to tag affected resources?
    * Can you adjust permissions to allow for containment?
    * Can you redeploy to recover quickly?
    * Did you learn from the incident and adjust?


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://notebook.iuriioapps.com/cloud/aws/well-architected-framework.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
