As AI capabilities find uses in new markets, more companies are turning to the cloud for these high-performance computing workloads. And AWS is opening its arms wider with expanded support for GPU-backed instances to provide those resources, at premium prices.
The P3 Elastic Compute Cloud (EC2) instance, released into general availability last week, improves performance for advanced applications with graphics processing units (GPUs). The P3 instance comes in three sizes: p3.2xlarge, p3.8xlarge and p3.16xlarge, with 1-8 NVIDIA Tesla V100 GPUs, 16-128 GBs of GPU memory, 8-64 vCPUs and 61-488 GB of instance memory. The instances also offer enhanced network performance of up to 25Gbps and 14Gbps of Elastic Block Store (EBS) bandwidth.
The P3 instance fits advanced workloads such as machine learning, high performance computing and video processing. It is also one of AWS most expensive instances, ranging from $3.06 to $24.48 per hour for On-Demand pricing.
Amazon also unveiled new Amazon Machine Images (AMIs) for the P3 instance family. These AWS Deep Learning AMIs include frameworks designed specifically for the NVIDIA Volta V100 GPUs included with the P3 instance family. Developers can use the AMIs to build custom AI models and algorithms.
New features and support
- PostgreSQL compatibility, new features. After months of preview, AWS made PostgreSQL for Amazon Aurora generally available. AWS hopes to entice users to migrate PostgreSQL workloads to Aurora, promising a more scalable, secure and durable managed database service and lower costs. AWS claims PostgreSQL with Aurora has “three times better performance” than standard PostgreSQL databases. Aurora also added the ability to launch R4 instances with a larger cache and faster memory than the previous R3 generation –a developer can double Aurora’s maximum throughput on MySQL databases.
- New AWS Batch functionality. AWS Batch can now trigger CloudWatch Events when a job transitions from one state to another, so a developer won’t have to poll the state of each Batch job. The event stream feature sends state updates in near real-time, which can route through CloudWatch Events to targets such as AWS Lambda or Amazon Simple Notification Service. AWS also adjusted the service to spin idle EC2 resources down faster in accordance with the cloud provider’s new per-second billing. AWS Batch previously held on to idle resources for the majority of the billing hour to prevent unnecessary instance launches.
- ElastiCache supports Redis encryption. Redis, an open source in-memory database, does not natively support encryption, but AWS now provides that capability for Amazon ElastiCache. The service now enables encryption for personally identifiable information at rest and in transit. At-rest encryption protects Amazon Simple Storage Service (S3) and disk backups, while in-transit encryption protects data communicated between Redis servers and clients.
- Apply Glue via CloudFormation. AWS has included its Glue service, which helps execute ETL jobs, as an option for AWS CloudFormation templates. This support helps IT teams automate AWS Glue functions — such as jobs, triggers and crawlers — to quickly load and prepare data for analytics.
- Address data warehouse demands. Dense compute (DC2) nodes for Amazon Redshift are a second generation of compute clusters designed to reduce latency and boost throughput for demanding data warehouse workloads. The DC2 nodes, which include Intel E5-2686 v4 (Broadwell) CPUs, DDR4 memory and NVMe-based solid state disks, are available for the same price as the previous generation DC1 nodes.
- Use Elasticsearch in a VPC. Amazon Elasticsearch Service (ES) now supports access from an Amazon Virtual Private Cloud (VPC), which removes the need to connect to the service over the public internet. IT teams can now use Elasticsearch, an open source search engine and analytics service, without configuring firewall rules and domain access policies for ES.
- Geographic application restriction. AWS Web Application Firewall now includes an option to restrict access to applications based on geographic location to fulfill licensing requirements and security needs. Geographic Match Conditions allows a business to create a whitelist that only allows visitors from specified countries. or a blacklist that blocks access to certain countries.
- CodePipeline takes pushes from CodeCommit. The latter service can now send an Amazon CloudWatch Event to the former service to trigger a pipeline, which eliminates the need to periodically check for code changes.
- ALBs support multiple certificates. Businesses can now host multiple secure HTTPS applications and assign each one a Secure Sockets Layer certificate behind one Application Load Balancer. AWS uses Server Name Indication to allow these apps to run on the same load balancer. This means businesses don’t have to use risky Wildcard or complicated Multi-Domain certificates to run multiple HTTPS apps on one load balancer.
- Migrate to new database sources. The AWS Database Migration Service (DMS) added Azure SQL Database and S3 as sources. S3 was previously supported as a target, but its addition as a source allows teams to freely move data to and from S3 buckets and other DMS sources. Amazon EC2 also now supports Microsoft SQL Server 2017 for extra scalability and performance.
The cost of graphics acceleration can often make the technology prohibitive, but a new AWS GPU instance type for AppStream 2.0 makes that process more affordable.
Amazon AppStream 2.0, which enables enterprises to stream desktop apps from AWS to an HTML5-compatible web browser, delivers graphics-intensive applications for workloads such as creative design, gaming and engineering that rely on DirectX, OpenGL or OpenCL for hardware acceleration. The managed AppStream service eliminates the need for IT teams to recode applications to be browser-compatible.
The newest AWS GPU instance type for AppStream, Graphics Design, cuts the cost of streaming graphics applications up to 50%, according to the company. AWS customers can launch Graphics Design GPU instances or create a new instance fleet with the Amazon AppStream 2.0 console or AWS software development kit. AWS’ Graphics Design GPU instances come in four sizes that range from 2-16 virtual CPUs and 7.5-61 gibibytes (GiB) of system memory, and run on AMD FirePro S7150x2 Server GPUs with AMD Multiuser GPU technology.
Developers can now also select between two types of Amazon AppStream instance fleets in a streaming environment. Always-On fleets provide instant access to apps, but charge fees for every instance in the fleet. On-Demand fleets charges fees for instances when end users are connected, plus an hourly fee, but there is a delay when an end user accesses the first application.
New features and support
In addition to the new AWS GPU instance type, the cloud vendor rolled out several other features this month, including:
- ELB adds network balancer. AWS Network Load Balancer helps maintain low latency during spikes on a single static IP address per Availability Zone. Network Load Balancer — the second offshoot of Elastic Load Balancing features, following Application Load Balancer — routes connections to Virtual Private Cloud-based Elastic Compute Cloud (EC2) instances and containers.
- New edge locations on each coast. Additional Amazon CloudFront edge locations in Boston and Seattle improve end user speed and performance when they interact with content via CloudFront. AWS now has 95 edge locations across 50 cities in 23 countries.
- X1 instance family welcomes new member. The AWS x1e.32xlarge instance joins the X1 family of memory-optimized instances, with the most memory of any EC2 instance — 3,904 GiB of DDR4 instance memory — to help businesses reduce latency for large databases, such as SAP HANA. The instance is also AWS’ most expensive at about $16-$32 per hour, depending on the environment and payment model.
- AWS Config opens up support. The AWS Config service, which enables IT teams to manage service and resource configurations, now supports both DynamoDB tables and Auto Scaling groups. Administrators can integrate those resources to evaluate the health and scalability of their cloud deployments.
- Start and stop on the Spot. IT teams can now stop Amazon EC2 Spot Instances when an interruption occurs and then start them back up as needed. Previously, Spot Instances were terminated when prices rose above the user-defined level. AWS saves the EBS root device, attached volumes and the data within those volumes; those resources restore when capacity returns, and instances maintain their ID numbers.
- EC2 expands networking performance. The largest instances of the M4, X1, P2, R4, I3, F1 and G3 families now use Elastic Network Adapter (ENA) to reach a maximum bandwidth of 25 Gb per second. The ENA interface enables both existing and new instances to reach this capacity, which boosts workloads reliant on high-performance networking.
- New Direct Connect locations. Three new global AWS Direct Connect locations allow businesses to establish dedicated connections to the AWS cloud from an on-premises environment. New locations include: Boston, at Markley, One Summer Data Center for US-East-1; Houston, at CyrusOne West I-III data center for US-East-2; and Canberra, Australia, at NEXTDC C1 Canberra data center for AP-Southeast-2.
- Role and policy changes. Several changes to AWS Identity and Access Management (IAM) aim to better protect an enterprise’s resources in the cloud. A policy summaries feature lets admins identify errors and evaluate permissions in the IAM console to ensure each action properly matches to the resources and conditions it affects. Other updates include a wizard for admins to create the IAM roles, and the ability to delete service-linked roles through the IAM console, API or CLI — IAM ensures that no resources are attached to a role before deletion.
- Six new data streams. Amazon Kinesis Analytics, which enables businesses to process and query streaming data in an SQL format, has six new types of stream processes to simplify data processing: STEP(), LAG(), TO_TIMESTAMP(), UNIX_TIMESTAMP(), REGEX_REPLACE() and SUBSTRING(). AWS also increased the service’s capacity to process higher data volume streams.
- Get DevOps notifications. Additional notifications from AWS CodePipeline for stage or action status changes enable a DevOps team to track, manage and act on changes during continuous integration and continuous delivery. CodePipeline integrates with Amazon CloudWatch to enable Amazon Simple Notification Service messages, which can trigger an AWS Lambda function in response.
- AWS boosts HIPAA eligibility. Amazon’s HIPAA Compliance Program now includes Amazon Connect, AWS Batch and two Amazon Relational Database Service (RDS) engines, RDS for SQL Server and RDS for MariaDB — all six RDS engines are HIPAA eligible. AWS customers that sign a Business Associate Agreement can use those services to build HIPAA-compliant applications.
- RDS for Oracle adds features. The Amazon RDS for Oracle engine now supports Oracle Multimedia, Oracle Spatial and Oracle Locator features, with which businesses can store, manage and retrieve multimedia and multi-dimensional data as they migrate databases from Oracle to AWS. The RDS Oracle engine also added support for multiple Oracle Application Express versions, which enables developers to build applications within a web browser.
- Assess RHEL security. Amazon Inspector expanded support for Red Hat Enterprise Linux (RHEL) 7.4 assessments, to run Vulnerabilities & Exposures, Amazon Security Best Practices and Runtime Behavior Analysis scans in that RHEL environment on EC2 instances.
AWS customers can add graphic acceleration to instances, but with little flexibility. To change that, the cloud provider has finally fulfilled a promise from early last year, with Elastic GPUs that fit enterprise needs.
Developers attach Elastic GPUs to Elastic Compute Cloud (EC2) instances to boost graphics performance in applications for intermittent spikes in workloads. EC2 Elastic GPUs are network-attached compute power available in sizes ranging from 1 GB to 8 GBs.
GPU users were previously limited to spinning up a G2 or G3 instance. But those require investment in a full physical GPU, which overshoots some business needs, resulting in costly and wasteful resource usage. Teams can use Elastic GPUs at a lower price than G2 and G3 instances, using just a portion of the physical GPU for graphics-intensive apps.
Elastic GPUs also help customers that need graphics acceleration without being restricted to a particular instance type. They choose another instance type – such as memory- or storage-optimized – and attach an Elastic GPU to it.
Busy month for AWS
August was a busy month for AWS, with updates from both the AWS Summit in New York and VMworld in Las Vegas.
AWS and VMware finally released their hybrid cloud service nine months after they unveiled the partnership. Enterprises were particularly interested in pricing and functionality details, while small businesses might not be a fit for the service.
At the AWS Summit, AWS unveiled new services for migration and security, a variety of new features for Elastic File System (EFS), Config and CloudTrail, and an upgrade to CloudHSM. And AWS Glue, a service revealed at last year’s re:Invent, is now generally available.
More new features and support
- DynamoDB adds VPC Endpoints. Amazon DynamoDB offers more secure network traffic via a free Virtual Private Cloud (VPC) Endpoints feature, which is now generally available. VPC Endpoints keeps traffic within the AWS cloud instead of exposed in the public internet, in line with businesses’ strict compliance needs.
- More HIPAA eligibility. A new AWS Quick Start helps healthcare enterprises automate a deployment based on a CloudFormation customizable template that adheres to HIPAA regulatory requirements. Additionally, Amazon Cloud Directory implemented new controls to help teams build and run apps that meet HIPAA and PCI DSS guidelines. As with all HIPAA-eligible services, an AWS user must first execute a Business Associate Agreement before building an app that achieves compliance.
- Develop serverless functions locally. A new beta Command Line Interface tool, AWS Serverless Application Model (SAM), enables dev teams to test and debug AWS Lambda functions on premises. Developers can write functions in Node.js, Java, and Python, choose an integrated development environment, and simulate function triggers and make calls via Amazon API Gateway to invoke functions.
- AWS Marketplace adds functionality, new region. Users can now visualize, analyze and control their AWS Marketplace spending via new integration with several existing cost management tools: AWS Cost Explorer, AWS Cost and Usage Report and AWS Budgets. In addition, the AWS Marketplace also is now available in the AWS GovCloud region for public sector customers.
- New capabilities for Simple Email Service. A new Reputation Dashboard helps Amazon Simple Email Service (SES) users track bounce and compliant rates for an account, and act on sending failures. Amazon SES also added dedicated IP pools so an AWS customer can send emails from a specific IP address, or organize IP addresses into configurable pools for large email sends. SES also added capabilities that enable businesses to track and optimize email recipient engagement.
- AWS adds global edge locations. AWS added three new edge locations for its Amazon CloudFront CDN service: Chicago (now home to two edge locations), Frankfurt (six locations) and Paris (three locations). In all, AWS has 93 global edge locations.
- Amazon RDS SQL Server quadruples max database size. Database instances for SQL Server on Amazon Relational Database Service (RDS) now range up to 16 TB of storage, four times higher than the previous maximum of 4 TB. The range for IOPS to storage also increased five times, from 10:1 to 50:1. With these new limits, available on Provisioned IOPS and General Purpose storage types in all regions, databases and data warehouses can support larger workloads without additional RDS instances.
- New CodeCommit features. Amazon’s code repository service, AWS CodeCommit, added several new features and integrations. The service now sends repository state changes to Amazon CloudWatch Events, which enables developers to trigger workflows based on those changes. CodeCommit users can now view, change and save preferences to customize the service’s dashboard presentation. Finally, CodeCommit added a Git tags view that eases code repository navigation.
- EFS adds more permissions. Amazon EFS added support for special permissions, enabling administrators to customize granular access permissions for directories. EFS now supports setgid, which applies ownership of new directory files to the group associated with the directory, and sticky bit special permissions, which restrict file deletion or renaming to either the file or directory owner or to the root user. EFS users can now also manage access to executable files so that end users can launch them but not read or write them.
- CloudTrail supports Lex. Amazon CloudTrail now integrates with Amazon Lex to track application programming interface (API) calls to and from the conversational interface app.
- New render management tool. AWS’ new render management system, Deadline 10, is now available, allowing developers to launch and manage rendering fleets.
- Amazon Cloud Directory boosts search performance. Amazon Cloud Directory users can now optimize searches by defining facets of schema to limit queries to subsets of a directory. A schema contains multiple attributes called facets, which help create different object classes and enable multiple apps to share one directory.
Amazon is upgrading its compute power to court more cloud-hosted graphics-intensive workloads, seeking to benefit from the high cost customers pay for that heavy compute power.
AWS has added a new G3 instance to its graphics-optimized Elastic Compute Cloud (EC2) instances, to power 3-D rendering or visualization, computer-aided design, video encoding augmented / virtual reality workloads. While the hardware upgrade could entice enterprises, IT teams should be wary of high costs and processing times with the instances.
The largest of the three G3 instances contains twice the CPU processing power and eight times the memory of the previous G2 generation. The instances, which provide enhanced video encoding and networking features, run on Intel Xeon E5-2686 v4 (Broadwell) processors and backed by NVIDIA Tesla M60 GPUs.
AWS customers can launch EC2 instances from the AWS Management Console, AWS software development kits, AWS Command Line Interface and other libraries.
New features and support
- Amazon Inspector adds triggers. The Amazon Inspector service, which assesses security vulnerabilities in AWS deployments, can launch automatic scans through integration with CloudWatch Events. With Assessment Events, a customer can create event rules in CloudWatch that notify Inspector to run an assessment on a cloud environment. Users can also schedule recurring assessments and monitor other services to look for event triggers. Inspector displays Assessment Events in its console so a user can see all the triggers assigned to an assessment.
- Visualize resource configurations. A dashboard for AWS Config summarizes account resources and makes configuration history easily accessible. The dashboard displays the number of resources in an account and resources by type, so an administrator can quickly identify resources that fail to comply with AWS Config Rules.
- CloudWatch gains speed. Amazon CloudWatch now supports high-resolution custom metrics and alarms,enabling SysOps to monitor deployments in seconds. Metrics publish in as little as one second and alarms occur in as few as 10 seconds, for more immediate and granular visibility into a cloud environment. The support also includes dashboard widgets.
- Spot Fleets improve tagging. Users can now apply up to 50 tags to EC2 instances launched in a Spot Fleet, to quickly identify specific instances and improve access control, compliance protocols and cost accounting for those compute resources. SysOps defines which tags they want to apply to Fleets, which apply those tags to individual instances. The tagging feature is available in all regions.
- New HIPAA eligibility. Two Amazon services gained HIPAA eligibility and PCI compliance. Amazon WorkSpaces is a desktop as a service that enables administrators to deploy HIPAA-compliant work environments for employees. The service also adheres to Payment Card Industry (PCI) Security Standards, which lets applications and files safely interact with data from card holders. Amazon WorkDocs, a file sharing and collaboration service, can safely handle sensitive health or cardholder information with HIPAA eligibility and PCI DSS compliance. Both updates help AWS customers, particularly in the healthcare field, conform to strict compliance standards.
- Lambda@Edge goes GA. Eight months after its unveiling, the AWS Lambda@Edge service is generally available for developers who want to run Node.js-based Lambda functions across AWS edge locations. Developers upload code to Lambda and configure it to trigger CloudWatch Events. AWS then routes the request to the edge location that’s geographically closest to the customer and executes it. For example, an IT team can create custom web pages and logic at lower latencies for individual Lambda requests based on their geographic origins.
- Reduce unwanted email. An added flow rules feature in Amazon Workmail enables an IT team to filter inbound email traffic to reduce unwanted email messages from specific senders, route email to junk folders and ensure delivery of priority email. Rules can apply to individual email addresses and entire email domains that AWS hosts.
This is a guest blog post by Bob Reselman, a nationally known developer, system architect, writer and editor. You can read more of his work at DevOpsAgenda.com.
Serverless computing is all the rage among developers, and with good reason.
A serverless environment is the new vista in modern application development. AWS has Lambda; Microsoft has Azure Functions; Google has Cloud Functions. These technologies are not going away. In fact, we’ll see a lot more work take place to create, build and test code in which the function is the unit of deployment.
Serverless-based applications are easy to architect and easy to deploy. A developer decides the services he needs, wires them up in a script, hits the deploy button and runs some tests — that’s it. Developers don’t need to worry about hardware, capacity or scalability; the serverless provider takes care of all that. Just pay the bill for the resources you use.
It couldn’t be simpler, right? Well, maybe not.
The architecture of a serverless environment with a simple REST API architecture implemented in AWS is fairly straightforward. A set of RESTful endpoints uses Amazon API Gateway and wires each endpoint to some AWS Lambda functions. One Lambda function uses Simple Storage Service (S3) as a data store, and the others store data in an Amazon DynamoDB database.
The API Gateway provides a way to get data in and out of the application; the functions handle computation, while S3 and DynamoDB provide the data storage. What’s not to like? AWS will scale up your application as needed. All you need to do is pay the bill.
So, let’s talk about that bill. Let’s use Will, a systems engineer, as an example.
Will is a low-level engineer who works on content delivery networks for a major telecom. He works closely with bare metal, well below the surface of the average developer’s day-to-day dealings with the cloud. In Will’s world, memory allocation counts.
Over the years, with the growing popularity of higher-level languages such as C# and Java, the common Linux command malloc, which requests memory from the operating system, has become hidden in the language runtime engines, including the common language runtime for .NET and the Java VM. But memory has to be allocated no matter what, and the way you get memory is via the operating system using malloc:
str = (char *) malloc(15);
Here is where it interesting: the efficiency of malloc varies depending on your implementation. Standard malloc is inefficient in situations with a high degree of concurrency in multiprocessor environments, so Will won’t use it. It locks up memory — used or unused — and places extra burden on the CPU. Will prefers tcmalloc, created by Google, which exposes configuration capabilities that allow memory allocation to work more efficiently. And it avoids wasteful CPU cycling.
So, what does a memory allocation binary have to do with your AWS bill? It actually has a lot to do with it.
AWS makes money on Lambda by billing you for the time it takes to execute code, which translates into CPU utilization — though you also get billed by your request volume. Thus, every piece of code in your Lambda function that declares a variable is subject to the memory allocation executable, which is most often malloc. That means you might have created code that runs squeaky clean on your local machine or even in a private cloud. But when it gets to AWS, it kills the CPUs.
The provider’s memory allocation infrastructure might not be optimized, so wasteful cycles get spun and you get billed. It’s just like giving a package to a messenger and letting him determine the best route, which might include a lot of stop lights. You pay for the messenger’s time no matter the route efficiency.
Of course, I am not saying AWS is a nefarious agent; quite the opposite. But the serverless environment is theirs to run, and the IT shop doesn’t have a lot say in the matter other than region selection.
Without the ability to optimize a serverless environment to accommodate computationally intensive applications, there is a real financial risk for enterprise IT teams. Hopefully, the major players realize that user optimization for cloud services offers a competitive advantage and more granular capabilities. Otherwise, engineers will fly blind without the aid of instruments on the control panel. And, as we’ve learned on the terrain, when disaster looms, you can’t fix what you can’t see.
Edge computing and IoT continue to infiltrate the enterprise, prompting AWS to release several services at re:Invent 2016. One service, AWS Greengrass, enables AWS Lambda functions on devices and ties together IoT and serverless technologies. Seven months after pulling back the curtains on the service, Greengrass is generally available in the US-East and US-West regions.
As enterprises invest more heavily in IoT-connected devices, they want more connectivity and compute capabilities associated with them. Greengrass delivers limited AWS programming to groups of devices, enabling them to respond to real-world circumstances, such as a faulty internet connection.
AWS Greengrass enables a device to perform functions on data and securely transmit that data to the cloud for additional analytics and storage. Developers can combine Lambda with the AWS Greengrass Core SDK to execute serverless functions locally, establish secure connections from the core device to the cloud and support MQTT messaging on devices.
The service also opens up hybrid cloud possibilities — another recent area of emphasis for AWS. Greengrass is one of few Amazon products that can run in on-premises. Greengrass can run on very lightweight or more intricate computing systems, enabling IT administrators to use the AWS programming model locally, if they choose.
Developers can access Greengrass from the AWS Management Console, API or AWS Command Line Interface, and then define and manage Greengrass groups — devices connected to each other.
In addition to Greengrass, AWS added several features and support this month, including plans for a new data center region. Here’s what you might have missed.
New AWS features and support
- DAX also goes GA. Amazon DynamoDB Accelerator (DAX), a caching service for eventually consistent, read-heavy workloads on DynamoDB, is generally available. DAX reportedly improves DynamoDB performance up to 10 times, and it is both fully-managed and compatible with existing DynamoDB API calls, lowering the barrier for developers to roll it into their deployments. DAX is available in the US-East-1 (Northern Virginia), US-West-1 (Northern California), US-West-2 (Oregon), EU-West-1 (Ireland) and Asia Pacific-Northeast-1 (Tokyo) regions.
- New region in Hong Kong. AWS will add a new geographic region to Hong Kong in 2018. The region, AWS’ eighth in Asia Pacific, appeals to local public and private sector clients as well as Asia-based businesses building multi-zone fault-tolerant applications. The region expands AWS’ global footprint to 20 regions; the public cloud provider will open other regions and availability zones in China, France and Sweden in 2017 and 2018.
- Rekognition adds region, feature. Amazon Rekognition, an image recognition and management service, is available in the AWS GovCloud (US) region. A new celebrity recognition feature enables the service to identify an image of a famous person by comparing it to a global list of thousands of celebrities across politics, entertainment, business, sports and media. The feature expands facial recognition capabilities for developers, who could roll the technology into mobile applications. It keeps pace with a similar tool within the Microsoft Cognitive Services portfolio.
- X-Ray expands latency monitoring. Two features in the AWS X-Ray service will analyze and debug distributed applications. The Visual Node and Edge latency distribution graphs, accessible in the Service Details sidebar, visualize and track latency among services; they also show current latency from the perspectives of clients, services and microservices. Developers can access the features via API call or the X-Ray console.
- Device authentication for Amazon Workspaces. AWS’ desktop as a service offering, Amazon Workspaces, added device authentication for users in BYOD work environments. Administrators establish policies to manage devices and client access, and digital certificates grant or block access to certain operating systems.
- AWS WAF adds more IP address control. AWS Web Application Firewall (WAF), a service that protects web-based apps from common malicious attacks, added a rate-based rules feature. Previously, security ops pros define rules for requests with certain criteria, such as IP address or the size of the request, and choose to allow, block or count those requests. Rate-based rules expand the controlled response to include a large number of requests for a particular IP address, which could signal a DDoS attack or something more benign, such as a software integration that cannot connect to the app. SecOps teams use rules to add or remove an IP address from a blacklist, set higher request rates for technology partners and set CloudWatch metrics — including alarms that can fire off AWS Lambda functions — to monitor each rule. SecOps can also combine rate-based rules with other WAF conditions to establish more sophisticated rate-based policies.
- Additional AWS Direct Connect locations, monitoring capabilities. AWS Direct Connect, which targets hybrid clouds, establishes secure, dedicated network connections from on-premises resources to the AWS cloud — with increased bandwidth and reduced network costs compared to web-based connections. The list of available locations for AWS Direct Connect now totals 60 — with new ones across North America and Europe. This was the service’s second expansion this year. Admins also can now add Amazon CloudWatch monitoring to all locations (except China), to monitor physical connections to the cloud and set up alarms and triggers through Amazon Simple Notification Service (SNS).
- Lightsail available in nine new regions. Launched at re:Invent in 2016 in just the US-East region, AWS expanded Lightsail, its Virtual Private Server service, to nine more regions across the United States, Europe and Asia Pacific. Lightsail offers simplified servers with managed infrastructure for businesses with more basic computing needs or limited budgets.
- CloudTrail improves API tracking. Admins use AWS CloudTrail to monitor AWS API calls, and AWS recently added to those tracking capabilities. The CloudTrail console’s API Activity History page now includes API calls to CloudWatch Events, Elastic Compute Cloud (EC2), DynamoDB, Cognito, Kinesis, CloudHSM and Storage Gateway. This addition centralizes API logs and removes the need to retrieve CloudWatch Events APIs from Simple Storage Service (S3) buckets.
- EC2 Systems Manager integrates with S3. Developers can query and visualize inventory data across multiple regions and accounts with AWS’ new integration between Amazon EC2 Systems Manager and S3. Developers enable an S3 bucket to automatically collect inventory data, which eliminates the need to create custom scripts. They can then use Amazon Athena to query the data or Amazon QuickSight to visualize it.
- Convert legacy data warehouses to AWS. The AWS Schema Conversion Tool added more support for legacy data warehouses. IT teams can now export data to Amazon Redshift from Teradata (versions 13 and above) and Oracle Data Warehouse (versions 10g and above).
- AppStream added user management, web portals. Amazon AppStream 2.0 now enables admins to create and manage users without an identity federation tool. Admins grant user access with the User Pools tab in the AppStream console. Users log in via a web portal to choose which approved applications to use.
In the fast-paced world of public cloud, if AWS is the hare, AWS GovCloud is the tortoise.
GovCloud launched in 2011 to meet stricter regulatory requirements for federal, state and local government. Since then, AWS has added dozens of new services and nine new private-sector regions across the globe. But AWS GovCloud was slow to incorporate new services, and it existed as only a single West Coast region – until now.
AWS will add a second GovCloud region in the East Coast in 2018. This comes on the heels of the public cloud market leader’s increased efforts to meet regulatory standards and improve feature parity among its commercial and public sector offerings.
And while AWS GovCloud might only serve as a curiosity to the private sector, its continued expansion speaks to a broader trend of the public cloud as an accepted place for workloads of all kinds.
“From a technology perspective, [GovCloud] has grown leaps and bounds, even over the last two or three years,” said Tim Israel, director of cloud engineering at Enlighten IT Consulting, a GovCloud reseller that works primarily with the Department of Defense.
AWS GovCloud has seen 185% compounded annual growth rate since it opened in 2011, according to Amazon. Some of the most important additions include new instance types already available on the general site and the addition of services such as AWS Lambda, which was added in May, more than two years after the service was first rolled out. There’s also a growing list of accreditations for various services that are often more important to regulated IT shops than the services themselves.
The real potential benefit of the new East Coast region — one that regular AWS users have had access to since 2015 — is disaster recovery across regions. Currently, AWS GovCloud users can replicate data across data centers within the region, but that’s probably not enough redundancy for mission-critical applications.
For example, users of the standard AWS public cloud, which incorporates regional failover, saw services remain uninterrupted when the US-East 1 region went down earlier this year. Those lacking cross-region replication couldn’t access applications housed in US-East 1 for up to four hours.
Still, AWS has a long way to go before there’s true parity between the two iterations of its cloud. Only 35 of its 92 services are available on GovCloud. The private cloud that AWS built specifically for the CIA is believed to have an even small feature set. All other U.S. regions offer at least 50 services, and across AWS’ global footprint, only the China region, which is operated by Sinnet, has fewer available services.
According to Amazon, the services available in AWS GovCloud align with the needs of government, as indicated by public sector customers.
AWS GovCloud is also generally more expensive than the commercial version. Comparable compute resources cost more in that region than they do in standard AWS regions; it’s also more expensive to transfer data out of the cloud.
Despite those limitations, AWS GovCloud does have benefits for its targeted audience. It meets certain regulatory standards that other regions do not. It’s also maintained only by U.S. citizens and provides encrypted access that meets federal guidelines.
Unlike the private sector, government agencies have to go through a competitive bidding process that puts roughly two years between when a project’s conception and when the actual purchase is made. And given that two years ago was about the time when enterprises really started to embrace the public cloud, AWS GovCloud could also be gaining steam at just the right time.
Trevor Jones is a news writer with SearchCloudComputing and SearchAWS. Contact him at firstname.lastname@example.org.
Price reductions are less common for compute resources than they were in the early days of cloud computing. But AWS customers can still find value in occasional Elastic Compute Cloud (EC2) price cuts.
In May, AWS reduced prices on a slew of one-year standard and three-year convertible EC2 Reserved Instances. AWS customers can save 9% to 17% on standard Reserved Instances, depending on the region, operating system and instance type — discounts apply to C4, M4, R4, I3, P2, X1 and T2 types. Discounts for convertible instance types range up to 21%.
Convertible Reserved Instances allow a user to change the instance family as application needs evolve, to provide an extra level of workload flexibility but still locked into a contract for instance capacity.
AWS also introduced no-upfront pricing for three-year Reserved Instances, a feature previously reserved for one-year terms. It also lowered On-Demand and Reserved pricing for M4 Linux instances.
New features and support
- Cost allocation tags for Elastic Block Store snapshots. Users can assign costs to a particular project or department via cost allocation tags. Navigate the AWS Management Console’s tag editor feature to find the necessary snapshot, or backup of an Elastic Block Store (EBS volume), and apply tags. Users can also create tags via a script command or function call. Manage and activate cost allocation tags in the billing dashboard, then monitor tagged snapshots in the Cost Explorer feature.
- Lambda support for AWS X-Ray. The AWS X-Ray service, which analyzes performance of microservices-based applications, now supports AWS Lambda. Developers can enable active function tracing within Lambda to activate X-Ray or update functions in the AWS Command Line Interface. AWS X-Ray processes traces of functions between services and generates visual graphs to ease debugging.
- New features for IAM policy summaries. Administrators can evaluate and troubleshoot AWS Identity and Access Management (IAM) permissions with three new IAM policy summary features. The new resource summaries display resource types, regions and account IDs to provide a full list of defined resources for each policy action. Admins can evaluate which services or actions a policy denies, and see which possible actions remain. They also can identify typos and other errors in policies by seeing which services and actions IAM fails to recognize.
- Support for SAP clusters. AWS unveiled extended support for larger SAP clusters at SAP’s Sapphire Now conference. In addition to expanding its X1 instance type to accommodate larger SAP applications, AWS revealed plans to expand RAM on its virtual servers to better support SAP workloads in the near future.
AWS has made it a priority to win over customers in the database market, specifically Oracle shops. And the public cloud provider has a new weapon in that battle — an upgraded primary database conversion tool.
The AWS Database Migration Service (DMS) now supports NoSQL databases, enabling developers to move databases from the open source MongoDB platform onto DynamoDB, Amazon’s native NoSQL database service. AWS DMS also supports migrations to and from Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle, SAP ASE and SQL Server as database sources. The cloud provider could target other NoSQL database providers for support in the future.
In addition to homogenous migrations, the AWS Schema Conversion Tool converts database schema to enable migration from a disparate database platform to a target on Amazon Relational Database Service, such as from Oracle to Amazon Aurora.
AWS also recently added support for data lake conversions from Oracle and Taradata to Amazon Redshift, a swift response to an Oracle licensing update that hiked fees for Oracle cloud users.
Despite the potential of lock-in, enterprises are interested in the ability of the DynamoDB platform to integrate database information with other AWS tools. And AWS is happy to beat its chest over winning these database customers — it passed 22,000 database migrations in late March, AWS CEO Andy Jassy claimed on Twitter.
It’s getting crowded in the AWS toolbox
Among AWS’ slate of recent service and tool updates, here are several other noteworthy tidbits:
- A Resource Tagging API. IT teams can now apply tags, remove tags, retrieve a list of tagged resources with optional filtering and retrieve lists of tag keys and values via API. The new API enables developers to code tags into resources instead of doing it from the AWS Management Console. The Resource Tagging API is available through the newest versions of AWS SDKs and the AWS Command Line Interface. The new API functions apply across dozens of resource types and services. The cloud provider also added the ability to specify tags for Elastic Compute Cloud instances and Elastic Block Store volumes within the API call that creates them.
- Support for CloudWatch Alarms on Dashboard Widgets. Added functionality of CloudWatch Alarms for Dashboard Widgets provides AWS users with at-a-glance visibility into potential performance issues. SysOps can view CloudWatch metrics and Alarms in the same widget, and view widgets that display metrics according to number (value of a metric), line graph or stacked area graphs (layering one metric over another).
- Cross-region, cross-account capabilities for Amazon Aurora. IT teams can copy automatic or manual snapshots from one region to another and create read replicas of Aurora clusters in a new region. These features can improve disaster recovery posture or expand read operations to users in geographically-close regions. Additionally, users can share encrypted snapshots across AWS accounts, which enables them to copy or restore a snapshot depending on encryption configuration. AWS also expanded Aurora availability to the US-West region, and added support for t2.small instances.
- Amazon Elastic MapReduce instance fleets. This addition lets ops specify up to five instance types per fleet with weighted capacities, availability zones and a mix of on-demand or spot pricing. EMR instance fleets enables ops teams to craft a strategy for how they want to provision and geographically place capacity, and how much they want to pay for it. EMR automatically spins up the required capacity to support big data frameworks for Apache Hadoop, Spark or HBase, among others.
AWS attributed Tuesday’s extended disruption to outdated processes and human error, according to a postmortem published Thursday.
The post, which classified the incident as a “service disruption,” states that the problem started at 12:37 p.m. ET when an authorized Amazon Simple Storage Service (Amazon S3) team attempted to resolve an issue that had caused the S3 billing system to behave slower than expected. One of the team members, following AWS guidelines, attempted to execute a command that would remove some of the servers for an S3 subsystem, but incorrectly entered one of the inputs.
As a result, too many servers were taken down, including those that supported two additional S3 subsystems that manage metadata and location information, as well as the allocation of new storage. Compounding problems and creating a Catch-22 scenario, the latter subsystem required the former to be in operation. The capacity removal required a full restart, and Amazon S3 was unable to service requests.
It appears the outage was due to fat fingering under pressure with an arcane, hardly used command, said Mike Matchett, senior analyst and consultant at Taneja Group.
“It need not ever have happened, but was too easy a mistake to make,” he said. “Once made, it cascaded into a major outage.”
The impact spread to additional services in the US East-1 region that rely on Amazon S3, including the S3 console, new Elastic Compute Cloud instance launches, Elastic Block Store volumes and AWS Lambda.
AWS’ system is designed to support the removal of significant capacity and is built for occasional failure, according to the company. It has performed this particular operation since S3 was first built, but a complete restart of the affected subsystems hadn’t been done in one of the larger regions in years, the company said.
It’s surprising that a system like AWS is vulnerable at this scope to manual errors, said Carl Brooks, an analyst with 451 Research. The initial failure is understandable, but the compounding impact shows a larger structural flaw in how AWS manages uptime, he added.
“It says something for one manual process to have that much disruptive effect,” Brooks said. “They claim they’ve been working against [these types of failures] all this time, and clear that work has not been completed.”
Amazon S3 fully recovered by 4:54 p.m. ET, and related services recovered afterward, depending on backlogs.
AWS said it has since modified the tool it uses to perform the debugging task to limit how fast it can remove capacity and to put in additional safeguards to prevent a subsystem from going below its minimum capacity. It’s also reviewing other operational tasks for similar architectural problems, and plans to improve the recovery time of critical S3 subsystems by breaking down services into smaller segments and limit the blast radius of a failure, the company said in the postmortem. That work was planned for S3 later this year but that timeline was pushed up following this incident. Company officials did not comment specifically on when this work would be completed.
Critical operations that involve shutting down key resources should be fully scripted and tested often, and it shows a level of hubris that AWS hadn’t tried this restart in years, Matchett said. He was also critical of having the health dashboard connected to S3, saying the management plane for high-availability workloads should have been on a completely separate one than the resources it was controlling.
“It really looks like AWS has built a bigger house of cards than even they are aware of,” Matchett said.
Changes also have been made to the Service Health Dashboard to run across multiple regions.