aws

Outages, complexity, and the stronger cloud

The extended outage of Amazon Web Services' EBS storage services in one of their service "regions" the week of April 21st has triggered so much analysis--emotional and otherwise--that I chose to listen rather than speak until now. Events like this are tremendously important, not because they validate or invalidate cloud services, but because they let us see how a complex system responds to negative events.

You see, for almost four years now, I've believed that cloud computing is evolving into a complex adaptive system. Individual services and infrastructure elements within a cloud provider's portfolio are acting … Read more

The cloud backlash

There's no doubt that the recent "partial failure" of the Amazon Web Services cloud computing platform is giving enterprises, service providers, and developers pause--and will continue to do so for months to come. Amazon called the outage "partial" and a "degradation," but it was a very big deal. A significant part of Amazon's flagship EC2 (Elastic Compute Cloud) was offline for a day, as were the related EBS (Elastic Block Store) and RDS (Relational Database Service) offerings. The failure affected only the northern Virginia data center ("US-East"), and the majority … Read more

Questions linger about Amazon outage

Today, April 29, 2011, Amazon Web Services released a "summary" of its EC2 (Elastic Compute Cloud) and RDS (Relational Database Service) disruption in its U.S. East Region. This came approximately one week after what appears to be a classic example of a rolling disaster that occurred after someone incorrectly executed a communications network traffic shift as part of "normal AWS scaling activities." I read human error here--long known as the leading cause of large system failures.

The rolling disaster is a well understood phenomenon in IT and can be hard to foresee with a complex system. The way to discover and fix potential failure points is to test on a regular basis then build around them. But periodic testing can become difficult for a system of this magnitude.

What I find positive about the Amazon summary is a set of disaster recovery recommendations for users and an admission that AWS customer support during the outage was less than stellar. The disaster recovery recommendations should now be required reading for every AWS customer. In fact, I think that all cloud services users should read this statement with an eye to discovering potential holes in their own disaster recovery strategies. … Read more

Amazon restoring AWS, but slowly for some

A serious Amazon Web Services outage has extended well into its second day, but Amazon said Friday the end is in sight for most affected customers of the cloud-computing infrastructure.

"We continue to see progress in recovering volumes, and have heard many additional customers confirm that they're recovering. Our current estimate is that the majority of volumes will be recovered over the next 5 to 6 hours," Amazon said on its AWS status dashboard at 8:49 a.m. today. Volumes are areas of Amazon's Elastic Block Storage (EBS) service that store data.

But for some … Read more

Amazon cloud outage derails Reddit, Quora

A partial failure at Amazon Web Services' cloud-computing infrastructure brought down some Internet operations today, including the Web sites of Quora and Reddit.

The outage struck the Elastic Compute Cloud (EC2) service at Amazon's northern Virginia site, which handles AWS operations for the U.S. East Coast. The problems began at 1:41 a.m. PT, according to Amazon's AWS status dashboard, with delays and errors when connecting to servers over a network.

A long list of customers has come to rely on Amazon EC2, which provides servers on a pay-as-you-go basis that lets customers ramp or down … Read more

Network, don't fail me now!

Everything in IT depends on the network.--and not just in an abstract, "need it occasionally" sort of way. The packets must flow for virtually every operation, every job, every transaction. Whenever packets drop, or links go down, we're disconnected and isolated. Information doesn't flow; apps don't work; users don't proceed. We need the network up and running, millisecond by millisecond, every millisecond of every day.

Our utter, urgent dependency won't lessen in the coming years. It will intensify--redoubling and redoubling again. Cisco calls its vision of the future "together." HP … Read more

Flash video gets a cloud option through Amazon

Adobe Systems' Flash Media Server software is now available as a pay-as-you-go option on the Amazon Web Services cloud-computing technology, the companies announced Wednesday.

Flash Media Server 4 lets customers send streaming video across the Net. By using it hosted on AWS' Elastic Compute Cloud (EC2) service, customers don't have to worry so much about installation and configuration details.

The service costs a flat rate of $5 to set up and $5 per month to use, with variable costs according to the video-streaming capacity needed and data transferred. For example, an extra-large server instance that can manage up to … Read more

Amazon adds DNS service for Net addresses

It probably wouldn't have helped WikiLeaks' struggle to stay on the Web last week, but Amazon.com has launched a new service for companies whose Internet operations need Domain Name Service.

DNS is technology that connects the Internet address that people use, such as www.flickr.com, to its numeric address, 68.142.214.24. It's that numeric Internet Protocol (IP) address that computers and network gear need to route data over the Internet. DNS functions not unlike a phone book, where you can find a phone number by looking up a person's name.

Now Amazon is … Read more

IBM floats new government clouds

This week IBM announced new cloud offerings for federal and state governments aimed at providing the scalable infrastructure and ease of deployment available from public cloud providers such as Amazon Web Services and Rackspace.

These clouds are hosted at IBM data centers and are multi-tenant offerings restricted to government entities. And while the world doesn't need yet another definition of cloud, this use case of hosted private cloud (note: I'm not sure if this fits as virtual private cloud) is one that I suspect we'll see more and more large data center providers move toward.

For clarity, these clouds are both private and hosted, leading me to wonder if this type of offering will be more appealing than behind-the-firewall private cloud solutions. After all, in this scenario you don't have to buy hardware or hire staff to manage your infrastructure.

To a large extent, these new services look just like hosting did a few years back with the only real difference that the infrastructure is designed to be used in a multi-tenant manner as opposed to having dedicated servers. (Note: most hosting companies ran multi-tenant servers anyway, but the actual technical way of separating the tenants is different with cloud providers.)

It's not that services such as Amazon Web Services EC2 can't perform at the same level of a government-specific cloud offering, but the often challenging requirements related to government computing require a specific way of doing things.

A few weeks back I spoke with IBM CIO Pat Toole, who emphasized the fact that IBM corporate IT has a strong focus on optimizing virtualized servers in cloud-like ways to reduce costs. With nearly 400,000 employees, IBM is as big or bigger than many government entities and has similar challenges related to uptime, security, and storage.… Read more

What's next for OpenStack's cloud efforts

Last month's launch of the open-source OpenStack project garnered a great deal of attention from the media and cloud followers, as it promised a new option for building and launching their own internal and hosted clouds.

This week, Chief Stacker Jim Curry posted an update on the OpenStack blog, outlining what new updates and features will be released in the new version, expected to land on October 21.

The official version of the OpenStack API. In addition to the functionality available in the Rackspace Cloud API, OpenStack will be add functionality related to role-based access controls and networking actions. … Read more