Uncover Hidden Performance Issues Through Continuous Testing

On-premise test tools, APMs, CEMs and server/network based monitoring solutions may not be giving you a holistic picture of your system’s performance; cloud-based continuous testing can.  

When it comes to application performance a wide array of potential causes of performance issues and end user dissatisfaction exist.  It is helpful to view the entire environment, from end user browser or mobile device all the way through to the web and application servers, as the complex system that it is.

system

Everything between the user’s browser or mobile and your code can affect performance

The state of the art in application performance monitoring has evolved to include on-premise test tools, Application Performance Management (APM) solutions, customer experience monitoring (CEM) solutions, server and network based monitoring. All of these technologies seek to determine root causes of performance problems, real or perceived by end users. Each of these technologies has it’s own merits and costs and seek to tackle the problem from different angles. Often a multifaceted approach is required when high value, mission critical applications are being developed and deployed.

On-premise solutions can blast the environment with 10+Gbit/sec of traffic in order to stress routers, switches and servers. These solutions can be quite complex and costly, and are typically used to validate new technology before it can be deployed in the enterprise.

APM solutions can be very effective in determining if network issues are causing performance problems or if the root cause is elsewhere. They will typically take packet data from a switch SPAN port or TAP (test access point), or possibly a tap-aggregation solution. APM solutions are typically “always-on” and can be an early warning system detecting applications problems before the help desk knows about an issue.  These systems can also be very complex and will require training & professional services to get the maximum value.

What all of these solutions lack is a holistic view of the system which has to take into account edge devices (Firewalls, Anti-Malware, IPS, etc), network connectivity and even endpoint challenges such as packet loss and latency of mobile connections. Cloud-based testing platforms such as Load Impact allow both developers and application owners to implement a continuous testing methodology that can shed light on issues that can impact application performance that might be missed by other solutions.

A simply way to accomplish this is to perform a long-term (1 to 24+ hr) application response test to look for anomalies that can crop up at certain times of day.  In this example I compressed the timescale and introduced my own anomalies to illustrate the effects of common infrastructure changes.

The test environment is built on an esxi platform and includes a 10gbit virtual network, 1gbit physical LAN, Untangle NG Firewall and a 50/5 mbit/sec internet link.  For the purposes of this test the production configuration of the Untangle NG Firewall was left intact – including Firewall rules, IPS protections however QoS was disabled.  Turnkey Linux was used for the Ubuntu-based Apache webserver with 8 CPU cores and 2 gigs of ram.

It was surprising to me what did impact response times and what had no effect whatsoever.  Here are a few examples:

First up is the impact of bandwidth consumption on the link serving the webserver farm.  This was accomplished by saturating the download link with traffic, and as expected it had a dramatic impact on application response time:

Impact of download activity on application response times

At approx 14:13 link saturation occurred (50Mbit) and application response times nearly tripled as a result

bandwidth

Snapshot of the Untangle Firewall throughput during link saturation testing

Next up is executing a Vmware snapshot of the webserver.  I fully expected this to impact response times significantly, but the impact is brief.  If this was a larger VM then the impact could have been longer in duration:

impact-of-snapshot-on-apache-VM

This almost 4x spike in response time only lasts a few seconds and is the result of a VM snapshot

Lastly was a test to simulate network congestion on the LAN segment where the webserver is running.  

This test was accomplished using Iperf to generate 6+ Gbit/sec of network traffic to the webserver VM.  While I fully expected this to impact server response times, the fact that it did not is a testament to how good the 10gig vmxnet3 network driver is:

iperf

Using Iperf to generate a link-saturating 15+Gbit/sec of traffic to Apache (Ubuntu on VM)

 

simulate network congestion using iperf 2

In this test approx 5.5Gbit/sec was generated to the webserver,no impact whatsoever in response times

Taking a continuous monitoring approach for application performance has benefits to not only application developers and owners, but those responsible for network, security and server infrastructure.  The ability to pinpoint the moment when performance degrades and correlate that with server resources (using the Load Impact Server Metrics Agent) and other external events is very powerful.  

Often times application owners do not have control or visibility into the entire infrastructure and having concrete “when and where” evidence makes having conversations with other teams in the organization more productive.

———-

Peter CannellThis post was written by Peter Cannell. Peter has been a sales and engineering professional in the IT industry for over 15 years. His experience spans multiple disciplines including Networking, Security, Virtualization and Applications. He enjoys writing about technology and offering a practical perspective to new technologies and how they can be deployed. Follow Peter on his blog or connect with him on Linkedin.

Load Impact: Closed Vulnerability to Heartbleed Bug

As you may have heard, a serious bug in the OpenSSL library was recently found. The bug, known colloquially as “Heartbleed” (CVE-2014-0160), impacted an estimated two-thirds of sites on the internet – including Load Impact.

While Load Impact has no evidence of anyone exploiting this vulnerability, we have taken action to mitigate all risks and are no longer vulnerable. 

The vulnerability has existed in OpenSSL the past two years and, during this time, could have been used by malicious hackers to target a specific online service or site, and covertly read random traffic between the site and its users. Over time, this means an attacker could gather sensitive information such as account details, passwords, encryption keys, etc. used by the site or its users.

Many sites have unknowingly been vulnerable to this bug the past two years, and most probably have little or no information about whether they have been targeted by hackers or not, as the attack would appear to be an entirely legitimate request and is unlikely to even be logged by most systems.

We advise you to be aware of this issue and ask your various online service providers for information if they haven’t provided you an update already. You should also consider changing your passwords on most systems you have been using for the past two years.

Load Impact has only been vulnerable to this bug since October 2013 – when we started using Amazon’s SSL service (through Amazon’s ELBs) – so our exposure is limited. However, since there is still a risk that someone may have stolen information from us in the past six months, we have now replaced our SSL certificates and keys. 

As an extra precaution, we advise our users to:

  • Create a new password
  • Generate new API keys

Feel free to contact us if you have any questions.

More info on the OpenSSL “Heartbleed bug” can be found here: http://heartbleed.com/

WordPress Vertical Scalability: How Performance Varies with Changes in Hardware

How does your web application respond to improvements in the underlying hardware? Well, that will depend a lot on your application. Different applications are limited by different factors such as RAM, CPU, bandwidth, disk speed to name a few. In this article, I’ll show you an approach to finding out how to test your way to understanding how your application consumes resources.

At some point in the development cycle, preferably early, it makes good sense to narrow down what factors limit your application the most. It’s also useful to flip that statement around and ask yourself: what hardware improvements will benefit your overall performance the most? If you focus on the latter of the two statements, the solution is probably the most important information you need for good resource planning.

To demonstrate the concept of vertical scalability testing, (or hardware sensitivity testing), I’ve set up a very simple WordPress 3.8.1 installation and will examine how performance varies with changes in hardware. The tests are made using virtual machines where hardware changes are easy to make. I’ve created a simple but somewhat credible user scenario using the Load Impact User Scenario Recorder for Chrome.

The simulated users will:

  •  Surf to the test site
  •  Use the search box to search for an article
  •  Surf to the first hit in the search results
  •  Go back to the home page

The baseline configuration is very conservative:

  • CPU: 1 core
  • RAM: 128 Gb
  • Standard disks.

The test itself is a basic ramp up test going from 0 to 50 concurrent users. Based on experience from previous tests with WordPress, a low power server like this should not be able to handle 50 concurrent users running stock WordPress. The idea is to run the test until we start seeing failures. The longer it takes before we see failures, the better. In the graph below, the green line is number of simulated users, the blue line is average response time and the red line is the failure rate measured as number of failed requests/s. As you can see, the first failed requests are reported at 20 concurrent users.

baseline

A comment on the response times (blue line) going down. At a high enough load, nearly 100% of all responses are error messages. Typically, the error happens early in the request and no real work is carried out on the server. So don’t be fooled by falling response times as we add load, it just means that the server is quick to generate an error.

 

RAM Memory sensitivity

First, I’m interested to see how performance varies with available RAM. I’ve made the point in previous articles that many PHP based web applications are surprisingly hungry for RAM. So let’s see how our baseline changes with increased RAM:

At 256 Mb RAM (2x baseline):

RAM256

At 512  Mb RAM (4x baseline)

RAM512

 

That’s a quite nice correlation. We see that the number of simulated users that can be handled without a failure is moved higher and higher. At 1024 Mb RAM (8x baseline) we actually don’t get any error at all:

RAM1024

Also note that before the WordPress server spits out errors, there’s a clear indication on the response times. At a light load, any configuration can manage about 1s response time, but as the load increases and we’re nearing the point where we see errors, response times have already gone up.

 

Sensitivity to CPU cores

Next angle is to look at CPU core sensitivity. With more CPU available, things should move faster, right? RAM memory has been reset to 128 Mb, but now I’m adding CPU cores:

Using 2 CPU cores (2x baseline)

2xCPU

Ops! As you can see, this is fairly close to the baseline. First errors start happening at 20 concurrent users, so more CPU couldn’t do anything to help the situation once we run out of memory. For the sake of completeness, looking at using 4 CPU cores shows a tiny improvement, first errors appear at 23 concurrent users instead of 20.

Using 4 CPU cores (4x baseline)

4xCPU

Adding more CPU cores doesn’t seem to be my highest priority.

 

Next step, mixing and matching.

You’ve probably already figured out that 128 Mb RAM is too little memory to host a stock WordPress application. We’ve discussed WordPress specifically before and this is not the first time we realize that WordPress is hungry for RAM. But the point of this article wasn’t about that. Rather, I wanted to demonstrate a structured approach to resource planning.

In a more realistic scenario, you’d be looking for a balance between RAM, CPU and other resources. Rather than relying on various ‘rules of thumb’ of varying quality, performing the actual measurements is a practical way forward. Using a modern VPS host that let’s you mix and match resources, it’s quite easy to perform these tests. So the next step is your’s.

My next step will be to throw faster disks (SSD) into the mix. Both Apache/PHP and MySQL benefits greatly from running on SSD disks, so I’m looking forward to seeing those numbers.

Comments, questions or criticism? Let us know by posting a comment below:

——-

0b59bcbThis article was written by Erik Torsner. Erik is based in Stockholm, Sweden, and shares his time between being a technical writer and customer projects manager within system development in his own company. Erik co-founded mobile startup EHAND in the early 2000-nds and later moved on to work as technology advisor and partner at the investment company that seeded Load Impact. Since 2010, Erik manages Torgesta Technology. Read more about Erik on his blog at http://erik.torgesta.com or on Twitter @eriktorsner.

 

 

[NEW RELEASE] Mobile Performance Testing – Including Network and Client Emulation

Mobile-FeatureToday, we introduced the first true cloud-based load testing Software as a Service for mobile apps, APIs and websites that can simulate traffic generated from a variety of mobile operating systems, popular browsers, and mobile networks – including 3G, GSM and LTE.

Current, only about half of companies with mobile sites or apps today test their mobile code, and a recent industry study reported that when a mobile app fails, 48 percent of users are less likely to use the app again, 34 percent will switch to a competitor, and 31 percent will tell others about their poor experience. [1]

Our new testing service for mobile apps, APIs and websites allows developers to emulate client behavior when downloading content to a phone, specify the number of concurrent downloads in total and per host, as well as the mix of different client applications or browsers, including Safari, Chrome, Firefox and Opera.

Developers can also use our new features to emulate mobile network characteristics including available bandwidth, network delay, packet loss, jitter and packet reordering.

So what’s REALLY changed?

What’s really new is that when we simulate a mobile client – whether it is a mobile user running a mobile web browser and accessing a standard web site, or it is a mobile user playing the Candy Crush app – we can generate the same kind of traffic for the servers to handle that real users would.

If the average mobile user has a network connection speed of, say, 384 kbit/s (old generation 3G) we will not let our simulated client load data faster than that from the servers.

li-test-config-network-emulation

In previous versions of  Load Impact, and in most other load testing tools, every simulated client/user in a load test will load things at maximum possible speed, at all times. This will of course result in a very skewed test result, that might tell you your site/app can handle max 1,000 concurrent users while in reality you could handle a lot more (or less).

Apart from simulating network connection speed, we also simulate network latency, which is equally important for performance as connection speed is  -  just like connection speed, latency also affects how “heavy” a client is for the servers to handle.

Our network/client emulation feature is currently available at the test level only, but you will soon be able to simulate mobile traffic at the user scenario level too. We’ll be sure to let you know when the update arrives.

Test your mobile code now at loadimpact.com

Mobile Network Emulation – The Key to Realistic Mobile Performance Testing

Mobile-Testing-Infographic

When was the last time you looked at your website’s browser statistics? If you have, you’ve likely noticed a trend that’s pretty hard to ignore – your users are browsing from a mobile device more than ever before. What was once a small sub-segment of your audience is now growing and representing the majority of your traffic. This may not be so surprising since today mobile usage makes up about 15 percent of all Internet traffic. Basically, if you don’t already have a mobile development strategy, you may already be loosing sales/users due to poor mobile performance. 

Responsive design takes care of your website’s layout and interface, but performance testing for mobile devices makes sure your app can handle hundreds (even thousands) of concurrent users. A small delay in load-time might seem like a minor issue, but slow mobile apps kill sales and user retention. Users expect your apps to perform at the same speed as a desktop app. It seems like a ridiculous expectation, but here are some statistics:

  • If your mobile app fails, 48% of users are less likely to ever use the app again. 34% of users will just switch to a competitor’s app, and 31% of users will tell friends about their poor experience, which eliminates those friends as potential customers. [1]
  • Mobile app development is expected to outpace PC projects by 400% in the next several years. [2]
  • By 2017, over 20,000 petabytes (that’s over 20 million gigabytes!) will be sent using mobile devices. Streaming is the expected primary driver for growth.[3]
  • 60% of mobile failures are due to performance issues and not functional errors. [4]
  • 70% of the performance of a mobile app is dependent on the network. [5]
  • A change in latency from 2ms (broadband) to 400ms (3G network) can cause a page load to go from 1 second to 30 seconds. [6]

These statistics indicate that jumping into the mobile market is not an option but a necessity for any business that plans to thrive in the digital age. You need more than just a fancy site, though. You need a fast fancy site. And the surefire way to guarantee your mobile site/app can scale and deliver a great performance regardless of the level of stress on the system is to load test early and continuously throughout the development process. 

Most developers use some kind of performance testing tools during the development process. However, mobile users are different than broadband users and therefore require a different set of testing tools to make sure they are represented realistically in the test environment. Mobile connections are less reliable; each geographic area has different speeds; latency is higher for mobile clients; and older phones won’t load newer website code. Therefore, you need real-world mobile network emulation and traffic simulation.

Prior to the availability of good cloud performance testing tools, most people thought the solution to performance problems was “more bandwidth” or “more server hardware”. But those days are long over. If you are to stay competitive today, you need to know how to optimize your mobile code. Good performance testing and traffic simulations take more than just bandwidth into account. Network delays, packet loss, jitter, device hardware and browser behavior are also factors that affect your mobile website’s or app’s performance. To properly test your app or site, you need to simulate all of these various situations – simultaneously and from different geographic locations  (i.e. not only is traffic more mobile, its also more global).

You not only want to simulate thousands of calls to your system, you also want to simulate realistic traffic behavior. And, in reality, the same browser, device and location aren’t used when accessing your site or app. That’s why you need to simulate traffic from all over the globe with several different browsers and devices to identify real performance issues. For instance, it’s not unlikely to have a situation where an iPhone 5 on the 4G network will run your software fine, but drop down to 3G and the software fails. Only realistic network emulation covers this type of testing environment.

Finally, simulating real user scenarios is probably the most important testing requirement. Your platform’s user experience affects how many people will continue using your service and how many will pass on their positive experience to others. Real network emulation performs the same clicks and page views as real users. It will help find any hidden bugs that your testing team didn’t find earlier and will help you guarantee that the user experience delivered to the person sitting on a bus using a 3G network is the same as the individual accessing your service seated at their desktop connected through DSL.  

Several years ago, mobile traffic was negligible, but it’s now too prominent to ignore. Simple put, don’t deploy without testing your mobile code!

Check out Load Impact’s new mobile testing functionality. We can simulate traffic generated from a variety of mobile operating systems, popular browsers, and mobile networks – including 3G, GSM and LTE. Test your mobile code now!

What to Look for in Load Test Reporting: Six Tips for Getting the Data you Need

Looking at graphs and test reports can be a befuddling and daunting task – Where should I begin? What should I be looking out for? How is this data useful or meaningful? Hence, here are some tips to steer you in the right direction when it comes to load testing result management.

For example, the graph (above) shows how the load times (blue) increase [1] as the service reaches its maximum bandwidth (red) limit [2], and subsequently how the load time increases even more as bandwidth drops [3]. The latter phenomenon occurs due to 100% CPU usage on the app servers.

When analyzing a load test report, here are the types of data to look for:

  • What’s the user scenario design like? How much time should be allocated within the user scenario? Are they geographically spread?

  • Test configuration settings: is it ramp-up only or are there different steps in the configuration?

  • While looking at the tests results, do you get an exponential growing (x²) curve? Is it an initial downward trend that plateaus (linear/straight line) before diving downwards drastically?

  • How does the bandwidth/requests-per-second look like?

  • For custom reporting and post-test management, can you export your test results to CSV format for further data extraction and analysis?

Depending on the layout of your user scenarios, how much time should be spent within a particular user scenario for all actions (calculated by total amount of sleep time), and how the users are geographically spread, you will likely end up looking at different metrics. However, below are some general tips to ensure you’re getting and interpreting the data you need.

Tip #1: In cases of very long user scenarios, it would be better to look at a single page or object rather than the “user load time” (i.e. the time it takes to load all pages within a user scenario excluding sleep times).

Tip #2: Even though “User Load Time” is a good indicator for identifying problems, it is better to dig in deeper by looking at individual pages or objects (URL) to get a more precise indication of where things have gone wrong. It may also be helpful to filter by geographic location as load times may vary depending on where the traffic is generated from.

Tip #3: If you have a test-configuration with a constant ramp-up and during that test the load time suddenly shoots through the roof, this is a likely sign that the system got overloaded a bit earlier than the results show. In order to gain a better understanding of how your system behaves under a certain amount of load, apply different steps in the test configuration to allow the system to calm down for approximately 15 minutes. By doing so, you will be able to obtain more and higher quality samples for your statistics.

Tip #4: If you notice load times are increasing and then suddenly starting to drop, then your service might be delivering errors with “200-OK” responses, which would indicate that something may have crashed in your system.

Tip #5: If you get an exponential (x²) curve, you might want to check on the bandwidth or requests-per-second. If it’s decreasing or not increasing as quickly as expected, this would indicate that there are issues on the server side (e.g. front end/app servers are overloaded). Or if it’s increasing to a certain point and then plateaus, you probably ran out of bandwidth.

Tip #6: To easily identify the limiting factor(s) in your system, you can add a Server Metrics Agent which reports performance metrics data from your servers. Furthermore, you could possibly export or download the whole test data with information containing all the requests made during the tests, including the aggregated data, and then import and query via MySQL database, or whichever database you prefer.

In a nutshell, the ability to extrapolate information from load test reports allows you to understand and appreciate what is happening within your system. To reiterate, here are some key factors to bear in mind when analyzing load test results:

  • Check Bandwidth

  • Check load time for a single page rather than user load time

  • Check load times for static objects vs. dynamic objects

  • Check the failure rate

  • For Server Metrics – check CPU and Memory usage status

……………….

 

1e93082This article was written by Alex Bergvall, Performance Tester and Consultant at Load Impact. Alex is a professional tester with extensive experience in performance testing and load testing. His specialities include automated testing, technical function testing, functional testing, creating test cases, accessibility testing , benchmark testing, manual testing, etc.

Twitter: @AlexBergvall

New Load Script APIs: JSON and XML Parsing, HTML Form Handling, and more!

Load scripts are used to program the behavior of simulated users in a load test. Apart from native functionality of the Lua language, load script programmers can also use Load Impact’s load script APIs to write their advanced load scripts.

Now you can script your user scenarios in the simple but powerful language Lua, using our programmer friendly IDE and new APIs such as: JSON and XML parsing, HTML form handling, Bit-fiddling, and more.

Advanced-Scripting2

Automated Acceptance Testing with Load Impact and TeamCity (New Plugin)

teamcity512

As you know, Continuous Integration (CI) is used by software engineers to merge multiple developers’ work several times a day. And load testing is how companies make sure that code performs well under normal or heavy use.

So, naturally, we thought it wise to develop a plugin for one of the most widely used CI servers out there – TeamCity by JetBrains. TeamCity is used by developers at a diverse set of industry leaders around the world – from Apple, Twitter and Intel, to Boeing, Volkswagen and Bank of America. It’s pretty awesome!

The new plugin gives TeamCity users access to multi-source load testing from up to 12 geographically distributed locations worldwide, advanced scripting, a Chrome Extension  to easily create scenarios simulating multiple typical users, and Load Impact’s Server Metrics Agent (SMA) for correlating the server side impact of testing – like CPU, memory, disk space and network usage.

Using our plugin for TeamCity makes it incredibly easy for companies to add regular, automated load tests to their nightly test suites, and as a result, get continuous feedback on how their evolving code base is performing. Any performance degradation, or improvement is detected immediately when the code that causes it is checked in, which means developers always know if their recent changes were good or bad for performance – they’re guided to writing code that performs well.

Download the TeamCity plugin!

Here’s how Load Impact fits in the TeamCity CI workflow:

CD-TeamCity

And the launch of our new plugin comes just in time to celebrate a pretty big milestone for us too – 1 MILLION LOAD TESTS!

Guess when we will run our ONE MILLIONTH load test and win a Pebble Watch!

Updated: We hit a million at exactly 2014-02-23 22:36:18 UTC. The winner is the person who guessed - 2014-02-24 at 10:00 AM! Your Pebble watch is on the way sir :-)

—-

In case you hadn’t noticed, we are about to hit our ONE MILLIONTH load test! Just kidding, we know nobody is watching the counter as closely as we are.

As we place bets internally about when we will actually reach this coveted number, we thought our customers might want to join in the fun too. Especially since there’s some nice swag to be won for having the closest guess.

Live test and Map

All you have to do is leave a comment below with your guess of the DATE and TIME (to the closest hour) we will execute our one millionth load test. Or submit your answer via email to: support@loadimpact.com. 

The person with the closest guess will win a Pebble watch worth $249 USD.

The top 10 runner-ups, including the winner, will receive $100 of Load Impact test credits and a nifty Load Impact t-shirt.

T-ShirtPebble

Submit your guess in the comment section below or email support@loadimpact.com. We will reply to you directly if you’ve won.

Hurry, the counter is ticking….

Contest rules: No purchase necessary; contest starts 2014/02/19 and ends no later than 2014/06/30 or when Load Impact runs its one millionth load test (whichever comes first); participant must 18 or older and have a confirmed Load Impact account; method to enter is to leave a comment in the section below or email support@loadimpact.com; limited to one entry per person; value of the first prize is $350 USD; the winner will be selected by comparing their submission with the actual date and time Load Impact runs it’s one millionth load test. In the event of a tie, the entry submitted first will win.; The winner will be notified with a reply in the comment section below or a reply via email; in order to claim the prize the winner(s) must provide Load Impact a valid email address, telephone number and mailing address.

Contest administered by Load Impact, Götgatan 14, SE-118 46, Stockholm, Sweden.

Countdown of the Seven Most Memorable Website Crashes of 2013

Let this be a lesson to all of us in 2014. 

Just like every other year, 2013 had its fair share of website crashes. While there are many reasons why a website might fail, the most likely issue is the site’s inability to handle incoming traffic (i.e. load).

Let’s look at some of the most memorable website crashes of 2013 that were caused by traffic overload.

#7. My Bloody Valentine

imgres-3February 2nd, obviously not so alternative shoegaze legends, My Bloody Valentine, decided to release their first album since 1991, and they decided to do so online. They crashed within 30 minutes.

In the end, most of their fans likely got hold of the new album within a day or two and the band, which clearly has a loyal fanbase, probably didn’t end up loosing any sales due to the crash.

#6. Mercedes F1 Team 

Lewis_Hamilton_2013_Malaysia_FP2_2Mercedes F1 team came up with a fairly clever plan to promote their web content. In february, they told fans on Twitter that the faster they retweeted a certain message, the faster the team would reveal sneak preview images of their 2013 Formula One race car.

It worked a little too well. While waiting for the magic number of retweets to happen, F1 fans all over the world kept accessing the Mercedes F1 web page in hopes of being the first to see the new car. Naturally, they brought the website down.

You guys are LITERALLY killing our website!” Mercedes F1 said via Twitter.

#5. NatWest / Royal Bank of Scotland

rbs-nat-west-1-522x293Mercedes F1 and My Bloody Valentine likely benefited from the PR created by their respective crashes, but there was certainly nothing positive to come out of the NatWest/RBS bank website crash. A crash which left customers without access to their money!

In December, NatWest/RBS saw the second website crash in a week when a DDOS attack took them down.

It’s not the first DDOS attack aimed at a bank and it’s probably not the last one either.

#4. Sachin Tendulkar crash

imagesOne of Indias most popular Cricketers, Sachin Tendulkar, also known as the “God of Cricket”, retired in 2013 with a bang! He did so by crashing local ticketing site, kyazoonga.com.

When tickets for his farewell game at Wankhede in Mumbai became available, kyazoonga.com saw a record breaking 19.7 million hits in the first hour, after which the website was promptly brought down.

Fans were screaming in rage on Twitter and hashtag #KyaZoonga made it to the top of the Twitter trending list.

#3. UN Women – White Ribbon campaign 

images-1

It may be unfair to say that this website crash could have been avoided, but it’s definitely memorable.

On November 25th – the International Day for the Elimination of Violence against Women – Google wanted to acknowledge the occasion by linking to the UN Women website from the search giant’s own front page.

As a result, the website started to see a lot more traffic than they’ve been designed for and started to load slowly, even crashing entirely.

Google had given the webmasters at unwomen.org a heads up and the webmasters did take action to beef up their capacity, but it was just too difficult to estimate how much traffic they would actually get.

In the end, the do-no-evil web giant and unwomen.org worked together and managed through the day, partly by redirecting the link to other UN Websites.

Jaya Jiwatram, the web officer for UN Women, called it a win. And frankly, that’s all that really matters when it comes to raising awareness for important matters.

#2. The 13 victims of Super Bowl  XLVII

Super_Bowl_XLVII_logoCoca Cola, Axe, Sodastream, Calvin Klein had their hands full during Super Bowl XLVII. Not so much serving online visitors as running around looking for quick fixes for their crashed websites.

As reported by Yottaa.com, no fewer than 13 of the companies that ran ads during Super Bowl saw their websites crash just as they needed them the most.

If anything in this world is ever going to be predictable, a large spike of traffic when you show your ad to a Super Bowl audience must be one those things.

#1. healthcare.gov

imgres-5The winner of this countdown shouldn’t come as a surprise to anyone. Healthcare.gov came crashing down before it was even launched.

It did recover quite nicely in the last weeks of 2013 and is now actually serving customers. If not exactly as intended, at least well enough for a total of 2 million americans to enroll.

But without hesitation, the technical and political debacle surrounding healthcare.gov makes it the most talked about and memorable website crash in 2013.

Our friends over at PointClick did a great summary of the Healthcare.gov crash. Download their ebook for the full recap: The Six Critical Mistakes Made with Healthcare.gov

There’s really nothing new or surprising about the website crashes of 2013. Websites have been developed this way for years – often with the same results. But there are now new methodologies and tools changing all that.

It isn’t like it used to be; performance testing isn’t hard, time consuming or expensive anymore. One just needs to recognize that load testing is something that needs to be done early and continuously throughout the development process. It’s not optional anymore. Unfortunately, it seems these sites found that out the hard way. A few of which will likely learn the lesson again in 2014.

Our prediction for 2014 is more of the same. However, mainstream adoption of developmental methodologies such as Continuous Integration and Delivery, which advocate for early and continuous performance testing, are quickly gaining speed.

A Google search trend report for the term, DevOps, clearly shows the trend. If the search trends are any indication of the importance being given to proactive performance testing by major brands, app makers and SaaS companies, we might only see half the number of super bowl advertiser site crashes in 2014 as we did last year.

DevOps Trend

Update following Superbowl XLVIII: According to GeekBeat, the Maserati website crashed after their ad featured their new Maserati Ghibli. And monitoring firm, OMREX, found two of the advertiser websites had uptime performance issues during the game – Coca-Cola and Dannon Oikos.

About Load Impact

Load Impact on-demand services detect, predict, and analyze performance problems – providing the information businesses need to proactively optimize their websites, apps and APIs for customers.
 
With its roots in work for NASDAQ and the European Space Agency, Load Impact has been redefining load testing since 2011 by making it cost-effective, instantly available, automated, and very easy to use.
 
Test your website, app or API at loadimpact.com

Follow us on Twitter

Enter your email address to follow this blog and receive notifications of new posts by email.

Follow

Get every new post delivered to your Inbox.

Join 36 other followers