All posts by artiverse-admin

The cost of retries – part 1

In the previous post I discussed how bad it is for proxies to retry. In that post I mentioned offhandedly that the proxy retrying was not only going to make your app slower but also more expensive. This is a first look at that problem.

For your convenience I have created another visualization on GitHub.

Retries in linear system

Imagine you have a perfect system and it has constant response time no matter how heavily you load it and it will never drop anything from its queue. If we hit the system with more requests than it can handle it will keep processing requests at that same speed but the responses will be ever more delayed because the requests are stuck in a request queue.

No retries

In the no retry case, as we slowly load up the system the total number of requests in the queue eventually exceeds the speed with which they can be processed (about 1500s with default settings). At this point both the queue length and the response time start increasing. Eventually the response time exceeds the timeout on the client (about 1800s with default settings). From the client’s perspective at this point all requests start failing. Note that this situation persists well past the point where the load has dropped below what the system can handle (about 2200s with default settings) because there are still so many – effectively dead – requests stuck in the queue. Only when the queue size drops significantly does the response time drop back below the 15s timeout (about 2350s with default settings) and requests start succeeding again.

All this is bad, but it is expected. One could argue that in this model there is only a problem if we are running very close to the limit of the ability of the system to handle load and that at that point some failures are expected.

With retries

Of course when the request fails it is likely the client will retry. In the simplest case we add a single retry. Everything remain the same until the first timeouts. At this point the number of requests on the system increases significantly because for every request more than 15s old a new request is added to the queue. This can be observed in the steep increase of the overall length of the request queue (bold red line). At the same time the average response time for requests also increases (orange line) because there are now even more (still dead) requests in the queue that need to be processed – and dropped by the client) before recovery can happen.

The thin red lines show the individual contributions of the original requests and the retries. As expected with a single retry these each contribute about half of the overall queued requests. If you look very carfully you will see that the contribution of the original requests initially almost follows the no-retry case but then with the increase of the retries suddenly increases to about the same level as the retries. This may initially seem counterintuitive but can easily be understood if you consider that original requests and retry requests are indistinguishable in the queue and the back end will process them as they come in. In other words, for every retry the back end will not process a first request and as a result more and more original requests will remain in the queue for longer.

To be continued …

I will leave you to ponder the implications of all this while I go off building a couple more models – specifically (1) a model where the system is running with a perfectly fine amount of headroom but gets hit by a requests spike – be it Black Friday or a network outage – and (2) a more realistic model where the response time is not constant with load to give you a more visceral sense of how much headroom a system really needs.


Proxies must not retry

You live in the cloud. Your app lives in the cloud. Mostly. You’ve decided to add access controls via a simple proxy. Your service is supposed to have “100%” uptime, so of course the proxy has to have “100%” uptime.

So far so good – except that the back end only has 99.9% uptime and your stupid ops people have set up alarms that check service uptime via your proxy. Since you don’t want to get dinged you figure you’ll retry. No alarms, no problem. Right?

Truth is you’ve just made your app slower. Probably a lot slower. And more expensive. And less stable.


Look at the data

Have a look at this picture. This is a test for a proxy that retries after 15s.


Let’s focus on the orange data. You’ll have to trust me when I say there is orange dots under the green dots. What you see is that the retry works really well: typical response time is about 2s and if that fails we get responses after about 17s (15+2) and if that fails we get responses after about 32s (2*15+2) and if that fails we get responses after about 47s (3*15+2). This is great! The proxy works!

Does it though? What should the client do? Should it wait for 50s? Or should it retry retry 25 times after 2s in the hopes that a single call will take the expected 2s? ? 10 times after 5s to account for some spread? Exponential backoff?

Based on the orange lines the client should absolutely retry every 3-5s. Of course that will kill your proxy and back end because each of the “timed out” calls will still go through the full proxy/back-end retry cycle. You just DoSed yourself.

Or course the blue data is more realistic. Under load there is actual spread. Some calls really do take up to 15s. So really you want exponential backoff. But even now you are abandoning calls to the retry pattern and DoSing your self. Not as badly but still.

In both of the above cases you client contains retry code. Now, why would you have retry code in your proxy?

I don’t believe you!

Ok. Just for you I have created this cool little toy on GitHub which allows you to walk through this step by step. Let’s say your server takes at least 2s to respond and at most 6s. Let’s model this as a gaussian because they are pretty:

Bad retry example
The blue line shows instantaneous probability that your request will be served at this time. The green line is the integrated probability, meaning that your request will be served by this time. Basically at 6s it is all but guaranteed that you received a reply.

So far so good. Now let’s have a look at the red line and what happens if we retry. If we retry early then we give up on any chance of the old request being fulfilled and start the wait again at the beginning. What this shows very nicely is that for any retry before you are guaranteed completion at 6s your performance will get worse.

How’s that different from the client doing the retry? Admittedly it isn’t. Except the client now has to wait until it’s guaranteed that the proxy would return!


Microcorruption uctf

A fried alerted me to one of the hackaday CTF games: I think every programmer, nay anyone using a computer, should play this!

Well, ok so this spoke to my obsessive nature and I really wish something this cool had existed when I was young, poor, and had the spare time to engage in some real hacking. Not that any of us did. Ever.

I admit it speaks to obsessive personalities and you probably won’t make it past the first 5 or so levels unless you have that obsessive streak, but even if you don’t: it’s a GAME. It has LEVELS. It TRAINS you and levels get progressively harder. And it has this cool hall of fame that shows you how much better you did than everyone else. And if you finish the tutorial and the first level you are already in the top 50% :)

And there’s lots of people playing it so you *can* get help if you ask. In fact it’s so popular that people are creating mods / tools / plugins to play this game:

The Imperial 8

As you may know I row. Well, mostly I scull, but for this year’s end-of-year event in Victoria, BC, the Head of the Gorge race, we decided to dress up and row an 8. So far that’s been a fairly good experience … we have costumes, we have airline tickets, we have hotel rooms … and even the boat works well enough for a bunch of people just thrown together.

Well, I hit a snag today when I made the suggestion to match our blades to our costumes. And then spent an inordinate amount of time today dealing with the insanity that is club politics. In the process we were discussing spinning of a separate club, just for the purpose of that race, so we could break the “club colors” rule: 2013 rowing canada racing rules, section 6.5. This was the proposed photo for the club president:

- Please recheck your ID(s).
- If you are showing a private album, check that the "Retrieve Photos From" option is set to "User's Private Album" and that the Authorization Key is correct.

Of course reading rules is a dangerous thing … apparently (according to section 6.2 of the same rules) if all rowers are from the same club the can wear whatever they want as long as everyone is wearing the same thing … if rowers are from different clubs they must wear singlets. Who write these things? More on this after races …

Keeping track of cool stuff

Today I had time in front of a real computer and so I went down the rabbit hole of . There’s amazing stuff here and like always when I see amazing stuff, I would like to keep track of it!

Like any other site that want to sell you stuff, they have a wishlist and favorites … and that’s actually very well implemented … but it still requires for me to be logged into their site and doesn’t really lend itself to sharing …

This seemed like an obvious case of Pinterest to the rescue …

Now getting that to EMBED … that’s another story entirely! Once again being at the bleeding edge of WordPress is working against me because the recipes people have shared don’t work in 3.6.1 (yet).

Probably the cutest solution to the problem was using the Flickr plugin with the Pinterest RSS feed, so if you have a Flickr plugin that works … great.

The second best suggestion I found was this: use the Pinterest widget builder and embed that into your site. This is great but I want this to work whether I am displaying just one post or many and it’s quite specific about not loading the widget multiple times. What to do …

… I ultimately decided to embed this is the theme … but the suggested “easy” execution scheme is mindboggling. It suggests to put the script tag just above the /body tag, apparently because it needs to execute onLoad. Well, it does seem to work in the footer (Appearance – Editor – Footer) … but the page rendering is … less than desirable.

Embedding the “Advanced” version in the HEAD section work much better …

Is it safe? I don’t know but I don’t keep anything high value here, right?

Mont Blanc Corvids

After all that pain of embedding images / galleries it seems anticlimactically simple to embed videos into a blog …

  • set up a youtube account … it’s as easy as logging in – google will know who you are, I am sure
  • upload video … it’s as easy as drag and drop
  • drop the youtube url directly into WordPress …

And …

But the best part of it is how smart youtube is about the video … the video above was a freehand 3gp from an ancient android phone and you can see that it’s portrait, not landscape. Youtube notices this! It also offers post processing such as de-shaking – ok, I could do better with virtual dub but not at the cost of clicking a button!

Youtube plus WordPress … so easy!

Looking for galleries (2)

Spent a little more time today trying to figure out how to make the gallery thing work. It’s weird, hoarding pictures seems like one of the most common things people do: snap a million pictures, scribble some notes on them to remember where they were taken, laugh about the most compromising ones with friends and make a couple of prints of some special ones for the dresser …

… yet there seems to be practically nothing decent out there to do this with.

Here is what I figure the deliverables for common photo hoarding should be:

  1. Easy to get pictures out of camera
  2. Easy to bulk caption pictures
  3. Easy to do the basic cleaning operations (rotate, crop, red-eye, white balance)
  4. Easy to organize pictures
  5. Easy to share pictures with everyone, ideally on the couch, yours or theirs
  6. Easy to embed them into other projects
  7. Easy to back up.
  8. Not subject to changes in web software version or at least in a data format that’s easy to bulk re-ingest somewhere else.

Photo-albums  - you know, paper and stuff – can do 2,4, 5 and 8. Slides can’t actually do any of the above.

Digital age to the rescue … you’d think …

Facebook, the biggest photo sharing site out there, can do 1,2 and 5. Picasa can do 1-3, 5, and maybe 6 and 7. I haven’t looked at flickr but I suspect it’s going to be similar.

As for offerings that a normal person can install on an amazon or digital ocean instance … gallery/gallery 2 will do 1, 2, 4, and 5 but is a nightmare of bugs. WordPress can do 1 and 5, but unless you are Steve Jobs and don’t believe anyone needs folders it’s probably no good for more than about 5 pictures, not to mention that apparently the code changes so much between versions that it’s a FEATURE of the plugin system to collect user reports on what plugin works with what version.

Why is this so hard?!? And I am not talking about building lightboxes, which is hard but basic fuctionality.

So where am I at now?

Sigh …

Adventure Park Anthares World

I used to watch fear factor and whenever they had the obstacle courses suspended high in the air I thought that looked like fun and I would love to do that. But I never realized I COULD do that. So naturally I really wanted to go when I saw the pictures at the entrance to the pool.

From Trips\2013-08\Italy\AdventurePark

Getting tickets was interesting because the girls at the entrance spoke practically no English but since I had previously reconnoitered I managed to get a ticket for 1pm and knew to be there half an hour in advance for safety instructions and get a harness. Thankfully the guy doing the instructions spoke excellent English, and the instructions were easy enough to follow:

  • always clip in.
  • Always use only one hand (so you never accidentally unclip both your safety lines).
  • On red marks clip in directly.
  • On yellow marks remove your roller thingie from the clip on your harness, fit it over the cable and clip in ONE line to secure it on the cable.
  • Check that your landing point is free. Then, and while I followed these instructions I can’t help wonder if this is good practice, clip the second line into the CARABINER of the first line – and jump.

After I demonstrated my understanding and proficiency by walking around the training course clipping and unclipping he sent me off by telling me the order of the first three courses: green, blue, orange.

- Please recheck your ID(s).
- If you are showing a private album, check that the "Retrieve Photos From" option is set to "User's Private Album" and that the Authorization Key is correct.

Essentially these are training runs that get progressively harder and higher with variations on the themes of ladders, rope bridges, high wires, and zip lines with landing nets. Probably the biggest challenges on these courses were: getting over your fear of heights if you have it, the generally low hanging safety cables (to accommodate the minimum height of 140cm I suspect) and landing in the nets at the end of the zip line, more on this later.

Fortunately I had brought water because even in the shade of the trees it was hot, especially wearing long jeans. Lots of people did this in shorts but there were a few place I am glad I had covered legs and in one place I wished for kneepads. Also in the process of grabbing the water I found out that if you wanted to go to the restaurant you would have to get out of the harness, so another win there.

With all the warmup out of the way, now for the real fun! I may not have all the obstacles in the right courses so here a best guess:

The red course starts with a long rope ladder and because clipping in and out on a rope ladder would be suicidal, it comes with a fall arrestor line you have to pull down. This was only labeled in Italian so it took me a couple minutes to work out but fortunately there was a dad with his 8 year old ahead of me giving me plenty of time. The highlight of this course was probably a set of individual footholds like hangman’s nooses although one could have used a short zip line instead. Hard on arms and coordination and being more than 140 cm tall helps immensely.

The black course had a Tarzan rope swing. The swing was actually quite easy, especially if you put your feet on the knot at the end of the rope. However clipping into the rope is a challenge because the rope is heavy and you can’t easily support yourself while clipping in the second carabiner. The subsequent landing on a coarse rope net and clipping in are easy and natural … until you realize the way out is UP and that the net you are clipped into stops 4m above ground. This was to me the scariest point because I realized I was getting complacent. The course ends in a super long zip line and here I had my only major mishap: I hit the net backwards and didn’t manage to grab on before sliding back on the zip line, ending up stuck and needing rescue. Clearly this is common though as one of the attendants just came and threw me some rope and then pulled me to the net.

At this point I was exhausted and should probably have stopped, especially because I was told that the yellow course was the hardest, but since I wouldn’t ever get the chance to do this again I braved it anyhow. True to promise this course is hard. A zigzag horizontal ladder, a cable loop horizontal ladder, a net/cable car, a high rope walk with a support that looks like a trapeze – I tried that and it hurt – but is actually a suspended version of a balance pole – works like a charm that way, another high rope walk but with loop obstacles you need to climb through. After all this you end up on a platform that’s a dead end. After calling for help it turns out there is a descender that you can grab and attach to your harness. And then you need to step off the platform. This is extremely daunting because the anchor point is at least 1m away from the edge of the platform and it’s clear you will swing sideways. In the end it’s not as bad as it looks but you do need to push yourself away from a tree trunk.

In summary: the best 20 euro spent this trip and the best 2h spent in a long time!

Went to the pool after to cool off but that was really all that pool is good for on a Sunday afternoon as you will be lucky to find a lawn chair or any shade at all.


Looking for galleries

What is the point of having a blog? Ramblings. What makes people like ramblings? Pictures. So how to get pictures into WordPress.


Apparently you can just use the native gallery and it’s actually fairly easy and functional:

  • in the editor click “add media”
  • upload media
  • click “create gallery” and select media you want in the gallery
  • insert gallery

Cool but boring.


The jetpack plugin promises lots of goodies. Maybe I don’t get it but the gallery seems to simply not work at all. The mobile theme is a nice addition but since you can’t customize it to match the main theme also falls short …

… to be continued

(Re)building web presence: wordpress

The Cloud Formation Way

For the last couple of months I have wanted to try WordPress, and specifically the Amazon Cloud Formation template. I initially tried this from my iPad while on vacation but there are some … issues. It could probably be done but my would you be in for pain!

The process is slick and fast although you will need to already have a key pair installed and probably should know how to log into your instance afterwards. Note that the default instance size is m1.small which is likely overkill for your test site.


After confirming that wordpress looked like the right tool for the job I really don’t want to have yet another instance … the monthly fees do add up … so how about installing on my existing instance. This looks easy:

  • make sure you have mysql php and a web server
  • download the tarball to web root
  • follow the install instructions

It gets a little tricky if you happen to be using nginx … in that case you probably want to also consult these:


I would post a link to the result … but you are looking at it ;)


In the process stumbled on which seems to be trivially cheap in comparison to EC2 … likely a future post