Can anybody enlighten me about what kinds of things are in a real SLA
for a cloud? I'm primarily interested in performance and availability.
My reason for asking: A research group I'm becoming affiliated with
has a neat technique for optimally scheduling tasks to provide
realistic guarantees on completion. The nice part is that they don't
take the usual kinds of average arrival rate and task duration as
input -- they take a general probability distribution function,
allowing things to be represented like "most of the time we expect
traffic X, but every so often it's going to run up to 100X."
Knowing how many servers you need in reserve to handle that, without
waste, seems like it could be useful.
But I'd like to know whether this kind of thing has anything to do
with what people actually talk about in SLAs.