In the lasts years we have seen how new infrastructure technologies have changed the way we think in developing applications. We have evolved from metal hardware provisioning, virtualization, to Cloud IaaS, container technologies… the software developer has changed the way it programs an application, with many service type applications moving to a microservices architecture. The most recent addition to this new way of program applications could it be called Cloud based event applications which major representation it is the quite recent commercial offer of those services, such as, Amazon Lambda, IBM Bluemix’s OpenWhisk, Google Cloud Functions, and Microsoft Azure Functions. All those services could be englobed in what we denominate Serverless Architecture, a type of architecture that it is been actually discussed in DITAS.
By serverless it does not mean applications are not having servers or even that a developer has not to take care of server-side logic in his/her code. What it means it that the applications run as stateless functions (the tendency it is to use containers), those functions are event called, they have a very short lifespam (order of minutes) and, typically, fully managed by a 3rd party. This serverless architectures are also known as Function as a Service (FaaS).
If we take a simple look about how this works using Amazon Lambda as example, after all, they were the first starting this kind of architectures and all the other examples more or less follow a similar architecture:
- The developer does not need to worry about how the code it is going to be delivered. The code it is uploaded to a repository (Google and Microsoft even allow to use git repositories directly) that can be accessed by the service provider.
The FaaS provider will do everything else:
- Putting the code into operation and its horizontal scaling it is done automatically by the FaaS provider. Typically it is done using containers, the first time a function it is called, a new container it is created to execute it (this takes less than one minute), after that, the container is kept in frozen state and reused if necessary. If a function needs to be called several times, more containers can be created. The user only pays per function execution, so the costs of the application grow with the usage. The user can configure how much a function can be used in parallel, to control the maximum amount of cost.
The functions have a limited time and its execution can happen by diverse type of events: calling an RPC API, changes in the storage (for example, a new image it is uploaded to Amazon S3), messages, etc.
- For the RPC calls, it is necessary to implement an API gateway that will redirect the call to the FaaS provider functions API to create the container, if necessary, and execute the appropriate function code.
- Since the function is basically a piece of code, to update it, it is only necessary to update the code in the FaaS provider. The provider will take care of automatically using the new code for future function innovations.
Although the more clear advantage it is for the developers have not to worry about the underlying infrastructure, it is important to know that there are several disadvantages to this new way of implementing services:
- All functions need to be completed stateless. The function is called with the necessary data to perform its work and the results are notified via the storage or back to the user. This it is not as much as handicap as one could think, since a lot of applications are being converted to a collection of microservices. Those applications are easily moved to a serverless architecture.
- Execution time of a function it is limited and typically limited by the FaaS provider. For example, Amazon limits a maximum time of 5 minutes per each function to be performed, if the function does not end on time, its container it is automatically destroyed.
- In some occasions, when a function it has just been invoked for the first time or its code updated, the startup time of the container with the code ready to be executed could be of the order of 10 ms to 2 minutes, this could present some problems depending on the type of application.
- The previous point takes us to the need of developers to optimize a lot startup times of their functions. For one side for response time of the application, each time the function is called, even if the container has been created, it needs maybe to access to things like a common storage, in traditional applications maybe there is some verifications steps that are performed than now should be avoided (ie. Check that the DB Schema is the last one). In the other side, when less time our function is being executed to perform something, less money we will have to pay to our FaaS provider.
There has been an idea that this kind of architectures mean that there is no need for Ops, that it is not exactly truth. For one side, you are really outsourcing the Ops part to a new provider. For other side, things like monitoring, security, networking, typically associated to the Ops part in a DevOps model, still need to be taken care.
This year we are starting to see some movement to port serverless to the Edge. From one side we have Amazon Greengrass, a Linux module that can run on ARM and Intel processors on the Edge. This module allow to run Amazon Lambda code while at the same time keeping a local datastore, security, and the possibility to continue working and collaborate with other Edge devices even when the internet connection is lost. From the other side, the serverless solution of IBM, OpenWhisk, it is being updated to be able to run in the Edge via containers.