Tailored cloud financial management solutions.
Manage the end-to-end delivery of IT services to customers successfully.
Complete cloud migration services for the public sector.
Transforming the financial services sector with industry-leading cloud and data solutions.
Delivering compliant and secure cloud automation solutions for the Public sector.
Other sectors that we currently have clients within include insurance, media, retail, construction and automotive.
Our team of industry-leading experts deliver world-leading transformation solutions.
We would love to talk about transforming your business. Please let us know.
Our latest industry news, insights, employee stories, and upcoming events.
2022-01-25 00:00:00
The combination of Kubernetes + Istio has largely reduced many emergency phone calls at 2am by providing a flexible way to tackle infrastructure and application level failure.
Its Cluster AutoScaler can scale up or down the number of nodes in the cluster based on usage. K8s also continuously monitors application pods’ healthiness and restarts the pods if necessary, while Istio has introduced granular control on ingress/egress traffic management at Cluster level and Pod level. They have greatly improved the user experience with less interruption and increased service availability.
However, even though K8s + Istio have got lots of useful features, it is always better to prepare for the worst before the worst comes. This is where Chaos testing comes in. We try to explore what will happen when different components in our system break. Of course, this will be carried out in a controlled environment, we will devise ways to break the system. For example, reduce infrastructure capacity, create high load in compute resource, create network outage, application failure, etc. All common or uncommon outage scenarios that you think of can be included in your Destroyer plan.
On the other hand, we also need our Savior repair strategy to get things restored once Doomsday occurs. We need to experiment with this plan and assess whether it returns our configuration to a stable state as we would want. Hence we build confidence that the service mesh can tolerate failing nodes and can prevent localised failures from cascading to other nodes.
It’s becoming popular for enterprise IT to hold a Game Day to get their IT expertise ‘rehearsed’ in such situations.
Technically speaking, Envoy, an open source lightweight proxy is the building block of Istio. Envoy works alongside the Kubernetes workload pod. It acts as a gateway between the workload pod and the Kubernetes mesh. Envoy intercepts all inbound and outbound traffic to and from the app workload. Hence we can use Envoy to manipulate the traffic by using its versatile routing features.
In the following, I will focus on using Istio to carry out Chaos testing, where some network delay and HTTP error response will be introduced to emulate network issues in microservice-based applications.
The client request call will first reach Istio Ingress Gateway which matches the Virtual Service and Destination Rule (if any). Based on the routing configuration, the request will be dispatched to the Backend.
Istio provides two kinds of HTTP failure injection at Virtual Service level, they are namely,
We can use HTTP delay fault to introduce network latency when the request reaches the Ingress Gateway. The envoy proxy response flag will be set to DI indicating that the request processing was delayed for a period specified via fault injection. With more granular control, you can specify what percentage of traffic you want to delay. Following is an example YAML file for creating a virtual service injecting a five second delay to ALL matched virtual service traffic.
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: test-vs
spec:
hosts:
- backend
http:
- fault:
delay:
percentage:
value: 100
fixedDelay: 5s
route:
- destination:
host: backend
gateways:
- ingress-gateway
Next comes the HTTP abort fault. Following is an example YAML where HTTP response code “500 — Internal Server Error” will be returned to the client for matched traffic. The envoy proxy response flag will be set to FI indicating that the request is aborted with a response code specified.
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: test-vs
spec:
hosts:
- backend
http:
- fault:
abort:
httpStatus: 500
percentage:
value: 100
route:
- destination:
host: backend
gateways:
- ingress-gateway
You can use Istio Virtual Service to do Chaos testing at the application layer transparently, by injecting timeouts or HTTP errors into your services, without actually updating your app code. Testing the system in distress to ensure its resilience is extremely important for modern microservice applications with little tolerance for downtime.
For a more orchestrated Chaos Engineering platform, Chaos Mesh will be a choice. It not only does Network Chaos, but is also able to carry Pod Chaos, DNS Chaos, IO Chaos, etc. and visualises the operation.