A year ago, the concern felt by most IT leaders was about the risk of breaking the business from an unwanted patch. While some basic firmware or Windows patches can seem innocuous, the risk of deploying a patch has – in the past – significantly outweighed the risk of being targeted by an attack. Wannacry may have changed this, but the increase in threat levels will not stop there.
The media coverage might have led to a change in risk appetite during this crisis. But even Wannacry did not cause the disruption advertised, as these early generations of vulnerability attacks such as Wannacry did not actually damage the critical information held on most data centres. This will no longer be the case, as the latest exploitation threats evolve and mature.
Virus protection and basic patching of firmware or operating systems is not enough to secure your network and the information held on it. As a result, our clients are facing the impossible decision between when to risk their critical system through a patch with unintended consequences and when to live with the risk of an open vulnerability. Even those who have outsourced services cannot sit on their laurels. If their providers are making these critical decisions without direct input, a bad judgement call can lead to significant outages or increased risk of vulnerability.
Three steps to risk-free patching
Our experience has shown that trying to tackle the problem is not easy, unless the organisation:
- Understands the estate, the standard operating environments, how change is managed and controlled.
- Establishes a robust process that prioritises control above performance targets and allows difficult decisions to be made with recurring problems.
- Embeds testing as a critical capability. Feeding outputs of testing into an impact assessment decision before deployment is invaluable, however there is more than one way to test.
Here we look at each of these three elements in turn:
1. Understanding your asset landscape means making sure your organisation has the robust processes needed to have a single source of the truth for asset lists and the software components that sit on them. This needs:
- Good controls around asset and configuration management, including management of asset churn. These are not capabilities that pertain to patching, but a good patching process is so reliant on asset and configuration management that it is typically impossible to establish a robust process without it being in place.
- Deep understanding of how each software component, which may be patched and may need to be upgraded, supports the services. By understanding the likely impact of a patch, the decision on what to patch and where an exemption is needed is often made before the patch is even published. Ideally, designing and implementing solutions that have patching in mind.
- Clear rules on patching that are in line with your customers’ expectations and certification requirements. Specific problems can be around when assets can and should be seen on the network and what to do with assets that are connected to the network infrequently. As both virtual, server and end-user assets are often dynamic in nature, it is important that there are well established rules on how assets are treated, if not seen on the network.
2. Establishing a robust process ensures you have control on what successes and failures have occurred from patching. This needs:
- An understanding of the Vendors involved: how they publish and expect you to deploy their latest updates; how they rate their security patches and at what frequency.
- Open and transparent reporting, focussed on control and resolution of issues above management of performance, is critical to the success of patching. In our experience, the quality of reporting has to be more important than the success of deployment, in order to be able focus on identifying and resolving issues.
- Obsolescence is an issue that most environments have to manage. Therefore, the process of patching will need to take account of how the assets are updated in order to be compliant with receiving patches, as obsolete environments will not be given the latest patch updates. Management of Obsolescence and Network Segmentation is therefore a key part of a robust patching process.
3. Managing the impact of a patch deployment will make a material difference to minimising the risk of patching:
- Testing patches, before they are deployed to the estate, is critical. Testing against your standard operating environment(s) needs to feed into the impact assessment of a patch as it comes out, typically within 12 hours of notification. This impact assessment needs to balance the risk of deploying the patch with the security risk of not deploying the patch. Therefore, the output of this initial testing has to be completed within 12-15 hours of a patch being published.
- Testing in its wider sense should not be limited to testing against a standard operating environment. Typical estates can be divided into groups of assets (server or end user), by risk. Test, development and pre-production servers are typically ones to target first in a hosting service. In the case of end user devices, it is often possible to create test groups that allow experienced/superuser groups to receive the patch first.
- Establishing good testing mechanisms and a robust process, is only useful if the business is prepared to stop patching. The decision to stop deployment should be made at regular points during the patch cycle, on the following basis:
1. If the known components of the system cannot be upgraded or patched, without significant risk to the service.
2. If the machine is obsolete and cannot be upgraded without a significant risk to the service.
3. If the patch has broken the service during testing or early stages of deployment and has had to be rolled back.
4. If the act of deploying and restarting a specific availability window could pose a significant risk to the service at that time.
In summary, our experience in running Transformation programmes can bring your patching and vulnerability management processes under control. We have the know-how to embed these new ways of working to ensure the balance of risk between the impact on the service and the security threat can be proactively managed.