Defensive Design

Defensive design means designing systems with basic assumption that everything that can fail, will eventually fail. Defensive design means implementing features that will cope with common types of failures occurring at all operational levels.


Bulkhead is a design pattern limiting the scope of failures to particular components, so that errors, failure or damage don’t spread to different part of the system or to different systems. For example, preventing fire spreading from one building to another.

Edge Cases

Designing to cover edge case scenarios is a way of designing systems to covering all rare but possible states or conditions in which the system may be put in or operating. For example, designing a system to operate in a critically low temperatures.

Mistake proofing

Mistake proofing is designing the systems to handle human error and operator mistakes so that they are impossible to make in the first place. For example: preventing users from uploading data in a wrong format.


Decoupling is designing the parts of the system to be independent from each other by making them easily replaceable by different implementations.


Redundancy is a design choice allowing duplication of resources or instances so that backup resources or copies of instances can handle workload in case of a failure.


Retry is a design concept encompassing multiple attempts to reach or to connect to external resources in case they become temporarily unavailable, therefore preventing failure of a system due to intermittent error(s).


Undo is a design concept allowing to revert system to a previous position, therefore allowing correction of human mistakes or preventing data corruption.

Cold standby

Cold standby is a concept of providing spare resources ready to start when needed, typically acting as backup to their primary resources.


Derating is a design concept that changes the way system operates so that if mistake is detected the system changes its operation to prevent from things getting worse.

Fault Tolerance

Fault tolerance is the ability of a system to continue its operation when error is detected, so that the whole system does not halt at the first instance of an error or any exception put in place to prevent unpredictable results.

Graceful degradation

Graceful degradation is a concept allowing partial operation of a system in case of a failure so that some functionalities or some areas of the system continue to work if other parts of the system failed.


Monitoring is a concept of activities providing workarounds in case of anomalies are detected during system operation so that parts of it can be taken offline and diagnosed as a result of abnormal or suspicious execution.


Durability is a concept to design the system so that it can handle wide variety of stress conditions and different levels of workloads.


Resilience is a concept of designing a system that handles different levels of stress and load due to its intrinsic properties.

- Comments

- Leave a Comment

- Contact Us

If you need more info, please speak with us by using the contact details provided below, or by filling in the contact form.

Our Location

71-75 Shelton Street, London, GB

- Write to us

Success! Your message has been sent to us.
Error! There was an error sending your message.