Benchmarking: Nobody Wants to be Average!

Where a large outsourcing contract has a term of greater than four or five years, it is usual to find a clause in the contract allowing for the customer to perform a benchmarking exercise. Whilst a supplier would rarely include a provision willingly in their standard terms, it is also rare for a supplier to challenge the existence of such a right in a large outsourcing contract. However, a number of benchmarking provisions proposed by prospective customers (or their advisors) seek to extend the purpose of the benchmark beyond merely protecting the customer against material shifts in the market to a more extensive right to maintain their perceived cost advantage in the market.

In this article I shall endeavour to explain why such an extension, in my opinion, not only breaks the underlying purpose of benchmarking but also, in many cases, means the provisions are unworkable and as such unenforceable. Please note these are my views and not necessarily those of my employer.

What is benchmarking?

Benchmarking is a service provided by a number of companies, for example Gartner, Equaterra and Compass, who gather data from their customers and other available data sources and, using such data, attempt to calculate the benchmark for a particular service.

When determining a benchmark, the benchmarker will source a number of comparators and ‘normalise’ the data to match, as closely as possible, the contractual and commercial terms and associated factors that do, or could, influence the price for the contracted services being benchmarked. In order to do this, it is necessary that the benchmarker has sufficient data of a suitably comparable nature (ie ‘like for like’) to enable a valid comparison to be made between the source data and the contracted service.

Once the benchmark has been determined, the contract between the customer and supplier will often specify the implications of this benchmark, whether it is ‘price match’, a trigger for an escalation meeting or termination of the contract itself.

The scope of a benchmark

A benchmark can only be usefully used to ascertain whether the price charged for the contracted services represents ‘Good Value’ (as defined in the contract).

It cannot, and should not, be used to determine whether the Service Levels and/or services themselves are in line with the market norms. The reasons for this are:

1- the Service Levels and services themselves are defined to meet the customer’s particular needs and wants and the limitations of the technology in place at the date of the contract and, as such, a comparison sample for such comparators merely details what a set of unrelated third parties contracted for and not what a market benchmark for a particular type of service is; and

2- what benefit is there in determining what the market standard service levels are for a particular service without knowing what the cost for such a service level is? If a benchmark were to say that the benchmark Service Level is 98% and the contract requires 95%, what has the customer found out, particularly if, under a subsequent benchmark, it finds out it is paying the benchmark charge for the contracted services and service levels, and so, one assumes, would be required to pay more for the higher benchmarked service level.

Therefore, a benchmark should only ask the question, do the charges payable for the contracted for services and Service Levels constitute ‘Good Value’?

How do you set the benchmark?

‘Good Value’, ‘competitive’ or whatever term is used in the contract to define the benchmark, should be set at the mean average of the comparison sample. The reasons for this are twofold:

First, as stated above, I believe that the overriding principle of benchmarking is to protect the customer from a material shift in the market. To expect a benchmark to be set at top quartile, or top decile would appear to go beyond this remit and can leave a supplier at a significant disadvantage with respect to the benchmark. For example, the supplier will risk a potential loss if it has invested in equipment, has amortised the cost associated with such equipment over the five-year term of the contract and is seeking to recover such amortised cost via a service charge.

Secondly, there is the mathematics associated with determining the benchmark itself. To calculate a ‘mean’ that is reasonably representative of the market, one would usually expect the benchmarker to use between six and eight ‘normalised’ comparison data samples. To calculate a ‘quartile’, which is reasonably representative of the market, one requires many tens of comparison data samples (TPI in their paper ‘Formal Benchmarking in Outsourcing Contracts: TPI’s Position’ in April 2007 stated that ‘many statisticians state that they need 60 or more data points to produce accurate quartile readouts, and they generally agree that a quartile positioning with fewer than 20 data points has limited value‘). The larger the number of comparison data samples required the greater the variability of the comparators to the benchmarked services. This will result in a larger volume of normalisations being required. A significant ‘margin of accuracy’ for the benchmark should be required by a supplier (ie the supplier’s charges must be within 20% of the benchmark) in such circumstances, thereby negating a large amount of the perceived benefit, particularly where the distribution of the comparators was small. Furthermore, a requirement for large volumes of normalisations of the comparison sample will lead to dispute between the customer and supplier.

It may be that the cloud and ‘as a service’ offerings are more easily benchmarked as a result of the resultant standardised service, however this does not mean that a quartile or decile is a fair comparator or mathematically achievable, particularly given the variety of offerings labelled ‘as a service’ currently in the market.

Who is the benchmark for?

Many outsourcing contracts include the provision that the outcome of the benchmark shall not result in an increase in the charges. Therefore, the instigation of a benchmark is not usually in the interests of a supplier, hence most benchmarking provisions include limitations on the frequency and scope of benchmarking as it is seen as a unilateral right of the customer. Clearly, where a contract includes a guaranteed base profit margin for the supplier, a benchmarking provision may require an increase in charges.

However, notwithstanding that a supplier is unlikely to instigate a benchmark itself, this does not mean that a benchmark is without benefit to a supplier. In many instances the conclusion of a benchmark strengthens the relationship between customer and supplier because it validates that the customer is receiving ‘good value’. Therefore, the benchmark is provided for both parties and hence a duty of care (to ensure the benchmark methodology is correctly followed) is owed to both the customer and supplier by the benchmarker. I therefore usually seek to obtain a contractual provision that, whilst the customer may instigate a benchmark, a contract for the benchmarking service shall be entered into as a tripartite agreement between the benchmarker, the customer and the supplier. This arrangement should also include the process for agreeing terms of reference for the benchmarker and the methodology by which the benchmark will be performed. This helps to ensure transparency and gives all parties the confidence that the process will operate fairly.

This avenue of discussion will often lead a customer (or its advisors) to the question of who pays for the benchmarker’s fees. Assuming the supplier has accepted that the customer may unilaterally instigate a benchmark, it is not in the supplier’s power to determine how many benchmarks are performed during the term of the contract (noting though that the parties may have included limits to the frequency of a benchmark in the contract). Therefore, to ask a supplier to include the ‘cost of benchmarking’ in its service charges will either result in the supplier pricing for the total number of benchmarks the customer could instigate under the terms of the contract or taking a ‘gamble’ that its relationship with the customer and/or the customer’s ability to verify the current market price for the services independent of a contractual benchmark are such that it will not use its full entitlement. Either way, this is unlikely to result in value for money and as such it is prudent to accept that a supplier should include in its charges 50% of the expected benchmark fees for one benchmark during the term of the contract. All subsequent benchmarks are paid for by the customer. A supplier may be willing to accept the ‘risk’ of its own internal costs but, depending on price sensitivity, it may also ask for some protection in relation to these costs where the customer instigates multiple benchmarks during the term.

Results of a benchmark

Given the limitations of the benchmarking process (most notably the limitation on sample size and the need for extensive normalisations), a standard benchmark cannot, in my opinion, determine what the actual market price for the services is but rather what an ‘acceptable’ market price is or ‘good value’ comprises. Therefore claims that the benchmark should determine a statistically valid ‘market price’ should be rebutted because this is not the goal of a benchmark. However, the process of benchmarking is an art and not a science. As such, the raw data is subject to varying degrees of manipulation by the benchmarker to ensure the comparator reflects the key cost drivers of the actual service being benchmarked; this is known as normalisation. Clearly, whilst the benchmarker is experienced in this process, and therefore one can assume such normalisations are fair and reasonable, these changes to the raw data do not necessarily reflect what the variant would have been had the customer and supplier entered into a contract akin to the relevant comparator contract as opposed to the one the parties actually executed. It is therefore possible that this normalisation process will give rise to a high level of dispute between the customer and supplier. To avoid this, and ensure a fair benchmarking process, the parties should consider including a ‘band’ of between 5 and 10% from the benchmark price. If the contract charges are within this ‘band’, the charges will be deemed to have met the benchmark.

The consequences of a failure to achieve the benchmark are very much determined by the strength of the bargaining position of each party at the time of execution of the contract. However, it should be noted that an opportunity should be given to the supplier to meet the ‘benchmark’. In the event that the supplier is unable or unwilling to meet the benchmark, any right of termination should take account of the fact that such a termination is neither a termination ‘for convenience’ nor a termination ‘for breach’ and hence any rights or remedies for either party should reflect this.

Conclusions

Benchmarking should not be used to bridge the gap between what the customer wants and what they think they should have because a competitor has it. Notwithstanding provisions in a contract that seek to facilitate organic change, the ability to vary the method of delivery of IT services, thus achieving cost savings which were not envisaged (and therefore’ baked’ into the price) at the time of contract signature, is often very much restricted. Ensuring that a customer receives the full benefit of any changes in technology will usually require a change to the contract, managed via an agreed change control procedure.

Benchmarking should therefore be used only to confirm that the customer receives an acceptable market price throughout the term of the contract. The restricted number of sufficiently similar data sources from which to calculate a benchmark means that using the mean of the sample group of at least six comparators results in a ‘fair’ benchmark. Furthermore, it should not be used to determine what the scope or value of service levels are in the market as the benchmarking process cannot adequately distinguish between a ‘need’, a ‘want’ and a technical capability or limitation.

Finally, benchmarking is an art not a science and therefore should be treated as such. It is by no means a perfect process and ultimately cannot determine what the actual market price is but rather determines what a fair market price could be. If used for what it was designed for, it is the best formal mechanism available to objectively determine a fair price for a long-term contract. However, if misused in the manner highlighted above, it will only lead to dispute and/or a sense of injustice for one or other party.

Michael Harvey is a Senior Legal Advisor at Fujitsu Services Limited

Upcoming events

SSCL: The UK’s Cybersecurity and Privacy Legislative Framework

Policy Forum 2025: a model for powerful AI legislation

Lunch and Learn: Digital Transformation in practice – Gaining the edge from drafting

AI Law: what every business (and their lawyers) needs to know

Data Protection Conference 2026