# Batting Medians

I admit that I’ve never really been a fan of batting averages, at least in terms of what it says about what we should expect from a batsman each time he walks to the crease. I understand the premise, i.e. the batting average is the average betwen dismissals rather than the average of all innings, and of course by now we have a feel for what the batting average tells us about a player’s ability.

However there are many batsmen for whom the average is undoubtedly misleading, and a prime example is the first player ever to take guard in a Test match, Charles Bannerman. Bannerman’s average is a shade under sixty at 59.75, but this is of course grossly inflated by his first innings, that deservedly famous 165 retired not out. But is an average of 60 a reasonable guide to what should have been expected of Bannerman when he batted? If we look at Bannerman’s Test scores they look like this:-

165*, 4, 10, 30, 15, 15*

Arranged ordinally they look like this:-

4, 10, 15, 15, 30, 165

Now we can see that the 165 knock is a dramatic outlier and I think it’s fair to say an average of almost 60 is misleading as a guide to expected score in Bannerman’s case – in six innings he exceeded that figure only once. It was more likely that he would score somewhere between 4 and 30.

Admittedly this is a small sample, but it does highight the potential problems with averages as currently utilised in ranking batsmen. Take as a further example Graham Gooch and VVS Laxman; they both played the same number of innings at 215, with Gooch outscoring Laxman by more than 300 runs, yet Laxman’s average is fully five runs higher than Gooch’s due to the latter’s low number of not outs (six, as against 34 by Laxman). But what else can be done to give a fairer approximation of a batsman’s ability?

There are a number of ways to summarise central tendency in data, and for a string of numbers these are average (or mean), median and mode. For those of you who aren’t familar with theses terms, here is a quick definition – average or mean is the sum of all of the numbers divided by the sample size; the median is the midpoint of the data set when ranked in order, i.e. there will be as many numbers above the median as below it; and the mode is the value which is most represented in the data set.

Below is the curve of all Test scores:-

It should be noted that this curve shows all Test innings, not just those by recognised batsmen. The various values of central tendency for this curve are as follows:-

Batting average: 30.17
Mean: 26.2
Median: 13.0
Mode: 0

Why is the mean lower than the batting average? As discussed earlier, the batting average is the average between dismissals, whereas the mean here is the average of all innings incuding not outs. The median as mentioned is the midpoint, i.e. as many scores above 13 as below, and as can be seen this is much lower than the mean. The mode is the total most often scored, which is zero.

The importance of selecting the most appropriate method of determining the central tendency of data is of course affected by the distribution of the sample.

In a normal distribution, the mean, median and mode can be closely located because the peak of the sample data is central – however if we look at the distribution of batting scores they typically show a positive skew as in the example above, that is the peak is towards the left hand side of the plot, i.e. towards the low end of the scores. For this reason, the average or mean misrepresents the central tendency of the batsman’s scores due to the likelihood of a small number of high-value outliers, as seen with the example of Bannerman above.

I would propose then that a better representation of the central tendency of batting performances, given their skewness, is the median. I don’t believe there’s been a detailed study of batting medians, at least I’m not aware of it, and so I’ve spent some time looking at all batsmen as far as the median of their batting performances is concerned.

Going back to the skew, the larger the skew vaue the longer the tail stretching to the right of the graph; the average skew is around 1.5, whereas for example RE Foster’s is 3.5, indicating a much longer tail due to his one very large innings of 287 – the highest I found is Wasim Akram, who comes in at 4.8. David Steele by comparison has a skew value of less than 0.5 indicating (because it’s closer to zero) that Steele very rarely failed, meaning that the distribution of his scores was more “normal”. In statistics, the degree of skewness is usually taken as highly skewed if larger than 1, moderately skewed between 0.5 and 1 and approximately symmetrical (e.g. as a normal distribution would be) if lower than 0.5. The data for all Test innings is a highly skewed +2.55.

There are very few batsmen who have a negatively-skewed distribution of scores, i.e. the median is much higher than the mean because the median score is at the high end. Consider Desmond Lewis as an example; his scores in asccending order were

4, 14, 72, 81, 88

With two not outs his average was 86.33, though the mean of the scores is 51.8. However the scores above the median (81, 88) are much closer to it in value (72) than those below (4, 14) which results in a negative skew of -0.55.

Looking at all of the batsmen, it appears that a median of 30 is a measure of greatness. Here are some notables with their associated medians to give you a feel for how the medians vary:-

40.0 Hobbs
32.0 Sutcliffe
33.0 Nourse
32.0 Hutton
28.0 Compton
34.5 Walcott
36.0 Weekes
28.0 Worrell
33.5 Sobers
46.0 Barrington
34.0 Pollock, RG
31.0 Chappell, GS
32.5 Richards, IVA
31.0 Border
25.5 Waugh, SR
34.0 Tendulkar
33.5 Lara
30.0 Ponting
34.5 Kallis
33.0 Dravid
31.0 Sehwag
32.5 Sangakkara

The above values, representing as they do the midpoint of all scores, give a more realistic approximation of the actual score we can expect from a batsmen, wich is much lower than the batting average. So as can be seen, 30 represents a reasonable level of greatness, with some of the all-time greats achieving medians over 40. It can also be seen that there is a significant difference between the batting average and the median score, and in some cases this difference is significant. Another point to make about the median is that it will either be an integer or between two integers, e.g. 34.5 – this reduced granularity may also be a better way to rank batsmen than comparing averages to two decimal places.

If we look at the differences between batting average and median score, they rank like this:-

Avg less Median
44.75 Bannerman
38.85 Valentine
38.72 Dempster
37.66 Weekes, KH
35.20 Kambli
34.50 Wood
32.33 Kuruppu

The aforementioned Bannerman fares worst here, and the others in the list apart from Bradman can be seen to have their averages somewhat padded. Bradman is an exception here (as he usually is) – the high difference is of course largely a result of his already high batting average. The way to cope with that is to look at the median as a percentage of the average:-

25.10% Bannerman
25.49% Aamer Malik
27.57% Lewis, AR
28.18% North, MJ
29.31% Bonnor
29.63% Watkins
30.43% Raina
31.58% Fender
31.85% Twose
32.98% Robinson, RT

No sign of Bradman there. Turning to those batsmen who had a very small difference between their average and median, only twelve in the history of Test cricket have a median value higher than their average, the most significant of which was David Steele. Steele made an immediate impact upon his introduction to Test cricket, with five of his first seven innings being over 50 and none below 39, his median at that point being a whopping 66.0! While it dropped off somewhat after that he still ended up with a median of 43.0 (higher ithan his average of 42.06) and a very low skew value of 0.43, indicating he very rarely failed – 15 of his 20 innings were scores of 20 or more.

It should be noted that the skew, while it is typically a reasonable indicator of consistency, will unfairly penalise those with long careers who have a number of high scores. For example, Lara has a skew value of 2.32, which would tend to suggest he wasn’t very consistent; however if we cap all of his high scores at 150 his skew value drops to a respectable 1.0.

Going back to the median, the highest medians recorded to date are shown below:-

72.0 Lewis
65.0 Richards, BA
51.0 Hill, AJL
50.0 Gregory, RG
48.0 Duleepsinhji
46.0 Barrington
46.0 Walters, CF
43.0 Steele
42.0 Jaques
41.0 Barnes, SG
40.5 Ramaswami
40.5 Tyldesley, GE
40.0 Hobbs

Some of these batsmen did not feature in many Tests, Bradman and Barrington being the notable exceptions – nonetheless it’s fair to say these batsmen demonstrated consistent performances. This can be conformed looking at the skew measure of scores – Lewis was discussed above, but Cyril Walters, who played 18 Test innings has a very low skew value of of +0.1; in only four of those innings did he fail to score at least 20.

But to me the significant indicator of the median as far as measuring the central tendency of individual batsmen can be seen if we rank the batsmen on median and exclude those with fewer than twenty innings, in which case we have this list:-

Best medians all-time
46.0 Barrington
42.0 Jaques
40.5 Tyldesley, GE
40.0 Hobbs
38.0 Sutcliffe
36.0 Weekes
35.0 Gunn, G
35.0 Barlow, EJ
35.0 Katich

I have always felt that it’s unreasonable to consider that Bradman, as some have said, is 50-60% better than everyone else based on the difference between his average and the next best (depending where you place the cut off from an innings played point of view) – here, Bradman enjoys a lead of just over 20%, which is still very significant but may be a better measure of his superiority over the rest.

To conclude, below is a list of the top batsmen, ranked on their medians, showing also skew and batting average (cut-offs 20+ innings, average 35+)

Player		Median	Skew	Avg
Barrington	46.0	1.19	58.67
Jaques		42.0	0.89	47.47
Tyldesley	40.5	0.39	55.00
Hobbs		40.0	1.30	56.94
Sutcliffe	38.0	1.05	60.73
Weekes		36.0	1.22	58.61
Barlow		35.0	1.46	45.74
Katich		35.0	0.82	45.03
Gunn		35.0	0.88	40.00
Kallis		34.5	1.16	57.43
Walcott		34.5	1.18	56.68
Pollock		34.0	1.83	60.97
Trott		34.0	1.62	57.79
Tendulkar	34.0	1.40	56.25
Hassett		34.0	1.52	46.56
Sobers		33.5	2.14	57.78
Lara		33.5	2.32	53.18
Paynter		33.0	2.00	59.23
Nourse		33.0	1.80	53.81
Dravid		33.0	1.67	53.00
Khan		33.0	2.10	50.93
Pietersen	33.0	1.51	50.48
Bland		33.0	1.31	49.08
Jardine		33.0	0.81	48.00
Watson		33.0	0.93	39.23
Taylor		33.0	0.72	35.60
Sangakkara	32.5	1.85	56.42
Richards	32.5	1.79	50.23
Hammond		32.0	2.18	58.45
Hutton		32.0	2.32	56.67
Hussey		32.0	1.17	53.26
Smith		32.0	1.93	50.30
Walters		32.0	1.82	48.26
Hendren		32.0	1.49	47.63
Kanhai		32.0	1.78	47.53
Brown		32.0	1.64	46.82
Richardson	32.0	0.91	44.77
Ryder		31.5	1.63	51.62
Chappell	31.0	1.67	53.86
Sehwag		31.0	2.19	52.41
Border		31.0	1.43	50.56
Chanderpaul	31.0	1.20	49.18
Jackson		31.0	0.89	48.79
Amla		31.0	1.74	46.95
Subba Row	31.0	1.02	46.85
Sharpe		31.0	0.85	46.23
Anwar		31.0	1.27	45.53
Hunte		31.0	2.19	45.06
Redpath		31.0	1.25	43.45
Fingleton	31.0	0.99	42.46
Hayden		30.5	2.37	50.22
McCabe		30.5	1.91	48.21
Stollmeyer	30.5	1.66	42.33
Iqbal		30.5	1.25	40.21
Ponting		30.0	1.56	53.13
Jayawardene	30.0	2.30	51.53
Cook		30.0	2.03	49.72
Boycott		30.0	1.47	47.72
Simpson		30.0	2.48	46.81
Martyn		30.0	1.17	46.38
Collins		30.0	1.94	45.06
Richardson	30.0	1.56	44.39
Edrich		30.0	2.48	43.54
Amarnath	30.0	0.88	42.50
Fredericks	30.0	1.34	42.49
Ramesh		30.0	1.11	37.97
Sheppard	30.0	1.14	37.80
Richardson	30.0	0.97	37.47
Inzamam		29.5	1.97	49.60
Sidhu		29.5	1.27	42.13
Kamal		29.5	0.52	37.73
Laird		29.5	0.59	35.29
Davis		29.0	1.66	54.20
Flower		29.0	1.75	51.55
Dexter		29.0	1.64	47.89
May		29.0	1.97	46.77
Morris		29.0	1.59	46.48
Leyland		29.0	1.43	46.06
Greig		29.0	1.26	40.43
Ahmed		29.0	1.74	40.41
Youhana		28.5	1.52	52.29
Compton		28.0	1.89	50.06
Worrell		28.0	2.09	49.48
Bell		28.0	1.54	49.28
Mitchell	28.0	1.18	48.88
Lawry		28.0	1.57	47.15
Robertson	28.0	1.25	46.36
Woodfull	28.0	1.16	46.00
Pullar		28.0	1.69	43.86
Rowan		28.0	2.31	43.66
Butcher		28.0	1.85	43.11
Washbrook	28.0	1.70	42.81
Viswanath	28.0	1.54	41.93
Gayle		28.0	3.08	41.65
McDonald	28.0	1.64	39.32
Gregory		28.0	0.99	36.96
Ganguly		27.5	1.70	42.18
Harvey		27.0	1.47	48.41
Graveney	27.0	2.14	44.38
Gower		27.0	1.77	44.25
Taylor		27.0	2.49	43.50
Gooch		27.0	2.34	42.58
Strauss		27.0	1.39	41.98
Taylor		27.0	1.15	40.77
Umar		27.0	2.15	40.08
D'Oliveira	27.0	1.28	40.06
Fleming		27.0	2.55	40.06
Stewart		27.0	1.60	39.55
Armstrong	27.0	1.93	38.68
Ranatunga	27.0	1.11	35.70
Cowper		26.5	2.78	46.84
Jones		26.5	1.77	44.27
Ransford	26.5	1.87	37.84
Gambhir		26.0	1.45	48.34
Gilchrist	26.0	1.33	47.61
Nurse		26.0	2.04	47.60
de Villiers	26.0	2.14	47.41
Laxman		26.0	2.09	46.26
O'Neill		26.0	1.51	45.55
Misbah		26.0	1.32	44.97
Turner		26.0	2.45	44.64
Kal'charran	26.0	1.23	44.43
Chappell	26.0	1.62	42.42
Taylor		26.0	1.36	41.76
Wessels		26.0	1.59	41.00
Symonds		26.0	1.95	40.61
Denness		26.0	2.24	39.69
Kippax		26.0	1.42	36.12
Waugh		25.5	1.37	51.06
Hazare		25.0	1.40	47.65
Clarke		25.0	1.24	46.31
Azharuddin	25.0	1.58	45.04
Greenidge	25.0	1.86	44.72
Cowdrey		25.0	1.28	44.06
Malik		25.0	1.82	43.70
Smith		25.0	1.30	43.67
Boon		25.0	1.51	43.66
Rowe		25.0	2.82	43.55
Slater		25.0	1.58	42.84
Umrigar		25.0	1.80	42.44
McGlew		25.0	2.47	42.06
Gibbs		25.0	1.93	41.95
Macartney	25.0	1.52	41.78
Dilshan		25.0	1.70	41.69
McCosker	25.0	0.97	39.56
Gurusinha	25.0	1.70	38.92
McKenzie	25.0	2.29	37.39
Duff		25.0	1.68	35.59
Samaraweera	24.5	1.62	52.61
Rae		24.5	0.75	46.18
Kirsten		24.5	1.91	45.27
Parfitt		24.5	1.18	40.91
Sarwan		24.5	2.52	40.01
Manjrekar	24.5	1.86	39.12
Khan		24.5	1.32	38.92
Wright		24.5	1.51	37.82
Jones		24.0	1.83	46.55
Crowe		24.0	2.18	45.37
Ryder		24.0	1.99	44.85
Haynes		24.0	1.44	42.29
Vengsarkar	24.0	1.43	42.13
Vaughan		24.0	1.84	41.44
Gomes		24.0	1.39	39.63
Ponsford	23.5	2.10	48.22
Lehmann		23.5	1.44	44.95
Dhoni		23.5	1.27	38.14
Nazar		23.5	2.15	38.09
Abel		23.5	1.65	37.20
McCullum	23.5	2.18	36.70
Thorpe		23.0	1.30	44.66
Sutcliffe	23.0	2.36	40.10
Iqbal		23.0	1.68	38.85
Burge		23.0	2.05	38.16
Khan		23.0	1.54	37.69
Manjrekar	23.0	2.48	37.15
Hooper		23.0	2.10	36.47
Barber		23.0	2.49	35.59
Atherton	22.5	1.29	37.70
Kelleway	22.5	1.64	37.42
Stackpole	22.5	1.97	37.42
Hughes		22.5	1.66	37.41
Raja		22.5	1.25	36.16
Woolley		22.5	1.39	36.07
Prince		22.0	1.51	43.36
de Silva	22.0	1.88	42.98
Yallop		22.0	2.31	41.13
Bardsley	22.0	1.73	40.47
Sardesai	22.0	2.22	39.23
McMillan	22.0	1.34	38.46
Miller		22.0	1.62	36.97
Holt		22.0	1.83	36.75
Umar Akmal	21.5	1.34	35.82
Edrich		21.0	2.14	40.00
Hafeez		21.0	1.38	35.17
Jayasuriya	20.5	2.89	40.07
Logie		20.5	1.10	35.80
Warnapura	20.5	0.98	35.70
Stoddart	20.5	2.35	35.57
Shrewsbury	20.5	1.90	35.47
Hardstaff	20.0	1.46	46.74
Til'karatne	20.0	1.66	42.88
Goodwin		20.0	1.46	42.84
Matthews	20.0	1.24	41.09
Trumper		20.0	2.08	39.04
Lindsay		20.0	1.84	37.66
Khan		20.0	1.98	37.10
Astle		20.0	1.92	37.02
Patil		20.0	1.85	36.93
Dias		20.0	0.94	36.71
Borde		20.0	1.58	35.59
Ritchie		20.0	1.63	35.21
Kambli		19.0	1.77	54.20
Reid		19.0	1.37	46.29
Abbas		19.0	2.27	44.79
Prior		19.0	0.97	44.71
Hill		19.0	1.58	39.21
Coney		19.0	1.60	37.57
Hussain		19.0	1.63	37.19
Lamb		19.0	1.39	36.09
Luckhurst	19.0	1.36	36.05
Fowler		19.0	2.10	35.32
Gatting		18.5	2.13	35.55
Booth		18.0	1.29	42.21
Fletcher	18.0	2.03	39.90
Iredale		18.0	1.35	36.68
Oram		18.0	1.61	36.33
Styris		18.0	1.64	36.04
Steel		17.5	2.49	35.29
Ranji		17.0	1.77	44.95
Faulkner	17.0	1.82	40.79
Ames		17.0	1.56	40.56
Sandham		17.0	3.46	38.21
Umar		17.0	2.56	36.63
Afridi		17.0	1.45	36.51
Shastri		17.0	2.07	35.79
Atapattu	16.5	2.36	39.02
Lloyd		16.0	2.23	46.67
Amiss		16.0	1.99	46.30
Edwards		16.0	1.49	40.37
Tres'thick	15.0	2.38	43.80

How about compare each batsmans score against the median score (z-score)for each say 3-5 year period and average it so to get an idea how each batsman is rated according to his contemporaries

Very interesting article, but i doubt many lower order batsman would like it ðŸ˜€

Comment by Leandro | 12:00am GMT 24 November 2011

Great work. I value consistency (median) as more important than averages. I would also like to see statistics on dot balls for one day match averages.

It should also be remembered that cricket is a team game and not all batsmen played for the team, rather they played for themselves and is more readily apparent in the one day format.

Comment by Steven Delit | 12:00am GMT 3 January 2012

Great article.

A question and a request ðŸ™‚

Question – are you taking the median of runs scored in an innings, or runs scored between dismissals?

Request – could you please let me know where you got the data from to do this analysis? I’d like to play around with it as well.

Thank you!

Comment by Masum | 12:00am BST 28 August 2013

I think this would also be very relevant for bowlers as an average of say 28 doesn’t translate to an actual analysis like a median of say 4/120 would and would allow for a better comparison of bowlers

Comment by Daniel | 5:39am GMT 8 February 2015