Getting the balance right banner

Dressage Scoring: Getting the Balance Right

Often we hear that Thursday morning (session 1) is a handicap and that Friday afternoon (session 4) is a bonus. This has been an opinion for years and it came up again after Burghley, and other 4* records, tumbled in the final hour of dressage. So can EquiRatings provide any factual insight on the matter?

Judges are human so we can’t expect them to be void of bias, but time and again our analysis does show that the draw can have an instrumental impact on a competitor’s chance of winning.

At Burghley, based on their 2016 dressage averages, the following sessions should have produced the following averages:

  • Thursday morning: 52.8
  • Thursday afternoon: 53.2
  • Friday morning: 50.3
  • Friday afternoon: 50.2

So, according to these recent trends we did expect a slightly higher standard of dressage on Friday. But how did the sessions then perform based on actual averages?

  • Thursday morning: 57.1 (4.2 marks higher than trend)
  • Thursday afternoon: 53.4 (0.2 marks higher than trend)
  • Friday morning: 51.5 (1.2 marks higher than trend)
  • Friday afternoon: 46.6 (3.6 marks lower than trend)

It follows a distinct pattern. Thursday morning is a must avoid as competitors will usually score significantly higher than their trending average, while Friday afternoon provides that crucial bonus in the opposite direction. The difference between these two sessions was 7.8 penalties at Burghley, the equivalent of two show jumps or 20 seconds on the cross country.

The intention here is not to criticise but to inform. The above figures can be used to help identify potential shifts in judging scoring patterns. The ground jury at Rio 2016 managed to avoid any session bias so it can be controlled, but frequently it is significant. The significance is the biggest issue. The 7.8 penalty difference between Thursday morning and Friday afternoon represents just half a mark per movement. Or, as it actually happens, quarter of a mark too hard on Thursday and quarter of a mark too easy on Friday.

If one judge felt a combination was working at all 7s, it results in a score of 45. If the other judge thought it was all 7.5s, it brings a score of 37.5, the difference of two show jumps. As with all analysis, a range of questions begin to open up:

Why is our current scoring system placing so much emphasis on dressage? Having a difference of opinion of one mark shouldn’t be such an issue, particularly with judges sat at different positions around the arena, but one mark per movement is currently 15 penalties, almost four show jumps and over 37 seconds on the cross country.

Jonelle Price and CLASSIC MOET - Nico Morgan

Jonelle Price and CLASSIC MOET scored 48.5 on Thursday afternoon. (photo Nico Morgan)

Comparing Jonelle Price and Christopher Burton at Burghley asks the question of balance between the phases. Burto was 18.3 marks ahead after dressage (an average of just 1.2 marks per movement). Jonelle was four seconds faster on the cross country and had one fence down versus the four of Burto. Chris won and still had 4.7 marks in hand over Jonelle (3rd) so could have had a fifth fence down and stayed ahead of her.

Chris Burton and NOBILIS 18

Christ Burton was 4.3 marks ahead of Bettina after his test on NOBILIS 18 (photo – Nico Morgan)

Is that balance right for eventing? Are we promoting the right type of horse for the sport or is that one mark per movement placing too much emphasis on the first phase? How many hours do people spend training in the cross country phase versus training in the dressage phase, and is that in the best interest of keeping the sport safe?

Comments 28

  1. Interesting and fair analysis. Supports early work I supervised for final year BSc student dissertations looking at thousands of BD test results. There was a statistically significant difference between first session before coffee, and late morning (lower good marks), with those immediately after coffee, immediately after lunch and for the final few competitors in the class. These of course are shorter than 1 day, and mostly have only one judge, but suggests similar trends

  2. This is fascinating and backs up a couple of things I have been discussing with others. However, it is important to also bear in mind that one judge in particular was regularly out of line with the other two last week. That might change the analysis somewhat. She was often two or three marks per movement down on the other two.
    I’m particularly interested in your last few paragraphs too. I was there, watching Chris’s SJ round, thinking, “that should be it now, he’s lost it” but it wasn’t. He was still in the lead. There is too much emphasis on dressage, still. An unlucky run-out on cross-country should not cost 20 penalties, or five poles. A good clear round SJ on a course like that on Sunday should be massively rewarded too. They were VERY few and far between.

  3. The very best dressage judges avoid dressage bias. They keep notes of their previous marks given in the competition to other competitors, and use their collectives if necessary to ensure that the overall finishing order after the dressage is as they intend. (I have written for a top judge who did this.) the best judges are not swayed by who is in the saddle, either.
    Perhaps we should go back to 1 second over time on xc = 1 penalty (rather than 0.4), system that was trialled in 1999, or otherwise do away with the dressage coefficient. The dressage is surely too influential at CICs and CCIs.

  4. The reason the Olympics avoided the bias was that there is no draw, and no multiple riders. The ‘draw’ is nearly always arranged to have the star riders later on on cross country day unless they are multiple riders. Many of those prefer to ride the least able horse first (admittedly others not) but that affects the figures. There are other factors not least of which is the judges getting tired or having two many in a section so the judge can’t remember the tests on the first morning session clearly and effectively judges two competitions. However your figures may well provoke some changes in the system which would be a good thing.

  5. Pingback: Putting the science into dressage scoring - NZ Horse & Pony

  6. In fact, your numbers may not reflect the actual judges scores since the penalty points at the FEI level include the 1.5 coefficient. If you are using actual dressage scores and not just penalty points, the weight of dressage is even greater.

    I’ve been doing 4* event spreadsheets with and without the FEI dressage coefficient for about five years. The dressage coefficient rarely changes the winner, but it did for Burghley 2016. With the dressage coefficient, the top three placings were Chris Burton, Andrew Nicholson and Jonelle Price. Without the dressage coefficient, the top three were Jonelle Price, Chris Burton, and Andrew Nicholson. The dressage coefficient also determined the winner at Badminton in 2011. With it Mark Todd won; without it Sam Griffiths and Happy Times were top ranked.

    At Burghley this year, seven of the top ten places were determined by the extra FEI dressage coefficient penalty points. The pairs would still have been in the top ten, but their placings were determined by the coefficient.

    The dressage coefficient is a relic of long format. Since so many other things have changed in eventing, with the loss of roads and tracks, the advent of the CIC, and the shortening of the CCI XC course, is there any excuse for this extra dressage weight. It is possible, and I have the spreadsheets to prove it, that perfect performance in both XC and dressage cannot overcome the extra penalty points (which increase geometrically as the dressage score declines) imposed by the FEI.

  7. Very very interesting article. There is no doubt that four down in the sj should not result in winning Burghley. The influence of the xc was huge as it should be, and I am sure also affected the showjumping, but it is those dressage statistics that need scrutiny

  8. The dressage coefficient was a relic of long format eventing and should have been removed when roads and tracks were removed. It is not found in national competitions and results in too much dressage influence in the final outcome. It also magnifies the variability of the dressage marking and the perceived bias which can occur when judges favour the big name riders. To win with 4 show jumps down is wrong as that is a very bad show jumping round.

  9. V interesting and it’s especially hard for riders who have two so will always likely be going on Thursday am. There were some nice tests at Burghley in this slot and general consensus was if they had been Friday pm they would have been a few marks more at least. Please can someone set out an example of how the coefficient works? Agree dressage has too much influence now.

    1. If you score a 30 in FEI dressage, your actual penalty points will be 45–30 x 1.5 equals 45. If you score a 40, your actual penalty points will be 60–40 x 1.5. If you score a 20, your actual penalty points will be 30–20 x1.5.

  10. What nobody is taking into account is that many horses vary in how they go from test to test. Sometimes horse or rider will make a mistake – miss a change, stick in the pirouette, break, etc all of which will bring the mark down for the movement(s) to below 5 and effects the submission and rider mark. You cannot say that the horses will always finish in order of their average mark – there are so many factors that contribute – weather, going, outside influences, preparation, how they feel on the day, etc.
    I don’t agree with Kerry that a good judge will compare one test to another and manipulate the order with the collectives. For me, that results in false results as comparison is difficult unless consecutive as the memory can distort. I prefer to mark each movement from a perfect image of what I want to see as this is more consistent to what actually happens. The collectives should reflect the test, not be used to modify the score. It may then be that a consistent test without mistakes comes out higher than one that says wow and everyone says oh that’s the best but maybe lack of straightness or similar made the mark come out slightly lower. Of course , the quality can be so high that it overrides the problems then you get people saying didn’t the judges see it doing x or y!!
    Possibly playing devils advocate here but while I don’t disagree that a horse that has 4 show jumps down shouldn’t win, should a horse that shows a lot of tension in the dressage, effectively producing a fairly moderate performance however skilfully ridden also be the winner? I wonder whose was the best all round performance?!

    1. I certainly agree that the dressage judges’s scores should determine the overall outcome. What I object to is the dressage coefficient giving those scores vastly more weight than the scores in the other phases.

    2. Post

      We have to back you up here Annabel – Careful and knowledgeable assessment of the actual performance is important. What’s interesting about Burghley is that the first two horses did actually beat their averages, so the judges were clearly happy to reward where it was warranted. Two of the last six horses on Friday scored higher (worse) than their average, one of them by 9 penalties, so again it shows that the judges were reacting to what was in front of them.

      What we see a lot, and what seems to go beyond coincidence (although we must stress that this is ‘often’ and not ‘always’) is that most first session combinations score below their usual mark and most final session combinations score better than their average. The figure at Burghley was that 3 of the first 17 horses scored better than their trend line, while the same figure for the end of the competition was 13 from 17. That’s the interesting trend that we see creep in a lot.

      Do Thursday mornings have the buzzier types that tend to blow up on the big stage, thus making them more susceptible to producing a below par test? There could be genuine reasons as to why this phenomenon exists. And it does get defied, Andrew (Nicholson) and Nereo led Badminton 2015 on 37.8 from an early draw!

      As we are always keen to stress, these pieces are never intended as a criticism of judging or of the current rules and formats. There are always a huge number of factors which need to be taken into account when analysing performance – the numbers just being one. We appreciate the discussion and we’ll have more content on its way, so please stay tuned!

      1. The FEI dressage scores do not accurately reflect dressage performance differences among the horses. The judge’s actual scores do.

  11. DR % of 70% is a pen score of 30 at national level and 45 at international. Hence the 1.5 multiplier effect.

    DR % of 60% is a pen score of 40 at national level and 60 at international. So the difference at national level between these two examples is 10 marks, while at international it is 15 marks.

  12. How horses performed compared get m to there average AND their starting positions I is much more interesting!
    I think there are many factors as to why there is an assumed bias. When a competition starts, it can be noisier than later on – last minute setting up, posts being banged in etc, fresh, still early morning air especially in spring and autumn makes for atmosphere and noise carrying more loudly and often the better horses are drawn late even in team competitions as the strongest one normally goes last. As you say, it is more than possible to lead from an early draw. The answer is to just go in a way that they give you the marks!!

  13. My first sentence doesn’t actually make sense!! I blame the lorry I was travelling in!!☺️☺️ What it was meant to say is that how horses performed compared against their average AND their starting position is much more interesting!

  14. Great that dressage scores are being discussed openly. Being a dressage judge must be difficult and I couldn’t do it. However, you mention bias as not something you can do anything about. I agree to some extent but you can lessen its impact. There have been numerous occasions when competitions have been won or lost because of one judge giving a much higher or lower score than the others. I know this is not the main thread of discussion in your article, but shouldn’t the way the scores are made up be looked at as well? For example, 4 judges (3*and upwards), top and bottom scores discarded and then an average taken of the remaining 2.? Thanks for the article.

  15. Firstly I totally agree with Annabel’s observations about the dressage phase. To change the subject: if one judge is consistently marking a little lower than the others that should theoretically make no difference to the final result. It only becomes a problem if that judge suddenly judges one or two horses very high and changes the balance. I can never understand why in international competitions the penalty marks for dressage are multiplied by 1.5. Also one has to remember that the judge on the side will see a very different picture to the judge at C & M or H. Annabel has explained very succinctly about straightness, pirouettes etc. However, these arguments and explanations are going to rumble on and on and, as has been said, judges are human beings and not computers. So therefore, in my opinion, the whole scoring system for the dressage phase of international horse trials should be looked at, reconsidered and overhauled to bring the marks into symmetry with the jumping phases so that they reflect the obedience of the horse and the quality of its flat work etc but play a less important part in the overall competition which is essentially a test of jumping, bravery and stamina.

  16. Pingback: FEI Dressage and Safe Cross Country %. – e-Venting

  17. I’ve just done my Pau 2016 spreadsheet. With the coefficient in place, the results were Nicholas, Jung, and Jung. Without the coefficient, Nicholas still would have won by multiple point (5, IIRC). But Camilla Speirs would have come second, Jung on Fischerrocana 3rd, Laghouag 4th, Annie Clover 5th, Fischertakinou 6th, Billy The Red 7th, One Two Many 8th, TS Jamaimo 9th, Zabreb 10th, and Clifton Signature 11th.

    After the top ten, the ranking don’t change much except that Karin Donckers would have been 18th , Cedric Lyard 15th. From 20th down, the coefficient changed only two places, and they were by only one place.

    This seems to me to be consistent with my previous spreadsheets that show that the top ten would remain the top ten, but the actual placings would be radically changed.

    The dressage coefficient PPs cost Camilla Speirs just at 18,000 euros.

  18. Pingback: The Importance of the FOD! EquiRatings’ Review of 2016 | Equiratings

Leave a Reply

Your email address will not be published. Required fields are marked *