I think Martin equalized the ratings independent from weightclass -- Now if only he would incorporate my changes above :)
You all must understand, these ratings were made by looking at a current scene and they are merely only pleasing you all for a snapshot in time. As time evolves, they may look grossly distorted and then let the complaining start. I believe that began to happen last time and thus the predictive system was incorporated.
JCS83MD wrote:I think Martin equalized the ratings independent from weightclass -- Now if only he would incorporate my changes above :)
You all must understand, these ratings were made by looking at a current scene and they are merely only pleasing you all for a snapshot in time. As time evolves, they may look grossly distorted and then let the complaining start. I believe that began to happen last time and thus the predictive system was incorporated.
:)
Yes, and nothing looked grossly distorted in the last system :)
What does "equalized the ratings independent from weightclass" mean?
JCS83MD wrote:I think Martin equalized the ratings independent from weightclass -- Now if only he would incorporate my changes above :)
You all must understand, these ratings were made by looking at a current scene and they are merely only pleasing you all for a snapshot in time. As time evolves, they may look grossly distorted and then let the complaining start. I believe that began to happen last time and thus the predictive system was incorporated.
:)
Yes, and nothing looked grossly distorted in the last system :)
What does "equalized the ratings independent from weightclass" mean?
Like I said 100 times, its all a matter of perception. If you were looking for a IBO/Ring like ranking from the last system, it wasn't it - OBVIOUSLY.
It means that rather than update the database w/ a divisional rating, he updated the database w/ more of a "career" rating. This makes the P4P look better, but one could argue that it does not look as "traditional" in the division lists.
Keep in mind that a lot of older pre-1940s bouts are missing and that the ratings are optimized for today's world. We should really be handicapping these older fighters.
I don't want to see the current ratings messed with too much for the sake of improving the all-time ratings. While the all-time ratings are fun, they are a totally speculative exercise of comparing fighters from very different historically time periods. Since the all-time measure itself (best total over 5 years) is just an arbitrary choice point - that's what should be fiddled with to make the all-time ratings look better. Change the time period, or create a new measure. Maybe something like the adjusted stats for baseball - reward fighters who were the most dominant in their era rather than try to use a single metric to compare all of them together.
The current ratings, on the other hand, are not a toy. They are actually used for matchmaking, sanctioning, and promotion purposes. I'm not saying that this latest adjustment is bad, because I don't know how it was derived - but I would like to see some stability in the current ratings for a while.
emile wrote:I don't want to see the current ratings messed with too much for the sake of improving the all-time ratings. While the all-time ratings are fun, they are a totally speculative exercise of comparing fighters from very different historically time periods. Since the all-time measure itself (best total over 5 years) is just an arbitrary choice point - that's what should be fiddled with to make the all-time ratings look better. Change the time period, or create a new measure. Maybe something like the adjusted stats for baseball - reward fighters who were the most dominant in their era rather than try to use a single metric to compare all of them together.
The current ratings, on the other hand, are not a toy. They are actually used for matchmaking, sanctioning, and promotion purposes. I'm not saying that this latest adjustment is bad, because I don't know how it was derived - but I would like to see some stability in the current ratings for a while.
Maybe those who are seriously using the system should use IBO as their system is a mystical secret as are its changes.
Cobwebcat wrote:I can see the sense in that but I liked the all-time and they need drastically altering to make sense. I'm sure that can be done without upsetting "current"
Those all-time figs are so bad I wouldn't publish until they are adjusted
Remember the All-Time ratings from the predictive system? If memory serves correct, they weren't so bad at all. Maybe this is a sign of something...
Those all-time ratings I believe are from the current system, but I dont imagine my testing improved them all that much.
If you give me time, I can produce the all-time ratings from the previously used system + latest enhancements that never made it.
I don't think you should get carried away with the prediction rate again and I figured out why.
It's because the prediction rate is like a form guide i.e. who is going to win next. At the end of the last football season, Liverpool were the hottest team in England. If there had been predictive ratings for football, then at the end of the season Liverpool would've been ahead of Chelsea, because that would've helped in the task of obtaining the best percentage in predicting the results of the next matchups. However the 'new' career performance ratings reflects the performance over a longer period of time e.g. Chelsea ahead of Liverpool in the premiership table. As we've found out, most people want to see the league table, not the "whos in form list".
Because the predictive ratings were being used by boxrec, in-form boxers like Calvin Brock and Sechew Powell were being pushed ahead in the rankings. However most people were looking for career achievements in the rankings e.g. Wlad and Mosely at number 1. Occasionally the in-form fighters were the same fighters that had the best career achievements e.g Calzaghe. This was the equivalent to the middle of last football season, were Chelsea were not only the league leaders, but the most in-form team in the premiership.
If you start optimizing the new rankings for predictability again, then exactly the same thing will happen, the in-form fighters will start climbing the ranking again, at the sacrifice of boxers with more career achievements.
2)
Regarding your point above, dropping the DQ to 20% instead of 40% so that the predictability gets better.
Most DQ's that I have seen, have been desperation moves on the part of the losing boxer. It is a rare case where I think, gee that person got unfairly DQ'd out of a fight. I consider the DQ logically equivalent to a KO or TKO i.e. the loser cannot make it to the end of the fight.
In general, if you are intending to improve the career performance ratings, then you must try and match the public's perception of a win or loss. I think most people would rank a KO or TKO as higher dominance than a UD. Martin already has implemented this line of thinking with different points differences for a SD and a UD.
It is of course legitimate to tinker with the percentages, but only in the context of what matches the publics perceptions.
3)
I think the problem that most people had with Tarver's ranking, was the fact that he got his arse handed to him by Hopkins. There is neutral information available to this in boxrec such as a) Hopkins jumped two weight classes b) it was his first fight at this higher weight in 10 years and c) Tarver got his arse absolutely handed to him on the cards.
If you can work such factors into the calculation (whilst retaining clarity) then you will more accurately match the publics perception of the rankings. This is more important than the predictability rate for the 'career performance' ratings.
4)
A question. What can you read more into a) a wide UD (average 118-110) on neutral ground or b) a victory via TKO. Perhaps counter-intuitively, a wide UD tells me more about the relative skills of the two fighters. Anyone can bring it back with a KO in the 12th, but if you got your arse handed to you round after round, then it tells me, that one fighter was really substantially dominate. I think it would be a good idea to try and integrate wide UD's into the equation. This would help match the publics perception of the fighters, which is the goal of the 'career performance' ratings.
5)
Do you get too many points by defeating an elite fighter (ratings over 1200?). Everyone has to lose sometime. Every star is human and just one punch away from a loss. The underdog in this fight has the right to assume a very high position in the weight division, spoils go to the victor. The question though remains though, is the victor of an elite fighter, really an elite fighter himself? What I'm basically saying, is that the weight of the victory against an elite fighter is too much in comparison to victories against other non elite fighters.
Depending on the algorithm, Tarver got around 700 pts for his victory against Jones. Against any other fighter in the division, he would've got around 150. This vaulted Tarver into the elite fighter category. The question is, did he deserve to be there, or was this just 'his night'.
I think it is possible to use the 'neutral' information available for reflecting on this victory, though it might cost a lot in clarity. Firstly Tarver had already lost to Jones, so he was not substantially better. Secondly Tarver had already lost to another fighter in his previous five fights, so it was quite questionable whether Tarver was truly an elite fighter (losses to fighters at the beginning of your career can be forgiven). Also, the victory was much higher than any other victory he had had. This was not a case of an established p4p top 10 like DLH beating p4p 1 Pernell Whitaker and getting a heap of points for it. Perhaps these sorts of factors of the win against a p4p elite fighter can be taken into account in adjusting the points for the win.
I could even imagine a simpler solution where part of the mega point bonus is held back until some criteria is filled, perhaps a win against a +400 ranked fighter or even just a win against any fighter in his next fight. In Tarver's case, he would've not got the points as he suffered a defeat in his next outing against Glen Johnson. On the other hand, a victory against Johnson would've perhaps confirmed his eliteness.
This sort of holding back pts logic (only for wins against elite fighters), would probably help with stories like Buster Douglas beating Mike Tyson or Julio Gonzales beating Michaelzwevski. It reduces the overall impact of the dramatic upset of an aging champion.
I follow your argument on DQ. That is why, as you'll see in my latest attempts, I made DQ a 0.3, and split fight factors into 3 categories. Again, this is just testing I am not sure will ever be honored but when the stats improve, one would have to assume the change made better results than the programming that was once there.
I was planning on separating blowout decisions into its own category even higher than stoppages, however I assume this will negatively affect prediction rates and lose some clarity in the grand scheme of things. Also, it can be assumed that a fighter who went 12 rounds was not beat as badly as one who was knocked out, regardless of round or situation before this knockout. This argument has 2 valid sides to it.
What's unfortunate is that there is no real metric to gauge progress that is efficient and easy to interpret. Perhaps you, as a statistician, should provide one. :)
I follow your argument on DQ. That is why, as you'll see in my latest attempts, I made DQ a 0.3, and split fight factors into 3 categories. Again, this is just testing I am not sure will ever be honored but when the stats improve, one would have to assume the change made better results than the programming that was once there.
I was planning on separating blowout decisions into its own category even higher than stoppages, however I assume this will negatively affect prediction rates and lose some clarity in the grand scheme of things. Also, it can be assumed that a fighter who went 12 rounds was not beat as badly as one who was knocked out, regardless of round or situation before this knockout. This argument has 2 valid sides to it.
What's unfortunate is that there is no real metric to gauge progress that is efficient and easy to interpret. Perhaps you, as a statistician, should provide one. :)
Just two quick suggestions one on DQ's and another on "Blow outs". if you do something with DQ's is there anyway you can sepaerate it between fights where a guy was winning when his opponent was DQ'd (same current credit ) and when he was losing when his opponent was DQ'd (fighter recieves less credit) and vice versa, take off less when a guy is DQ'd while winning and I guess the same as now when a a fight is DQ'd while losing.
As far as "Blowouts" I would only consider a "blowout" a complete shut out on all three cards, that would remove subjectivity of judges some. Of course if you want to lower the threshhold I guess you can say a fighter winning more than 3/4's of the rounds on all judges cards.
just some suggestions I am sure you have thought of these before. Dunno if they'd even have any effect on the ratings you come up with.
I follow your argument on DQ. That is why, as you'll see in my latest attempts, I made DQ a 0.3, and split fight factors into 3 categories. Again, this is just testing I am not sure will ever be honored but when the stats improve, one would have to assume the change made better results than the programming that was once there.
I was planning on separating blowout decisions into its own category even higher than stoppages, however I assume this will negatively affect prediction rates and lose some clarity in the grand scheme of things. Also, it can be assumed that a fighter who went 12 rounds was not beat as badly as one who was knocked out, regardless of round or situation before this knockout. This argument has 2 valid sides to it.
What's unfortunate is that there is no real metric to gauge progress that is efficient and easy to interpret. Perhaps you, as a statistician, should provide one. :)
Just two quick suggestions one on DQ's and another on "Blow outs". if you do something with DQ's is there anyway you can sepaerate it between fights where a guy was winning when his opponent was DQ'd (same current credit ) and when he was losing when his opponent was DQ'd (fighter recieves less credit) and vice versa, take off less when a guy is DQ'd while winning and I guess the same as now when a a fight is DQ'd while losing.
As far as "Blowouts" I would only consider a "blowout" a complete shut out on all three cards, that would remove subjectivity of judges some. Of course if you want to lower the threshhold I guess you can say a fighter winning more than 3/4's of the rounds on all judges cards.
just some suggestions I am sure you have thought of these before. Dunno if they'd even have any effect on the ratings you come up with.
The DQ separation is a "no go" because there are quite a few DQs in the database where the scorecards aren't listed. I am sure if all DQs had them listed, it would be a possibility.
In the past, I found success testing 12 round blowout swhere fighters had won by an average of 8 or more points per card.