21 coaches online • Server time: 08:36
Forum Chat
Log in
Recent Forum Topics goto Post Gnomes are trashgoto Post Roster Tiersgoto Post Gnomes FTW! (Replays...
SearchSearch 
Post new topic   Reply to topic
View previous topic Log in to check your private messages View next topic
JimmyFantastic



Joined: Feb 06, 2007

Post   Posted: Dec 15, 2011 - 12:47 Reply with quote Back to top

Sure RO is in the sweet spot but so are a lot of Chaos teams. How many have his win%?
Also he hasn't been there forever, he has got knocked down quite a few times as it's unavoidable in the box.

_________________
Pull down the veil - actively bad for the hobby!
koadah



Joined: Mar 30, 2005

Post   Posted: Dec 15, 2011 - 12:53 Reply with quote Back to top

JimmyFantastic wrote:
Sure RO is in the sweet spot but so are a lot of Chaos teams. How many have his win%?
Also he hasn't been there forever, he has got knocked down quite a few times as it's unavoidable in the box.


That's the point though. With out counting TV he's being compared against a lot of low TV chaos with no skills and crap results. Rather than like for like.

Edit:
It would be good to have the link in the OP too.

_________________
Image
O[L]C 2016 Swiss! - April ---- All Star Bowl - Teams of Stars - 2 more teams needed
JimmyFantastic



Joined: Feb 06, 2007

Post   Posted: Dec 15, 2011 - 13:03 Reply with quote Back to top

I understand your point Koadah. However my point is that RandomOracle's results far exceed other high TV Chaos teams not just low ones, so his rating isn't too distorted although obviously it is a bit.
Will be interesting to see Carnis's new chaos team, 12/3/1 while they suck!

_________________
Pull down the veil - actively bad for the hobby!
VoodooMike



Joined: Nov 07, 2010

Post   Posted: Dec 15, 2011 - 16:37 Reply with quote Back to top

the_sage wrote:
Now if I win with dwarves, I gain .5 point. I win with goblins, I gain 4.5 points.
Of course this doesn't have to be the whole system, this can just be the weighting you apply to any other ladder/ranking system.

And if you're a dwarf coach, you have a reason to avoid ever playing goblins, because the rewards are almost negligible and the potential penalties are massive.

I should point out that the gain/loss thing is a bit of a misstatement for this system. You're not guaranteed a gain or loss in any situation, what happens is that each match is assigned a value, and that value is averaged across all your matches. Thus, if you're normally a very highly rated coach, and you play a match that is just slightly above average, it can make your rating go down... albeit probably not by much, since it has no heavier a weight than any other match.

That's another reason RandomOracle's rating is so high - he's quite consistent in his performance, and he has a lot of games under his belt. If he has a crap match, it probably won't do much to his rating.

the_sage wrote:
Me, personally? No. The 'best' coaches? Hell yes. That's stalling, and it turns what would be draws into wins, and what would be losses into draws.

I'm not sure that, if you're that much better than the other coach, stalling is all that important to convert draws into wins and such.

Again, the data used to calculate is based on the average of all matches played by all types of coaches on FUMBBL in B.

koadah wrote:
I've even heard that in some cultures running up the score may be considered unsporting.

Sportsmanship, in that case, is playing beneath your capacity. Whatever your reason for doing so, you're opting not to do as well as you could and, by definition, are mechanically lowering your skill level. I don't think rating systems need to compensate for your choice to do that.

JanMattys wrote:
And even if 99% of the coaches decide to score and I am the only one who doesn't, my 2-0 doesn't lose its strategic value. In your system, it would result in my victory being rated "below average", though.

No, but it does mean your performance as below average. If the majority of coaches in your position can score 3 TDs then I'm not sure how scoring less is strategic - the strategy only comes into play when you manage to prevent the other side from scoring, for example, and ultimately that's no different in this system as it involves maintaining the relative TD count.

A coach that manages to prevent the other side scoring, and score more times his or herself, is performing better than you are if you can't manage both at the same time, no?

koadah wrote:
But really TV has to be in.

I suspect TV consideration would change things a bit, though not as much as you seem to think. It would reduce the rating of teams that play most of their games in their TV sweet spot, and raise the rating of successful teams that play outside of it.

JimmyFantastic wrote:
Will be interesting to see Carnis's new chaos team, 12/3/1 while they suck!

Might be worthwhile to have the script able to display rating calculations for specific teams, or specific matches, rather than the aggregate coach rating only, too.
happygrue



Joined: Oct 15, 2010

Post   Posted: Dec 15, 2011 - 16:50 Reply with quote Back to top

Good work on the script, it's fun to see. I agree that more transparency would be good

VoodooMike wrote:

That's another reason RandomOracle's rating is so high - he's quite consistent in his performance, and he has a lot of games under his belt. If he has a crap match, it probably won't do much to his rating.


I think this is a good thing and one of the real positives for a method like this. For all my complaints about CR in the other thread, I think that the current CR formula does a great job at measuring recent (very short term) performance in terms of straight wins and losses, though I think that in itself is heavily influenced by luck in the short term so that is just not a really useful metric for doing much with. A number like this that is built up with many data points is, IMHO, a more interesting measurement of coach skill (though no method is going to be "perfect" whatever that means).

Anyway, good work!
Corvidius



Joined: Feb 15, 2011

Post   Posted: Dec 15, 2011 - 17:04 Reply with quote Back to top

VoodooMike wrote:
Corvidius wrote:
Seems ok, although my rating is very low.

Lower than you expected? How do you feel you perform in B, keeping in mind it is only rating your relative performance with B teams based on racial matchups.


I can't say that it's innacurate, i have a majority non win ratio in Box, but i'll keep an eye on how much it changes in comparison to CR.
dode74



Joined: Aug 14, 2009

Post   Posted: Dec 15, 2011 - 17:07 Reply with quote Back to top

JanMattys wrote:
To sum it up in a pretty straightforward example:
- One coach is 2-0 up second half, turn 16. He decides to score, and wins 3-0. He gets to suffer the last three LoS blocks in the T16 of his opponent before the game is over.
- Another coach, in the same situation, doesn't score. He wins 2-0 and blocks the hell out of his opponent. He trades his last Td for a better chance to avoid casualties, because he doesn't suffer the last three blocks on LoS.

I would find EXTREMELY debatable that the first coach is a better coach than the second one. They both have their reasons to choose one approach or the other, but stating that one choice is better than the other is false.
That may show up over time in the performance of the next matches. The next match is, after all, the reason to avoid those casualties.
JanMattys



Joined: Feb 29, 2004

Post   Posted: Dec 15, 2011 - 17:08 Reply with quote Back to top

VoodooMike wrote:
JanMattys wrote:
And even if 99% of the coaches decide to score and I am the only one who doesn't, my 2-0 doesn't lose its strategic value. In your system, it would result in my victory being rated "below average", though.

No, but it does mean your performance as below average. If the majority of coaches in your position can score 3 TDs then I'm not sure how scoring less is strategic - the strategy only comes into play when you manage to prevent the other side from scoring, for example, and ultimately that's no different in this system as it involves maintaining the relative TD count.

A coach that manages to prevent the other side scoring, and score more times his or herself, is performing better than you are if you can't manage both at the same time, no?


No.
That's precisely the point I was trying to present.

In my example, both strategies make sense. Both are equally intelligent and you have something to gain (and something to lose) from both. Which one is the "better" one is a matter of taste and personal preferences and priorities of the coach (skilling up players vs playing safer).
Both sstrategies have merits and both strategies trade something for something else.
Yet, one rates higher than the other, and it shouldn't. Winning 3-0 and risking LoS blocks is NOT objectively better than winning 2-0 and playing safe.

_________________
Image
JanMattys



Joined: Feb 29, 2004

Post   Posted: Dec 15, 2011 - 17:09 Reply with quote Back to top

dode74 wrote:
JanMattys wrote:
To sum it up in a pretty straightforward example:
- One coach is 2-0 up second half, turn 16. He decides to score, and wins 3-0. He gets to suffer the last three LoS blocks in the T16 of his opponent before the game is over.
- Another coach, in the same situation, doesn't score. He wins 2-0 and blocks the hell out of his opponent. He trades his last Td for a better chance to avoid casualties, because he doesn't suffer the last three blocks on LoS.

I would find EXTREMELY debatable that the first coach is a better coach than the second one. They both have their reasons to choose one approach or the other, but stating that one choice is better than the other is false.
That may show up over time in the performance of the next matches. The next match is, after all, the reason to avoid those casualties.


Uhm.
This would be a fair counter-arguement in a closed league. In an open format, not so much, I think.

_________________
Image
VoodooMike



Joined: Nov 07, 2010

Post   Posted: Dec 15, 2011 - 17:15 Reply with quote Back to top

Since there's concern over the "win but still be considered below average" I figured I'll pull out the match-ups where it's even possible, and post them so people can get an idea of when that might actually happen:

Code:
Amazon       Goblin        1.79
Amazon       Ogre          1.54
Chaos Dwarf  Goblin        1.24
Chaos Dwarf  Halfling      1.46
Chaos Dwarf  Ogre          1.12
Chaos Pact   Halfling      1.11
Chaos Pact   Ogre          1.16
Dark Elf     Goblin        1.78
Dark Elf     Halfling      1.47
Dark Elf     Ogre          1.95
Dwarf        Goblin        1.73
Dwarf        Halfling      1.40
Dwarf        Ogre          1.01
Elf          Goblin        1.52
Elf          Ogre          2.20
High Elf     Goblin        1.57
High Elf     Halfling      1.48
High Elf     Ogre          1.76
Human        Goblin        1.51
Human        Halfling      1.28
Human        Ogre          1.53
Lizardman    Halfling      1.21
Lizardman    Ogre          1.22
Necromantic  Halfling      1.13
Necromantic  Ogre          1.15
Norse        Goblin        1.46
Norse        Halfling      1.15
Skaven       Goblin        2.16
Skaven       Halfling      1.19
Skaven       Ogre          1.61
Skaven       Vampire       1.08
Slann        Goblin        1.52
Slann        Halfling      1.64
Slann        Ogre          1.65
Undead       Ogre          1.33
Underworld   Ogre          1.10
Vampire      Halfling      1.17
Wood Elf     Goblin        1.40
Wood Elf     Halfling      1.76
Wood Elf     Ogre          1.98


This is based on the average number of TDs scored by the first team above and beyond the number scored by the second team. Any race matches that don't show up on that list will always be considered above 1000 if they win, even by one TD.

Teams from the second column can, in the listed matchups, be considered above average by having the TD difference between them and the other team be less than the listed number, even if they end up losing the match.
dode74



Joined: Aug 14, 2009

Post   Posted: Dec 15, 2011 - 17:17 Reply with quote Back to top

JanMattys wrote:
Uhm.
This would be a fair counter-arguement in a closed league. In an open format, not so much, I think.
Why not? I don't follow the logic... Confused
JanMattys



Joined: Feb 29, 2004

Post   Posted: Dec 15, 2011 - 17:19 Reply with quote Back to top

Voodmike, important things first:
1- I appreciate your dedication and effort.
2- I present counter-arguements in the most positive spirit of making your formula better, not just because or to piss you off.
3- I am presenting a totaly theoretical example with probable negligible impacts in the long run (as I stated), just to point out the flawed reasoning beneath your formula, not because I actually think that numbers would change significantly.

As I wrote, I think that an indicator mixing:
- Td ratios
- Blocks ratios
- Ball possessions
- Turnovers

would be better. Sadly, on Fumbbl we currently have only access (If I am not mistaken) to two of those four pieces of data.

ps: just for the fun of it, I will add that being one of the races in question Ogres (from your data), it makes DOUBLE sense to avoid LoS blocks Laughing j/k

_________________
Image


Last edited by JanMattys on %b %15, %2011 - %17:%Dec; edited 1 time in total
JanMattys



Joined: Feb 29, 2004

Post   Posted: Dec 15, 2011 - 17:20 Reply with quote Back to top

dode74 wrote:
JanMattys wrote:
Uhm.
This would be a fair counter-arguement in a closed league. In an open format, not so much, I think.
Why not? I don't follow the logic... Confused


Because I am an old dude and I still think in [R] terms. Very Happy

This is [B] we are talking about, and your point makes sense. Embarassed

ps: to further explain my previous (wrong) point: in an environment where you can choose your opponent, your starting roster makes no difference.
In an environment where you get what you get, like B, and where it's entirely up to the (flawed) mechanics of inducements to fill the gap between you and your opponent, it does.

_________________
Image
VoodooMike



Joined: Nov 07, 2010

Post   Posted: Dec 15, 2011 - 17:30 Reply with quote Back to top

JanMattys wrote:
In my example, both strategies make sense. Both are equally intelligent and you have something to gain (and something to lose) from both. Which one is the "better" one is a matter of taste and personal preferences and priorities of the coach (skilling up players vs playing safer).

It is a very specific example. There's no reason to assume that the coach that opts to score was in serious danger of having his or her players injured - if their strategy was good enough they could ensure that the risk was minimal to none by the time they ran for the TD.

JanMattys wrote:
This would be a fair counter-arguement in a closed league. In an open format, not so much, I think.

I think he means that the team that took injuries would perform less well in the next match regardless of who that match happened to be against, and thus would be more likely to get a lower rating for that match to compensate for the higher rating they got when they took the risk.

Indeed, if that is NOT the case then it vindicates their decision to take the risk to score and improve their players, since the only reason to play defensively to prevent potential injuries is you worry they will negatively impact future games.

JanMattys wrote:
Sadly, on Fumbbl we currently have only access (If I am not mistaken) to two of those four pieces of data.

Also EEG readings! The coach that has the lowest intensity of brainwaves but still wins, is obviously the better natural coach... but yeah, we don't even have easy access to most of the other stuff, and no real access to large amounts of past data to create averages from, sadly, or I'd be all over looking at what would happen if they were included.

Of the ones you mention, I'd agree it would be interesting to look into ball possession and turnovers.. that said, ball possession isn't necessarily skill related. If you have the ball and can keep the opponent away from you, then sure, you've managed to successfully stall, but maybe you just couldn't figure out how to move forward with it, which is skill failure. Turnovers would be a good number to work with, since they represent failed risks, but I'm betting that value is heavily correlated with ball possession, too.
Hitonagashi



Joined: Apr 09, 2006

Post   Posted: Dec 15, 2011 - 17:32 Reply with quote Back to top

VM:

To be honest, my problem is that you are using results from a different incentive set.

I play in a league (White Isle League), where TD's are used as a tiebreaker. In those games, there are higher scores.

The problem for me is you are taking the results from a division where nobody is incentivised to win big (your standard as a coach doesn't matter, and long term development is more important than meaningless TD's), and applying a dataset which ranks people on those games in which they won big.

The whole point behind this dataset is useful, but inherently meaningless, because in a division where winning big was important, then your playstyle would entirely change.

I know I personally have turned down several chances to get a 5-0 in order to guarantee a 2-0 (and that's not as silly as it sounds, if you turn over turn 2 of the second half and then stall for the remainder). If my worth as a coach is purely measured by how many I can beat someone by, my playstyle(and skill selection!) changes drastically.

_________________
http://www.calculateyour.tv - an easy way to work out specific team builds.
Image
Display posts from previous:     
 Jump to:   
All times are GMT + 1 Hour
Post new topic   Reply to topic
View previous topic Log in to check your private messages View next topic