33 coaches online • Server time: 12:10
Forum Chat
Log in
Recent Forum Topics goto Post Finishing the 60 Gam...goto Post Borg Invasiongoto Post GIF Guide
SearchSearch 
Post new topic   Reply to topic
View previous topic Log in to check your private messages View next topic
Bardazur



Joined: Mar 29, 2007

Post   Posted: Sep 19, 2011 - 12:22 Reply with quote Back to top

The main issue is that, unlike chess or go where each player has almost the same chance of winning, a bloodbowl game may be completely unfair and its outcome depends on the raced played as much (if not more) than the coach skills or pure luck.

So the best thing to measure a coach skill is to consider a race and a TV range, and compare the win percetage of this coach to the average win percentage.

For example, if the 1000-1250TV zones win 60% of their games, a coach who wins 80% of the game he plays with such team is good, the one who wins only 40% is not good.

In fact, a ranking for one team is easier to establish than the ranking for a coach, since the coach may do well with a race and not well with another one. For a team, here is what I would do:
1) in each TV range, do the difference between the win percentage of this team compared to the average percentage of that race for this TV range. That gives you a number Ri between -1 (the team lost all the games while the race should win every game) and 1 (the team wins everything and the race nver wins)
2) ponder each TV range by a weight Wi equals to (number of games played in this TV range / total number of games)
3) sum up all the (Ri * Wi), that gives you a ranking R between -1 and 1 for the team
4) if you think that this ranking does not speak for itself, you can do a linear transformation (e.g.: 50+ 50*R to have a ranking between 0 and 100) or an exponential one (e.g 50 * exp(R))
gandresch



Joined: Aug 02, 2003

Post   Posted: Sep 19, 2011 - 12:22 Reply with quote Back to top

Hi,

good work!
I've done something for groups, which isn't yet completely ready (the layout is 6 years old, i think, when i've done it the first time). My work depends on single groups. I added the official tournaments lately, as they have included the season which most of the leagues leave at 0 for each tournament. The performance is something to work on, but for now it's ok. Perhaps you can get some more inspiration from it:
http://ariadne.kicks-ass.org/fumbblStats
(due to the old layout, do not use IE Wink )

gan

Ps: The CR works only for teams right now and i don't use a racial factor at all. But if you have enough games to base the stats on, the racial factor can be added very easily.
VoodooMike



Joined: Nov 07, 2010

Post   Posted: Sep 19, 2011 - 12:49 Reply with quote Back to top

Hitonagashi wrote:
My hypothesis is that clawmbpo can take a poorer coach and give them better results than they would expect with a different race, but in the hands of a stronger coach, it experiences the same success rate.

I don't think you can properly test your hypothesis using the data you have available to you. Since you have no way to determine if any players on a given team have that particular skill combo, much less how many, you'd really only be examining any relationship between certain races and wins at specific TVs.

It has never been denied that certain teams are better at certain TVs - in fact, I believe, the bashy teams have always been better at high TVs, and this is compounded in an open setting where bashy teams suffer less attrition (on average.. mostly because they tend to have higher armor).

Basically what I'm saying is... assuming you find a relationship between the races in question and their victories at higher TVs, will you really be producing any new information? Certainly it won't say much about a given skill combination's contribution since there's no way to examine the skill distribution on teams.

I'm all for statistical analysis... hell, I consider SPSS a mandatory application on every machine I own.. but in this case you need more info.
Hitonagashi



Joined: Apr 09, 2006

Post   Posted: Sep 19, 2011 - 13:31 Reply with quote Back to top

VoodooMike wrote:
Hitonagashi wrote:
My hypothesis is that clawmbpo can take a poorer coach and give them better results than they would expect with a different race, but in the hands of a stronger coach, it experiences the same success rate.

I don't think you can properly test your hypothesis using the data you have available to you. Since you have no way to determine if any players on a given team have that particular skill combo, much less how many, you'd really only be examining any relationship between certain races and wins at specific TVs.

It has never been denied that certain teams are better at certain TVs - in fact, I believe, the bashy teams have always been better at high TVs, and this is compounded in an open setting where bashy teams suffer less attrition (on average.. mostly because they tend to have higher armor).

Basically what I'm saying is... assuming you find a relationship between the races in question and their victories at higher TVs, will you really be producing any new information? Certainly it won't say much about a given skill combination's contribution since there's no way to examine the skill distribution on teams.

I'm all for statistical analysis... hell, I consider SPSS a mandatory application on every machine I own.. but in this case you need more info.


I see your point...but this is why I didn't mention the combination specifically. I'm aware of how little this proves. What I want to do is to find out first if there *is* a problem at all. If every single test I can devise proves that there is no effective benefit between playing as chaos and playing as say, orcs, then it implies that M access and hence the skills inside are not massively overpowered.

It's impossible with the dataset to prove that a skill combination is overpowered...but it is possible to demonstrate that it cannot be.

If I need to, I can make a reasonable stab. I haven't done this yet, as I want to run preliminary experiments, but I can get past players, and their skills for a team, so while I cannot say "in this game, this team had these skills", I can say "over the lifespan of this team, 30% of their players had clawmbpo and this is the difference in results from this team where 5% of them did".

As far as I'm aware, there's been no work on showing the balance of races in LRB 6, and that's primarily what I want to check. This is also a "fumbbl only" dataset. There are enough differences to "mainline" bloodbowl that the result set won't mean as much as it could (no mercenaries, no goblin weapons, TV matched environment).
garyt1



Joined: Mar 12, 2011

Post   Posted: Sep 19, 2011 - 13:38 Reply with quote Back to top

If the % win is not affected significantly by CLAWPOMB I still don't think it would show that it is not overpowered. The point is that it spoils games as people don't want their teams trashed so easily. Still it is interesting stuff and may show what you are aiming for.

_________________
“A wise man can learn more from a foolish question than a fool can learn from a wise answer.”
VoodooMike



Joined: Nov 07, 2010

Post   Posted: Sep 19, 2011 - 13:38 Reply with quote Back to top

Hitonagashi wrote:
I see your point...but this is why I didn't mention the combination specifically.

Your stated hypothesis mentions it specifically.

Hitonagashi wrote:
It's impossible with the dataset to prove that a skill combination is overpowered...but it is possible to demonstrate that it cannot be.

Actually no, it isn't. Because you can't examine the skills involved in the games, you can't draw any conclusion about skill combinations, not even that they're NOT improving win chances, because you can't see if those skills are present when they lose or draw, either.
garyt1



Joined: Mar 12, 2011

Post   Posted: Sep 19, 2011 - 13:41 Reply with quote Back to top

[quote="Hitonagashi"][quote="VoodooMike"]
Hitonagashi wrote:
but I can get past players, and their skills for a team, so while I cannot say "in this game, this team had these skills", I can say "over the lifespan of this team, 30% of their players had clawmbpo

Can you do this automatically? If so it would be great.
Fela



Joined: Dec 27, 2004

Post   Posted: Sep 19, 2011 - 13:46 Reply with quote Back to top

Hitonagashi wrote:

If I need to, I can make a reasonable stab. I haven't done this yet, as I want to run preliminary experiments, but I can get past players, and their skills for a team, so while I cannot say "in this game, this team had these skills", I can say "over the lifespan of this team, 30% of their players had clawmbpo and this is the difference in results from this team where 5% of them did".


Actually the data is available, it would just be hard to gather.

Match reports contain play activity. The player's bio contains SPP gain, which you could use to deduct IF the skill combo was present in that match.

Writing those scripts to dig into match reports and player bios would be a hell of work though.
Hitonagashi



Joined: Apr 09, 2006

Post   Posted: Sep 19, 2011 - 14:27 Reply with quote Back to top

VoodooMike wrote:
Hitonagashi wrote:
I see your point...but this is why I didn't mention the combination specifically.

Your stated hypothesis mentions it specifically.

Hitonagashi wrote:
It's impossible with the dataset to prove that a skill combination is overpowered...but it is possible to demonstrate that it cannot be.

Actually no, it isn't. Because you can't examine the skills involved in the games, you can't draw any conclusion about skill combinations, not even that they're NOT improving win chances, because you can't see if those skills are present when they lose or draw, either.


Touche.

Correction.

In my first post, that's why I didn't mention the combination explicitly Razz. I was hoping to avoid bringing it up and instead studying theories suggested by the community.

With regard to the skills...I believe the reason we can make meaningful data is that the combination is so prevalent.

If it was a secret sauce that nobody took, sure, you can't make that conclusion. If you can tell me that a particular skill combination is overpowered, and yet that race does not perform well with it, then I'd like to see your definition of overpowered. If the only reason is that the only players who can take it tend to end up mng a lot...that's a balancing factor in itself. I would be willing to bet that 50% of the chaos teams in the box over 175 TR use clawmbpo. And that's being very conservative.

Regardless, your tone and mine is becoming rather combative. I'm not interested in a flame war, I'm interested in finding ways of getting better data.

How can I improve the experiment to satisfy you? What would you like to see done?
Hitonagashi



Joined: Apr 09, 2006

Post   Posted: Sep 19, 2011 - 14:32 Reply with quote Back to top

Fela wrote:
Hitonagashi wrote:

If I need to, I can make a reasonable stab. I haven't done this yet, as I want to run preliminary experiments, but I can get past players, and their skills for a team, so while I cannot say "in this game, this team had these skills", I can say "over the lifespan of this team, 30% of their players had clawmbpo and this is the difference in results from this team where 5% of them did".


Actually the data is available, it would just be hard to gather.

Match reports contain play activity. The player's bio contains SPP gain, which you could use to deduct IF the skill combo was present in that match.

Writing those scripts to dig into match reports and player bios would be a hell of work though.


Hmm.

I can get details on each specific match, and details on the past players of a team through the API. I'm not currently storing a date for the matches, so that could prove inconvenient. I could go back and get it though, which would allow me to arrange the games in match order.

The only problem then comes from "grandfathered" teams. I can't tell whether player x was created before the Box turned LRB 6, in which case, all bets are off.

Hmm.
garyt1



Joined: Mar 12, 2011

Post   Posted: Sep 19, 2011 - 15:03 Reply with quote Back to top

I guess those teams are a rarity?

_________________
“A wise man can learn more from a foolish question than a fool can learn from a wise answer.”
VoodooMike



Joined: Nov 07, 2010

Post   Posted: Sep 19, 2011 - 15:08 Reply with quote Back to top

Hitonagashi wrote:
With regard to the skills...I believe the reason we can make meaningful data is that the combination is so prevalent.

Do we know it to be that prevalent? That's certainly something that can be looked at using the data you have available to you. Remember that we want to run with actual data, not from-the-hip beliefs. I'm not saying you're wrong at all, mind you, just asking if we're sure that's true.

Hitonagashi wrote:
If you can tell me that a particular skill combination is overpowered, and yet that race does not perform well with it, then I'd like to see your definition of overpowered.

This is a pretty important point - what does "overpowered" mean? Are people saying that if a player has those three specific skills, the team they're on is likely to win against anyone? As far as I can see, all three of those skills are related to whether or not a specific player is likely to cause a casualty. Are we sure we're looking at the right data in any of the proposed models (or pieces of a model)?

Hitonagashi wrote:
Regardless, your tone and mine is becoming rather combative. I'm not interested in a flame war, I'm interested in finding ways of getting better data.

I'm not being combative, I'm simply pointing out problems in the proposed hypothesis and model. At this point I don't think you have the data necessary to examine what you want to examine. That doesn't mean it isn't possible, of course - it means I don't think the data you say we have is usable to test your hypothesis. I don't think you can build a model to do that testing at present.

Hitonagashi wrote:
How can I improve the experiment to satisfy you? What would you like to see done?

Well, nobody has proposed a model yet, so there's nothing to be improved on. What I'm saying is I don't see how you can build a model, using the data available, to test what you want to test.

I think that to use statistics to see if that skill combination has an important effect on the game, we need to build several different models, and we have to decide what an "important effect" really means. What is it that the skill combination is supposedly doing, that we want to test for?

So, my suggestion is to start simple:

1) Examine the relationship between casualties caused by a team during a game, and the team's outcome for the game. This could be done as a relative thing, to see if there's a relationship between the number of casualties caused in excess of the number of casualties received during a game, and the number of touchdowns scored in excess of the number of touchedowns the opponent scored.

2) Examine the relationship between the presence and prevalence of the skill combination on a team (with the potential to have them) and the number of casualties caused during a game.

3) Examine the relative casualty causing frequency during a match of teams that can have the skill combination, and those that cannot. This may require further examination across team values to see if the teams in question are just bashier in general, combos nonwithstanding.

Now, if we can't actually dig out the data from each match to apply it reliably, we could maybe estimate it - not with the percentage thing you suggested, but with regression applied to all existing teams of a given race and those teams' current TV. If the result was significant, then we could apply that as an estimated number of players with the combo to build that second model.

As always, the question is "what relationships can you examine, using the data you presently have".
Fela



Joined: Dec 27, 2004

Post   Posted: Sep 19, 2011 - 15:29 Reply with quote Back to top

Hitonagashi wrote:
Fela wrote:
Hitonagashi wrote:

If I need to, I can make a reasonable stab. I haven't done this yet, as I want to run preliminary experiments, but I can get past players, and their skills for a team, so while I cannot say "in this game, this team had these skills", I can say "over the lifespan of this team, 30% of their players had clawmbpo and this is the difference in results from this team where 5% of them did".


Actually the data is available, it would just be hard to gather.

Match reports contain play activity. The player's bio contains SPP gain, which you could use to deduct IF the skill combo was present in that match.

Writing those scripts to dig into match reports and player bios would be a hell of work though.


Hmm.

I can get details on each specific match, and details on the past players of a team through the API. I'm not currently storing a date for the matches, so that could prove inconvenient. I could go back and get it though, which would allow me to arrange the games in match order.

The only problem then comes from "grandfathered" teams. I can't tell whether player x was created before the Box turned LRB 6, in which case, all bets are off.

Hmm.


I was referring to the player's history, where it specifically states in which match he got how many SPP (including the datetime of said match).

Admittedly i don't know if there is an API for that, so worst case you'd have to browse via a script, which would make things a bit tedious.
Hitonagashi



Joined: Apr 09, 2006

Post   Posted: Sep 19, 2011 - 15:40 Reply with quote Back to top

Now we are getting somewhere!

Without quoting the lot of that, as there's rather a lot there; I think Bloodbowl is balanced. This makes me a singularly bad person to come up with a definition for OP. Thinking about it more, what I would like to do is to attempt to set up a framework that given hypothesis x (say, clawmbpo is too powerful), it's possible to prove or disprove it, or at least have some evidence to back our theories up. The repeated "no I'M right" on various forums, here and TFF is what bothers me the most when there's this much easily accessible data. Very Happy.

Onto your suggestion, I really like it, but I'm a touch concerned about whether we need to race limit it. As Bardazur said earlier, the outcome of a game depends on the race matchup as well as the casualty matchup. We can't analyse over all teams, because there are around 10x as many chaos/chaos games as there are chaos/elf games, which would skew the data heavily. I'm not a statistician, but it seems to me that that output over all teams and games isn't as useful if the racial balance of those teams is skewed. We could take a representative sample of the teams to apply it? (so say, a random sample of 30 with greater than 15 games teams for each race)

_________________
http://www.calculateyour.tv - an easy way to work out specific team builds.
Image
koadah



Joined: Mar 30, 2005

Post   Posted: Sep 19, 2011 - 16:06 Reply with quote Back to top

Hitonagashi wrote:

How can I improve the experiment to satisfy you? What would you like to see done?


You need to convince Christer to store/output the skills a player had for the match. Wink


Edit:

Hitonagashi wrote:

With regard to the skills...I believe the reason we can make meaningful data is that the combination is so prevalent.


I admit that I do not play many games but I've still never actually played against one of these C-POMB monsters.

_________________
Image
O[L]C 2016 Swiss! - 19th June! ---- All Star Bowl XII - Teams of Stars - Sign up NOW!
Display posts from previous:     
 Jump to:   
All times are GMT + 1 Hour
Post new topic   Reply to topic
View previous topic Log in to check your private messages View next topic