abstractapplic comments on D&D.Sci Coliseum: Arena of Data Evaluation and Ruleset

abstractapplic Oct 29, 2024, 2:25 AM
9 points
0
Notes on my performance:
Well, I feel pretty dumb (which is the feeling of becoming smarter). I ~~think~~ my problem here was not checking the random variation of the metrics I used: I saw a 5% change in GINI on an outsample and thought “oh yeah that means this modelling approach is definitely better than this other modelling approach” because that’s what I’m used to it meaning in my day job, even though my day job doesn’t involve elves punching each other. (Or, at least, that’s my best post hoc explanation for how I kept failing to notice simon’s better model was indeed better; it could also have been down to an unsquished bug in my code~~, and/or LightGBM not living up to the hype.)~~
ETA: I have finally tracked down the trivial coding error that ended up distorting my model: I accidentally used kRace in a few places where I should have used kClass while calculating simon’s values for Speed and Strength.
Notes on the scenario:
I thought the bonus objective was executed very well: you told us there was Something Else To Look Out For, and provided just enough information that players could feel confident in their answers after figuring things out. I also really liked the writing. Regarding the actual challenge part of the challenge . . . ~~I’m recusing myself from having an opinion until I figure out how I could have gotten it right; all I can tell you for sure is~~ this wasn’t below ⁴⁄₅ Difficulty. (Making all features’ effects conditional on all other features’ effects tends to make both Analytic and ML solutions much trickier.)
ETA: I now have an opinion, and my opinion is that it’s good. The simple-in-hindsight underlying mechanics were converted seamlessly into complex and hard-but-fair-to-detangle feature effects; the flavortext managed to stay relevant without dominating the data. This scenario also fits in neatly alongside earlier entries with superficially similar premises: we’ve had “counters matter” games, “archetypes matter” games, and now a “feature engineering matters” game.
I have exactly one criticism, which is that it’s a bit puzzlier than I’d have liked. Players get best results by psychoanalyzing the GM and exploiting symmetries in the dataset, even though these aren’t skills which transfer to most real-world problems, and the real-world problems they do transfer to don’t look like “who would win a fight?”; this could have been addressed by having class and race effects be slightly more arbitrary and less consistent, instead of having uniform +Strength / -Speed gaps for each step. However, my complaint is moderated by the facts that:
.This is an isekai-world, simplified mechanics and uncannily well-balanced class systems come with the territory. (I thought the lack of magic-users was a tell for “this one will be realistic-ish” but that’s on me tbh.)
.Making the generation function any more complicated would have made it (marginally but nontrivially) less elegant and harder to explain.
.I might just be being a sore ~~loser~~ only-barely-winner here.
.Puzzles are fun!
- aphyer Oct 29, 2024, 8:30 PM
  2 points
  0
  Parent
  ETA: I have finally tracked down the trivial coding error that ended up distorting my model: I accidentally used kRace in a few places where I should have used kClass while calculating simon’s values for Speed and Strength.
  Thanks for looking into that: I spent most of the week being very confused about what was happening there but not able to say anything.