Pattern comments on [S] D&D.Sci: All the D8a. Allllllll of it.

Pattern Feb 16, 2023, 9:38 PM
2 points
0
What are people using to load and analyze the data?
- abstractapplic Feb 17, 2023, 2:37 AM
  4 points
  0
  Parent
  I used the python package Pandas.
  (I also tried Excel, but the dataset was too large to load everything in. In retrospect, I realize I could have just loaded in the first million rows − ²⁄₃ of the dataset, more than enough to get statistically significant results from—and analyzed that, possibly keeping the remaining ~400k rows as a testing set.)
- simon Feb 17, 2023, 4:25 AM
  1 point
  0
  Parent
  I started out with Excel, but it could only load, as abstractapplic noted, about ²⁄₃ of the dataset. I considered using just that, or splitting the data, but then decided that since I had been thinking of trying out doing data analysis in Haskell, I would abandon Excel and try out Haskell.
  After various hangups, including most perniciously the stubborn refusal of the Parsec library to modify its operation to conform to my mental model of how it works, I still haven’t actually loaded the data in my program in a usable form. But I’m hoping I’ll manage it soon. And then I have to figure out how to actually process and get data out in a usable form...
  - aphyer Feb 17, 2023, 2:13 PM
    2 points
    0
    Parent
    Hm. I thought the large dataset would help analysis, and would be pretty easy to truncate if desired, but it seems that wasn’t as obvious as I hoped. If people are having trouble with it, I’ve added a smaller version by generating the first 200k rows of the large dataset, see here.
    This cuts down to a bit under ¹⁄₆ the size of the main dataset, which will make it harder to identify effects, but if you can’t use the main dataset I assume this is better than nothing.
    If anyone wants extra time, let me know.
    - simon Feb 20, 2023, 11:25 AM
      1 point
      0
      Parent
      I am now requesting extra time. I’ve loaded the data using my Haskell program (fixed the parsing late last night), and used it to check
      basic stats for number of players, aspects, classes including individual player aspect/class combos, which all seem pretty evenly distributed and none of which seem to effect winrate very much except number of players and that’s not that big either
      but still need to
      check for interactions and especially look for symmetries in those interactions—relatively constant winrates overall suggests symmetric interactions unless the effects are weak
      edited to add:
      No wait, the variations in winrate for individual player aspect/class combos don’t look that small at all. Noticed this shortly after making the above comment but didn’t want to actually make the edit until I had got the Haskell program to calculate the p-values though the variations were obviously too big to be random if the variations in total numbers for each combo were assumed to be random. The variations in winrates of classes and aspects, while much smaller, are still strongly statistically significant in some cases (if I got the program to do the right math).
      Since I was busy with that I haven’t gotten around to looking at correlations between different players in the same team yet. There definitely do seem to be patterns in which classes go with which aspects for individual player aspect/class combos, though.
      - aphyer Feb 20, 2023, 12:49 PM
        3 points
        0
        Parent
        Understood, no worries! I’ll aim to post the solution on Friday unless I hear further—if you want another weekend I could instead do next Monday.
        simon Feb 24, 2023, 9:51 AM
        1 point
        0
        Parent
        And looks like I could use the weekend as well, if that’s OK. Though, if other players object, I do feel like I am abusing this a bit—the time ratio between “data analysis” vs “learning Haskell” has been low.
        aphyer Feb 24, 2023, 12:52 PM
        3 points
        0
        Parent
        Fine with me
        abstractapplic Feb 24, 2023, 11:17 AM
        3 points
        0
        Parent
        If it helps, I for one am completely okay with you taking the weekend.
        simon Feb 20, 2023, 7:16 PM
        1 point
        0
        Parent
        Thanks!
    - simon Feb 17, 2023, 6:13 PM
      1 point
      0
      Parent
      Thanks aphyer, I might end up requesting extra time, though I don’t need the 200k row dataset; if I wanted to I could just accept what Excel (or Libre/OpenOfficeCalc) truncates it to.