...bearing this out, it looks like Lorxus managed to get a perfect score with relatively little actual Data Science just by thinking about what it might mean that including lots of ingredients led to Magical Explosions and including few ingredients led to Inert Glop.
Not quite true! That’s where I started to break through, but after that I noticed the Mutagenic Ooze issue as well. It also took me a lot of very careful graceful use of pivot tables. Gods beyond, that table chugged. (And if I can pull the same Truth from the void with less powerful tools, should that not mark me as more powerous in the Art? :P)
I guess I’m not clear on what “actual Data Science” would involve, if not making hypotheses and then conducting observational-experiments? I figured out the MO mechanic specifically by looking at brews that coded for pairs of potions, for the major example. The only thing that would have changed if I’d known SQL would be speed, I suspect.
...and documented his thought process very well, thank you Lorxus!
Always a pleasure! I had a lot of fun with this one. I was a little annoyed by the undeclared bonus objective—I would have wanted any indication at all in the problem statement that anything was not as it appeared. I did notice the correspondence in (i.a.) the Farsight Potion but in the absence of any reason to suspect that the names were anything but fluff, I abstracted away anything past the ingredients being a set of names. Maybe be minimally more obvious? At any rate I’d be happy to be just as detailed in future, if that’s something you want.
I liked the bonus objective myself, but maybe I’m biased about that...
As a someone who is also not a “data scientist” (but just plays one on lesswrong), I also don’t know what exactly actual “data science” is, but I guess it’s likely intended to mean using more advanced techniques?
(And if I can pull the same Truth from the void with less powerful tools, should that not mark me as more powerous in the Art? :P)
Perhaps, but don’t make a virtue of not using the more powerful tools, the objective is to find the truth, not to find it with handicaps...
Speaking of which one thing that could help making things easier is aggregating data, eliminating information you think is irrelevant. For example, in this case, I assumed early on (without actually checking) that timing would likely be irrelevant, so aggregated data for ingredient combinations. As in, each tried ingredient combination gets only one row, with the numbers of different outcomes listed. You can do this by assigning a unique identifier to each ingredient combination (in this case you can just concatenate over the ingredient list), then counting the results for the different unique identifiers. Countifs has poor performance for large data sets, but you can sort using the identifiers then make a column that adds up the number of rows (or, the number of rows with a particular outcome) since the last change in the identifier, and then filter the rows for the last row before the change in the identifier (be wary of off-by-one errors). Then copy the result (values only) to a new sheet.
This also reduces the number of rows, though not enormously in this case.
Of course, in this case, it turns out that timing was relevant, not for outcomes but only for the ingredient selection (so I would have had to reconsider this assumption to figure out the ingredient selection).
Perhaps, but don’t make a virtue of not using the more powerful tools, the objective is to find the truth, not to find it with handicaps...
I’m obviously seeking out more powerful tools, too—I just haven’t got them yet. I don’t think it’s intrinsically good to stick to less powerful tools, but I do think that it’s intrinsically good to be able to fall back to those tools if you can still win.
And when I need to go out and find truth for real, I don’t deny myself tools, and I rarely go it alone. But this is not that.
You don’t need to justify—hail fellow D&Dsci player, I appreciate your competition and detailed writeup of your results, and I hope to see you in the next d&dsci!
Not quite true!That’s where I started to break through, but after that I noticed the Mutagenic Ooze issue as well. It also took me a lot of very careful graceful use of pivot tables. Gods beyond, that table chugged. (And if I can pull the same Truth from the void with less powerful tools, should that not mark me as more powerous in the Art? :P)I guess I’m not clear on what “actual Data Science” would involve, if not making hypotheses and then conducting observational-experiments?I figured out the MO mechanic specifically by looking at brews that coded for pairs of potions, for the major example. The only thing that would have changed if I’d known SQL would be speed, I suspect.Always a pleasure! I had a lot of fun with this one. I was a little annoyed by the undeclared bonus objective—I would have wanted any indication at all in the problem statement that anything was not as it appeared. I did notice the correspondence in (i.a.) the Farsight Potion but in the absence of any reason to suspect that the names were anything but fluff, I abstracted away anything past the ingredients being a set of names. Maybe be minimally more obvious? At any rate I’d be happy to be just as detailed in future, if that’s something you want.
I liked the bonus objective myself, but maybe I’m biased about that...
As a someone who is also not a “data scientist” (but just plays one on lesswrong), I also don’t know what exactly actual “data science” is, but I guess it’s likely intended to mean using more advanced techniques?
Perhaps, but don’t make a virtue of not using the more powerful tools, the objective is to find the truth, not to find it with handicaps...
Speaking of which one thing that could help making things easier is aggregating data, eliminating information you think is irrelevant. For example, in this case, I assumed early on (without actually checking) that timing would likely be irrelevant, so aggregated data for ingredient combinations. As in, each tried ingredient combination gets only one row, with the numbers of different outcomes listed. You can do this by assigning a unique identifier to each ingredient combination (in this case you can just concatenate over the ingredient list), then counting the results for the different unique identifiers. Countifs has poor performance for large data sets, but you can sort using the identifiers then make a column that adds up the number of rows (or, the number of rows with a particular outcome) since the last change in the identifier, and then filter the rows for the last row before the change in the identifier (be wary of off-by-one errors). Then copy the result (values only) to a new sheet.
This also reduces the number of rows, though not enormously in this case.
Of course, in this case, it turns out that timing was relevant, not for outcomes but only for the ingredient selection (so I would have had to reconsider this assumption to figure out the ingredient selection).
I’m obviously seeking out more powerful tools, too—I just haven’t got them yet. I don’t think it’s intrinsically good to stick to less powerful tools, but I do think that it’s intrinsically good to be able to fall back to those tools if you can still win.
And when I need to go out and find truth for real, I don’t deny myself tools, and I rarely go it alone. But this is not that.
You don’t need to justify—hail fellow D&Dsci player, I appreciate your competition and detailed writeup of your results, and I hope to see you in the next d&dsci!
I have struckthrough part of the previous comment, given the edit. I need no longer stand by it as a complaint.