This seems like a natural fit for D&D.Sci games. All the ones I made are public domain, so you can use them freely (and I bet the other people who made some would give you permission if you asked them nicely), they’ve been publicly played by clever humans with a variety of skill levels and associated outcomes, and they’re obscure enough that I doubt an LLM would have memorized the solutions (and if not you could tweak the names and data-generation hyperparameters to flatfoot them).
. . . I happen to have a completed-but-unreleased D&D.Sci game, which I was planning to put on LW early next month, after everyone got back from their holidays. Would it be helpful if I sent it to you and delayed the release until Feb, so you and yours could let LLMs try it first?
Interesting!
How much would we have to pay you to (a) put it into the task format and document it etc as described above, and (b) not publish it anywhere it might make it into training data?
For the unreleased challenge, b) isn’t for sale: making something intended to (eventually) be played by humans on LW and then using it solely as LLM-fodder would just be too sad. And I’m guessing you wouldn’t want a) without b); if so, so much for that.
. . . if the “it must never be released to the public internet” constraint really is that stringent, I might be better advised to make D&D.Sci-style puzzles specifically for your purposes. The following questions then become relevant:
.How closely am I allowed to copy existing work? (This gets easier the more I can base it on something I’ve already done.)
.How many challenges are you likely to want, and how similar can they be to each other? (Half the difficulty on my end would be getting used to the requirements, format etc; I’d be more inclined to try this if I knew I could get paid for many challenges built along similar lines.)
.Is there a deadline? (When are you likely to no longer need challenges like this?) (Conversely, would I get anything extra for delivering a challenge within the next week or so?)
Even if it has already been published we’re still interested. Especially ones that were only published fairly recently, and/or only have the description of the puzzle rather than the walkthrough online, and/or there are only a few copies of the solutions rather than e.g. 20 public repos with different people’s solutions
I think we’d be super interested in you making custom ones! In terms of similarity level, I think it would be something like “it’s not way easier for a human to solve it given solutions to similar things they can find online”.
I imagine we’d be interested in at least 10, as long as they don’t all have the same trick or something, and maybe more like 50 if they’re pretty diverse? (but I think we’d be at more like $1000 for marginal task at those sort of numbers)
I don’t expect there to be a hard deadline, expect we’ll still want more of these for next year or two at least. Sooner is better, next week or so would be awesome.
To be clear, with (b) you could still have humans play it—just would have to put it up in a way where it won’t get scraped (e.g. you email it to people after they fill in an interest form, or something like that)
This seems like a natural fit for D&D.Sci games. All the ones I made are public domain, so you can use them freely (and I bet the other people who made some would give you permission if you asked them nicely), they’ve been publicly played by clever humans with a variety of skill levels and associated outcomes, and they’re obscure enough that I doubt an LLM would have memorized the solutions (and if not you could tweak the names and data-generation hyperparameters to flatfoot them).
. . . I happen to have a completed-but-unreleased D&D.Sci game, which I was planning to put on LW early next month, after everyone got back from their holidays. Would it be helpful if I sent it to you and delayed the release until Feb, so you and yours could let LLMs try it first?
Interesting! How much would we have to pay you to (a) put it into the task format and document it etc as described above, and (b) not publish it anywhere it might make it into training data?
For the unreleased challenge, b) isn’t for sale: making something intended to (eventually) be played by humans on LW and then using it solely as LLM-fodder would just be too sad. And I’m guessing you wouldn’t want a) without b); if so, so much for that.
. . . if the “it must never be released to the public internet” constraint really is that stringent, I might be better advised to make D&D.Sci-style puzzles specifically for your purposes. The following questions then become relevant:
.How closely am I allowed to copy existing work? (This gets easier the more I can base it on something I’ve already done.)
.How many challenges are you likely to want, and how similar can they be to each other? (Half the difficulty on my end would be getting used to the requirements, format etc; I’d be more inclined to try this if I knew I could get paid for many challenges built along similar lines.)
.Is there a deadline? (When are you likely to no longer need challenges like this?) (Conversely, would I get anything extra for delivering a challenge within the next week or so?)
Even if it has already been published we’re still interested. Especially ones that were only published fairly recently, and/or only have the description of the puzzle rather than the walkthrough online, and/or there are only a few copies of the solutions rather than e.g. 20 public repos with different people’s solutions
I think we’d be super interested in you making custom ones! In terms of similarity level, I think it would be something like “it’s not way easier for a human to solve it given solutions to similar things they can find online”.
I imagine we’d be interested in at least 10, as long as they don’t all have the same trick or something, and maybe more like 50 if they’re pretty diverse? (but I think we’d be at more like $1000 for marginal task at those sort of numbers)
I don’t expect there to be a hard deadline, expect we’ll still want more of these for next year or two at least. Sooner is better, next week or so would be awesome.
To be clear, with (b) you could still have humans play it—just would have to put it up in a way where it won’t get scraped (e.g. you email it to people after they fill in an interest form, or something like that)