Obviously, you did it somewhere, and now that I navigated a little bit more in the mass of very good documents you produced, I can point towards:
http://electology.github.io/vse-sim/VSE/ for the code repository, with a long explanation, of a statistical test of a lot of methods. That implies that we have the actual code to understand:
the model used (for voters preferences and strategies—extremely interesting!)
the actual algorithmic definition of all the methods
nice VSE graph for each method :)
http://electology.github.io/vse-sim/VSEbasic/ which is a nice summary of the main methods.
Also, I wanted to point out that there was a big real life test of a variant of score voting (with 6 score from “reject” to “very good”) before french presidential election in 2017: https://articles.laprimaire.org/l%C3%A9lection-pr%C3%A9sidentielle-au-jugement-majoritaire-les-r%C3%A9sultats-373e089315a4
(sorry in French).
The scrutin was a little complex, with a first tour with 16 candidates and >10k voters. Each voter was presented a random set of 5 candidates and was asked for evaluation of each candidate project on ~10 aspects (your clusters) (each time with the 6 states scale) (see https://articles.laprimaire.org/r%C3%A9sultats-du-1er-tour-de-laprimaire-org-c8fe612b64cb). Smart contracts on ethereum were used to register votes.
Then a second round took the 5 bests candidates, and more than 30k voters ranked them. (https://articles.laprimaire.org/r%C3%A9sultats-du-2nd-tour-de-laprimaire-org-2d61b2ad1394)
It was a real pleasure to participate to it because you really try to estimate each candidate project, and you don’t try to be strategic.
But it does not seems possible in real life major elections (like French or USA presidential), because it requires quite a lot of time and will from voters.
By the way, just to be sure I understood correctly: 3-2-1 is a summable voting method and so is not subject to that risk?
If so, it seems to be definitly the best voting method available.