In order to assess the applicability and quality of the entropy-based tuning scheme, a detailed test was carried out in spring 2015 at the University of Music Würzburg in cooperation with Prof. Andreas C. Lehmann, Master piano builder Burkard Olbrich and Michael Kohl. To this end two structurally identical Steinway-C grands were compared, one of them professionally tuned by ear and the other one according to the EPT (see figure). 28 pianists played and compared the two instruments in a double-blind test, evaluating them in a questionnaire.
The participants can be divided roughly into two groups. The first group of 20 participants represents the semi-professional sector, including piano students at the University of Music and serious amateurs. The second group of eight people consists of professional pianists with many years of experience, including e.g. professors for piano playing at the University of Music.
Because of the small number of participants statistical statements are limited. Nevertheless, the test has led to a clear overall picture that can be summarized as follows:
Pianists belonging to the group of semi-professional musicians do not show a clear preference for one of the two grands.
Pianists with a long professional experience show a statistically significant preference for the aurally tuned grand. Moreover, this grand is perceived as being more harmonic and in a better tune, exhibiting less beats than the electronically tuned instrument.
In conclusion, it seems that the current version of the EPT generates tunings which can be considered as acceptable in a semi-professional context. On the other hand, the EPT cannot compete with high-quality aural tunings on a professional scale. However, given the fact that the EPT tunes randomly according to a very simple one-line formula this is not too surprising. What is surprising, though, is that the entropy-based method seems to produce acceptable results on a semi-professional scale.
I wish this linked to a more substantive writeup of the test.
I agree that a link to a more substantive writeup would be very good… it’s hard to know what to make of the claim that “Pianists with a long professional experience show a statistically significant preference for the aurally tuned grand”, given that there were only 8 such pianists and 2 pianos (one tuned one way, one tuned the other way).
… also, this information comes to use from the website of this “entropy piano tuner”, which seems… well, I’d like to see another source, at least.
If there was a consensus among the 8 as to which tuning is better, that would be significant, right? Since the chance of that is 1⁄128 if they can’t tell the difference. You can even get p < 0.05 with one dissenter if you use a one-tailed test (which is maybe dubious). Of course we don’t know what the data look like, so I’m just being pedantic here.
Have you (or has anyone) ever done double-blind listening tests to determine whether in fact anyone can tell the difference in such cases?
Nisan’s comment upthread links to one such double-blind test, text reproduced here to save people the effort of clicking:
I wish this linked to a more substantive writeup of the test.
Thanks!
I agree that a link to a more substantive writeup would be very good… it’s hard to know what to make of the claim that “Pianists with a long professional experience show a statistically significant preference for the aurally tuned grand”, given that there were only 8 such pianists and 2 pianos (one tuned one way, one tuned the other way).
… also, this information comes to use from the website of this “entropy piano tuner”, which seems… well, I’d like to see another source, at least.
(Apparently, the creators of this “EPT” are themselves affiliated with the University of Physics Würzburg, which certainly explains how/why they got the University of Music Würzburg involved in this test.)
To reach statistical significance, they must have tested each of the 8 pianists more than once.
If there was a consensus among the 8 as to which tuning is better, that would be significant, right? Since the chance of that is 1⁄128 if they can’t tell the difference. You can even get p < 0.05 with one dissenter if you use a one-tailed test (which is maybe dubious). Of course we don’t know what the data look like, so I’m just being pedantic here.