The person advocating such experiments is usually an advocate of the change rather than a curious seeker of truth. Consequently, the lack of evaluation is a feature not a bug.
It is very hard to get good data out of any experiment that has a sample size of one.
It is often very hard to measure the actual thing you desire, and so as a consequence, insofar as there is measurement it is measurement of something that can be measured rather than something that is a useful measurement.
Even insofar as it is possible to measure something useful, it is often impossible to get all experimenters to agree on which of the useful metrics to use.
To some degree these things can be dealt with honestly by taking an A/B approach. So, for example
You have an open office plan on the first floor, and keep it as is on the second floor.
You introduce new forum rules for the month of December, and revert on January (and see if the forum members demand the return on the new rules.)
You spin off a new forum: my-fourm-expanded, that allows in the new users and see if the old users naturally migrate to the broader discussion. Failure and the new forum will automatically die.
However, as you say in your comment oftentimes such things are introduced more out of an implementation of a belief system about what is “right”, rather than a curious investigation into which is more effective. It often requires utterly disastrously bad results before reverting the change is likely.
A few thoughts on this:
The person advocating such experiments is usually an advocate of the change rather than a curious seeker of truth. Consequently, the lack of evaluation is a feature not a bug.
It is very hard to get good data out of any experiment that has a sample size of one.
It is often very hard to measure the actual thing you desire, and so as a consequence, insofar as there is measurement it is measurement of something that can be measured rather than something that is a useful measurement.
Even insofar as it is possible to measure something useful, it is often impossible to get all experimenters to agree on which of the useful metrics to use.
To some degree these things can be dealt with honestly by taking an A/B approach. So, for example
You have an open office plan on the first floor, and keep it as is on the second floor.
You introduce new forum rules for the month of December, and revert on January (and see if the forum members demand the return on the new rules.)
You spin off a new forum: my-fourm-expanded, that allows in the new users and see if the old users naturally migrate to the broader discussion. Failure and the new forum will automatically die.
However, as you say in your comment oftentimes such things are introduced more out of an implementation of a belief system about what is “right”, rather than a curious investigation into which is more effective. It often requires utterly disastrously bad results before reverting the change is likely.