merging (either pair of) the screening nodes into one node
Then the network does not encode the conditional independence between the two variables that you merged.
The task you have to do when making predictions is marginalization: in order to computer P(Rain|WetGrass), you need to compute the sum of P(Rain|WetGrass, X,Y,Z) for all possible values of the variables X, Y, Z that you didn’t observe. Here it is very helpful to have the distribution factored into a tree, since that can make it feasible to do variable elimination (or related algorithms like belief propagation). But the directions on the edges in the tree don’t matter, you can start at any leaf node and work across.
Then the network does not encode the conditional independence between the two variables that you merged.
The task you have to do when making predictions is marginalization: in order to computer P(Rain|WetGrass), you need to compute the sum of P(Rain|WetGrass, X,Y,Z) for all possible values of the variables X, Y, Z that you didn’t observe. Here it is very helpful to have the distribution factored into a tree, since that can make it feasible to do variable elimination (or related algorithms like belief propagation). But the directions on the edges in the tree don’t matter, you can start at any leaf node and work across.