I don’t see how someone could, in the abstract, determine a function or something that would tell someone the minimal set of variables to put in a DAG.
That is not the problem that this paper tries to solve. The paper assumes you know the graph, and are trying to find a sufficient set of variables to condition on to get d-separation.
To determine the minimal set of variables to include in the graph, you generally need subject matter expertise, ie external causal knowledge. Essentially, you need to be able to claim that there does not exist a variable not in the graph which is a common cause of two variables that are in the graph. (With a faithfulness assumption you may also be able to remove certain variables based on the data)
That is not the problem that this paper tries to solve. The paper assumes you know the graph, and are trying to find a sufficient set of variables to condition on to get d-separation.
To determine the minimal set of variables to include in the graph, you generally need subject matter expertise, ie external causal knowledge. Essentially, you need to be able to claim that there does not exist a variable not in the graph which is a common cause of two variables that are in the graph. (With a faithfulness assumption you may also be able to remove certain variables based on the data)