For the most part, you seem to spend a lot of time trying to discover whether terms like unknown probability and known probability make sense. Yet, those are language artifacts which, like everything language, is merely a use of a clarification algorithm as means to communicate abstractions. Each class represents primarily its dominating modes, but becomes increasingly useless at the margins. As such, you yourself make a false dichotomy by trying to discuss whether these terms are useful or not by showing that at the border they might fail: they fail, and they’re both useful and not useful at the border, depending if the exact threshold. In fact, you even start by discussing whether the border can be objectively defined: the answer is obviously no, since it is language, and then you try to use other words to make the point, where you then do your analysis a second time on the new words, and discuss whether the classification systems for those words are perfect (i.e. zero, plausible, realistic, etc.). I think that in reality you are missing the point entirely.
These quotes and definitions are using imagery to teach readers the basic classification system (i.e. the words) to the reader, by proposing initial but vague boundaries. Then, based on the field and experience, the reader further refines this classification to further match the group’s (i.e. The experts) definition. Reviews of a given set of risks and uncertainties are then about discussing a) whether the different experts are calibrated in terms of classification threshold; and b) if they feel they sufficiently are calibrated, whether the probabilities and impacts have been properly and sufficiently assessed or not (here to, these vague words are based on a given groups standards).
For example, in software engineering, plans generally include a risks section, where is described various unknowns, their probability, and their impact. Each of those are quantified by (for example) High, Medium, Low, or Unknown. This is simply a double-layer of subjective but group-agreed upon classification of words, meant to communicate the overall probability that the project will hit the date at the expected cost. It is based on experience (i.e. the internal model of the author). During the review process, other leaders and engineers then comment based on their experience on whether a specified risk is properly assessed. These threshold can be very context specific (i.e. a team, a company, or the industry). This is no different in public policy (i.e. risks and impact of global warming).
In other words, I think that you are trying to analyze the problem objectively by making each assertion absolute (i.e. a probability is known or unknown, etc.), while in fact the problem is one of pure communication, rather than one of objective truth or logic. So you get caught in a rabbit hole as you are essentially re-discovering the limitations of language and classification systems, rather than actually discussing the problem at hand. And the initial problem statement, in your analysis (i.e. certainty as risk+uncertainties) is arbitrary, and your logic could have been applied to any definition or concept.
Whether the idea of uncertainty+risk is the proper tool can essentially only analyzed empirically, by comparing it, for example, to another method used in a given field, and evaluating whether method A or B improve s the ability of planners to predict date/cost prediction (in software engineering, for example).
In other words, I think it’s more useful to think of those definitions as an algorithm (perhaps ML): certainty ~ f(risk, uncertainty); and the definitions provided of the driving factors as initial values. The users can then refine their threshold to improve the model’s prediction capability over time, but also as a function of the class of problems (i.e. climate vs software).