A theory appears beautiful or elegant [..] when it’s simple, [..] when it can be expressed very concisely. – Murray Gell-Mann
The question of whether one should use beauty as a heuristic in physics is a difficult problem with deep philosphical roots. It is hard to find appropriate assumptions to even frame the problem. Like in the classic philosophical problems it is related to, falling into infinite regress is always right around the corner. In the limit, if we make no assumptions, nothing can be said.
I sketch an argument, in the form of a series of claims, as to how one should try to think of the problem. This is then applied in two ways. First to argue that the term “beauty” is not useful to think about the problem. Second, to evaluate one example of a heuristic that is normally considered to be an instance of beauty, naturalness.
1. Science is a process in which theories are assigned relative degrees of certainty.
1.1 We can model degrees of certainty with probabilities.
1.2 Theories must be logically consistent
1.3 Theories gain (or lose) certainty through empirical confirmation.
2. Scientific theories allow calculations that specify expected observations.
2.1 To yield calculations, these theories must be computable, in the Church-Turing sense.
2.2 These theories can be re-cast as algorithms.
3. The routine practice of science includes embedded assumptions. These assumptions are usually taken for granted and even invisible .
3.1 These assumptions affect the way probabilities are assigned to theories.
3.2 The way these assumptions affect probability assignments must be consistent.
We can accomodate the problem of beauty in physics in terms of point 3. I argue later that the word “beauty” is not the right word to describe the real problem at hand. This real problem can be best described as follows.
3.3 If competing theories yield (through calculation) empirical claims that are equivalent in our domain of observation, then they cannot be assigned different probabilities from observation alone.
3.3.1 The problem of assigning different probabilities to empirically equivalent theories (up to observation) is an instance of the philosophical problems of underdetermination, and its cousin, the problem of induction.
3.3.2 The problem is not vacuous as theories with as yet equivalent calculations may diverge for not yet made observations.
The problem of beauty in physics can now be described as the question of whether beauty should be used an additional criteria to answer 3.3.1. Before going forward, we identify the main classes of assumptions present and accepted in scientific practice.
4. Science includes a class of assumptions about nature’s uniformity, this is known as Uniformitarianism.
4.1 Uniformitarianism asserts, for example, that the laws of nature are invariant in space and time: we expect the laws of physics to be the same everywhere. We also expect the laws of physics to be the same in the past as in the future.
4.2 Uniformitarianism is an a priori assumption: it is not subject to confirmation as scientific theories are.
4.3 Following 3.1, uniformitarianism affects the way probabilities are assigned to theories in a way that is complementary to empirical confirmation (extra-empirical in ).
5. Science includes an assumption about nature’s simplicity, this is known as Occam’s Razor.
5.1 Occam’s razor asserts that simpler theories are, ceteris paribus, more likely to be true.
5.2 Occam’s razor is an a priori assumption: it is not subject to confirmation as scientific theories are.
5.3 Following 3.1, Occam’s razor affects the way probabilities are assigned to theories in a way that is complementary to empirical confirmation (extra-empirical in ).
It should be apparent that the claims in 4.3 and 5.3 have the same form as that required by 3.3: probability assigning mechanisms that, given equivalently supported theories, allow assigning probabilities differentially.
6. Uniformitarianism and Occam’s razor are the only a priori assumptions accepted in science. They can be reduced/unified into a bias in favour of simpler theories.
6.1 Uniformitarianism and Occam’s razor can be formalized as a bias that assigns relatively more probability to simpler theories. In a bayesian setting, this bias takes the form of a prior over theories.
6.2 Given that theories take the form of algorithms, one needs a measure of simplicity for an algorithm. An example of such a formalism is found in Solomonoff induction, also in a bayesian setting.
6.3 Although Kolmogorov complexity is uncomputable, and the complexity has a dependence (up to a constant overhead) on choice of turing machine, solomonoff induction illustrates that the complexity of a theory is amenable to formal treatment. This provides a model with which to think about the underlying philosophical issue in a principled way.
6.4 Other considerations such as “testability, falsifiability, naturalness, calculability, and diversity” are not epistemological assumptions, but rather desirable criteria for the efficient practice of science.
Now we can say what the problem with “beauty” is: that it is ambiguous and/or inconsistent with scientific practice.
7. The word “beauty” is not a good way to think about the problem of selecting theories with equivalent empirical support. Whenever “beauty” does not refer to simplicity it is ambiguous or inconsistent.
7.1 If beauty does not refer to simplicity, there is no objective way to formalize it, and it is, in principle, not possible to construct a prior.
7.2 If beauty does not refer to simplicity, applying it is inconsistent: it it is not part of the accepted a priori assumptions used in science.
7.3 If it is a problem to use “beauty” when it does not mean simplicity, then one can drop beauty altogether and use simplicity instead.
Now I turn to a specific application of this approach naturalness as described by Hossenfelder 
Physicists use the word “natural” in general to mean that a theory’s dimensionless parameters (ie those without units) are not much larger or much smaller than 1 [..] The belief in naturalness is usually rationalized by claiming that numbers which are very large or very small are unlikely. They appear cherry-picked for no other reason than correctly describing an observation and require further explanation.
I argue that naturalness, as defined above, could be formulated as an instance of a simplicity bias:
8. If a theory is to be encoded as an algorithm, then the parameter values it uses to make calculations must be present as data in the theory.
8.1 Relative to the choice of Universal Turing Machine (UTM), the number of parameters and their values will contribute to the length of the algorithm.
8.2 Due to kolmogorov complexity invariance, the choice of UTM cannot arbitrarily reduce the length contributions of parameters. In particular, if the encoding of certain parameter values in one UTM is efficient, it cannot be for others. There is no free-lunch UTM, and certain number of parameters and values contribute to algorithm length in a non-arbitrary way (up to a constant overhead).
9. If we are using simplicity priors over theories due to 4) and 5), then it is reasonable to expect that these priors over theories have logical implications as to the number and values of theory parameters.
9.1 If 4) and 5) take the form of priors over theories, then they imply priors over parameter values.
9.2 If 9.1) is the case, then not only are simplicity favouring priors over parameter numbers and values compatible with scientific practice, they are, in fact, necessary. For these formulations of 4) and 5) it would be inconsistent not to employ a priori probability distributions over parameter numbers and values.
We can compare 9.1 to observations made by Hossenfelder and Wells.
The major problem with finetuning arguments both in cosmology and in particle physics is the reference to probabilities without defining a probability distribution, or, if a distributionis defined, the failure to explain where the distribution itself comes from.
If naturalness has any connection to the extra-empirical judgments of theory plausibility, which is surely what naturalness discussion is after, we have no recourse but to introduce probability measures across parameters of the theory.
The main idea is that simplicity assumptions provide these probability distributions, albeit implicitly.
The claims made in 9) are not general, but rather point to the possibility of accomodating naturalness into the framework of scientific practice in a principled way. This is a philosophical argument, because I provide no concrete specifications here for 4) and 5), or naturalness. So one cannot say whether naturalness follows in general, one can only say that it is conceivable, as in the above model, that it does.
We can imagine a scenario where we could go from the philosophical to the technical, in a way reminiscent of E.T. Jaynes’ reasoning robot. The key to this idea is the need to automate the relevant probabilistic reasoning, which forces the formalization of the method into precise mathematical instantiations. This would mean
Step 1. Choose some turing machine where the robot runs on
Step 2. Restrict the class of theories such that priors are computable
Step 3. Formalize the principles 4) and 5) into precise probability priors over theories
Step 4. Formalize naturalness
Step 5. Extract the logical relationship between the formalizations of Step 3 and Step 4.
This would give a concrete answer as to whether, in that scenario, naturalness follows from previous principles, and in effect, whether naturalness is consistent with and necessary for the robots’ “scientific practice”.
Note that Step 3 is related to Wells’ comment :
The critical discussion above leads to a credible skepticism to extra-empirical theory assessments (SEETA). We do not have a meta-theory that provides statistical distributions toparameters of theories, which is a key problem naturalness faces
Step 3 would, in effect, construct the meta-theory for the robot’s accepted scientific principles.