Here is a presentation (in spanish) on rationality for skepticamp I’ve worked on recently. It needs some polish but the content is essentially complete. A summary of the main points
Optimization and the second law of thermodynamics are opposing forces
Intelligence is a type of optimization that evolved in certain species to counteract the 2nd law through behavior. Intelligence functions through observation, learning and prediction
Prediction requires a correct representation of the environment, this defines epistemic rationality as a component of intelligence
Classical logic fails to model rationality as it cannot deal with uncertainty
Probability theory is an extension of logic to domains with uncertainty
Probability theory and Bayes define a standard of ideal rationality. Other methods are suboptimal approximations
Probability theory as formalization of rationality:
Provides a quantitative model of the scientific method as a special case of Bayes theorem
Provides operational, quantitative definitions of belief and evidence
Naturally relates predictive power and falsifiability through the sum rule of probability
Explains pathological beliefs of the vacuous kind; astrology, card reading, divination, etc
Explains pathological beliefs of the floating kind; “There is a dragon in my garage”
Exposes fraudulent retrodiction; astrology, cold reading, ad-hoc hypothesis, bad science, bad economics, etc
Dissolves false disagreements described by matching predictions but different verbal formulations
Naturally embeds empiricism, positivism and falsificationism
Pathological beliefs can be analyzed empirically by re-casting them as physical phenomena in brains, the province of cognitive science.
A naturalistic perspective automatically explains human deviations from rationality; evolution will always favor adaptations that increase fitness even if they penalize rationality
Today, politics is an example of rationality catastrophe; in the ancestral environment, irrationality that favored survival in a social context (tribes) was a successful adaptation. (Wright, Yudkowsky)
Bayesian probability – Bruyninckx(2002)
Philosophy and the practice of Bayesian statistics – Gelman(2011)
Varieties of Bayesianism – Weisberg
No Free Lunch versus Occam’s Razor in Supervised Learning – Lattimore, Hutter(2011)
A Material Theory of Induction – Norton(2002)
Bayesian epistemology – Hartmann, Sprenger(2010)
The Illusion of Ambiguity: from Bistable Perception to Anthropomorphism – Ghedini(2011)
Bayesian Rationality and Decision Making: A Critical Review – Albert(2003)
Why Bayesian Rationality Is Empty, Perfect Rationality Doesn’t Exist, Ecological Rationality Is Too Simple, and Critical Rationality Does the Job* – Albert(2009)
A Better Bayesian Convergence Theorem – Hawthorne
One of the themes in AI is the progressive replacement of hand crafted knowledge/programming with autonomous learning. Another theme is the progressive shift from narrow domain specific abilities to generally applicable performance. These themes are closely related: domain specific performance is usually achieved via encoding domain specific expert knowledge, provided by humans, into the AI itself. This encoding is fixed and static, and is potentially brittle if the agent is subjected to a domain outside its narrow region of applicability.
In this talk we see hints of generality in machine learning via the replacement of hand crafted, tuned features (as input to a later stage learning phase) with a learning phase that autonomously learns these features. Because domain specific features, for example for vision or speech recognition, are typically designed by subject matter experts, learning is constrained by that specific manual task. One cannot use vision features in an audio problem and so forth.
However, when machine learning is targeted at learning the features themselves (in an unsupervised scheme), the potential for generality becomes present. If features can be autonomously learned for various domains, the machine learning process becomes general in so far as the feature learning’s performance is comparable or superior to that using hand crafted knowledge. This is exactly what is demonstrated here.
And, to make it even more relevant in terms of current research avenues, this is related to findings in neuroscience that suggest the general applicability of some kinds of learning in the neocortex, where for example patients can learn to see with their auditory cortex or their somatosensory cortex (by rewiring the optic nerve). This suggests the possibility that there is a unique learning algorithm at work, at least in the learning of concepts at low levels of the hierarchy close to the perceptual level, which is exactly where features reside.