Regression to the Mean and Trait Fungibility (transplanted twitter conversation)

anon

Salko Spandrell understood so much, and yet he couldn't understand regression to the mean.

Spandrell ?

Salko Hi! Big fan. I didn't realize you were back on twitter! Your old blog had a post suggesting that regression to the mean in intelligence is observed only because of “smart men marrying attractive women,” which is retarded. There's a fundamental RttM selection effect.

Spandrell Well “only” is an overstatement but the aggregate result isn't only genetic. If smart couples breed consistently over generations the average does go up. How else do you explain Indian castes?

Salko Wdym by the aggregate effect? If the heritable part of intelligence is a linear combination of the parents, you can't change the population mean by assortment. I think RttM is commonly understood as the tendency of extreme samples to come closer to the general mean when retested. In this case, the fact that children's intelligence tends to be closer to the population mean than the average of their parents.

I think the fungibility of traits is probably part of this, but the basic statistical effect behind RttM is that selecting for a high combination of deterministic and stochastic components means that extreme samples will tend to have extreme stochastic components (obviously). Only the deterministic component is preserved, though, so the next tests will give less extreme results.

Spandrell

I don't remember exactly the context where I said it, but people think that smart couples won't have smart kids because regression; and sure there's some of that. As you say it's a basic statistical law, but think about the actual genetics.

Extremely intelligent people are likely to have, say (making this up) 1,000 genes which raise their IQ. Which is of course rare. In any instance of meiosis a lot of those genes are going to get not make it to the gamete, so unless the wife also has a big bunch of high IQ genes, the baby isn't going to be as intelligent. Even if the wife is also very smart, her high IQ genes might also not make it; or if they do they overlap or whatever.

But how big is that effect compared to smart men having children with cognitively mid wives? Surely much smaller.

anon

The gene-level arrangement might matter, but that's only insofar as intelligence isn't just a linear combination of alleles. Obviously it isn't exactly, but it's the tendency in how polygenic traits behave. With that (approximately accurate) assumption, meiosis alone shouldn't cause any regression to the mean. Even beyond genes, there are ofc many heritable contributors to intelligence.

However, the overall heritability is like 60% at best. If you select two 99th-percentile parents, they'll each score at the 99th percentile (on average) in both the heritable and non-heritable components. Their kid, meanwhile, is expected to be at the 99th percentile in the heritable component, but only at the 50th percentile in the non-heritable one. Depending on the heritability, that may put them in the 98th percentile, or it may put them in the 51st.

With that in mind, I think the selection-based regression to the mean probably has quite a large effect. Fungibility of traits is very real, and it definitely smooths out extremes, but that fungibility isn't even close to total - assortative mating is a powerful force acting in the opposite direction.

anon
Replying to:
Spandrell

Regression to the mean is widely misunderstood and mostly fake. Lynn covers this argument in his books dealing with eugenics and dysgenics. Intelligence (g) is a polygenic trait coded for by thousands of genes. These genes have (mostly) additive genetic effects, while the regression to the mean occurrs when dealing with traits determined by dominant and recessive genes. Studies show children are about at the IQ level you'd expect from the average IQ of their parents, with the regression towards the mean accounting for only up to a few points smaller IQ in children (Terman,1925; Oden,1968; Scarr & Weinberg,1978). This is a robust, replicated finding. For more info, look up pages 160-164 in Lynn's Eugenics: A reassessment.

~davdev-hidtul
Replying to:
anon

This brings up to two points that I've never understood from casually following such conversations:

1) Why should polygenetic effects be additive, rather than, for example, multiplicative? The cartoon picture of a brain is as some kind of complex network, and behaviors of complex networks depend in complicated non-linear ways on size and connectivity.

2) (Related) Should we really expect IQ to be normally distributed, a priori? Galton and Co. were not aware of this and it is not taught in Stats 101, but there are a lot of distributions that superficially look normal, but behave differently in the tail. I presume guys like Jensen aren't retards and ran the proper tests for kurtosis, and that this just doesn't come up in HBD poasting.

anon
Replying to:
anon

This is simply not true. I agree that regression to the mean in the genetic component of intelligence is dubious. However, regression to the mean in the trait itself is statistically inevitable - unless you think intelligence is purely hereditary.

anon
anon
Replying to:
~davdev-hidtul

(1) I don't really follow as to what you exactly mean by multiplicative. To clarify, it might be of help to quote wikipedia about the additive genetic effects (I know, it's wikipedia but the following is still correct), „genetic effects are broadly divided into two categories: additive and non-additive. Additive genetic effects occur where expression of more than one gene contributes to phenotype (or where alleles of a heterozygous gene both contribute), and the phenotypic expression of these gene(s) can be said to be the sum of these contributions.“ So, basically genes average out when producing a phenotype. That is, tall parents will produce tall children. Intelligent parents will produce intelligent children. Not to say there will not be outliers in a particular family for example, of course there will be, and that's what some have in mind when they refer to the regression to the mean, but on the population level the statement holds.

In addition, you are quite right, the brain definitely is a complex system and as such, there is a degree of nonlinearity. Thus, only about 80% of the variance is explained by genetics. The remaining 20% relates to random biochemical effects.

As for (2), we should definitely expect intelligence, as in g, the psychological construct of general intelligence, to be normally distributed. It is a biological variable coded for by many genes (and environmental variables). As is the case with other such variables like height, it ought to be normally distributed. We also know the SNPs ascertained to be involved in determining intelligence through GWAS are normally distributed. The problem with IQ though is that we measure it on an interval scale. We don't have a ratio scale. What happens is that raw scores on IQ tests are transformed into a normal distribution. There is something called the central limit theorem that basically tells us IQ should too be normally distributed but that's too much stats for me and you'd have to consult someone with higher level intelligence science expertise.

anon
Replying to:
anon

This is by far the clearest and best explanation of the matter that I've seen. Thank you

anon
Replying to:
anon

It's mostly hereditary. Surely not 60% at best. I'll quote Richard Haier, the editor in chief of the journal Intelligence, from his 2023 book the neuroscience of intelligence, “To recap this key piece of the genetic story, the heritability of general intelligence increases with age to about 80% by the end of the teenage years” (page 57).

Intelligence(g) most definitely is largely a genetic phenomenon, hence the small impact of the regression to the mean makes sense.

anon
Replying to:
Spandrell

(2) is the Central Limit Theorem, like sweatanon said. This is the defining feature of a normal distribution - as individual components get arbitrarily small relative to the whole, any linear combination of random variables converges on one.

(1) is a really excellent question. To address your confusion, sweatanon, what he's asking is why we should expect polygenic traits to be approximately linear. The assumption here is basically just that genes act in the same way and are relatively independent; of course, complex interactions exist, so this is just an approximation. But consider your “multiplicative” example. Even if genes have a multiplicative effect on some underlying trait - e.g. compute - their effect on positioning in the distribution of compute is additive. For this specific case, log preserves order relations, so you can just apply it to compute and make the effects of genes additive. Generally, with a weaker set of assumptions, you can often derive a normally-distributed variable even if what you're measuring doesn't admit one.

anon
Replying to:
anon

Yes good video. People talk about “regression to the mean” as in some sort of “regression to racial platonic form”. Michael Woodley's talk of the gene-complex and how those assemblages intelligence genes can dilute into an average is more the point. To take it back to Spandrell's original point, smart only needs to reproduce with pretty for one generation for pretty to have decent intelligence genes. After that you've just gotta make sure your smart + pretty caste goes for other smart + pretty, instead of just finding a pretty. You'd probably need some sort of ethnogenesis for that.

Side note, is there an ethnogenesis in recent times that's produced an attractive people? Or are they all uggos?

~davdev-hidtul

Thanks for the answers, guys, but appeals to the Central Limit Theorem are begging the question, since in those terms I'm asking whether or why the hypotheses of the CLT hold for polygenetic traits. The CLT presumes you are taking the limit of arithmetic means as the number of independent random variables grows, and I'm asking whether polygenetic traits really are arithmetic means. The same question applies to height.

There are other limit theorems for different kinds of distributions, producing for example bell-shaped but non-normal distributions such as the Cauchy distribution:

https://en.wikipedia.org/wiki/Cauchy_distribution

anon
Replying to:
anon

The point of bringing up the CLT is to show that (2) is not just related to (1) but a necessary consequence if the effects of the relevant genes are individually sufficientLY small. It's not begging the question but rather refining it.

WRT polygenic traits being normally distributed, again, almost all aren't. For example, if height forms a normal distribution, height^10 doesn't. There's no fundamental distinction between “essential” and “derivative” traits. The point is that, under certain circumstances, you can derive a normally distributed variable (i.e. one where sufficiently plural factors will act additively) from a measurement. IQ is such a variable; the raw measurement is the number of correct answers on a certain test, which isn't necessarily normally distributed. This type of transformation provides insight into relative performance, nothing absolute.

~davdev-hidtul
Replying to:
anon

Thanks. But I don't know quite what you mean about “sufficiently small”. Relatively to what? Isn't the strength of the interaction more important than the magnitude of each individual effect?

Anyhow, I think what I'm trying to get at it is “under what conditions do you expect some invertible transformation of your data to be normally distributed”? And “why should be expect polygenetic traits to satisfy those conditions?”