Statistical Modeling, Causal I... Note

Statistical Modeling, Causal Inference, and Social Science

The website statmodeling.stat.columbia.edu is the personal website of Andrew Gelman, a professor of statistics and political science at Columbia University. The site appears to be a blog that features Gelman's thoughts on various topics in statistics, data analysis, and related fields. The site has a simple and straightforward design, with a focus on content over flashy visuals. The site includes articles on a range of topics, from technical discussions of statistical modeling to commentary on current events and the intersection of politics and data. The articles often feature Gelman's thoughts on recent research papers, and include links to external sources and further reading. One of the main features of the site is its comment section, which appears to be quite active, with many users engaging in discussions and debates with Gelman and other readers. The comments are often insightful and provide additional context and perspectives on the topics being discussed. Overall, the site seems to be geared towards professionals and researchers in the field of statistics and data analysis, but also appears to be accessible to readers with an interest in data-driven thinking and analysis.

Thread Of Notes

The New York Knicks and the martingale property of calibrated probability forecasts (with some simulation and R code)

This long post covers four topics: 1. The Knicks’ stunning series of come-from-behind victories to win the NBA title in 5 games; 2. The martingale property of probability forecasts; 3. An example of learning from simulation; 4. How we (sometimes) …
CdXz5zHNQW_rhI88N6aag.png

Ph.D. student opening in Sweden on Earth Observation, Data Science, and AI for poverty estimation

Adel Daoud writes: I’m writing to ask for your help circulating a PhD opening in my group at Chalmers, the AI and Global Development Lab (www.aidevlab.org). The position is in Earth Observation, Data Science, and AI for poverty estimation, the …

To what extent is it true that “All intelligence, human or artificial, must extract structure from correlational data”?

Someone pointed me to this article, “Does AI already have human-level intelligence?” You can click through to read the whole thing; spoiler alert: their answer is Yes. I don’t have much to say about the main argument of the article–it’s …

Jazz and quantum mechanics: Eventually Dmitri realized that they are kind of similar

Dmitri Tymoczko pointed me to this article by John Baez explaining general relativity. I replied that this seems like some very important stuff, but I’m devoting all of that part of my brain to being confused by quantum mechanics. I …

Adjusting for nonrepresentativeness in continuous norming using multilevel regression and poststratification.

Klazien de Vries, Marieke E. Timmerman, Anja F. Ernst, and Casper J. Albers write: In psychological test norming, nonrepresentativeness in background variables in the normative sample can lead to bias in the normed score estimates. Because representativeness is difficult to …

Stein’s method, learning and inference -or- how to really monitor convergence and thin chains

This post is from Bob. I’ve been thinking a lot about scores (gradients of the log density function) and how they can be used for convergence monitoring. We know that the expected value of the score is zero. Stein generalized …

What is the relation between interactions in a regression model and correlations among the predictors?

I’ve often seen confusion between interactions in a regression model and correlations among the predictors. To keep it simple, consider the model y = b0 + b1*x1 + b2*x2 + b3*x1*x2 + error, and assume the predictors have been signed …

Epidemiologist Donna Spiegelman sez: SUTVA is “mostly not necessary for valid causal estimation and inference most of the time”

Donna Spiegelman shares this presentation she gave at the recent American Causal Inference Conference. I like what she has to say. Here are the two parts of the stable treatment value assumption: 1. No interference between units. As Spiegelman says, …
CdXz5zHNQW_VGO6qP1FT4.png

Say what you want about this junk survey, at least it’s more plausible than other hyped claims like the hyperloop or the idea that UFOs are space aliens!

Palko points to this breakdown of a junk news story. The fake-survey-to-headline pipeline reminds me of a credulous Wall Street Journal story from a few years back. But, yeah, with respected news sources repeatedly falling for ridiculous scams like the …

Noem’s Razor and why I think the concept of “unintended consequences” is overrated

I was thinking more about Noem’s Razor (“Never attribute to stupidity that which is adequately explained by malice”) and it reminded me of that “Unintended consequences” often were actually intended, a principle that I discussed back in 2008 in the …

What if scientists really were dispassionate observers, communicating ideas without irrational commitment? Look here, says AI.

This is Jessica. We often idealize science as proceeding primarily by the scientific method, where scientists approach the objects of their investigation with a healthy dose of detachment and neutrality, and become convinced only when the evidence is there. But …

No, Bayes does not like Mayor Pete. (Pitfalls of using implied betting market odds to estimate electability.)

This one’s from 2019, but it’s worth reposting given recent interest in prediction markets. The story starts with a post from economist Greg Mankiw, who wrote: Who has the best chance of beating Donald Trump? A clue can be found …
CdXz5zHNQW_A9r0bN3K9W.jpeg

The “humans are imperfect reporters too” defense for ascribing little thoughts to machines

This is Jessica. In my last post about the tension between the “necessity” that we ascribe human folk psychological concepts like thinking and reasoning to machines and the problems that arise when we overinterpret them, I briefly mentioned a defense …

Differences between crackdowns on dissent now and in the early Cold War period

First, the similarities: 1. Government actors are directly threatening both private citizens and government employees to suppress dissenting speech. 2. The attacks on free expression are notable but they’re still very rare. Most people in this country can still say …

Don’t cite sources you haven’t read, and don’t trust when people claim to be reporting something from the literature.

Peter Dorman writes: In case you haven’t seen it, check out this recent piece in Rolling Stone. A key paragraph toward the end: Craig Callender, a philosophy professor at the University of California San Diego and president of the Philosophy …

Full day Stan tutorial at Modern Modeling Methods (M3) this summer in New York (22 June 2026)

This post is from Bob Mitzi Morris and Bob Carpenter, two of Stan’s developers, will be presenting a tutorial on Stan and Bayesian data analysis aimed at psychometricians this summer. Modern Modeling Methods Conference (M3), Fordham University Lincoln Center Campus, …

What do I think about that proposed Arxiv policy to ban authors of papers with AI slop?

Tim First writes: I’m curious what your thoughts are on the new arXiv policy that authors will be banned for a year if their paper includes mistakes due to the use of AI. My (uninformed) thoughts: 1. arXiv is acting …

James Heathers will fix Wiley’s problems for less than 3.7 million dollars (that is, 2,553,739 Jamaican beef patties, 47,064 whisky-sodden meals at Newark airport, or nearly 218 invites to a conference featuring Gray Davis, Grover Norquist, and a rabbi)

The data thug quotes from: an April 2023 post from the EVP of Research at Wiley: In September 2022, Wiley identified and immediately alerted the industry to paper mill activity we found operating at scale. Specifically, we found fraudulent outside …
CdXz5zHNQW_X3x0ZjdHE5.jpeg

MrPlew: Locally Equivalent Weights for Multilevel Regression and Poststratification

Ryan Giordano, Alice Cima, Jared Murray, Erin Hartman, and Avi Feller write: Multilevel regression and poststratification (MrP) has become a workhorse method for estimating population quantities from non-probability surveys, and is the primary model-based alternative to traditional survey calibration weighting …

Why are there squares everywhere in statistics (e.g., normal density, variance, least squares, etc.)?

I remember asking my colleagues at Carnegie Mellon this very same question as I was first learning basic statistics in the early 1990s and they gave the same kind of answers as I found more recently in the AskStatistics subreddit. …