My friend Jane Bambauer at the University of Florida wants your input on her new misinformation project. Here’s her pitch. Enjoy!
I am running a research program at the University of Florida aimed at getting policymakers and intellectuals to accept that the "misinformation problem" is (1) a demand-side problem rather than a supply-side one, (2) much more prevalent in mainstream media than scholars and elite society tend to recognize, and (3) is best tackled by making confirmatory bullshit socially embarrassing.
I have a few research projects in the works that could benefit from the input of thoughtful economists, philosophers, lawyers, and other Caplanites. You can use this form for easy input and to express interest in attending convenings or getting involved, but I’ll also pay attention to the comments section.
Using LLMs to Measure Bullshit / Grade the News
1. Grading old predictions. We are building an LLM tool that can scan old news stories to identify concrete-enough-to-grade predictions, identify a "due date," and then grade the prediction if the due date has passed (and otherwise store for future grading.) We are also building a model that works backwards from existing prediction markets: we look at the betting markets that Polymarket or Kalshi or whatever have set up in the past, then we go search for old news items that made an overt or implied prediction, and we imagine that the speaker or author had placed an appropriate bet in that market on the date that the claim was made. We have some ideas for dealing with tentative predictions (“might” language), and we also have ideas for penalizing speakers who imply predictions but use vague or weaselly language that avoids falsification. When we are happy with performance, we will use UF's supercomputer to do this at scale.
Why:
(1) More trustworthy than fact-checking. Fact-checking applies to claims that are already contested. By contrast, usually there is little debate about whether a prediction has or has not been met—hence the reasons that prediction markets are able to clear without constant controversy.
(2) Reputation scores for experts. Philip Tetlock used predictions to see whether experts do a good job aggregating and synthesizing knowledge. His studies found “meh, a bit better.” But it couldn’t be done at scale before LLMs.
(3) Expert reputation is really important! Most knowledge is built from testimony of others rather than direct experiences and observations, so to quote Michael Huemer, “The key question would then be why do you think that person is reliable about p [the proposition]. Perhaps you have heard this person speak about p-related matters before, you have frequently checked on her p-related assertions, and you have generally found them to be correct. Thus, you inductively inferred that this person is reliable about things related to p… Sad to say, I really haven't checked on very many people at all. I haven't taken a large, random sample of assertions by various humans and verified their truth, as would seem to be required for an inductive inference to the general veracity of people's statements.”
2. Grading for Cherry-picking. We are also building a similar LLM tool that grades news stories based on when misleading examples are used to support a narrative. This program scans a news story to identify explicit or implicit factual claims, and also identifies the evidence/examples used to support the claim. It then does an independent assessment of available data to see if the examples are representative given the claim. This almost always boils down to comparing two ratios, so usually cherry-picking can be detected by having good data on 4 numbers (2 numerators, 2 denominators). Bryan's post on the "Chinese robbers" fallacy would get picked up by this.
3. Proportion of epistemically useless content. We have also just started to build LLM tools that parse content into (a) premises, (b) claims, (c) arguments, (d) evidence, and (e) something that is none of that stuff. (e) is bad. It's usually unjustified narrative control or emotional appeals. What proportion of a news story is (e)? Also, we are hoping to make a tool that looks for logical fallacies among the (c) arguments, and this might be a good alternative to the fact-checking approach because you don't need any source of ground truth.
Market-Based Institutions Beyond Betting/Prediction Markets
4. News guarantees / bounties. Marshall van Alstyne (BU), Yonathan Arbel (Alabama), and a few others have written about using contractual agreements to let people sue you if they think you are wrong to create reliable signals of quality. There are a few ways to set up such a program, but no content creator has taken this up (other than Bryan, of course.) If we were able to do field experiments, how should we structure a bounty? Example: the content creator guarantees factual claims made in a story and deposits $10k in an escrow. A challenger would have to pay the costs of a challenge, and then the two sides would put on a trial for an online jury. Alternatively, the two sides could each select a judge who is blinded to who selected them, as is the system at Rootclaim. There are probably other models, too. What type of system would be administrable and widely trusted? If Meta could introduce a new revenue stream by serving as the escrow for content creators that agree to a bounty system, and participation in the system would boost their content in the curation algorithm, could this provide a decentralized and credible signal for quality? (Marshall gets credit for this Meta idea.)
2. Prediction Market-Funded Investigative Journalism. Why don’t news companies that put tireless work into collecting and analyzing new information place bets on the prediction markets before publishing their articles? The authorities might consider this illegal but that would be bad policy and also probably unconstitutional under the First Amendment. I suspect that’s not the main reason journalists don’t do it, anyway. What are the other reasons? Can this be a way to establish a new business model for careful journalism? (Thanks to my colleague Peter Molk for this idea.)
Again, the link to the input form is here. Thanks very much to Bryan for allowing me to reach his “beautiful bubble” of regular readers.