Sadly, there are still 994 days until the next presidential inauguration.

Welcome…

April 30th, 2026|

This is a pretty interesting paper published in the April 30 edition of Science ( Peter G. Brodeur et al., Performance of a large language model on the reasoning tasks of a physician. Science 392,524-527 (2026). DOI:10.1126/science.adz4433 ).

Open access available here:
https://www.science.org/doi/10.1126/science.adz4433
It discusses some of the earlier OpenAI models’ (e.g. o1-preview and GPT-4) performances on generating differential diagnoses and then looked at how o1 and 4o performed on real world ED and ICU admissions when compared to two Internal Medicine physicians at Beth Israel Deaconess in Boston.

Excerpts from the article:

“The o1 model identified the exact or very close diagnosis (Bond scores of 4 to 5) in 67.1% of cases during the initial ER triage, 72.4% during the ER physician encounter, and 81.6% at admission to the medical floor or ICU—surpassing the two physicians (55.3, 61.8, and 78.9% for Physician 1; 50.0, 52.6, and 69.7% for Physician 2) at each stage.”
and
“We emphasize that our study addresses only text-based performance for both humans and machines; clinical medicine is multifaceted and awash with nontext inputs, including auditory (such as the patient’s level of distress) and visual information (for example, interpretation of medical imaging studies) that clinicians routinely use. Existing studies suggest that current foundation models are more limited in reasoning over nontext inputs (26, 27); future work is needed to assess how humans and machines may effectively collaborate (28) in use of nontext signals.”
Progress apparently continues apace; as these are now “older” models, I would agree with the authors that “Although we expect performance to be sustained or improved with newer models (27, 29), further studies should be done to elucidate how performance varies across models and to study how humans and LLMs may collaborate.”

April 21st, 2026|

Every American should listen to Ezra Klein’s Our Tax System Should Make You Furious episode on his eponymous podcast with Boston College Law Professor and guest Ray Madoff.  It’s a superb indictment of our ridiculously convoluted and unfair tax system. Transcript is here.  Podcast is here.

Recommended:

• Caveat emptor. From Matt Levine’s excellent Money Stuff, on Polymarket bots:

Anyway Bloomberg’s Carolyn Silverman, Nathaniel Popper and Marie Patino report:

Over 100,000 accounts lost at least $1,000 on Polymarket, one of the largest prediction markets, according to a Bloomberg News analysis of every wallet active since the beginning of 2025. That is almost twice the number that made at least that much.

Among the winners, a majority of the profits were raked in by a tiny slice of what look to be automated bots, based on the Polymarket trade records compiled by the data firm Dune. Everyone else, in aggregate, lost $131 million. …

While prediction markets have been described as peer-to-peer, the Polymarket records suggest the role of the sportsbook is now largely being played by the sort of automated, high-frequency traders that have long dominated other financial markets. The most active accounts on the site were a small proportion of wallets, but accounted for most of the trading volume.

• Thomas Edsall has it right in his NYT guest essay: titled “Easily the Worst President in U.S. History”:
The damage President Trump has inflicted on the United States and the world is so enormous and wide-ranging that it is hard to grasp.” Worth the read.

• A nice essay from Charlie Warzel, writing in The Atlantic – “An Incredibly Weird Time to Be Alive.”  A snippet: “The world witnessed the best and worst of humanity in a single week.”

 

   Molly White riffs on Web3:

    Let’s just say it’s not all rosy in Web3’s not-so-meta world; caveat emptor…