Subscribe via Email

Interesting Articles I've Read in 2024

Below is a collection of interesting articles I’ve read in 2024. Three papers are on differential privacy and adjacent topics. There’s a recent method for differentially private SGD utilizing methods from private query answering, an intuitive watermarking scheme for language models, and a paper from 1986 that proposed $k$-anonymity before it was formalized as a criterion for de-identification. Several papers are on the history of ideas ranging from early twentieth century pragmatism, the synergies and antagonisms between poetry and philosophy, and the relation between periodicals and intellectual progress to the deaths of Analytical Marxism and Effective Altruism.

Read More

An Utterly Incomplete Look at Research from 1824

The French Revolution cast a long shadow over early nineteenth century thought. Many of the selections below are concerned with the aftermath of the Revolution and the Napoleonic Wars both economically and intellectually. The pseudonymous Piercy Ravenstone critiques England’s system of public finance and its handling of the massive debt resulting from the wars. John Stuart Mill reviews a debate about the effects of war-time spending on prices. His father, James Mill, offers analyses of government, jurisprudence, freedom of the press, and international law along utilitarian lines for the Encyclopedia Britannica. William Stevenson argues that the moral convulsion among the people caused by the French Revolution opened the door for intellectual progress.

Read More

Interesting Books I've Read in 2024

Below are some interesting books I’ve read in 2024. The bulk of what I’ve been reading is either wrapped up in my research on differential privacy or for my project looking back at research from 100, 150, and 200 years ago. Two of the four recommendations below are from the latter project, but all four are of interest from an historical perspective. Wisdom’s Workshop is a history of the American research university from Medieval times to the present. William Stanley Jevons’ The Principles of Science is a wide-ranging treatment of logic and philosophy of science from 1874 that’s bursting with ideas - some more developed than others. Ballyhoo! is a history of professional wrestling and combat sports from its outlaw roots in the late nineteenth century through the first half of the twentieth century. Finally, John Ramsay McCulloch’s Discourse on Political Economy from 1824 is the first history of economic thought from the era of the classical economists.

Read More

Pseudoinverse of Repeating Vertical Block Matrix

A block matrix is a structured representation of a matrix as partitioned into submatrices. For example, let $A, B, C, D$ be $m \times n$ matrices. Then \(\begin{bmatrix} A & B \\ C & D \end{bmatrix}\) is a $2m \times 2n$ block matrix. A special case called vertical (or horizontal) block matrices have blocks stacked in a single column (or row). In this post, we discuss a simple result for calculating the pseudoinverse of a repeating vertical block matrix i.e. a vertical block matrix where each block is identical. This result has been mentioned in a few places on Wikipedia and elsewhere but not with a direct proof.

Read More

New Paper - Efficient and Private Marginal Reconstruction with Local Non-Negativity

I have a new paper out with my colleagues from UMass Amherst and Penn State: Efficient and Private Marginal Reconstruction with Local Non-Negativity. Marginals are statistics that capture low-dimensional structure and correlations among sets of attributes in a dataset and are an important building block for differentially private algorithms. A marginal can be decomposed into a set of queries called residuals. Our paper studies how to decompose noisy answers to marginals into noisy answers to residuals and how to recombine noisy answers to many residuals into noisy answers to marginals.

Read More

An Utterly Incomplete Look at Research from 1924

The relation between science and society is the primary theme from the selections below. H. B. S. Haldane imagines a future where science frees humanity from the constraints of nature; Bertrand Russell sees this future as a collectivist prison of sorts. Philosophers of various stripes — whether idealist or otherwise — seek to integrate results from relativity theory into their metaphysical views about the world. Some economists seek to replicate the success of the natural sciences by overhauling classical and marginalist economics.

Read More

New Paper - Joint Selection

I have a new paper out with my colleagues from UMass Amherst: Joint Selection: Adaptively Incorporating Public Information for Private Synthetic Data. Many data sources that researchers and policy makers are interested in are updated through periodic releases ranging from large-scale surveys such as the Current Population Survey (CPS) to governmental administrative records. Since these datasets often contain sensitive information, it may be the case that only aggregate statistics are released or, alternatively, a synthetic dataset is constructed and released (either case hopefully under differential privacy). We introduce a new method JAM-PGM to utilize public data to improve the quality of synthetic data generated under differential privacy. In the case of periodically released datasets, public data could include prior releases. In the paper, we look at cases where the public and private data do not follow the same distribution, which is what one would expect if using such techniques in the wild.

Read More

Interesting Articles I've Read in 2023

Below is a collection of interesting articles I’ve read in 2023. Several papers focus on foundations or methodology ranging from mathematical logic and history to causal inference. Two papers look like jokes or hoaxes but are insightful in the end. We consider the history of women in analytic philosophy and the first law and economics program from the early 20th century. There is a primer on synthetic data for policymakers and a short look at working with the pseudoinverse for highly structured matrices. Finally, we survey a scuffle between internet subcultures and consider income inequality globally.

Read More

An Utterly Incomplete Look at Research from 1823

For our areas of interest, research in the English-speaking world during the early 1820s was heavily focused on economics. This is largely a result of numerous public debates regarding economic policy in Britain such as the Bullionist Controversy and the various Corn laws. Political economists carried out these debates in periodicals, pamphlets, and the occasional tome of a text. Unsurprisingly, economics items compose half of the selections below.

Read More

Interesting Books I've Read in 2023

Below are some interesting books I’ve read in 2023. Much like last year, I’m giving six recommendations rather than the three of years past. Lost Continents looks at Atlantis and similar fictional places as rhetorical devices. Slouching Towards Utopia offers a narrative economic history of the 20th century that focuses on the interaction between policy and economic thought. The Hound of the Baskervilles is a wonderfully Gothic Sherlock Holmes novel. The Rise of Universities explores early universities in the 12th and 13th centuries, and Colonial New England on 5 Shillings a Day is an imagined travel guide for exploring pre-Revolution New England. The Book of Yōkai surveys creatures from Japanese folklore that have found their way into all sorts of popular media.

Read More

An Utterly Incomplete Look at Research from 1873

The late 1860s and early 1870s are the era of the Victorian periodical. While outlets existed for the natural sciences such as Nature and Transactions of the Royal Society, we are a few years away from widespread English-language specialized journals in philosophy, economics, and so forth. Mind debuts in 1876, the Quarterly Journal of Economics (QJE) and Political Science Quarterly (PSQ) in 1886, and the Journal of the American Statistical Association in 1888.

Read More

An Utterly Incomplete Look at Research from 1923

The aftermath of the First World War motivated a flurry of research across the humanities and sciences. Practical problems such as how currencies should be managed in light of devaluation and indebtedness and how states should communicate with one another to resolve disputes gave impetus to new ways of thinking about the economy, the structure of society, and the international order. What should be the focus of central banks? What should be the government’s role in the economy or society in general? How can we best utilize progress in the natural and social sciences to diminish the possibility of future war and improve the quality of life? This is the single greatest thread that unites many of the selections below. Keynes’ A Tract on Monetary Reform and Zimmern’s Nationalism and Internationalism are prime examples.

Read More

New Paper - The Shape of Explanations: A Topological Account of Rule-Based Explanations in Machine Learning

I have a new paper out that will be presented at the AAAI 2023 Workshop on Representation Learning for Responsible Human-Centric AI. This paper introduces a formal model to explore rule-based explanations for classifiers. Explanations of this sort explain a classification by providing a simple sufficiency condition. For example, if an applicant’s loan is rejected by a predictive model, a rule-based explanation may be that the application belongs to the group with outstanding debt greater than $X$, missed payments greater than $Y$, etc. and every application (or close to every application) in this group is rejected. The key observation is that this rule completely describes a region in the feature space. One way to think about a topology on a space is that it provides a language for describing subsets of the space with some being more descriptively complex than others (which we can describe using known hierarchy results such as the Borel hierarchy). Using this, we prove that a classifier being explainable is equivalent to the inverse image of each label being the union of a descriptively simple set and a small set.

Read More

Interesting Books I've Read in 2022

Below are some interesting books I’ve read in 2022. In past years, I’ve offered three recommendations; however, this has been a great reading year, so I’m including six. Fading Foundations applies a novel mathematical approach to a perennial problem in epistemology. 1177 B.C. and The Isles detail the histories of the Bronze Age collapse and the British Isles, respectively. How Carrots Won the Trojan War offers some whimsical miscellany about vegetables and their kin. Dr. Jekyll and Mr. Hyde is a cornerstone of nineteenth century Gothic literature. Pythagoras and the Pythagoreans examines the influence of Pythagoreanism on the history of Western thought.

Read More

Interesting Articles I've Read in 2022

Below is a collection of interesting articles I’ve read in 2022. There are two papers in differential privacy: A Better Privacy Analysis of the Exponential Mechanism and Differentially Private Approximate Quantiles. There’s an article in the history of mathematical philosophy and a survey introduction to an area of logic: The introduction of topology into analytic philosophy and Incomplete and Utter Introduction to Modal Logic. Two papers border on the practical side of the philosophy of science: Stylized Facts in the Social Sciences for social science research and Does Academic Research Destroy Stock Return Predictability? for finance.The World Putin Wants discusses Russia’s rhetoric from the war in Ukraine. To round out the collection, we explore the influence of a wayward early 20th century archaeologist, build a simple mathematical model of tennis, ponder whether NLP models have intentional states, and consider the role of private property among early humans.

Read More

New Paper - AIM: An Adaptive and Iterative Mechanism for Differentially Private Synthetic Data

I have a new paper out on arXiv with my colleagues at UMass Amherst (Ryan McKenna, Daniel Sheldon, and Gerome Miklau). This paper proposes a new method for generating differentially private synthetic data that’s tailored to one’s use case by only capturing information from the desired marginal distributions. This approach builds on prior work on generating private synthetic data by representing a data distribution as a probabilistic graphical model. See this post for a summary of current techniques.

Read More

Interesting Books I've Read in 2021

Below are some interesting books I’ve read in 2021. Despite having a slow reading year (which we’ve probably all had), I’ve come up with a handful of suitable recommendations. The Open Society and Its Complexities argues that moral diversity can be socially beneficial. A Man of Misconceptions profiles a 17th century polymath caught between magical thinking and early scientific philosophy. Rationalizing Capitalist Democracy is a critical look at the origins of rational choice theory and its applications in the moral and social sciences.

Read More

Interesting Articles I've Read in 2021

Below is a collection of interesting articles I’ve read in 2021. The dominant theme this year is differential privacy and learning theory (which is unsurprising given my current research). Several fall into this category: A Simple and Practical Algorithm for Differentially Private Data Release, Privacy in Pharmacogenetics, A dynamic logic for learning theory, and Exploring Connections Between Active Learning and Model Extraction. Two articles look at issues in recent academic history: The Prehistory of biology preprints and Scientific community in a divided world. The Cult, the Cultic Milieu and Secularization is a foundational paper in the sociology of religion and esotericism. The remaining papers span Hellenistic historiography, evidence for axioms in mathematics, and a new typology for pseudoscience and pseudophilosophy.

Read More

New Paper - Earnings Mobility and the Great Recession

I have a new paper in Social Science Quarterly with my collaborators Dave Sjoquist and Sally Wallace at Georgia State University. This paper compares earnings mobility for low-wage workers before and after the Great Recession using linked administrative data from the state of Georgia. This work utilizes our mobilityIndexR R package to estimate transition matrices and calculate mobility indices i.e. measures of mobility. Click here for an introduction to the package.

Read More

Resources for Learning Computational Complexity Theory

Computational complexity theory studies the feasibility of solving and resources required to solve computational problems and is useful to any field that thinks about the analysis and design of algorithms (which is much more broad than one may first think). While there are a good bit of notes and lectures available online, these are scattered across university course pages, YouTube, etc. This guide aims to bring this material together for learning computational complexity theory at the introductory graduate level, especially for those without a formal CS background.

Read More

Introducing mobilityIndexR

mobilityIndexR is an R package for calculating transition matrices and indices to measure mobility within a sample. For instance, tracking the income of a cohort over some period of time allows one to measure the economic mobility of that cohort, and tracking the grades of students in a class allows one to measure grade mobility. This post is an invitation to the package and an introduction to the ideas it implements. For a general introduction to economic mobility, see Further Reading at the bottom of the post.

Read More

Interesting Research Programs from the 2010s

The idea of this post is to introduce and discuss several interesting research programs from the past decade. A research program (or programme) refers to a common thread of research that shares similar assumptions, methodology, etc. The list below contains a variety of research programs: some on topics that have broad appeal e.g. explainable machine learning and mental disorder; others moved the direction of entire industries e.g. advances in computer vision and cryptocurrencies; and others still are more niche areas that I happen to be deeply interested in e.g. topological learning theory, privacy attacks on ML models, and graph-theoretic approaches to epistemology.

Read More

Top Books I've Read in 2020

Below are the top books I’ve read in 2020. This year, there seems to be no particular theme; however, each of the three books below are rather short and readable. The History of Phlogiston Theory aims to dispel myths about phlogiston theory and provide a brief history of chemistry in the 18th century as the field moved from alchemy to quantitative methods. The Abraham Dilemma develops a theory of delusion as a mental disorder, with a focus to the peculiar complications of religious delusion, informed by the experiences of clinicians and patients. Finally, Libra Shrugged dives into Facebook’s (currently unsuccessful) attempt to launch a cryptocurrency at a worldwide scale.

Read More

Top Articles I've Read in 2020

Below are the top articles I’ve read in 2020. This year’s list contains a nice mix of types of articles. A prominent theme in the list is economics and economic methodology with A Theory of Optimum Currency Areas, Economic Modelling as Robustness Analysis, and Thoughts on DSGE Macroeconomics. The Theory of Interstellar Trade is an oddball article from a young Paul Krugman. An introduction to (algorithmic) randomness is an excellent invitation to a technical area of mathematical logic, and Comments on Economic Models, Economics, and Economists is a fun and effective book review on methodology. Finally, there’s four thought provoking articles across political rhetoric, machine learning privacy, social contract theory, and international relations with The Paranoid Style in American Politics, Stealing Machine Learning Models via Prediction APIs, Self-organizing moral systems, The End of Grand Strategy.

Read More

Economic Methodology Meets Interpretable Machine Learning - Part IV - Current State of Economic Methodology

This post looks at the current state of economic methodology with respect to the realistic assumptions debate. After briefly surveying the history of economic methodology, we’ll walk through two recent arguments in the realistic assumptions debate: one in favor of instrumentalism as a theoretical ideal and the other favoring a limited form of realism in practice. In light of these arguments, I’ll argue that practitioners can still adopt some form of limited realism in practice if such an approach is a expedient guide to creating models with desirable properties.

Read More

Economic Methodology Meets Interpretable Machine Learning - Part III - Responses to Friedman's 1953 on the Realism of Assumptions

This post discusses three responses to Friedman 1953 (which we introduced in Part II). Friedman’s contention, termed the “F-Twist” by Samuelson, is that economic theories should be evaluated only on their predictions within some specified domain. The F-Twist puts Friedman on the instrumentalism end of the realism of assumptions debate. The responses by Paul Samuelson, Stanley Wong, and Dan Hausman discussed below provide various lenses though which to view the problem of the realism of assumptions and, ultimately, in my view, renders the F-Twist untenable in isolation.

Read More

Economic Methodology Meets Interpretable Machine Learning - Part II - Friedman's 1953

This post introduces Milton Friedman’s 1953 essay The Methodology of Positive Economics which takes the position that economic theories should be evaluated only on their predictions within some specified domain. This article has been called “the most cited, the most influential, and the most controversial piece of methodological writing in 20th century economics” and plays the foil (and occasionally the bogeyman) in much of the economic methodology literature. This is so much so that it is often referred to as Friedman 1953 or even F53.

Read More

Shadow on Pop!_OS/Ubuntu 19.10

I recently moved to a System76 Darter Pro running Pop!_OS 19.10 as my primary laptop (review coming soon). As you might have guessed by the version number, Pop!_OS is System76’s fork of Ubuntu. With this move, I switched to Shadow as my cloud gaming service, since they have a supported Linux client - no messing with Wine, dual boots, or VMs required! The Shadow Linux client is built to be compatible with 18.04+ but didn’t work right away due to some Video Acceleration issues. Below is a guide for getting Shadow running on 19.10 based on my experience troubleshooting.

Read More

Some Fun with Knights and Knaves

I’m currently working through Raymond Smullyan’s The Gödelian Puzzle Book and came across a fun problem that serves as a good starting point for new readers of Smullyan. Smullyan is well known for (among many other things) producing several books of logic puzzles that introduce ideas from mathematical and philosophical logic in an accessible but still technical way. These books are often formatted where each chapter has an introduction to the relevant characters, ideas, and setting, several problems for the reader to work through, and the solutions to the problems.

Read More

Economic Methodology Meets Interpretable Machine Learning - Part I - Interpretability, Explainability, and Black Boxes

This post is the first entry in Economic Methodology Meets Interpretable Machine Learning and briefly introduces the ideas of black boxes, explainability, and interpretability for machine learning models and offers arguments for and against deploying only interpretable models in the wild when interpretable models are available. The debate over interpretable models in machine learning is far from settled and has been getting much attention in recent years.

Read More

Economic Methodology Meets Interpretable Machine Learning - Introduction

In this series of posts, we will develop an analogy between the realistic assumptions debate in economic methodology and the current discussion over interpretability when using machine learning models in the wild. While this connection may seem fuzzy at first, the past seventy years or so of economic methodology offers many lessons for machine learning theorists and practitioners to avoid analysis paralysis and make progress on the interpretability issue - one way or the other. But first, what’s going on with these two debates?

Read More

Top Books I've Read in 2019

Here are the top three books I’ve read in 2019, presented below in chronological order by year published. While quite the cliché, the theme that emerged this year is to not judge a book by its cover. While Measure and Category by John Oxtoby appears to be a terse math treatise, it is a short, well-paced, lucid read (though requiring some prerequisites). Braudel’s The Structures of Everyday Life digs deeply into the minutiae of common experience in early modern Europe rather than providing overarching historical narrative. To finish, Haskel and Weslake’s Capitalism without Capital is a well-researched - if at times dull - look at intangible assets from an economic perspective whose title reminds one of a political polemic.

Read More

Top Articles I've Read in 2019

Below are the top eleven articles I’ve read in 2019. A theme of methodology runs through this set of papers, especially statistical methodology. There’s also some fun miscellany mixed in with blockchain (whose craze seems like a lifetime ago now), unicorns, and the history of the English language. To my surprise, all of these articles are from the present decade. They are presented in chronological order.

Read More

Resources for Learning Measure Theory

When approaching measure theory for the first time, the ideas can seem opaque and unmotivated. This is amplified since many students of measure theory are not coming from a strictly mathematics background and may be approaching the material on their own outside of the classroom. In addition to first-year math graduate students and advanced math undergraduates, students in stats, economics, the hard sciences, etc. will find their way into learning measure theory. This is a guide to resources for learning measure theory that tries to keep in mind that many (myself included) approach the material with an atypical background.

Read More

Parsing Nested JSON Records in Python

JSON is the typical format used by web services for message passing that’s also relatively human-readable. Despite being more human-readable than most alternatives, JSON objects can be quite complex. For analyzing complex JSON data in Python, there aren’t clear, general methods for extracting information (see here for a tutorial of working with JSON data in Python). This post provides a solution if one knows the path through the nested JSON to the desired information.

Read More

Introduction to Structure of Epistemic Justification via the Telephone Game (part I)

In epistemology, we often think of the things we believe as discrete propositions. For instance, you may believe that there is a computer screen in front of you. But how is this belief justified? One way of justifying a belief is by offering a reason, which can itself also be a proposition. For this next proposition, we can then ask how it is justified and so on. The regress problem asks the following question: if any of the things we believe are justified, then what is the structure of that justification? Does the justification question not just keep getting passed backward forever with reasons for reasons for reasons?

Read More

Top Books I've Read in 2018

Here are the top three books I’ve read in 2018. They are presented below in chronological order. While these three books seem rather disparate, they are bound together by themes of innovation, conflict, and ideology.

Read More

Top Articles I've Read in 2018

Here are the top eleven papers I’ve come across in 2018.$^*$ These papers are mostly recent publications (within the last two years) with some older ones peppered in. They are in chronological order below.

Read More