About

Origin

Datamule grew out of an open source document parsing project started by John Friedman, during a medical leave of absence from his PhD at UCLA. The document parser required a large supply of diverse data, for which the SEC corpus was chosen. Datamule was then released as an open source package for working with SEC data.

SEC rate limits were too slow, so John set up his own SEC Archive. This was released to the public, using a Stripe paywall of $1/100,000 filings downloaded as a resource control mechanism.

In April 2025, Datamule was incorporated as a LLC. That summer, Datamule became part of AWS Activate and Cloudflare for Startups. The credits and compute were used to setup cloud infrastructure to process and distribute SEC data at scale.

Mission

At its core, Datamule is about efficient processing of information. This is an interdisciplinary mission that blends together concepts from Information Theory, Machine Learning, Compression, and Distributed Computing.

The SEC corpus provides a rich, diverse, large corpus to work on that is also extremely valuable.

Goals

Make SEC data cheap and easy to use, especially for AI.
Advance information processing.
Profit.

Team

John Friedman

Founder & CEO

John grew up playing Nim and Chomp in Ithaca, New York. He later became interested in economics, working for future Nobel Prize Winner and coauthor, Simon Johnson. In 2020, John dropped out of Berkeley to work on the Covid-19 Relief Effort—and to pursue research full-time.