Anticipating SEC Filings
There are faster ways to get SEC filings. I've previously open sourced code showing how: the fastest way to get SEC Filings. Now, I'm including a basic implementation on Datamule.

This is accessible via SEC Filing Emails, Webhooks, and will soon be made available on Websocket V3. See products. You can also use Datamule's complimentary SEC filings feed.
Here's how it works:
- Most companies and investors get notified of a new filing after it is released on the RSS feed.
- This is typically 30 seconds after the filing's metadata page is generated.
- This page can be anticipated, by predicting the url, which requires knowing its accession.
- accessions mostly follow the format {cik of company filing on behalf}- {year}-{sequential numbering of filings submitted so far this year}.
- So, I predict it.
I'm currently using a naive algorithm that should anticipate ~10% of filings. It costs me $20/year to run. For ~$80/year I could get this to ~40%.
> Note: after I published the fastest way to get SEC Filings, a few trading firms reached out. They had been working on the same problem; but had not been able to make it work. Their issue was that they had not realized that the metadata page url could be constructed without knowing the cik of the company being filed for. This made the search space large, and therefore expensive.
> Note 2: metadata is published before filing content. You can typically get filing content a bit faster, especially for larger file like 10-Ks. Generally e.g. 30s faster vs 2s faster, metadata to data.
Preliminary Results
After collecting data for about a month. We get:
- period_start_date: 2026/05/27
- total_filings_in_period: 71289
- filings_with_anticipate: 7440
- p25_seconds_faster: 1
- median_seconds_faster: 2
- p75_seconds_faster: 8
- p90_seconds_faster: 20
- p99_seconds_faster: 53
- max_seconds_faster: 199
- avg_seconds_faster: 6.93984962406015
This cost $0.002 to run.
Code
from datamulehub import databases
rows = databases.read_query(
"""
SELECT
MIN(start_date) AS period_start_date,
COUNT(*) AS total_filings_in_period,
SUM(CASE WHEN anticipate_time IS NOT NULL THEN 1 ELSE 0 END) AS filings_with_anticipate,
SUM(CASE WHEN first_sec_time IS NOT NULL THEN 1 ELSE 0 END) AS filings_with_rss_or_efts,
SUM(CASE WHEN anticipate_time IS NOT NULL AND first_sec_time IS NOT NULL THEN 1 ELSE 0 END) AS filings_with_both_sources,
SUM(CASE WHEN first_sec_time > anticipate_time THEN 1 ELSE 0 END) AS anticipate_faster_filings,
CAST(SUM(CASE WHEN anticipate_time IS NOT NULL THEN 1 ELSE 0 END) AS double) / COUNT(*) AS anticipate_coverage_rate,
CAST(SUM(CASE WHEN first_sec_time > anticipate_time THEN 1 ELSE 0 END) AS double)
/ NULLIF(SUM(CASE WHEN anticipate_time IS NOT NULL AND first_sec_time IS NOT NULL THEN 1 ELSE 0 END), 0)
AS anticipate_win_rate_when_comparable,
MIN(CASE WHEN first_sec_time > anticipate_time THEN first_sec_time - anticipate_time END) AS min_seconds_faster,
approx_percentile(CASE WHEN first_sec_time > anticipate_time THEN first_sec_time - anticipate_time END, 0.25) AS p25_seconds_faster,
approx_percentile(CASE WHEN first_sec_time > anticipate_time THEN first_sec_time - anticipate_time END, 0.50) AS median_seconds_faster,
approx_percentile(CASE WHEN first_sec_time > anticipate_time THEN first_sec_time - anticipate_time END, 0.75) AS p75_seconds_faster,
approx_percentile(CASE WHEN first_sec_time > anticipate_time THEN first_sec_time - anticipate_time END, 0.90) AS p90_seconds_faster,
approx_percentile(CASE WHEN first_sec_time > anticipate_time THEN first_sec_time - anticipate_time END, 0.99) AS p99_seconds_faster,
MAX(CASE WHEN first_sec_time > anticipate_time THEN first_sec_time - anticipate_time END) AS max_seconds_faster,
AVG(CASE WHEN first_sec_time > anticipate_time THEN first_sec_time - anticipate_time END) AS avg_seconds_faster
FROM (
SELECT
sd.accessionnumber,
sd.filingdate,
start_boundary.start_date,
m.anticipate_time,
m.first_sec_time
FROM sec_submission_details_table sd
CROSS JOIN (
SELECT date_add('day', 1, date(from_unixtime(MIN(detected_time)))) AS start_date
FROM monitor_dumps
WHERE lower(source) = 'anticipate'
) start_boundary
LEFT JOIN (
SELECT
accession,
MIN(CASE WHEN lower(source) = 'anticipate' THEN detected_time END) AS anticipate_time,
MIN(CASE WHEN lower(source) IN ('rss', 'efts') THEN detected_time END) AS first_sec_time
FROM monitor_dumps
GROUP BY accession
) m
ON CAST(sd.accessionnumber AS varchar) = CAST(m.accession AS varchar)
WHERE sd.filingdate >= CAST(start_boundary.start_date AS varchar)
) joined
"""
)
print(rows)