Filer Data Descriptive Statistics
Exploring the statistical characteristics of listed and unlisted entities in SEC filings
Summary Statistics
Key metrics comparing listed and unlisted entities in the dataset.
Listed Entities
Unlisted Entities
Top Business Cities
Distribution of entity registrations across top business cities for both listed and unlisted entities.
Top Countries
Distribution of entities across countries, excluding the United States which dominates the dataset.
Top ZIP Codes
Distribution of entities across top ZIP codes by registration volume.
Entity Categories
Distribution of entities across different business categories.
Entity Types
Distribution of entities across different entity types (operating, investment, other).
Fiscal Year End Distribution
Distribution of entities by fiscal year end date, using a logarithmic scale since December 31st dominates the dataset.
Insider Transactions
Distribution of entities by insider transaction existence, for both issuers and owners.
Listed Entities
Unlisted Entities
Listed Entities
Unlisted Entities
Owner Organizations
Distribution of entities by owner organization types, showing the top 10 categories.
Industry Distribution (SIC)
Distribution of entities by Standard Industrial Classification (SIC) codes, showing the top categories.
Methodology
This analysis provides descriptive statistics for filer data collected from SEC filings. The visualizations compare patterns between listed and unlisted entities across various dimensions.
Key methodology details:
- Data is sourced from SEC entity filing metadata
- Listed entities refer to those that are publicly traded on stock exchanges
- Unlisted entities refer to private companies and individuals that still file with the SEC
- For categorical visualizations, categories with less than 1% representation are grouped as "Other"
- All text data has been cleaned to remove quotes and standardize to lowercase
- ZIP codes are standardized to 5-digit base form
- All data is updated regularly through an automated process
For more details on the data processing methodology, visit the GitHub repository.