To enable a more automated approach to gathering information about companies company_dns was created. This release enables the synthesis of data from the SEC EDGAR repository and Wikipedia. A Medium article entitled “A case for API based open company firmographics” is available discussing the process and motivation behind the creation of this service.
The embedded web interface is a modern single-page application for exploring SEC EDGAR filings and industry classifications. For detailed documentation on features, architecture, and usage, see html/README.md.
The V3.2.0 release brings comprehensive security hardening and modernizes the web framework:
/docs and ReDoc at /redoc.venv conventionfastapi>=0.115.0: Modern ASGI framework with automatic OpenAPI docspydantic>=2.10.0: Data validation and serializationslowapi>=0.1.9: Rate limiting for API protectionpython-multipart>=0.0.9: Multipart form data support✅ Malicious paths (/wp-login.php, xmlrpc.php, etc.) return 403 Forbidden
✅ SQL injection patterns blocked before reaching business logic
✅ XSS attempts filtered at middleware level
✅ All endpoints enforce rate limits per IP address
For release notes prior to V3.2.0 (including V3.1.0 and V3.0.0), see the consolidated changelog: CHANGELOG.md.
The install and setup process is either for users or developers. Instructions for both are provided below.
New from V3.0.0 are automated docker builds providing a fresh image on a monthly basis. There are three reasons for this:
The image can be pulled using docker pull ghcr.io/miha42-github/company_dns/company_dns:latest. With the image pulled you can run it using docker run -m 1G -p 8000:8000 company_dns:latest which will run the image in the foreground, and running the image in the background docker run -d -m 1G -p 8000:8000 company_dns:latest. GitHub’s container registry is used to store the images, and more information on this package can be found at company_dns/company_dns.
Assuming you have setup access to GitHub and a Linux or MacOS system of some kind, you’ll need to get the repository.
mkdir ~/devcd ~/dev/git clone git@github.com:miha42-github/company_dns.gitSince the docker build process takes care of data cache creation, Python requirements installation and other items getting company_dns running is relatively straight forward. To simplify the process further the svc_ctl.sh script is provided.
svc_ctl.sh automates build/run/log tasks for company_dns. Common workflows:
~/dev/company_dns: ./svc_ctl.sh build then ./svc_ctl.sh start (background) or ./svc_ctl.sh foreground (interactive)../svc_ctl.sh tail../svc_ctl.sh stop (graceful) or ./svc_ctl.sh kill (forceful)../svc_ctl.sh rebuild../svc_ctl.sh status or ./svc_ctl.sh check-deps../svc_ctl.sh cleanup.NAME:
./svc_ctl.sh <sub-command>
COMMANDS:
help - Display help
check-deps - Validate docker and required files
start - Start the service in the background
stop - Stop the running container gracefully
kill - Forcefully stop the running container
build - Build the Docker image
rebuild - Rebuild image and restart the service
foreground - Run the service in the foreground
tail - View logs of the running container
status - Check container status and port
cleanup - Remove stopped containers and dangling images
Depending upon the intention for getting the code it could be running in a Python virtual environment or in a vanilla file system. Regardless the steps below can be followed to get the service up and running.
Before you get started it is important to install all prequisites and then create the cache database.
cd ~/dev/company_dns/company_dnspip3 install -r ./requirements.txtpython3 ./makedb.pyIf everything above completed successfully then running company_dns can be performed via python3 ./company_dns.py this will run the service in the foreground.
Regardless of the approach taken to run the company_dns checking to see if it is operating is important. A quick way to check on service availability when running on localhost is to follow this link: http://localhost:8000/. If this is successful the embedded web interface will display (see screenshot below) describing core capabilities and function, examples with curl, and some helpful links to the company_dns GitHub repository.
A live system is available for Mediuroast efforts and for anyone to try out, relevant links are below.
Oil - https://www.mediumroast.io/company_dns/V3.0/na/sic/description/oilIf you encounter a problem with the company_dns please first review existing open issues, and if you find a match then please add a comment with any detail you might deem relevant. If you’re unable to find an issue that matches the behavior you’re seeing please open a new issue.
We try to keep high level Todos and Improvements in a list contained in a section below, and as we begin to work on things we will create a corresponding issue, link to it, progress and close it. However, if there is a change in design, major improvement, and so on something may fall off the list below. If something isn’t on the list then please create a new issue and we will evaluate. We’ll let you know if we pick up your request and progress to working on it.
Here are the things that are likely to be worked but without any strict deadline:
Since this code falls under a liberal Apache-V2 license it is provided as is, without warranty or guarantee of support. Feel free to fork the code, but please provide attribution to the authors.