Proposal : Transparency in coverage

Project Title: ETL and dashboarding for Transparency in coverage data released by healthcare providers.

Project Description:
In 2020 USA federal government passed a rule Transparency in Coverage)to make it easier for patients to calculate cost of healthcare before the treatment. Also, it asks coverage providers to make three informations negotiated rates, out-of-network allowed amounts, and drug pricing information public. After multiple delays, coverage providers finally dropped first batch of data in july. The dataset is in form of predefined json format as prescribed by CMS. The compressed json files are very large in size going over 1TB. Overall it will drive down the healthcare costs as providers compete and insurers compete for market share.

This data is gold mine for payers and providers. A study done by RAND shows that private health plans paid hospitals more than twice the amount paid by medicaid for same plans. Another study found claims that consumers would save money if they paid in cash at the pharmacy instead of using their health plan. A publicly accessible data would allow consumers to better plan the healthcare cost and bring down the cost. The dataset is released monthly and can be aggregated for future use like predicting the trend in health plans.


  1. Scrape, parse and create ETL(extract transform load) pipelines backed by data integrity checks.
  2. Publish the data to the ocean marketplace
  3. A D-app for the project
  4. Queries and dashboarding through compute-to-data

Ashwini kumar Pal
Role: Data Scientist/ ML Professional
eth address : 0x8513A856a88e63374286d0116C192733444894C0


This is a very worthwhile project, and furthermore, there might be a tangental opportunity here.

Under the False Claims Act, a whistleblower who discovers that a company is defrauding the US government (for example in healthcare), is eligible to keep 15-30% of the funds if they win.

You ought to be able to cross reference this data, with the data that was sent to be billed to medicare / medicaid, to detect these sorts of anomolies. You can likely get this billing data purusuant to a FOIA request sent to medicaid.

Thank you for suggestions and encouragement.

This is a nice idea. I’d like to know where the data is being sourced from in the first place.
Is this publicly available data? Is it free/paid for?

The dataset is publicly available dumped by the healthcare providers every month.

Good proposal, strongly agree that it is useful!
For a few hospitals, this is a good proof-of-concept project. But to drive impact at scale, I think you’ll need a lot more work, and I know some folks working on this at who also put out this data bounty Announcing the $10,000 chargemaster URLs bounty and may be a good partner to build this project out in the longer term! Let me know if I should connect you over email (Discord / DM is best). Goodluck!

