CMS Open Payments Drug/Device/Company Data ReadMe

Files: openpayments_companies.csv; openpayments_drugs.csv; openpayments_devices.csv; openpayments_drug_company_lookup.csv

Source: Centers for Medicare and Medicaid Services

Date Received: September 2014

Date Covered: August 2013 to December 2013


ABOUT THIS DATA

Beginning in 2014, the federal government mandated that pharmaceutical and medical device manufacturers publicly report payments made to doctors and teaching hospitals. The first report, released in September 2014, covered the last five months of 2013 and contained nearly 4.4 million records. The data detailed general and research payments as well as payments to companies' physician investors to more than 545,000 physicians and 1,200 teaching hospitals.

The data was complicated. Forty percent of records (more than 1.7 million) weren’t attributable to a particular recipient because of data irregularities and inconsistencies. (CMS plans on releasing the de-anonymized data in June 2015.) Often, a single record could indicate payment for up to five drugs and/or devices, making aggregate calculations by drug or device difficult.

And the data itself was messy. Some companies misspelled the names of drugs and devices, which made it difficult to accurately tally spending. Others referred to their products by more than one name. Some companies listed their medical devices as drugs, while other companies reported payments to doctors or teaching hospitals, but left off which products were related to the payments.

ProPublica set out to clean and standardize the data to glean information on which companies were spending the most on which products. We split each record into separate transactions for each drug or device listed (i.e., if one record showed payment to five drugs, we split into five records). We also standardized product names.

We are releasing four files that powered our Open Payment Explorer app. Three of the files (one for companies, one for drugs, one for devices), include totals by spending and number of payments. A fourth, smaller file is a crosswalk to allow you to see which companies made payments on which drugs and devices.
 
Here are some things to bear in mind about the data:

RECORD LAYOUT

File: openpayments_companies.csv
Rows: 1063
Columns: 40
Size: 131KB

company_id: Unique identifier made by ProPublica to identify a company.
company: Name of company submitting payment information.
general_payment_sum: Sum of all general payments made by this company.
general_payment_count: Count of all general payments made by this company.
charitable_sum: Sum of all charitable payments made by this company.
charitable_count: Count of all charitable payments made by this company.
speaking_sum: Sum of all speaking payments made by this company.
speaking_count: Count of all speaking payments made by this company.
nonaccredited _sum: Sum of all non-accredited training payments made by this company.
nonaccredited _count: Count of all non-accredited training payments made by this company.
accredited _sum: Sum of all accredited training payments made by this company.
accredited _count: Count of all  accredited training payments made by this company.
consulting _sum: Sum of all consulting payments made by this company.
consulting _count: Count of all consulting payments made by this company.
investment _sum: Sum of all investment  payments made by this company.
investment _count: Count of all consulting payments made by this company.
education_sum: Sum of all education payments made by this company.
education _count: Count of all education payments made by this company.
entertainment_sum: Sum of all entertainment payments made by this company.
entertainment _count: Count of all entertainment payments made by this company.
meal_sum: Sum of all meal payments made by this company.
meal _count: Count of all meal payments made by this company.
gift _sum: Sum of all gift payments made by this company.
gift_count: Count of all  gift payments made by this company.
grant _sum: Sum of all grant payments made by this company.
grant _count: Count of all grant payments made by this company.
honoraria_sum: Sum of all honoraria payments made by this company.
honoraria _count: Count of all honoraria payments made by this company.
royalty _sum: Sum of all royalty payments made by this company.
royalty _count: Count of all royalty payments made by this company.
facility_sum: Sum of all facility rental payments made by this company.
facility _count: Count of all facility rental payments made by this company.
travel_sum: Sum of all travel payments made by this company.
travel _count: Count of all travel payments made by this company.
ownership_payment_sum: Sum of all ownership payments made by this company.
ownership_payment _count: Count of all ownership payments made by this company.
research_sum: Sum of all research payments made by this company.
research_payment_count: Count of all research payments made by this company.
general_minus_royalty_sum: Sum of all general payments (without royalty payments included) made by this company.
general_minus_royalty_count: Count of all general payments (without royalty payments included) made by this company.

File: openpayments_drugs.csv
Rows: 1409
Columns: 36
Size: 185KB

product_id: Unique identifier made by ProPublica to identify a drug.
new_product_name: Standardized drug name.
charitable_sum: Sum of all charitable payments concerning this drug.
charitable_count: Count of all charitable payments concerning this drug.
speaking_sum: Sum of all speaking payments concerning this drug.
speaking_count: Count of all speaking payments concerning this drug.
nonaccredited _sum: Sum of all non-accredited training payments concerning this drug.
nonaccredited _count: Count of all non-accredited training payments concerning this drug.
accredited _sum: Sum of all accredited training payments concerning this drug.
accredited _count: Count of all  accredited training payments concerning this drug.
consulting _sum: Sum of all consulting payments concerning this drug.
consulting _count: Count of all consulting payments concerning this drug.
investment _sum: Sum of all investment  payments concerning this drug.
investment _count: Count of all consulting payments concerning this drug.
education_sum: Sum of all education payments concerning this drug.
education _count: Count of all education payments concerning this drug.
entertainment_sum: Sum of all entertainment payments concerning this drug.
entertainment _count: Count of all entertainment payments concerning this drug.
meal_sum: Sum of all meal payments concerning this drug.
meal _count: Count of all meal payments concerning this drug.
gift _sum: Sum of all gift payments concerning this drug.
gift_count: Count of all  gift payments concerning this drug.
grant _sum: Sum of all grant payments concerning this drug.
grant _count: Count of all grant payments concerning this drug.
honoraria_sum: Sum of all honoraria payments concerning this drug.
honoraria _count: Count of all honoraria payments concerning this drug.
royalty _sum: Sum of all royalty payments concerning this drug.
royalty _count: Count of all royalty payments concerning this drug.
facility_sum: Sum of all facility rental payments concerning this drug.
facility _count: Count of all facility rental payments concerning this drug.
travel_sum: Sum of all travel payments concerning this drug.
travel _count: Count of all travel payments concerning this drug.
research_sum: Sum of all research payments concerning this drug.
research_payment_count: Count of all research payments concerning this drug.
general_minus_royalty_sum: Sum of all general payments (without royalty payments included) concerning this drug.
general_minus_royalty_count: Count of all general payments (without royalty payments included) concerning this drug.

File: openpayments_devices.csv
Rows: 3460
Columns: 36
Size: 432KB

product_id: Unique identifier made by ProPublica to identify a product.
new_product_name: Standardized product name.
charitable_sum: Sum of all charitable payments concerning this device.
charitable_count: Count of all charitable payments concerning this device.
speaking_sum: Sum of all speaking payments concerning this device.
speaking_count: Count of all speaking payments concerning this device.
nonaccredited _sum: Sum of all non-accredited training payments concerning this device.
nonaccredited _count: Count of all non-accredited training payments concerning this device.
accredited _sum: Sum of all accredited training payments concerning this device.
accredited _count: Count of all  accredited training payments concerning this device.
consulting _sum: Sum of all consulting payments concerning this device.
consulting _count: Count of all consulting payments concerning this device.
investment _sum: Sum of all investment  payments concerning this device.
investment _count: Count of all consulting payments concerning this device.
education_sum: Sum of all education payments concerning this device.
education _count: Count of all education payments concerning this device.
entertainment_sum: Sum of all entertainment payments concerning this device.
entertainment _count: Count of all entertainment payments concerning this device.
meal_sum: Sum of all meal payments concerning this device.
meal _count: Count of all meal payments concerning this device.
gift _sum: Sum of all gift payments concerning this device.
gift_count: Count of all  gift payments concerning this device.
grant _sum: Sum of all grant payments concerning this device.
grant _count: Count of all grant payments concerning this device.
honoraria_sum: Sum of all honoraria payments concerning this device.
honoraria _count: Count of all honoraria payments concerning this device.
royalty _sum: Sum of all royalty payments concerning this device.
royalty _count: Count of all royalty payments concerning this device.
facility_sum: Sum of all facility rental payments concerning this device.
facility _count: Count of all facility rental payments concerning this device.
travel_sum: Sum of all travel payments concerning this device.
travel _count: Count of all travel payments concerning this device.
research_sum: Sum of all research payments concerning this device.
research_payment_count: Count of all research payments concerning this device.
general_minus_royalty_sum: Sum of all general payments (without royalty payments included) concerning this device.
general_minus_royalty_count: Count of all general payments (without royalty payments included) concerning this device.

File: drug_company_lookup.csv
Rows: 5908
Columns: 4
Size: 312KB

This file can be used as a crosswalk when more than one company makes a payment regarding a drug or device. Note: Not all product_ids will match to the main files.

product_id: Unique identifier made by ProPublica to identify a product.
new_product_name: Standardized product name.
company_id: Unique identifier made by ProPublica to identify a company.
company: Name of company submitting payment information.

HOW WE USED THIS DATA

LICENSE

Using the attached data acknowledges that you've agreed to these terms:
You can’t charge people money to look at the data, or sell advertising specifically against it.
If you use the data for a printed table, chart, graphic or map, you must cite ProPublica.
You acknowledge that the data you use may contain errors and omissions, and you are solely responsible for checking our site for any corrections or updates.
You can't resell the data.
Before you e-mail us a question, please make sure you've read the explanation of our fields as well as our methodology.
If you have any questions, e-mail data@propublica.org.