Downloading the CMS Open Payments data

The CMS Open Payments data is available directly as a web-based download from the CMS website. We'll download the data using the Unix wget utility, but first we have to register with the CMS website to get our own API key:

  1. Go to https://openpaymentsdata.cms.gov and click on the Sign In link at the top-right of the page:

Homepage of CMS OpenPayments

 Click on Sign Up:

Sign-Up Page on CMS OpenPayments

Enter your information and click on the Create My Account button:

Sign-Up Form for CMS OpenPayments

 Sign In to your account:

Signing into CMS OpenPayments

Click on Manage under Packt Developer's Applications. Note that Applications here refers to apps that you may create that will query data available on the CMS website:

Creating 'Applications'

Assign a name for the application (examples are shown in the following image):

Defining an application

 You'll get a notification that the Application Token has been created:

Creating the Application Token

The system will generate an App Token. Copy the App Token:

The Application Token
  1. Now, log in to the Packt Data Science VM as user packt and execute the following shell command after replacing the term YOURAPPTOKEN with the one that you were assigned (it will be a long string of characters/numbers). Note that for the tutorial, we will only download a few of the columns and restrict the data to only physicians (the other option is hospitals).

You can reduce the volume of the data downloaded by reducing the value of the limit specified at the end of the command to a lower number. In the command, we have used 12000000 (12 million), which would let us download the entire 2016 dataset representing physician payments. The application will still work if, for example, you were to download only one million entries instead of the approximately 11-12 million records.

Note: Two approaches are shown below. One using the Token and the other without using the Token. Application Tokens allow users to have a higher throttling limit. More information can be found at https://dev.socrata.com/docs/app-tokens.html
# Replace YOURAPPTOKEN and 12000000 with your API Key and desired record limit respectively

cd /home/packt;


time wget -O cms2016.csv 'https://openpaymentsdata.cms.gov/resource/vq63-hu5i.csv?$$app_token=YOURAPPTOKEN&$query=select Physician_First_Name as firstName,Physician_Last_Name as lastName,Recipient_City as city,Recipient_State as state,Submitting_Applicable_Manufacturer_or_Applicable_GPO_Name as company,Total_Amount_of_Payment_USDollars as payment,Date_of_Payment as date,Nature_of_Payment_or_Transfer_of_Value as paymentNature,Product_Category_or_Therapeutic_Area_1 as category,Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_1 as product where covered_recipient_type like "Covered Recipient Physician" limit 12000000'
Important: It is possible to also download the file without using an app token. However, the method should be used sparingly. The URL to download the file without using an application token is shown as follows:
# Downloading without using APP TOKEN

wget -O cms2016.csv 'https://openpaymentsdata.cms.gov/resource/vq63-hu5i.csv?$query=select Physician_First_Name as firstName,Physician_Last_Name as lastName,Recipient_City as city,Recipient_State as state,Submitting_Applicable_Manufacturer_or_Applicable_GPO_Name as company,Total_Amount_of_Payment_USDollars as payment,Date_of_Payment as date,Nature_of_Payment_or_Transfer_of_Value as paymentNature,Product_Category_or_Therapeutic_Area_1 as category,Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_1 as product where covered_recipient_type like "Covered Recipient Physician" limit 12000000'
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.30.62