Payment and treatment in U.S. Hospital
Contents
Payment and treatment in U.S. Hospital¶
Author: Linjun Zhou
Course Project, UC Irvine, Math 10, S22
Introduction¶
The dataset “Payment_and_value_of_care_-_Hospital.csv” includes what Medicaid paid for three types of heart disease in each U.S. state. In this project, the payments for the three diseases are separated out to provide a more accurate estimate of the health care situation in each state. And then, The K-Nearest Neighbors Regressor is used to infer the payment association between the three diseases.
Main portion of the project¶
Dataset Adjustment¶
import pandas as pd
import altair as alt
import numpy as np
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsRegressor, KNeighborsClassifier
from sklearn.model_selection import train_test_split
df = pd.read_csv("/work/Payment_and_value_of_care_-_Hospital.csv")
# drop unnecessary data
df = df[[i for i in df.columns if i in ['Hospital name', 'State', 'Payment measure ID', 'Payment']]]
df = df.dropna()
# remove characters and convert type of column
for i in ["Payment"]:
for a in df.index:
df.loc[a,i] = df.loc[a,i].replace("$","")
for i in ["Payment"]:
df[i] = pd.to_numeric(df[i])
df
Hospital name | State | Payment measure ID | Payment | |
---|---|---|---|---|
0 | MARSHALL MEDICAL CENTER SOUTH | AL | PAYM_30_AMI | 23171.0 |
1 | MARSHALL MEDICAL CENTER SOUTH | AL | PAYM_30_HF | 16376.0 |
2 | MARSHALL MEDICAL CENTER SOUTH | AL | PAYM_30_PN | 14384.0 |
4 | WEDOWEE HOSPITAL | AL | PAYM_30_HF | 16649.0 |
5 | WEDOWEE HOSPITAL | AL | PAYM_30_PN | 13168.0 |
... | ... | ... | ... | ... |
14446 | SETON MEDICAL CENTER HAYS | TX | PAYM_30_HF | 17189.0 |
14448 | NORTH CYPRESS MEDICAL CENTER | TX | PAYM_30_AMI | 23587.0 |
14451 | MEMORIAL MEDICAL CENTER | WI | PAYM_30_PN | 13813.0 |
14452 | STAR VALLEY MEDICAL CENTER | WY | PAYM_30_PN | 18226.0 |
14453 | LAKEWAY REGIONAL MEDICAL CENTER, LLC | TX | PAYM_30_HF | 17076.0 |
9880 rows × 4 columns
df["Payment measure ID"].unique()
array(['PAYM_30_AMI', 'PAYM_30_HF', 'PAYM_30_PN'], dtype=object)
To better distinguish different payments, I create new dataframe which divides “payments” into three payments according to the payment measure.
for a,b in df.groupby('Payment measure ID'):
if a == 'PAYM_30_AMI':
df1 = b[[i for i in df.columns if i != 'Payment measure ID']]
elif a == 'PAYM_30_HF':
df_HF = b[[i for i in df.columns if i != 'Payment measure ID']]
else:
df_PN = b[[i for i in df.columns if i != 'Payment measure ID']]
df1.rename(columns={'Payment': 'AMI_Payment'}, inplace=True)
df1 = df1.reset_index()
df_HF = df_HF.reset_index()
df_PN = df_PN.reset_index()
df1.drop('index', axis=1, inplace=True)
df_HF.drop('index', axis=1, inplace=True)
df_PN.drop('index', axis=1, inplace=True)
df1['HF_Payment'] = df_HF['Payment']
df1['PN_Payment'] = df_PN['Payment']
df1.head()
Hospital name | State | AMI_Payment | HF_Payment | PN_Payment | |
---|---|---|---|---|---|
0 | MARSHALL MEDICAL CENTER SOUTH | AL | 23171.0 | 16376.0 | 14384.0 |
1 | CRESTWOOD MEDICAL CENTER | AL | 20007.0 | 16649.0 | 13168.0 |
2 | PROVIDENCE ALASKA MEDICAL CENTER | AK | 24309.0 | 14229.0 | 13258.0 |
3 | CHI-ST VINCENT INFIRMARY | AR | 23600.0 | 15339.0 | 12303.0 |
4 | CHICOT MEMORIAL MEDICAL CENTER | AR | 23543.0 | 14558.0 | 10817.0 |
numcols = ['AMI_Payment', 'HF_Payment', 'PN_Payment']
df1[numcols]
AMI_Payment | HF_Payment | PN_Payment | |
---|---|---|---|
0 | 23171.0 | 16376.0 | 14384.0 |
1 | 20007.0 | 16649.0 | 13168.0 |
2 | 24309.0 | 14229.0 | 13258.0 |
3 | 23600.0 | 15339.0 | 12303.0 |
4 | 23543.0 | 14558.0 | 10817.0 |
... | ... | ... | ... |
2338 | 20340.0 | 15377.0 | 13525.0 |
2339 | 22608.0 | 15453.0 | 15425.0 |
2340 | 23941.0 | 17143.0 | 13979.0 |
2341 | 22231.0 | 17935.0 | 12097.0 |
2342 | 23587.0 | 16171.0 | 13932.0 |
2343 rows × 3 columns
Some trials about clustering¶
The graph below is the overview of three treatments’ payments. It shows that payment for heart attack patients (with payment measure id “PAYM_30_AMI”) has the highest cost.
alt.data_transformers.enable('default', max_rows=None)
c1 = alt.Chart(df).mark_boxplot(size=50, extent=0.5).encode(
x="Payment measure ID",
y=alt.Y('Payment',scale=alt.Scale(zero=False))
).properties(width=300).configure_axis(
labelFontSize=16,
titleFontSize=16
)
c1
kmeans1 = KMeans(n_clusters=2)
kmeans1.fit(df1[numcols])
KMeans(n_clusters=2)
df1['pred'] = kmeans1.predict(df1[numcols])
df1
Hospital name | State | AMI_Payment | HF_Payment | PN_Payment | pred | |
---|---|---|---|---|---|---|
0 | MARSHALL MEDICAL CENTER SOUTH | AL | 23171.0 | 16376.0 | 14384.0 | 0 |
1 | CRESTWOOD MEDICAL CENTER | AL | 20007.0 | 16649.0 | 13168.0 | 0 |
2 | PROVIDENCE ALASKA MEDICAL CENTER | AK | 24309.0 | 14229.0 | 13258.0 | 1 |
3 | CHI-ST VINCENT INFIRMARY | AR | 23600.0 | 15339.0 | 12303.0 | 1 |
4 | CHICOT MEMORIAL MEDICAL CENTER | AR | 23543.0 | 14558.0 | 10817.0 | 0 |
... | ... | ... | ... | ... | ... | ... |
2338 | MEMORIAL MEDICAL CENTER | WI | 20340.0 | 15377.0 | 13525.0 | 0 |
2339 | BAYLOR SCOTT AND WHITE MEDICAL CENTER SUNNYVALE | TX | 22608.0 | 15453.0 | 15425.0 | 0 |
2340 | BAYLOR SCOTT AND WHITE MEDICAL CENTER MCKINNEY | TX | 23941.0 | 17143.0 | 13979.0 | 1 |
2341 | SETON MEDICAL CENTER HARKER HEIGHTS | TX | 22231.0 | 17935.0 | 12097.0 | 0 |
2342 | NORTH CYPRESS MEDICAL CENTER | TX | 23587.0 | 16171.0 | 13932.0 | 1 |
2343 rows × 6 columns
c1=[]
for i in numcols:
c1.append(alt.Chart(df1).mark_circle().encode(
x = alt.X('State'),
y = alt.Y(i),
color = "pred:N"
))
c1[0]&c1[1]&c1[2]
I use KMeans here and want to classify the hospital into two categories, one represents high payments and other represents low payments. However, according to the graphs above, only the first figure is reasonable and valueable. And the reason might be the generally higher value of first figure, which means it will account for a larger portion of the algorithm. So next I will rescale data and try cluster again.
scaler = StandardScaler(with_mean=True, with_std=False)
scaler.fit(df1[numcols])
StandardScaler(with_std=False)
df2 = df1.copy()
df2[numcols] = scaler.transform(df1[numcols])
kmeans2 = KMeans(n_clusters=2)
kmeans2.fit(df2[numcols])
KMeans(n_clusters=2)
df2['pred'] = kmeans2.predict(df2[numcols])
alt.Chart(df2).mark_circle().encode(
x = 'PN_Payment',
y = 'HF_Payment',
color = "pred:N"
)
This chart is pretty chaos, it shows that my second cluster does not work either.
c2=[]
for i in numcols:
c2.append(alt.Chart(df2).mark_circle().encode(
x = alt.X('State'),
y = alt.Y(i),
color = "pred:N"
))
c2[0]&c2[1]&c2[2]
This chart also makes no sense, the cluster only work for one payment. So next I will use KMeans to predict three payments seperately.
for i in numcols:
kmeansi = KMeans(n_clusters=2)
kmeansi.fit(df2[[i]])
df2[f'pred_{i}'] = kmeansi.predict(df2[[i]])
df2
Hospital name | State | AMI_Payment | HF_Payment | PN_Payment | pred | pred_AMI_Payment | pred_HF_Payment | pred_PN_Payment | cluster | type_state | |
---|---|---|---|---|---|---|---|---|---|---|---|
714 | BAYHEALTH - KENT GENERAL HOSPITAL | DE | -1201.681178 | 1925.911652 | 662.243278 | 1 | 0 | 1 | 1 | 3 | 2.166667 |
625 | BAYHEALTH - MILFORD MEMORIAL HOSPITAL | DE | 986.318822 | 3038.911652 | 1329.243278 | 0 | 1 | 1 | 1 | 0 | 2.166667 |
623 | CHRISTIANA CARE HEALTH SERVICES, INC. | DE | 1057.318822 | 906.911652 | -676.756722 | 0 | 1 | 1 | 0 | 1 | 2.166667 |
485 | NANTICOKE MEMORIAL HOSPITAL | DE | 1048.318822 | -1004.088348 | 374.243278 | 0 | 1 | 0 | 1 | 2 | 2.166667 |
562 | BEEBE MEDICAL CENTER | DE | 3679.318822 | -1690.088348 | 834.243278 | 0 | 1 | 0 | 1 | 2 | 2.166667 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1931 | SANFORD USD MEDICAL CENTER | SD | -510.681178 | -952.088348 | -2479.756722 | 1 | 0 | 0 | 0 | 7 | 6.181818 |
1910 | AVERA MCKENNAN HOSPITAL & UNIVERSITY HEALTH CE... | SD | -935.681178 | -2082.088348 | -1312.756722 | 1 | 0 | 0 | 0 | 7 | 6.181818 |
1922 | RAPID CITY REGIONAL HOSPITAL | SD | -109.681178 | -646.088348 | 767.243278 | 1 | 0 | 0 | 1 | 6 | 6.181818 |
1984 | BROOKINGS HEALTH SYSTEM | SD | -1803.681178 | -28.088348 | -110.756722 | 1 | 0 | 0 | 0 | 7 | 6.181818 |
150 | COMMONWEALTH HEALTH CENTER | MP | -6907.681178 | -1760.088348 | -1121.756722 | 1 | 0 | 0 | 0 | 7 | 7.000000 |
2343 rows × 11 columns
c3=[]
for i in numcols:
c3.append(alt.Chart(df2).mark_circle().encode(
x = alt.X('State'),
y = alt.Y(i),
color = f"pred_{i}:N"
))
c3[0]&c3[1]&c3[2]
Each payment is divided into two clusters, and then I will aggregate the eight combination cases of the clusters of three payments.
df2['cluster'] = 0
i = "pred_AMI_Payment"
j = "pred_HF_Payment"
k = "pred_PN_Payment"
df2.loc[(df2[i] == 0) & (df2[j] == 0) & (df2[k] == 1), 'cluster'] = 1
df2.loc[(df2[i] == 0) & (df2[j] == 1) & (df2[k] == 0), 'cluster'] = 2
df2.loc[(df2[i] == 1) & (df2[j] == 0) & (df2[k] == 0), 'cluster'] = 3
df2.loc[(df2[i] == 0) & (df2[j] == 1) & (df2[k] == 1), 'cluster'] = 4
df2.loc[(df2[i] == 1) & (df2[j] == 0) & (df2[k] == 1), 'cluster'] = 5
df2.loc[(df2[i] == 1) & (df2[j] == 1) & (df2[k] == 0), 'cluster'] = 6
df2.loc[(df2[i] == 1) & (df2[j] == 1) & (df2[k] == 1), 'cluster'] = 7
The aggregation is from 0 to 7. The default number is 0 so I did not reset it here. The larger the aggregation number, the higher the payment.
df2
Hospital name | State | AMI_Payment | HF_Payment | PN_Payment | pred | pred_AMI_Payment | pred_HF_Payment | pred_PN_Payment | cluster | type_state | |
---|---|---|---|---|---|---|---|---|---|---|---|
714 | BAYHEALTH - KENT GENERAL HOSPITAL | DE | -1201.681178 | 1925.911652 | 662.243278 | 1 | 0 | 1 | 1 | 4 | 2.166667 |
625 | BAYHEALTH - MILFORD MEMORIAL HOSPITAL | DE | 986.318822 | 3038.911652 | 1329.243278 | 0 | 1 | 1 | 1 | 7 | 2.166667 |
623 | CHRISTIANA CARE HEALTH SERVICES, INC. | DE | 1057.318822 | 906.911652 | -676.756722 | 0 | 1 | 1 | 0 | 6 | 2.166667 |
485 | NANTICOKE MEMORIAL HOSPITAL | DE | 1048.318822 | -1004.088348 | 374.243278 | 0 | 1 | 0 | 1 | 5 | 2.166667 |
562 | BEEBE MEDICAL CENTER | DE | 3679.318822 | -1690.088348 | 834.243278 | 0 | 1 | 0 | 1 | 5 | 2.166667 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1931 | SANFORD USD MEDICAL CENTER | SD | -510.681178 | -952.088348 | -2479.756722 | 1 | 0 | 0 | 0 | 0 | 6.181818 |
1910 | AVERA MCKENNAN HOSPITAL & UNIVERSITY HEALTH CE... | SD | -935.681178 | -2082.088348 | -1312.756722 | 1 | 0 | 0 | 0 | 0 | 6.181818 |
1922 | RAPID CITY REGIONAL HOSPITAL | SD | -109.681178 | -646.088348 | 767.243278 | 1 | 0 | 0 | 1 | 1 | 6.181818 |
1984 | BROOKINGS HEALTH SYSTEM | SD | -1803.681178 | -28.088348 | -110.756722 | 1 | 0 | 0 | 0 | 0 | 6.181818 |
150 | COMMONWEALTH HEALTH CENTER | MP | -6907.681178 | -1760.088348 | -1121.756722 | 1 | 0 | 0 | 0 | 0 | 7.000000 |
2343 rows × 11 columns
alt.Chart(df2).mark_circle().encode(
x = 'HF_Payment',
y = 'AMI_Payment',
color = "cluster:N"
)
This chart is still a bit messy, but it is much clear than charts in my previous trials. In this chart, clusters with first four lower numbers are below axis = 0, and other four clusters are above axis = 0.
c4 = alt.Chart(df2).mark_boxplot().encode(
x = 'State',
y = 'mean(cluster)'
)
c4
selection = alt.selection_single()
c5 = alt.Chart(df2).mark_circle().encode(
x = 'State',
y = 'mean(cluster)',
tooltip = [alt.Tooltip('mean(cluster)'), alt.Tooltip('State')]
).add_selection(selection)
c5
Here I make other two charts to visualize the average payments of each state. From the circle chart we can know that states like MP and SD have the lowest values of payments, and states like DE and NJ have the highest values of payments.
Finding overall payments of each state¶
x = df2['State'].unique()
df2['type_state'] = 0
for a,b in df2.groupby('State'):
for i in range(len(x)):
if a == x[i]:
df2.loc[b.index,'type_state'] = b['cluster'].mean()
I create a new column which represents the average of each state’s cluster.
df2['type_state'].unique()
array([4.83333333, 4.77777778, 4.73684211, 4.68421053, 4.21568627,
4.04347826, 4. , 3.96296296, 3.81632653, 3.88888889,
3.61111111, 3.57142857, 3.5 , 3.44791667, 3.40291262,
3.40462428, 3.40909091, 3.31372549, 3.25 , 3.16666667,
3.15555556, 3.11111111, 3.10869565, 3.04 , 2.96551724,
2.86666667, 2.8 , 2.78125 , 2.75862069, 2.72727273,
2.70149254, 2.69014085, 2.65625 , 2.59259259, 2.58064516,
2.57142857, 2.53333333, 2.525 , 2.5 , 2.38181818,
2.35555556, 2.34615385, 2.33333333, 2.29166667, 2.11111111,
2.07692308, 2. , 1.90909091, 1.58333333, 1.54166667,
1. , 0.81818182, 0. ])
df2 = df2.sort_values(['type_state'])
df2
Hospital name | State | AMI_Payment | HF_Payment | PN_Payment | pred | pred_AMI_Payment | pred_HF_Payment | pred_PN_Payment | cluster | type_state | |
---|---|---|---|---|---|---|---|---|---|---|---|
150 | COMMONWEALTH HEALTH CENTER | MP | -6907.681178 | -1760.088348 | -1121.756722 | 1 | 0 | 0 | 0 | 0 | 0.000000 |
1915 | PRAIRIE LAKES HOSPITAL | SD | -221.681178 | 638.911652 | -544.756722 | 1 | 0 | 1 | 0 | 2 | 0.818182 |
1913 | AVERA ST LUKES | SD | -1586.681178 | -737.088348 | 573.243278 | 1 | 0 | 0 | 1 | 1 | 0.818182 |
1872 | AVERA SACRED HEART HOSPITAL | SD | -1060.681178 | -1325.088348 | 597.243278 | 1 | 0 | 0 | 1 | 1 | 0.818182 |
1891 | HURON REGIONAL MEDICAL CENTER | SD | -1659.681178 | -1845.088348 | -788.756722 | 1 | 0 | 0 | 0 | 0 | 0.818182 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
562 | BEEBE MEDICAL CENTER | DE | 3679.318822 | -1690.088348 | 834.243278 | 0 | 1 | 0 | 1 | 5 | 4.833333 |
485 | NANTICOKE MEMORIAL HOSPITAL | DE | 1048.318822 | -1004.088348 | 374.243278 | 0 | 1 | 0 | 1 | 5 | 4.833333 |
623 | CHRISTIANA CARE HEALTH SERVICES, INC. | DE | 1057.318822 | 906.911652 | -676.756722 | 0 | 1 | 1 | 0 | 6 | 4.833333 |
625 | BAYHEALTH - MILFORD MEMORIAL HOSPITAL | DE | 986.318822 | 3038.911652 | 1329.243278 | 0 | 1 | 1 | 1 | 7 | 4.833333 |
714 | BAYHEALTH - KENT GENERAL HOSPITAL | DE | -1201.681178 | 1925.911652 | 662.243278 | 1 | 0 | 1 | 1 | 4 | 4.833333 |
2343 rows × 11 columns
selection = alt.selection_single()
c6 = alt.Chart(df2).mark_circle().encode(
y = 'type_state',
x = 'type_state',
color = 'State',
tooltip = [alt.Tooltip('type_state'), alt.Tooltip('State')]
).add_selection(selection)
c6
Here I chart the average of each state’s cluster in order.
Assuming type_state >= 3.5 is outstanding states. So we have:
temp_lst = [df2.loc[i,'State'] for i in df2.index if df2.loc[i,'type_state'] >= 3.5]
o_state = np.array(temp_lst)
o_state = np.unique(o_state)
print(f"Outstanding states are: {o_state}")
Outstanding states are: ['CT' 'DC' 'DE' 'FL' 'IL' 'KS' 'MA' 'NE' 'NH' 'NJ' 'NV' 'RI' 'UT']
Section 2 : Predict the AMI payment by using the K-Nearest Neighbors Regressor¶
df3 = df1.copy().drop('pred', axis=1)
df3
Hospital name | State | AMI_Payment | HF_Payment | PN_Payment | |
---|---|---|---|---|---|
0 | MARSHALL MEDICAL CENTER SOUTH | AL | 23171.0 | 16376.0 | 14384.0 |
1 | CRESTWOOD MEDICAL CENTER | AL | 20007.0 | 16649.0 | 13168.0 |
2 | PROVIDENCE ALASKA MEDICAL CENTER | AK | 24309.0 | 14229.0 | 13258.0 |
3 | CHI-ST VINCENT INFIRMARY | AR | 23600.0 | 15339.0 | 12303.0 |
4 | CHICOT MEMORIAL MEDICAL CENTER | AR | 23543.0 | 14558.0 | 10817.0 |
... | ... | ... | ... | ... | ... |
2338 | MEMORIAL MEDICAL CENTER | WI | 20340.0 | 15377.0 | 13525.0 |
2339 | BAYLOR SCOTT AND WHITE MEDICAL CENTER SUNNYVALE | TX | 22608.0 | 15453.0 | 15425.0 |
2340 | BAYLOR SCOTT AND WHITE MEDICAL CENTER MCKINNEY | TX | 23941.0 | 17143.0 | 13979.0 |
2341 | SETON MEDICAL CENTER HARKER HEIGHTS | TX | 22231.0 | 17935.0 | 12097.0 |
2342 | NORTH CYPRESS MEDICAL CENTER | TX | 23587.0 | 16171.0 | 13932.0 |
2343 rows × 5 columns
X_train, X_test, y_train, y_test = train_test_split(
df3[["HF_Payment", "PN_Payment"]], df3["AMI_Payment"], test_size = 0.4)
reg = KNeighborsRegressor(n_neighbors=2)
reg.fit(X_train, y_train)
KNeighborsRegressor(n_neighbors=2)
df3['pred'] = reg.predict(df3[["HF_Payment", "PN_Payment"]])
df3
Hospital name | State | AMI_Payment | HF_Payment | PN_Payment | pred | |
---|---|---|---|---|---|---|
0 | MARSHALL MEDICAL CENTER SOUTH | AL | 23171.0 | 16376.0 | 14384.0 | 21394.5 |
1 | CRESTWOOD MEDICAL CENTER | AL | 20007.0 | 16649.0 | 13168.0 | 24125.0 |
2 | PROVIDENCE ALASKA MEDICAL CENTER | AK | 24309.0 | 14229.0 | 13258.0 | 23598.0 |
3 | CHI-ST VINCENT INFIRMARY | AR | 23600.0 | 15339.0 | 12303.0 | 22985.5 |
4 | CHICOT MEMORIAL MEDICAL CENTER | AR | 23543.0 | 14558.0 | 10817.0 | 23346.0 |
... | ... | ... | ... | ... | ... | ... |
2338 | MEMORIAL MEDICAL CENTER | WI | 20340.0 | 15377.0 | 13525.0 | 22522.5 |
2339 | BAYLOR SCOTT AND WHITE MEDICAL CENTER SUNNYVALE | TX | 22608.0 | 15453.0 | 15425.0 | 22758.5 |
2340 | BAYLOR SCOTT AND WHITE MEDICAL CENTER MCKINNEY | TX | 23941.0 | 17143.0 | 13979.0 | 22909.5 |
2341 | SETON MEDICAL CENTER HARKER HEIGHTS | TX | 22231.0 | 17935.0 | 12097.0 | 21336.5 |
2342 | NORTH CYPRESS MEDICAL CENTER | TX | 23587.0 | 16171.0 | 13932.0 | 23185.5 |
2343 rows × 6 columns
c11 = alt.Chart(df3).mark_circle().encode(
x = alt.X('HF_Payment', scale=alt.Scale(zero=False)),
y = alt.Y('pred', scale=alt.Scale(zero=False))
)
c12 = alt.Chart(df3).mark_circle(color='purple').encode(
x = alt.X('HF_Payment', scale=alt.Scale(zero=False)),
y = alt.Y('AMI_Payment', scale=alt.Scale(zero=False))
)
c11+c12
reg.score(df3[["HF_Payment", "PN_Payment"]], df3[['AMI_Payment']])
0.06560938956960727
Although two charts above are pretty similar, the accuracy of prediction is significantly low. Hence, there is no relation between HF_Payment, PN_Payment and AMI_Payment.
Summary¶
Either summarize what you did, or summarize the results. Maybe 3 sentences.
In this project, I used KMeans to group hospital payments by state and to visualize the overall payments by state. Basically, this result is in accordance with the real situation that the top three states, DE, NJ, and NE, with highest payments are very high ranked in terms of health care in the U.S. In addition, I used two of the payment scenarios to predict the other payment scenario, but the prediction accuracy was very low, so there was no connection between them.
References¶
What is the source of your dataset(s)?
The dataset “Payment_and_value_of_care_-_Hospital.csv” was adapted from Hospital Payment and Value of Care
Were any portions of the code or ideas taken from another source? List those sources here and say how they were used.
List other references that you found helpful.