# Data Manipulation

The purpose of this assignment is to use spreadsheet capabilities to perform data manipulation and to explain the process used in the handling of the data.

For this assignment, you will use the “Claims” dataset. In the dataset, the claims data for n = 608 people are recorded. The data derive from a random sample of females diagnosed with ischemic heart disease over 24 months (see Exercise 7.27 in the textbook).

Instead of using urgent care centers, some people rely on the Emergency Room (ER) to address most, if not all, of their medical needs. In fact, someone who has three or more ER visits within 24 months is considered a high ER user. Complete the steps below to execute this assignment.

1. Using the dataset and Excel, create a new column titled “High_ER_User” with “Yes” if three or more ER visits; otherwise “No.”
2. Duration is measured in days, but 30-day intervals are more appropriate for most reporting purposes. Using Excel, create a new column titled “Duration_Months” by converting the duration into 30-day intervals.
3. Many times complications and comorbidities are rare; therefore, these two negative events are summed together. Using Excel, create a new column titled “Comps_Comorbs” by adding complications with comorbidities.
4. Many times age is grouped in 10-year intervals. Using Excel’s VLOOKUP function, create a new column titled “Age_Group” with grouped ages of “21-30 yrs,” “31-40 yrs,” and so on for 10-year intervals. The last age group would be “61-70 yrs.” Use a tab titled “Age_Groups” for this task.

Next you will create a pivot table with the data and execute the following (refer to the examples in the resource “Data Manipulation Screenshots”).

1. Use “High_ER_User” as a filter to obtain two filtered views of the pivot table.
2. Summarize the data to get counts of claims, sum of claims and months, and average of procedures, prescribed drugs, ER visits, and complications/comorbidities.
3. Add a calculated field titled “Claims PM” to the pivot table. This calculated field is the sum of claims divided by the sum of duration months and measures the average claim amount per month (PM).

APA format is not required, but solid academic writing is expected.

This assignment uses a grading rubric. Please review the rubric prior to beginning the assignment to become familiar with the expectations for successful completion.

[promo2]