Question Description
THIS ASSIGNMENT MUST BE DONE IN EXCEL. THERE ARE 2 ” NEW QUESTIONS” AT THE BOTTOM OF THIS PAGE. THOSE ANSWERS MUST BE DONE IN WORD. The purpose of this assignment is to use spreadsheet capabilities to
perform data manipulation and to explain the process used in the
handling of the data.
For this assignment, you will use the “Claims” dataset. In the
dataset, the claims data for n = 608 people are recorded. The data
derive from a random sample of females diagnosed with ischemic heart
disease over 24 months (see Exercise 7.27 in the textbook).
Instead of using urgent care centers, some people rely on the
Emergency Room (ER) to address most, if not all, of their medical needs.
In fact, someone who has three or more ER visits within 24 months is
considered a high ER user. Complete the steps below to execute this
assignment.
- Using the dataset and Excel, create a new column titled “High_ER_User” with “Yes” if three or more ER visits; otherwise “No.”
- Duration is measured in days, but 30-day intervals are more
appropriate for most reporting purposes. Using Excel, create a new
column titled “Duration_Months” by converting the duration into 30-day
intervals. - Many times complications and comorbidities are rare; therefore,
these two negative events are summed together. Using Excel, create a new
column titled “Comps_Comorbs” by adding complications with
comorbidities. - Many times age is grouped in 10-year intervals. Using Excel’s
VLOOKUP function, create a new column titled “Age_Group” with grouped
ages of “21-30 yrs,” “31-40 yrs,” and so on for 10-year intervals. The
last age group would be “61-70 yrs.” Use a tab titled “Age_Groups” for
this task.
Next you will create a pivot table with the data and execute the
following (refer to the examples in the resource “Data Manipulation
Screenshots”).
- Use “High_ER_User” as a filter to obtain two filtered views of the pivot table.
- Summarize the data to get counts of claims, sum of claims and
months, and average of procedures, prescribed drugs, ER visits, and
complications/comorbidities. - Add a calculated field titled “Claims PM” to the pivot table. This
calculated field is the sum of claims divided by the sum of duration
months and measures the average claim amount per month (PM).
APA format is not required, but solid academic writing is expected.
This assignment uses a grading rubric. Please review the rubric prior
to beginning the assignment to become familiar with the expectations
for successful completion. Excel spreadsheet column updates specified in steps 1-4 are complete and correct. The pivot table described in steps 1-3 is complete and correct. New Question Suppose you had daily temperature data indicating the “high” point of
each day for 2015. If you want to show how the high differs over time,
what are some of the plot types that will allow you do this? What are
some benefits to binning the data into one of 52 weeks and plotting the
average high for each week? Would it make sense to do something similar
for the four quarters in the year? Why or why not? New Question Many times, data are missing because of various
reasons. This poses some challenges when doing data analysis. For
example, suppose you wanted to do some analysis of the yearly incomes of
the faculty at GCU. When asked for their incomes, 25% of the faculty
did not participate in the survey; therefore, their incomes are missing
from the dataset. How would you summarize the income data in this case?
Is it appropriate to ignore the missing incomes and summarize the data
without them? Should you estimate the missing incomes, perhaps with the
overall average, to complete the data set?