Tags: There are currently no tags associated with this assignment group.

Students analyze aggregate Medicare claim information from US hospitals to answer the question: where should someone live to minimize their medical costs?

Overview

The purpose of this assignment is to have students do both the data mining (using something like Pandas or R) and analysis (in report form). The health care data it relies on is fairly clean, but different years report slightly different service codes which requires consideration.

Author: Hank Feild

Web links:

There are currently no web resources associated with this assignment group.

Versions of assignment by class:

CSC440—Data Mining & Visualization (Feild, Spring 2019)

(Back to top)

CSC440 Data Mining & Visualization, Spring 2019

Tags:

spring 2019 (9)

Notes

This is the first run of the assignment. Much of the instruction for the software (Python) was in class, though there were also weekly readings in both a book on Python for data analysis and visualizations in a data mining textbook.

Outcome summary

I was hoping students would use a number of visualizations, including geospatial maps, but the Python library we tried to use wasn't working in class. Several students included plots that could have been vastly improved, but we didn't spend much time on that in class. About half the students did a very nice deep dive into the data, as I was expecting. They almost all analyzed the suggested data in isolation, e.g., without looking at other sources of information. I wasn't expecting that, but I should have given how the assignment is worded. The formatting in some of the reports was wanting. In the future, I would offer students additional data sources (e.g., a standard of living multiplier or something similar). I would also spend more time going over visualizations and appropriate ways of using bar graphs, pie charts, etc. Finally, I would provide models of good and poor analyses for similar research questions.

Instructor:	Hank Feild
Field of study:	Computer Science
Learning curve:	High
Hours of instruction:	12.0
Assignment duration:	4.0 weeks
Students given assignment:	8

Where should one live in the US to minimize uninsured medical costs?

Overview

CSC440 Data Mining & Visualization, Spring 2019

Notes

Outcome summary

Materials

Handouts

From the web

Advanced search

Fields common to all verticals:

Fields just for assignments:

How do you want to combine these?

Which vertical do you want to search?

Where should one live in the US to minimize uninsured medical costs?

Overview

CSC440 Data Mining & Visualization, Spring 2019

Notes

Outcome summary

Materials

Handouts

From the web

Resources this assignment uses

Analyses

Software

Datasets

How-tos

Related assignments