Week 6 - Linking Data


Data come from many different sources—in this class alone you all are working with multiple data sets that describe distinct aspects of the city. Part of the opportunity of “big data” is the potential to coordinate these data sets in order to analyze them in conjunction with each other. In order to do so, however, we must leverage the schemas we learned about in the last module—that is, the people, places, and things that they share in common—to link them. In this module, you will leverage the geographical infrastructure of Boston to develop aggregate measures describing some unit of analysis (e.g., streets, census tracts), and then link them with other information describing those same units. We will also learn about the Actionable Intelligence for Social Policy’s efforts to solve this same problem for linking information about individuals across the data generated by the many different agencies responsible for health and human services.

Learning objectives

Substantive Readings

Choose one of the two papers:

And peruse Actionable Intelligence for Social Policy.


IDSes merge a wealth of personal data to support comprehensive scholarly and policy analysis.

Try to draw a diagram of the schema that you think enabled the study that you read. What other research questions do you think that that the data in that schema might be able to answer?

Technical Readings

Data Assignment

Last week you identified a latent construct and the manifest variables that will help you create it. For this week’s data assignment create your latent construct and conduct an analysis base on it.

Please submit your rendered output as well as the original Rmd document.

Assignment - Service Learning Response

Now that you’ve completed your midterm, how did the city walk and in-class workshop influence the practical application of your analysis in working with the data? Going forward, what are your goals in this aspect (practical application-wise)?

Due Friday at 23:59.

2 - 3 pages double spaced.