Resources for Data Science/Big Data

Tools

Atom

Atom is a nice text editor with support for multiple languages.

Since Atom is not downloaded from App Store, it may fail to install when you simply open it. You need to allow Mac to install an APP from an unidentified developer. It will appear in System Preferences > Security & Privacy, under the General tab. Click Open Anyway to confirm your intent to open or install the app.

To install the atom and apm commands, run “Window: Install Shell Commands” from the Command Palette (Press Cmd+Shift+P), which will prompt you for an administrator password.

Git and Github

Git and Github together provide an important way for version control.

  1. R studio and Github

R and R studio

I have been gradually migrating from Stata to R because there exist many useful packages that facilitate both teaching and (reproducible) research in an elegant way. For example, this website was first created completely using Radix, an R package built by Yihui Xie and his coauthors, which is now replaced by Distill. Recently, I have also started to use bookdown to organize and write notes and papers.

Click here for some “translations” between R and Stata

Data Visualization Tools

  1. Patchwork: Multiple Plots

R Markdown

  1. R Markdown for Medicine Workshop

  2. Stata and R Markdown

Website

  1. Tutorials on Creating Distill Website

Stata

  1. lassopack: Model Selection and Prediction with Regularized Regression in Stata Stata article

R and Stata Packages to process Data commonly used in Economics

  1. CPS Data on IPUMS

  2. U.S. Department of Education College Scorecard

  3. PSID Tools in Stata

  4. PSID in R

  5. The Standardized World Income Inequality Database

  6. Tidycensus

  7. Other Census Bureau datasets

  8. Analyze Survey Data for Free (Include many publicly available datasets)