ECONOMICS 375
Brian Phelan
Fall 2014
This handout provides a very brief introduction to STATA, a convenient and versatile econometrics package. In the past 20 years, STATA has become one of the leading statistical programs used by economic researchers. STATA was written by economists so it is more intuitive for researchers in our field. It is fast and relatively easy to use.
STATA‟s speed advantage comes from the fact that all data is loaded into RAM. Subsequently, the amount of high memory restricts the size of the problem. Given the size of the data sets we will use in class and the available memory on typical machines, this will not prove to be a constraint.
All the STATA data files, sample programs, this handout, etc., will be available on D2L.
This outline demonstrates those STATA procedures necessary for the course. However, this handout only scratches the surface of STATA‟s capabilities. The text is written so that you should be able to follow along on a computer with STATA and gradually build up to the point where you can generate simple statistics.
Some places on the web where you can learn more about STATA include
STATA faq‟s http://www.stata.com/support/faqs/
The STATA listserv http://www.stata.com/statalist/
UCLA‟s resources for learning STATA http://www.ats.ucla.edu/stat/stata/
STATA Availability
STATA version 11 is available in all computer labs on campus. If you want your own copy of STATA, version
13/IC (Intercooled – very fast) is available for a one-year site license fee of $98. This product can be purchased through the STATA Grad Purchase plan at http://www.stata.com/order/new/edu/gradplans/campus-gradplan/.
This is not required for class, but, if you want to use STATA on your own laptop/desktop, this is the only available avenue. Once you are into STATA
Click on the STATA icon and the program will open.
When you first enter STATA, the screen will look like Figure 1 below. You will notice that there are five boxes on the screen. I want to focus on four at this time.
Area A is called the command line. This is where you will type executable statements.
Area B is the variable list. Once you load a data set into STATA, all the variables available to you will be listed in the box.
Area C is the review box and it will contain a history of all the commands executed during this STATA session. Area D is where any results will be reported.
1
Figure 1
C
B
D
A
The command line is the active area of the screen where you will be typing all your commands. The contents of the other boxes will be determined by what you type here. Once in STATA, the cursor should be blinking in the command line indicating to you that the program is waiting to accept input. Commands are executed by hitting return after you have typed the command.
Throughout this tutorial, anything written in COURIER FONT is a command that should be executed through the command line.
There are two ways to produce statistics in STATA. First, you can write executable statements, line by line from the command line, and execute the codes. Alternatively, you can write an entire program that contains a group of executable statements, then submit the program from the command line. In the text below, we will indicate the „lineby-line‟ interactive approach but in Appendix 1, I provide a STATA program, cps87.do, that generates all the results in this tutorial. At the end of the handout, I also outline how to execute the batch program. The results from this single program are reported in Appendix 2. Please refer to these results when you want to see the output from any particular line-by-line statement in the tutorial below.
From the command line, you can ask for help at any point. Suppose you wanted some information about how to describe the contents of data sets. From the command line, you would type
help describe then hit return. A pop-up box appears that outlines the syntax for the „describe‟ command. Notice also that