# FIN 3010 Business Finance Middle Tennessee State University

- Description
- Full Document

Middle Tennessee State University

FIN 3010 Business Finance Middle Tennessee State University

1 Getting started 1

1.1 What is Stata? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 What does Stata look like? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.3 Getting help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Data management in Stata 4

2.1 Variables and data types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2 Formats and variable labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.3 Data input and saving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.4 Data description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.5 Changing data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.6 Generating new variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.7 Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.8 Keeping track of your work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.9 Saving data and results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3 Simple linear regression – estimation of an optimal hedge ratio 18

4 Hypothesis testing – Example 1: hedging revisited 26

5 Estimation and hypothesis Testing – Example 2: the CAPM 29

6 Sample output for multiple hypothesis tests 34

7 Multiple regression using an APT-style model 35

7.1 Stepwise regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

8 Quantile Regression 40

9 Calculating principal component 45

10 Diagnostic testing 47

10.1 Testing for heteroscedasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

10.2 Using White’s modified standard error estimates . . . . . . . . . . . . . . . . . . . . . . . 51

10.3 The Newey-West procedure for estimating standard errors . . . . . . . . . . . . . . . . . 52

10.4 Autocorrelation and dynamic models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

10.5 Testing for non-normality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

10.6 Dummy variable construction and use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

10.7 Multicollinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

10.8 RESET tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

10.9 Stability tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

11 Constructing ARMA models 69

12 Forecasting using ARMA models 77

13 Estimating exponential smoothing models 81

15 VAR estimation 88

16 Testing for unit roots 97

17 Testing for cointegration and modelling cointegrated systems 101

18 Volatility modelling 115

18.1 Testing for ‘ARCH effects’ in exchange rate returns . . . . . . . . . . . . . . . . . . . . . 115

18.2 Estimating GARCH models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

18.3 GJR and EGARCH models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

18.4 GARCH-M estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

18.5 Forecasting from GARCH models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

18.6 Estimation of multivariate GARCH models . . . . . . . . . . . . . . . . . . . . . . . . . . 126

19 Modelling seasonality in financial data 129

19.1 Dummy variables for seasonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

19.2 Estimating Markov switching models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

20 Panel data models 135

20.1 Testing for unit roots and cointegration in panels . . . . . . . . . . . . . . . . . . . . . . 141

21 Limited dependent variable models 146

22 Simulation Methods 156

22.1 Deriving critical values for a Dickey-Fuller test using simulation . . . . . . . . . . . . . . 156

22.2 Pricing Asian options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

22.3 VaR estimation using bootstrapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

23 The Fama-MacBeth procedure 168

1 Getting started

1.1 What is Stata?

Stata is a statistical package for managing, analysing, and graphing data.1 Stata’s main strengths are

handling and manipulating large data sets, and its ever-growing capabilities for handling panel and time-

series regression analysis. Besides its wealth of diagnostic tests and estimation routines, one feature that

makes Stata a very suitable econometrics software package for both novices and experts of econometric

analyses is that it may be used either as a point-and-click application or as a command-driven package.

Stata’s graphical user interface provides an easy interface for those new to Stata and for experienced

Stata users who wish to execute a command that they seldom use. The command language provides

a fast way to communicate with Stata and to communicate more complex ideas. In this guide we will

primarily be working with the graphical user interface. However, we also provide the corresponding

command language, where suitable.

This guide is based on Stata 14.0. Please note that if you use an earlier version of Stata the design

of the specification windows as well as certain features of the menu structure might differ. As different

statistical software packages might use different algorithms for some of their estimation techniques the

results generated by Stata might not be comparable to those generated by EViews, in each instance.

A good way of familiarising yourself with Stata is to learn about its main menus and their relation-

ships through the examples given in this guide.

This section assumes that readers have a licensed copy of Stata and have successfully loaded it onto

an available computer. There now follows a description of the Stata package, together with instructions

to achieve standard tasks and sample output. Any instructions that must be entered or icons to be

clicked are illustrated by bold-faced type. Note that Stata is case-sensitive. Thus, it is important

to enter commands in lower-case and to refer to variables as they were originally defined, i.e. either as

lower-case or CAPITAL letters.

1.2 What does Stata look like?

When you open Stata you will be presented with the Stata main window, which should resemble figure

1. You will soon realise that the main window is actually sub-divided into several smaller windows. The

five most important windows are the Review, Output, Command, Variables, and Properties windows

(as indicated in the screenshot below). This sub-section briefly describes the characteristics and main

functions of each window. There are other, more specialized windows such as the Viewer, Data Editor,

Variables Manager, and Do-file Editor – which are discussed later in this guide.2

The Variables window to the right shows the list of variables in the dataset, along with selected

properties of the variables. By default, it shows all the variables and their labels. You can change the

properties that are displayed by right-clicking on the header of any column of the Variables window.

Below the Variables window you will find the Properties window. It displays variable and dataset

properties. If you select a single variable in the Variables window, this is were its properties are displayed.

If there are multiple variables selected in the Variables window, the Properties window will display

properties that are common across all selected variables.

Commands are submitted to Stata via the Command window. Assuming you know what command

you would like to use, you just type it into the command window and press Enter to execute the

command. When a command is executed – with or without error – the output of that command (e.g.

1This sub-section is based on the description provided in the Stata manual [U] User’s Guide.

2This section is based on the Stata manual [GS] Getting Started. The intention of this sub-section is to provide a

Figure 1: The Stata Main Windows

the table of summary statistics or the estimation output) appears in the Results window. It contains

all the commands that you have entered during the Stata session and their textual results . Note that

the output of a particular test or command shown in the Results window does not differ whether you

use the command language or the point-and-click menu to execute it.

Besides being able to see the output of your commands in the Output window, the command line

will also appear in the Review window at the left-hand-side of the main window. The Review window

shows the history of all commands that have been entered during one session. Note that it displays

successfully executed commands in black and unsuccessful commands – along with their error codes –

in red. You may click on any command in the Review window and it will reappear in the Command

window, where you can edit and/or resubmit it.

The different windows are interlinked. For instance, by double-clicking on a variable in the Variables

window you can send it to the Command window or you can adjust or re-run certain commands you

have previously executed using the Review window.

There are two ways by which you can tell Stata what you would like it to do: you can directly type

in the command into the Command window or you can use the click-and-point menu. Access to the

click-and-point menu can be found at the top left of the Stata main window. You will find that the menu

is divided into several sub-categories based on the features they comprise: File, Edit, Data, Graphics,

Statistics, User, Window, and Help.

The File menu icon comprises features to open, import, export, print or save your data. Under the

Data icon you can find commands to explore and manage your data as well as functions to create or

change variables or single observations, to sort data or to merge datasets. The Graphics icon is relatively

self-explanatory as it covers all features related to creating and formatting graphics in Stata. Under the

Statistics icon, you can find all the commands and functions to create customised summary statistics

and to run estimations. You can also find postestimation options and commands to run diagnostic

(misspecification) tests. Another useful icon is the Help icon under which you can get access to the

Stata pdf manual as well as other help and search features.

When accessing certain features through the Stata menu (usually) a new dialogue window appears

where you are asked to specify the task you would like Stata to perform.

the button for a moment, and a tool-tip will appear with a description of that button. In the following,

we will focus on those toolbar buttons of particular interest to us.

The Log button begins a new log or closes, suspends, or resumes the current log. Logs are used to

document the results of your session – more on this issue in sub-section 2.8 – Keeping track of your

work.

The Do-file Editor opens a new do-file or re-opens a previously stored do-file. Do-files are mainly

used for programming in Stata but can also be handy when you want to store a set of commands in

order to allow you to replicate certain analyses at a later point in time. We will discuss in more detail

about the use of do-files and programming Stata in later sections.

There are two icons to open the Data Editor. The Data Editor gives a spreadsheet-like view of the

data. The icon resembling a spreadsheet with a loop opens the Data Editor in the Browse-mode while

the spreadsheet-like icon with the pen opens the editor in the Edit-mode. The former only allows you

to inspect the data. In the edit mode, however, you can make changes to the data, e.g. overwriting

certain data points or dropping observations.

Finally, there are two icons at the very right of the toolbar: a green downward-facing arrow and a

red sign with a cross. The former is the Clear–more–Condition icon which tells Stata to continue when

it has paused in the middle of a long output. The latter is the Break icon and pressing it while Stata

executes a command stops the current task.

1.3 Getting help

There are several different ways to get help when using Stata.3 Firstly, Stata has a very detailed set

of manuals which provide several tutorials to explore some of the software’s functionalities. Especially

when getting started using Stata, it might be useful to follow some of the examples provided in the

Stata pdf manuals. These manuals come with the package but can also be direclty accessed via the

Stata software by clicking on the ‘Help’ icon in the icon menu. There are separate manuals for different

subtopics, e.g. for graphics, data management, panel data etc. There is also a manual called [GS]

Getting started which covers some of the features briefly described in this introduction in more detail.

Sometimes you might need help regarding a particular Stata function or command. For every com-

mand, Stata’s in-built support can be called by typing ‘help’ followed by the command in question in

the Command window or via the Help menu icon. The information you will receive is an abbreviated

version of the Stata pdf manual entry.

For a more comprehensive search or if you do not know what command to use type in ‘search’

or ‘findit’ followed by specific keywords. Stata then also provides links to external (web) sources or

user-written commands regarding your particular enquiry.

Besides the help features that come with the Stata package, there is a variety of external resources.

For instance, the online Stata Forum is a great source if you have questions regarding specific functional-

ities or do not know how to implement particular statistical tests in Stata. Browsing of questions and an-

swers by the Stata community is available without registration; if you would like to post questions and or

provide answers to a posted question you need to register to the forum first (http://www.statalist.org/).

Another great resource is the Stata homepage. There you can find several video tutorials on a variety

of topics

(http://www.stata.com/links/video-tutorials/ ).

Several academics also publish their Stata codes on their institution’s website and several U.S.

institutions provide Stata tutorials that can be accessed via their homepage

(e.g. http://www.ats.ucla.edu/stat/stata/ ).

2 Data management in Stata

2.1 Variables and data types

It is useful to learn a bit about the different data types and variable types used in Stata as several Stata

commands distinguish between the type of data that you are dealing with and several commands require

the data to be stored as a certain data type.

Numeric or String Data

We can broadly distinguish between two data types: numeric and string. Numeric is for storing numbers;

string resembles a text variable. Variables stored as numeric can be used for computation and in

estimations. Stata has different numeric formats or storage types which vary according to the number

of integers they can capture. The more integers a format type can capture the greater its precision, but

also the greater the storage space needed. Stata’s default numeric option is float which stores data as a

real number with up to 8 digits. This is sufficiently accurate for most work. Stata also has the following

numeric storage types available: byte (integer e.g. for dummy variables), int (integer e.g. for year

variables), long (integer e.g. for population data), and double (real number with 16 digits of accuracy).

Any variable can be designated as a string variable, even numbers. However, in the latter case Stata

would not recognise the data as numbers anymore but would treat them as any other text input. No

computation or estimations can be performed on string variables. String variables can contain up to 244

characters. String values have to be put in quotation marks when being referred to in Stata commands.

To preserve space only store variables with the minimum storage requirements.4

Continuous, categorical and indicator variables

Stata has very convenient functions that facilitate the work and estimation with categorical and indicator

variables but also other convenient data manipulations such as lags and leads of variables.5

Missing values

Stata marks missing values in series by a dot. Missing numeric observations are denoted by a single dot

(.), while missing string observations are referred to either by blank double quotes (“ ”) or dot double

quotes (“.”). Stata can define multiple different missing values, such as .a, .b, .c etc. This might be

useful if you would like to distinguish between the reasons why a data point is missing, such as the data

point was missing in the original dataset or the data point has been manually removed from the data

set. The largest 27 numbers of each numeric format are preserved for missing values. This is important

to keep in mind when applying constraints in Stata. For example, typing the command ‘describe age

if age>= 27’ includes observations for which the person’s age is missing, while the command ‘describe

age if age>=27 & age!=.’ excludes observations with missing age data.

2.2 Formats and variable labels

Each variable may have its own display format. This does not alter the content or precision of the

variable but only affects how it is displayed. You can change the format of a variable by clicking on

4A very helpful command to check whether data can be stored in a data type that requires less storage space is by

using the ‘compress’ command.

Data / Variable Manager, by directly changing the format in the Variable Manager window at the

bottom right of the Stata main screen or by using the command ‘format’ followed by the new format

specification and the name of the variable. There are different format types that correspond to ordinary

numeric values as well as specific formats for dates and time.

We can attach labels to variables, an entire dataset or to specific values of the variable (e.g. in

the case of categorical or indicator variables). Labelling a variable might be helpful when you want to

document the content of a certain variable or how it has been constructed. You can attach a label to a

variable by clicking in the label dialogue box in the Variable Manager window or by clicking on Data

/ Variable Manager. A value label can also be attached to a variable using the Variable Manager

window.6

2.3 Data input and saving

One of the first steps of every statistical analysis is importing the dataset to be analysed into the software

package. Depending on the format of your data there are different ways of accomplishing this task. If

your dataset is already in a Stata format which is indicated by the file suffix ‘.dta’ you can click on File

/ Open… and simply select the dataset you would like to work with. Alternatively, you use the ‘use’

command followed by the name and location of the dataset, e.g. ‘use ”G:\Stata training\sp500.dta”,

clear’. The term ‘clear’ tells Stata to close all workfiles that are currently used in memory.

If your dataset is in Excel format you click on File / Import / Excel spreadsheet (*.xls;*.xlsl)

and select the Excel file you would like to import into Stata. A new window appears which provides a

preview of the data and in which you can specify certain options as to how you would like the dataset

to be imported. If you prefer to use a command to import the data you need to type in the command

‘import excel’ into the command window followed by the file name and location as well as potential

importing options.

Stata can also import text data, data in SAS format and other formats (see File / Import) or you

can directly paste observations into the Data Editor. You save changes to your data using the command

save or by selecting the File / Save as… option in the menu.

When you read data into Stata, it stores it in the RAM (memory). All changes you make are

temporary and will be lost if you close the file without saving it. It is also important to keep in mind

that Stata has no Undo options so that some changes cannot be undone (e.g. dropping of variables).

You can generate a snapshot of the data in your Data Editor or by the command ‘snapshot save’ to

be able to reset data to a previous stage.

Now let us import a dataset to see how the single steps would be performed. The dataset that

we want to import into Stata is the excel file UKHP.xls. First, we select File / Import / Excel

spreadsheet (*.xls;*.xlsl) and click on the button Browse… . Now we choose the ‘UKHP.xls’ file,

click Open and we should find a preview of the data at the bottom of the dialogue window, as shown in

Figure 2, left panel. We could further specify the worksheet of the Excel file we would like to import by

clicking on the drop-down menu next to the box Worksheet; however, since the ‘UKHP.xls’ file only

has one worksheet we leave this box as it is.

The box headed Cell range allows us to specify the cells we would like to import from a specific

worksheet. By clicking on the button with the three dots we can define the cell range. In our case we

want the Upper-left cell to be A1 and the lower-right cell to be B270 (see figure 2, right panel). Finally,

there are two boxes that can be checked: Import first row as variable names tells Stata that the

first rows are not to be interpreted as data points but as the names of the variables. Since this is the

case for our ‘UKHP.xls’ we check this box. The second option, Import all data as string tells Stata

Figure 2: Importing excel data into Stata

to store all series as string variables, independent of whether the original series might contain numbers.

We want Stata to import our data as numeric values and thus we leave this box unchecked. Once you

have undertaken all these specifications, the dialogue window should resemble figure 2, left panel. The

last thing we need to do to import the data into Stata is to press OK. If the task has been successful

we should find the command line import excel followed by the file location and further specifications

in the Output window as well as the Review window. Additionally, there should now be two variables

in the Variables window: Month and AverageHousePrice. Note that Stata automatically attached

a variable label to the two series in the workfile which are identical to the variable names.

You can save the imported dataset as a Stata workfile by clicking on File / Save as… and specifying

the name of the dataset – in our case ‘ukhp.dta’. Note that we recognise that the dataset is now stored

as a Stata workfile by the file-suffix ‘.dta’.

2.4 Data description

Once you have imported your data, you want to get an idea of what the data are like and you want to

check that all the values have been correctly imported and that all variables are stored in the correct

format.

There are several Stata functions to examine and describe your data. It is often useful to visually

inspect the data first. This can be done in the Data Editor mode. You can access the Data Editor

by clicking on the respective icons in the Stata icon menu as described above. Alternatively, you can

directly use the commands browse or edit.

Additionally, you can let Stata describe the data for you. If you click on Data / Describe data

you will find a variety of options to get information about the content and structure of the dataset.

In order to describe the data structure of the ‘ukhp.dta’ file you can use the describe command

which you can find in the Stata menu under Describe data in memory or in a file. It provides the

number of observations and variables, the size of the dataset as well as variable specifics like storage

type and display format. If we do this for our example session, we are presented with the following

output in the Output window.

. describe

Contains data

obs: 269

vars: 2

size: 2,690

storage display value

variable name type format label variable label

Month int %tdMon-YY Month

AverageHouseP⇠e double %10.0g Average House Price

Sorted by:

Note: dataset has changed since last saved

.

We see that our dataset contains two variables, one that is stored as ‘int’ and one as ‘double’. We

can also see the the display format and that the two variables have a variable label but no value label

attached.

The command summarize provides you with summary statistics of the variables. You can find it

under Data / Describe data / Summary statistics. If we click on this option, a new dialogue

window appears where we can specify what variables we would like to generate summary statistics for

(figure 3).

Figure 3: Generating Summary Statistics

If you do not specify any variables, Stata assumes you would like summary statistics for all vari-

ables in memory. Again, it provides a variety of options to customise the summary statistics. Besides

Standard display, we can tell Stata to Display additional statistics, or (rather as a programmer

command) to merely calculate the mean without showing any output.7 Using the tab by/if/in allows

us to restrict the data to a sub-sample. The tab Weights provides options to weight the data points in

your sample. However, we want to create simple summary statistics for the two variables in our dataset

so we keep all the default specifications and simply press OK. Then the following output should appear

in our Output window.

. summarize

Variable Obs Mean Std. Dev. Min Max

Month 269 15401.05 2367.985 11323 19479

AverageHouseP⇠e 269 109363.5 50086.37 49601.66 186043.6

.

Note that the summary statistics for ‘Month’ are not intuitive to read as Stata provides summary

statistics in the coded format and does not display the variable in the (human readable) date format.

Another useful command is ‘codebook’, which can be accessed via Data / Describe data /

Describe data contents (codebook). It provides additional information on the variables, such

as summary statistics on numeric variables, examples of data points for string variables, the number

of missing observations and some information about the distribution of the series. The ‘codebook’

command is especially useful if the dataset is unknown and you would like to get a first overview of

the characteristics of the data. In order to open the command dialogue window we follow the path

mentioned above. Similar to the ‘summarize’ command, we can execute the task for all variables by

leaving the Variables box blank. Again, we can select further options using the other tabs. If we simply

press OK without making further specifications, we generate the output on the next page.

We can see from the output below that the variable Month is stored as a daily variable with days

as units. However, the units of observations are actually months so that we will have to adjust the type

of the variable to a monthly time variable (which we will do in the next section).

There is a variety of other tools that can be used to describe the data and you can customise them

to your specific needs. For instance, with the ‘by’ prefix in front of a summary command you can create

summary statistics by subgroups. You can also explicitly specify the statistics that shall be reported in

the table of summary statistics using the command tabstat. To generate frequency tables of specific

variables, the command tabulate can be used. Another great feature of Stata is that many of these

commands are also available as panel data versions.

. codebook

Month Month

type: numeric daily date (int)

range: [11323,19479] units: 1

or equivalently: [01jan1991,01may2013] units: days

unique values: 269 missing .: 0/269

mean: 15401.1 = 02mar2002 (+ 1 hour)

std. dev: 2367.99

percentiles: 10% 25% 50% 75% 90%

12113 13362 15400 17440 18687

01mar1993 01aug1996 01mar2002 01oct2007 01mar2011

AverageHousePrice Averag House Price

type: numeric (double)

range: [49601.664,186043.58] units: .0001

unique values: 269 missing .: 0/269

mean: 109364

std. dev: 50086.4

percentiles: 10% 25% 50% 75% 90%

51586.3 54541.1 96792.4 162228 168731

**Preview**

**NOTE: Please check the details before purchasing the document.**