read.dta package:foreign R Documentation _R_e_a_d _S_t_a_t_a _B_i_n_a_r_y _F_i_l_e_s _D_e_s_c_r_i_p_t_i_o_n: Reads a file in Stata version 5-10 binary format into a data frame. _U_s_a_g_e: read.dta(file, convert.dates = TRUE, convert.factors = TRUE, missing.type = FALSE, convert.underscore = FALSE, warn.missing.labels = TRUE) _A_r_g_u_m_e_n_t_s: file: a filename or URL as a character string. convert.dates: Convert Stata dates to 'Date' class? convert.factors: Use Stata value labels to create factors? (version 6.0 or later). missing.type: For version 8 or later, store information about different types of missing data? convert.underscore: Convert '"_"' in Stata variable names to '"."' in R names? warn.missing.labels: Warn if a variable is specified with value labels and those value labels are not present in the file. _D_e_t_a_i_l_s: If the filename appears to be a URL (of schemes 'http:', 'ftp:' or 'https:') the URL is first downloaded to a temporary fiie and then read. ('https:' is only supported on some platforms.) The variables in the Stata data set become the columns of the data frame. Missing values are correctly handled. The data label, variable labels, and timestamp are stored as attributes of the data frame. Nothing is done with variable characteristics. By default Stata dates (%d and %td formats) are converted to R's 'Date' class and variables with Stata value labels are converted to factors. Ordinarily, 'read.dta' will not convert a variable to a factor unless a label is present for every level. Use 'convert.factors = NA' to override this. In any case the value label and format information is stored as attributes on the returned data frame. Stata 8.0 introduced a system of 27 different missing data values. If 'missing.type' is 'TRUE' a separate list is created with the same variable names as the loaded data. For string variables the list value is 'NULL'. For other variables the value is 'NA' where the observation is not missing and 0-26 when the observation is missing. This is attached as the '"missing"' attribute of the returned value. _V_a_l_u_e: a data frame with attributes. These will include '"datalabel"', '"time.stamp"', '"formats"', '"types"', '"val.labels"', '"var.labels"' and '"version"' and may include '"label.table"'. Possible versions are '5, 6, 7', '-7' (Stata 7SE, 'format-111'), '8' (Stata 8 and 9, 'format-113') and '10' (Stata 10, 'format-114'). The value labels in attribute '"val.labels"' name a table for each variable, or are an empty string. The tables are elements of the named list attribute '"label.table"': each is an integer vector with names, _A_u_t_h_o_r(_s): Thomas Lumley _R_e_f_e_r_e_n_c_e_s: Stata Users Manual (versions 5 & 6), Programming manual (version 7), or online help (version 8, 9, 10) describe the format of the files. Or at and _S_e_e _A_l_s_o: 'write.dta', 'attributes' 'Date' 'factor' _E_x_a_m_p_l_e_s: data(swiss) write.dta(swiss,swissfile <- tempfile()) read.dta(swissfile)