Link Search Menu Expand Document

Variables are, broadly, either numeric or string (text). The terms are quite self-explanatory; numeric variables contain numbers such as integers, decimals, etc. while string variables contain text such as names, addresses, etc. If you think about it, numbers are text and so, it can happen that string variables are composed entirely of numbers. In our dataset, the date variable is an example of this; it is just numbers but stored as a string. (we’ll see below how to tell Stata to store it as a numeric variable though this is not really needed here)

Numeric variables can be stored as byte, int, long, float, or double. We don’t need to know the differences between these now, except that float and double types can hold decimal values. We don’t have to explicitly instruct Stata which of these to use; it figures it out automatically.

Thankfully, string variables have their lives sorted and their sub-types just signify the maximum string length of an observation.For example, if the longest name in a variable is “toolazytosetanamewow”, then this variable will be stored as “str20” where 20 tells us the max string length.

For more details on these, you can read this and this. We haven’t discussed dates in Stata but the first page discusses them briefly (happy reading!).

In case a string variable contains only numbers (“numeric characters”), you can convert it into a numeric variable using the destring command. As mentioned above, date variable is an example. We just run the following to change its type to numeric:

destring date, replace

On the other hand, we can convert a variable’s type from numeric to string using the tostring command. This can come in handy when combining different numeric IDs to create a single ID, as in the last example here. A simple example:

// asssuming you converted date to numeric above
tostring date, replace

How to know if a variable is string or numeric? A few ways:

  • The codebook outputs the variable type in its results.
  • If you select a variable in the “Variables” window, then the “Properties” window tells you its type as well.
  • When browsing the data, string variable observations are displayed in red while numeric variable observations are displayed in black. You might notice some variables with observations in blue; these are numeric variables but have a “value label”; we cover this next.