Stata level 2

    Tips and tricks for working with Stata
-------------------------------------------------------------------------------
        Stata basics

            Passing commands to Stata

            Work flow

            Navigation

            Customize your Stata

            information available in a Stata dataset

            Additional information available in a stata session

            returned results

        macros

            Macro

            what is in a macro?

            Macros containing numbers

            Scalars

            compound quotes

            = or no =

            local versus global

            tempname tempfile tempvar

            Passing local macros to another .do file

            Leaving local macros behind

                c_local

            extended macro functions

        looping

            An example loop

            different types of loops

            Try it yourself

        prefixes

            by

            statsby

        finding and manipulating groups of variables

            finding variables

        numerical precision

            binary versus decimal

            how a number is stored

            rounding errors: adding and subtracting

            storage types

            possible problems

        programming

            defining a program in a .do file

            Why would you want to do that?
 
    Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------


            Goal

            Starting

            Add some lines to our .html file to make it prettier

            add a replace option

            Let the user specify a meaningful title

            What should the title be when the title() option is not
            specified?

            Add a list of variable names to our codefile

            Adding varialble labels to our variable list

            Add data label to codebook

            Add data notes to codebook

            Splitting the program up in smaller subroutines

            Add variable notes to the variable list

            Add a list of value labels to our codebook

            Add frequencies to our labels

            Numeric variables without value labels

            display format for summary statistics

            Add other summary statistics

            String variables

            Add links between the variable list and the value label
            list

            Turn this into an .ado file

            What is still left?

-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- Stata basics
-------------------------------------------------------------------------------

Passing commands to Stata

There are three ways of telling Stata what you want it to do

click on the menu. I do this very rarely, mainly to look for files or import files from excel.

type in the command window. I do this a lot, but only for experimentation.

write your commands in a .do file. This is the main way of doing things. It allows me to keep track of what I am doing, and it keeps a paper trail of what I have done in case someone wants to replicat what I have done.

A result can only appear in my article or presentation if it is the result of running a .do file

I do experiment in the command window, but the command I am happy with has to be coppied in the .do file.

I mainly work in Stata's do-file editor, which you can start by typing doedit

-------------------------------------------------------------------------------

index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- Stata basics
-------------------------------------------------------------------------------

Work flow

Real research is too long for a single .do file. Instead you need to break this up in several smaller do files.

Create one master .do file that executes these in turn; a .do file can execute another .do file (say cceast_dta01.do, by including the command do cceast_dta01.do

Have a naming system for your .do files. I typically have a small abbreviation of a project (e.g. cceast), and add to that either _dta or _ana for data preparation or analysis files. After that I add a number.

Numbering the files prevents names like "final", "really_final", "no_seriously_I_am_done", etc.

I have at least two directories: "working" and "posted". I can change anything in the working directory, but once something is in posted I cannot change it anymore.

I will put files in posted when I present my results at a conference or submit a paper to a journal. This ensures I can always replicate those results.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- Stata basics
-------------------------------------------------------------------------------

Navigation

Stata has a working directory and system directories.

The working directory is where Stata will look for (data and .do) files when you don't specify the directory.

If you type in Stata pwd (print working directory) it will tell you where Stata is.

You can change the working directory using {cmd cd}

You can also use relative paths. Say you are in the working directory, and your data is stored in posted/data, you can type use ../posted/data/datafile.dta.

This is useful when you work on different computers. If you have to set the directory only once, in the master .do file, and all other .do files will only use relative paths, then you have to change the cd command only once in order to make it work on your other computers.

system directories is where Stata looks for its programs, help-files, etc.

You can find out where that is by typing sysdir

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- Stata basics
-------------------------------------------------------------------------------

Customize your Stata

You can set the scheme of your output window using edit --> preferences --> general preferences

You can set the font by right clicking on a window and choose font... I like lucida console

You can let Stata execute a couple of commands every time it starts up. By creating a .do file called profile.do and store it in your PERSONAL folder (see: sysdir).

My profile.do reads:

noi di as txt _n"Current projects:"

noi di as txt "F4" as result " SS18 Stata_L2" global F4 cd "D:\Mijn documenten\onderwijs\konstanz\ss18\stata_l2\";

exit

This makes that if I press F4 Stata will cd to the directory of this course.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- Stata basics
-------------------------------------------------------------------------------

information available in a Stata dataset

The main place where we store and access information is the dataset, and specifically the values a variable takes.

. sysuse auto, clear (1978 Automobile Data)

. browse

. list foreign rep78 in 1/10, nolabel

+-----------------+ | foreign rep78 | |-----------------| 1. | 0 3 | 2. | 0 3 | 3. | 0 . | 4. | 0 3 | 5. | 0 4 | |-----------------| 6. | 0 3 | 7. | 0 . | 8. | 0 3 | 9. | 0 3 | 10. | 0 3 | +-----------------+

. display rep78 3

. display rep78[5] 4

    The dataset can contain additional information in the form of variable,
    value, and data labels

. desc

Contains data from C:\Program Files (x86)\Stata15\ado\base/a/auto.dta obs: 74 1978 Automobile Data vars: 12 13 Apr 2016 17:45 size: 3,182 (_dta has notes) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- make str18 %-18s Make and Model price int %8.0gc Price mpg int %8.0g Mileage (mpg) rep78 int %8.0g Repair Record 1978 headroom float %6.1f Headroom (in.) trunk int %8.0g Trunk space (cu. ft.) weight int %8.0gc Weight (lbs.) length int %8.0g Length (in.) turn int %8.0g Turn Circle (ft.) displacement int %8.0g Displacement (cu. in.) gear_ratio float %6.2f Gear Ratio foreign byte %8.0g origin Car type ------------------------------------------------------------------------------- Sorted by: foreign

. fre rep78

rep78 -- Repair Record 1978 ----------------------------------------------------------- | Freq. Percent Valid Cum. --------------+-------------------------------------------- Valid 1 | 2 2.70 2.90 2.90 2 | 8 10.81 11.59 14.49 3 | 30 40.54 43.48 57.97 4 | 18 24.32 26.09 84.06 5 | 11 14.86 15.94 100.00 Total | 69 93.24 100.00 Missing . | 5 6.76 Total | 74 100.00 -----------------------------------------------------------

    desc also returns the type of the variable, this gives information about
    whether a variable a string or a numeric variable, and if it is a numeric
    variable about the range of possible values.

In addition desc also provides information about the format of a variable, that is, how the values are supposed to be displayed. This can be particularly useful for finding time variables.

We can add additional comments to variables and datasets in addition to the labels using notes. A more general version of notes are characteristics. These are used to store information in the dataset when xtset or stset the data.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- Stata basics
-------------------------------------------------------------------------------

Additional information available in a stata session

You can store information in macros

Stata also has the possibility to store information in scalars or matrices.

Many commands return results, which can be accessed until the next command is run.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- Stata basics
-------------------------------------------------------------------------------

returned results

Many command leave the results behind in memory. These are returned results.

. sum mpg

Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- mpg | 74 21.2973 5.785503 12 41

. return list

scalars: r(N) = 74 r(sum_w) = 74 r(mean) = 21.2972972972973 r(Var) = 33.47204738985561 r(sd) = 5.785503209735141 r(min) = 12 r(max) = 41 r(sum) = 1576

. di r(mean) 21.297297

. . sum mpg, meanonly

. return list

scalars: r(N) = 74 r(sum_w) = 74 r(sum) = 1576 r(mean) = 21.2972972972973 r(min) = 12 r(max) = 41

    Because each command can return results, you should not expect these to
    persist for long. If you need them store the desired results immediately
    after the command, e.g. in a local macro

There are various types of returned results:

returned results (r(something), and return list), these come from general, non-estimation commands.

ereturned results (e(something) and ereturn list), these come from estimation commands, like regress

sreturned results (s(something) and sreturn list), these come from subprograms.

creturned results (c(something)) contains the value of system parameters and settings, along with certain constants such as the value of pi.A full list can be found at help creturn.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- macros
-------------------------------------------------------------------------------

Macro

A macro is a shorthand, it is one thing standing for another

. sysuse auto, clear (1978 Automobile Data)

. local xvars = "mpg foreign"

. reg price `xvars'

Source | SS df MS Number of obs = 74 -------------+---------------------------------- F(2, 71) = 14.07 Model | 180261702 2 90130850.8 Prob > F = 0.0000 Residual | 454803695 71 6405685.84 R-squared = 0.2838 -------------+---------------------------------- Adj R-squared = 0.2637 Total | 635065396 73 8699525.97 Root MSE = 2530.9

------------------------------------------------------------------------------ price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mpg | -294.1955 55.69172 -5.28 0.000 -405.2417 -183.1494 foreign | 1767.292 700.158 2.52 0.014 371.2169 3163.368 _cons | 11905.42 1158.634 10.28 0.000 9595.164 14215.67 ------------------------------------------------------------------------------

. reg price mpg foreign

Source | SS df MS Number of obs = 74 -------------+---------------------------------- F(2, 71) = 14.07 Model | 180261702 2 90130850.8 Prob > F = 0.0000 Residual | 454803695 71 6405685.84 R-squared = 0.2838 -------------+---------------------------------- Adj R-squared = 0.2637 Total | 635065396 73 8699525.97 Root MSE = 2530.9

------------------------------------------------------------------------------ price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mpg | -294.1955 55.69172 -5.28 0.000 -405.2417 -183.1494 foreign | 1767.292 700.158 2.52 0.014 371.2169 3163.368 _cons | 11905.42 1158.634 10.28 0.000 9595.164 14215.67 ------------------------------------------------------------------------------

    In the second line of this example we defined a local macro called xvars,
    which stands for / contains the string "mpg foreign"

The syntax is

local macroname content

If we later want to refer to the contents of the local macro xvars we type `xvars', that is the macroname with left and right single quotes.

When Stata sees a line it will first look for macros and replace that with its content

So when Stata saw line 3, the first thing it did is look up what was in the local macro called xvars, and replaced `xvars' with its contents, and only than did it try to execute the command.

So the third and fourth line are equivalent as far as Stata is concerned.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- macros
-------------------------------------------------------------------------------

what is in a macro?

It can contain anything, but is always a string

. local mac "foo"

. di `mac' foo not found r(111);

    I said that the content of a macro is always a string, then why did this
    return an error?

The double quotes are there to indicate that this is a string, but they are not part of the string.

So `mac' contains foo not "foo"

So for the second line Stata saw di foo, and since there were no double quotes Stata assumed we wanted to look at a variable (or scalar) foo, could not find that variable and returned the error message.

This will work:

. local mac "foo"

. di "`mac'" foo

 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- macros
-------------------------------------------------------------------------------

Macros containing numbers

Because the quotes are stripped, macros may also contain numbers.

They are stored as strings, but as soon as Stata replaces the name of the macro with its content it will see them as numbers, as there are no quotes around them.

. local mac 1

. di `mac' 1

    However, numbers stored in macros are not quite as precise as numbers
    stored in scalars.

Scalars are stored in double precision (15-16 decimal digits), while locals have about 12 decimal digits, sometimes more, but never less than 11.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- macros
-------------------------------------------------------------------------------

Scalars

A scalar is a "container" containing one element, either a string or a number.

It is good practice to use tempnames for scalars as they share the same namespace as variables

Unlike a macro the name is not immediately replaced by its contents

. sum mpg, meanonly

. tempname m_mpg

. scalar `m_mpg' = r(mean)

. scatter mpg price, yline(`m_mpg') invalid line argument, __000006 r(198);

    In order to replace the scalar name with its contents you can type
    `=scalarname'

. sum mpg, meanonly

. tempname m_mpg

. scalar `m_mpg' = r(mean)

. scatter mpg price, yline(`=`m_mpg'')

 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- macros
-------------------------------------------------------------------------------

compound quotes

We can make the double quotes part of the string by surrounding them with compound quotes

. local mac `" "foo" "'

. di `mac' foo

. di `"`mac'"' "foo"

. di `"|`mac'|"' | "foo" |

. di "`mac'" foo" " invalid name r(198);

    The content of the macro `mac' is in this example <space>"foo"<space>.

So in the first di, Stata sees: di <space><space>"foo"<space>

In the second di, Stata sees: <space>`"<space>"foo"<space>"'

These spaces are more visible by surounding them with pipes: now Stata sees <space>`"|<space>"foo"<space>|"'

The final one gets into trouble because Stata does not know that the outer double quotes should "wrap around" the quotes in the macro.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- macros
-------------------------------------------------------------------------------

= or no =

If the content of a macro is a string, and quotes are stripped anyhow, then the quotes when defining the macro seem redundant

. local mac foo

. di "`mac'" foo

    However, that is only true if we did not include a "="

Including an "=" means that what comes after the equal sign is an expression, and the result of evaluating that expression is to be the content of the macro.

So when Stata sees local mac = foo, then Stata starts looking for the variable or scalar foo, which it can than put into the local macro `mac'. It cannot find it, and will return an error message

. local mac = foo foo not found r(111);

 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- macros
-------------------------------------------------------------------------------

local versus global

We have thus far used local macros.

In a .do file they exist as long as that .do file is running, but disappear immediately afterwards.

So if you run a line of your do file which defines the local macro and than run another line that uses that local macro, that local macro will no longer exist.

That sounds awkward, but it is actually extremely useful.

The alternative is a global macro: a macro that persists after you are done running a .do file.

In a datapreparation or data analysis phase you can easily work hours on end, have (lunch) breaks in between etc. What happens when you use global macros? You defined them early in the morning and you may or may not have changed somewhere along the way. You can easily imagine a situation where you think your global contains one thing, but actually contains something else.

So global macros are dangerous, and generally considered bad practice.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- macros
-------------------------------------------------------------------------------

tempname tempfile tempvar

tempname, tempfile, and tempvar are special types of local macros.

tempfile is a local macro containing a name for a matrix or scalar that is guaranteed not to exist, and that will be removed at the end of the session.

tempvar is a local macro containing a filename that is guaranteed not to exist, and that will be removed at the end of the session.

tempname is a local macro containing a variable name that is guaranteed not to exist, and that will be removed at the end of the session.

These are good ways of storing intermediate results that you need to store for a very short time.

For example, I often use tempfile in combination with merge.

I prepare a dataset for merging, and store it as a tempfile

I open the other dataset, and merge the tempfile in.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- macros
-------------------------------------------------------------------------------

Passing local macros to another .do file

You can run a .do file called foo.do by typing do foo.do

You can also run that .do file by typing do foo.do something else

This will make the following local macros available at the beginning of foo.do:

`0' containing: something else

`1' containing: something

`2' containing: else

foo.do could do some complicated/fiddly manipulations of a variable

You want to apply those to multiple variable, say var1 and var2

Now you can create another .do file that contains the lines:

do foo.do var1 do foo.do var2

If you find an error in foo.do, you only have to fix it once, saving you a lot of potential to include inconsistencies and bugs.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- macros
-------------------------------------------------------------------------------

Leaving local macros behind

Say you have a .do file bar.do that calls a .do file foo.do

foo.do makes local macros that you want bar.do to have access to

Instead of the line do foo.do you can add the line include foo.do in bar.do

This will run foo.do as if it was actually part of bar.do. That way any local macros defined in foo.do will be available in bar.do

This is a good way to store settings that will be used in multiple .do files

A (dangerous) alternative is >> c_local.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- macros
-------------------------------------------------------------------------------

extended macro functions

extended macro functions are function you can use when defining a macro

. local varlab : var label mpg

. di "`varlab'" Mileage (mpg)

    You can also use that on the fly

. di "`: var label price'" Price

    The most useful are functions for

extracting data attributes (labels, characteristics, variable types)

file names and file paths

formating results

manipulating lists

matrices

parsing strings

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- looping
-------------------------------------------------------------------------------

An example loop

. forvalues i = 1/10 { 2. di `i' 3. } 1 2 3 4 5 6 7 8 9 10

    forvalues tells Stata that we want to loop

the i after forvalues is the name of a local macro that will exist when the loop runs

= 1/10 tells Stata the values the local i should take: 1 for the first time, 2 for the second time, ..., 10 for the tenth time, and then it stops.

{ and }: whatever is between these braces is going to be repeated

di `i' displays the content of the local macro i. So the first time it evaluates to di 1, the second time to di 2, etc.

previously we talked about a .do file foo.do that can be used to manipulate var1 and var2. What if we have var1 till var50?

forvalues i = 1/50 { do foo.do var`i' }

before Stata executes a line it first replaces macros with their content.

So the first time around it sees do foo.do var1, the second time around it sees do foo.do var2, etc.

In this case it is necessary that there is no space between var and `i'

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- looping
-------------------------------------------------------------------------------

different types of loops

forvalues i = 2(2)100

We loop over the values 2, 4, 6, ..., 100

This is the type of loop I use most

foreach var of varlist *

We loop over all variables in the dataset

This is useful for specific list like varlists, numlist, or locals

while `diff' > 1e-6

`diff' is probably a tempname for a scalar, and we continue the loop till this scalar is less than 1e-6 (0.0000001)

This is usful for when you want to iteratively optimise something. Often there are beter suits of commands for that, e.g. ml, nl, gmm, so that is very rarely used.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- looping
-------------------------------------------------------------------------------

Try it yourself

The variables weight length and turn all measure the size of the car. Maybe we want to combine those variables into one. In order to do so we need to make sure the are measured in the same unit.

One possiblity would be the percentile score; the proportion of cars that is smaller. Here is how I would do this for weight

. egen i = rank(weight)

. count if !missing(weight) 74

. gen p_weight = (i - .5)/r(N)

    Use a loop to create percentile scores for weight, length, and turn.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- prefixes
-------------------------------------------------------------------------------

by

Say we want a new variable containing the mean price for every level of repair status

We could do this with a loop

. levelsof rep78 1 2 3 4 5

. local levs = r(levels)

. gen mprice = . (74 missing values generated)

. foreach lev of local levs { 2. sum price if rep78 == `lev', meanonly 3. replace mprice = r(mean) if rep78 == `lev' 4. } (2 real changes made) (8 real changes made) (30 real changes made) (18 real changes made) (11 real changes made)

    Alternatively we could use the by prefix

. bysort rep78 : egen mprice2 = mean(price)

    We could use this to find the highest price within each level of repair
    status

. bys rep78 (price) : gen maxprice = price[_N]

    What would happen if price contained missing values?

. gen misprice = missing(price)

. bys rep78 misprice (price): gen maxprice2 = price[_N] if misprice == 0

    Can you find the lowest price in each level of rep78?

Can you find the second highest price in each level of rep78? -------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- prefixes
-------------------------------------------------------------------------------

statsby

The statsby prefix allows you to execute a command for each level of a variable and store returned statistics in a dataset.

This is often useful for creating graphs of summary statistics

. sysuse nlsw88, clear (NLSW, 1988 extract)

. statsby m=r(mean) min=r(min) max=r(max) p75=r(p75) p25=r(p25) , /// > by(industry) clear: sum wage, d (running summarize on estimation sample)

command: summarize wage, d m: r(mean) min: r(min) max: r(max) p75: r(p75) p25: r(p25) by: industry

Statsby groups ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 ............

. list

+---------------------------------------------------------------------+ 1. | industry | m | min | max | p75 | | Ag/Forestry/Fisheries | 5.621121 | 1.811594 | 12.38325 | 7.589398 | |---------------------------------------------------------------------| | p25 | | 3.454104 | +---------------------------------------------------------------------+

+---------------------------------------------------------------------+ 2. | industry | m | min | max | p75 | | Mining | 15.34959 | 5.016723 | 40.19808 | 24.93801 | |---------------------------------------------------------------------| | p25 | | 5.761177 | +---------------------------------------------------------------------+

+---------------------------------------------------------------------+ 3. | industry | m | min | max | p75 | | Construction | 7.564934 | 2.801002 | 30.19324 | 8.260865 | |---------------------------------------------------------------------| | p25 | | 4.830918 | +---------------------------------------------------------------------+

+---------------------------------------------------------------------+ 4. | industry | m | min | max | p75 | | Manufacturing | 7.501578 | 1.004952 | 40.19808 | 8.872785 | |---------------------------------------------------------------------| | p25 | | 4.508855 | +---------------------------------------------------------------------+

+---------------------------------------------------------------------+ 5. | industry | m | min | max | p75 | | Transport/Comm/Utility | 11.44335 | 3.526568 | 40.19808 | 12.11755 | |---------------------------------------------------------------------| | p25 | | 8.22866 | +---------------------------------------------------------------------+

+---------------------------------------------------------------------+ 6. | industry | m | min | max | p75 | | Wholesale/Retail Trade | 6.125896 | 2.012882 | 40.19808 | 6.76328 | |---------------------------------------------------------------------| | p25 | | 3.349436 | +---------------------------------------------------------------------+

+---------------------------------------------------------------------+ 7. | industry | m | min | max | p75 | | Finance/Ins/Real Estate | 9.843174 | 1.501798 | 40.19808 | 10.40257 | |---------------------------------------------------------------------| | p25 | | 5.233495 | +---------------------------------------------------------------------+

+---------------------------------------------------------------------+ 8. | industry | m | min | max | p75 | | Business/Repair Svc | 7.51579 | 1.571983 | 40.19808 | 9.462362 | |---------------------------------------------------------------------| | p25 | | 3.718949 | +---------------------------------------------------------------------+

+---------------------------------------------------------------------+ 9. | industry | m | min | max | p75 | | Personal Services | 4.401093 | 1.151368 | 22.97034 | 5.442833 | |---------------------------------------------------------------------| | p25 | | 3.001791 | +---------------------------------------------------------------------+

+---------------------------------------------------------------------+ 10. | industry | m | min | max | p75 | | Entertainment/Rec Svc | 6.724409 | 1.811594 | 13.17229 | 10.06441 | |---------------------------------------------------------------------| | p25 | | 3.220612 | +---------------------------------------------------------------------+

+---------------------------------------------------------------------+ 11. | industry | m | min | max | p75 | | Professional Services | 7.871186 | 1.032247 | 40.74659 | 9.798708 | |---------------------------------------------------------------------| | p25 | | 4.64573 | +---------------------------------------------------------------------+

+---------------------------------------------------------------------+ 12. | industry | m | min | max | p75 | | Public Administration | 9.148407 | 2.093397 | 40.19808 | 10.83736 | |---------------------------------------------------------------------| | p25 | | 6.352656 | +---------------------------------------------------------------------+

 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- finding and manipulating groups of
variables
-------------------------------------------------------------------------------

finding variables

Most datasets contain a great many variables. Finding the variables you are looking for can be a challenge.

The lookfor command can be useful, it allows you to search for variables with a specific string in their name or labels

. sysuse auto, clear (1978 Automobile Data)

. lookfor repair

storage display value variable name type format label variable label ------------------------------------------------------------------------------- rep78 int %8.0g Repair Record 1978

. return list

macros: r(varlist) : "rep78"

    Say we want a list of all numeric variables

This is something ds can do

. ds, has(type numeric) price headroom length gear_ratio mpg trunk turn foreign rep78 weight displacement

 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- numerical precision
-------------------------------------------------------------------------------

binary versus decimal

If a computer stored numbers in decimal format we would not be surprised that we could not store the number 1/3 exactly; we would have to stop storing 3s otherwise we would need an infinite amount of memory to store one number.

A computer however stores numbers in binary format. In binary some numbers we would not consider problematic, are actualy like 1/3. The most common example is 0.1.

So a lot of numbers we think are perfectly "normal" are in a computer actually rounded versions of that number.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- numerical precision
-------------------------------------------------------------------------------

how a number is stored

So how are numbers stored?

We could say that we store a number up to 6 digits after the decimal point (ignoring that they are actually stored in binary)?

This is problematic

We would store the number 1,000,000 with 13 significant digits

While we would store the number 0.0001 with only 3 significant digits

Instead a number is stored in three parts: the sign and two numbers, lets call them a and b

If we would store the number in decimal format the number stored would then be sign * a * 10^b

So if we decided on 6 significant digits we would store the number 1,000,000 as +1*1,00000*10^6 and the number 0.0001 as +1*1.00000*10^-4

In real computers both a and b are binary numbers and we don't use 10^b, but 2^b

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- numerical precision
-------------------------------------------------------------------------------

rounding errors: adding and subtracting

This way of storing number allows us to reliably store number within a very large range.

It has some quirks, for example adding numbers that differ by a large order of magnitude can lead to quite large rounding errors.

We want to add 1,000,000 and 0.0001

Then we are adding +1*1,00000*10^6 and +1*1.00000*10^-4

In order to add them we would need to the exponent the same:

We would change +1*1.00000*10^-4 to +1*0.00000000001*10^6

However, we only stored 6 digits, so 0.00000000001 gets rounded to 0

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- numerical precision
-------------------------------------------------------------------------------

storage types

The problem with precision mainly occurs with fractional numbers, integers can be stored exactly as long as they are not too big

There are three storage types for integers that differ with respect to the range of numbers they can store and the amount of memory needed to store them

byte, which can store numbers between -127 and 100 and uses only one byte per number

int , which can store numbers between -32,767 and 32,740 and uses two bytes per number

long which can store numbers between -2,147,483,647 and 2,147,483,620 and uses four bytes per number

There are two storage types for fractional numbers

float has a precision of about 8 decimal digits and takes four bytes per number

double has a precision of about 16 decimal digits and takes 8 bytes per number. A double can also store the largest range of numbers: between -8.988*10^307 and 8.988*10^307.

The default storage type for variables is float.

This is fine for storing data. We typically don't think we measured our variables upto 8 digits accurate.

Say we ask someone's income. A respondent does not have her or his income exactly in memory, he or she will round when answering that question. We can probably trust the first two, maybe three, digits. So storing that with about 8 digits accuracy is more than enough.

A float is probably not optimal for storing intermediate results of computations. We want to minimise the rounding errors that happen at each step, and if we store them as floats they can quickly add up. So for intermediate results in computations you are better of using doubles.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- numerical precision
-------------------------------------------------------------------------------

possible problems

logical statements with fractional numbers

Say we want to summarize a variable x for only the observation for which var == 0.1

We type sum x if var == 0.1 and Stata will tell you that no observations meet that criterium, while if you look at the data you see several such observations.

var is probably stored as a float, but Stata does computations in double precision

So Stata is comparing the float(0.1) from var to the double(0.1) in the expression and finds that they are not equal.

The solution is to really don't do equality checks on fractional numbers.

Storing larger numbers that need to be stored exactly

This is very common for ID variables. Say the first two digits stand for the country, the next two digits for the privince, the next three digits for the city, the next 4 digits for the household, the next two digits for the person, and the next two digits for the wave.

Now we have 15 digits, and a float cannot store that. A double can.

If we have even larger numbers we can store them as strings.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- programming
-------------------------------------------------------------------------------

defining a program in a .do file

. program drop _all

. program define hello 1. di "hello world" 2. end

. hello hello world

    A program starts with program and ends with end

Whatever is in between is anything that you could also include in a regular .do file.

So now you can write Stata programs.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Tips and tricks for working with Stata -- programming
-------------------------------------------------------------------------------

Why would you want to do that?

The most common application would be to use it to bootstrap some statistic

Occationally I do that to automate some extremely fiddly tasks that I need to do repeatedly

Consider the following example

. sysuse nlsw88,clear (NLSW, 1988 extract)

. reg wage c.ttl_exp##c.ttl_exp grade

Source | SS df MS Number of obs = 2,244 -------------+---------------------------------- F(3, 2240) = 129.89 Model | 11018.1157 3 3672.70523 Prob > F = 0.0000 Residual | 63336.2148 2,240 28.2750959 R-squared = 0.1482 -------------+---------------------------------- Adj R-squared = 0.1470 Total | 74354.3305 2,243 33.1495009 Root MSE = 5.3174

------------------------------------------------------------------------------ wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- ttl_exp | .3148533 .106226 2.96 0.003 .1065415 .523165 | c.ttl_exp#| c.ttl_exp | -.0022075 .0042817 -0.52 0.606 -.0106039 .006189 | grade | .6455095 .0457626 14.11 0.000 .5557678 .7352511 _cons | -4.238754 .7752557 -5.47 0.000 -5.759049 -2.718459 ------------------------------------------------------------------------------

. nlcom -_b[ttl_exp]/(2*_b[c.ttl_exp#c.ttl_exp])

_nl_1: -_b[ttl_exp]/(2*_b[c.ttl_exp#c.ttl_exp])

------------------------------------------------------------------------------ wage | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _nl_1 | 71.31548 115.0695 0.62 0.535 -154.2166 296.8476 ------------------------------------------------------------------------------

    So the maximum wage is obtained after 71 years of experience with a huge
    confidence interval

We may not trust that confidence interval as nlcom assumes that the sampling distribution is normally distributed, and our statistic is something divided by a number that could easily be 0. So we an expect strange things to happen, and to be sure we would like to bootstrap this.

. program drop _all

. sysuse nlsw88 (NLSW, 1988 extract)

. program define toboot, rclass 1. version 14 2. syntax [if] 3. marksample touse 4. reg wage c.ttl_exp##c.ttl_exp grade if `touse' 5. return scalar max = -_b[ttl_exp]/(2*_b[c.ttl_exp#c.ttl_exp]) 6. end

. toboot

Source | SS df MS Number of obs = 2,244 -------------+---------------------------------- F(3, 2240) = 129.89 Model | 11018.1157 3 3672.70523 Prob > F = 0.0000 Residual | 63336.2148 2,240 28.2750959 R-squared = 0.1482 -------------+---------------------------------- Adj R-squared = 0.1470 Total | 74354.3305 2,243 33.1495009 Root MSE = 5.3174

------------------------------------------------------------------------------ wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- ttl_exp | .3148533 .106226 2.96 0.003 .1065415 .523165 | c.ttl_exp#| c.ttl_exp | -.0022075 .0042817 -0.52 0.606 -.0106039 .006189 | grade | .6455095 .0457626 14.11 0.000 .5557678 .7352511 _cons | -4.238754 .7752557 -5.47 0.000 -5.759049 -2.718459 ------------------------------------------------------------------------------

. return list

scalars: r(max) = 71.31548299848835

. . bootstrap max=r(max), reps(100) bca : toboot (running toboot on estimation sample)

Jackknife replications (2244) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 .................................................. 150 .................................................. 200 .................................................. 250 .................................................. 300 .................................................. 350 .................................................. 400 .................................................. 450 .................................................. 500 .................................................. 550 .................................................. 600 .................................................. 650 .................................................. 700 .................................................. 750 .................................................. 800 .................................................. 850 .................................................. 900 .................................................. 950 .................................................. 1000 .................................................. 1050 .................................................. 1100 .................................................. 1150 .................................................. 1200 .................................................. 1250 .................................................. 1300 .................................................. 1350 .................................................. 1400 .................................................. 1450 .................................................. 1500 .................................................. 1550 .................................................. 1600 .................................................. 1650 .................................................. 1700 .................................................. 1750 .................................................. 1800 .................................................. 1850 .................................................. 1900 .................................................. 1950 .................................................. 2000 .................................................. 2050 .................................................. 2100 .................................................. 2150 .................................................. 2200 ............................................

Bootstrap replications (100) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100

Bootstrap results Number of obs = 2,244 Replications = 100

command: toboot max: r(max)

------------------------------------------------------------------------------ | Observed Bootstrap Normal-based | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- max | 71.31548 472.9631 0.15 0.880 -855.6752 998.3061 ------------------------------------------------------------------------------

. estat bootstrap, bca

Bootstrap results Number of obs = 2,244 Replications = 100

command: toboot max: r(max)

------------------------------------------------------------------------------ | Observed Bootstrap | Coef. Bias Std. Err. [95% Conf. Interval] -------------+---------------------------------------------------------------- max | 71.315483 -97.31549 472.96311 38.58707 799.4429 (BCa) ------------------------------------------------------------------------------ (BCa) bias-corrected and accelerated confidence interval

 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Goal

We will write a program htmlcodebook that creates a codebook in html format for a specified Stata dataset.

The codebook will contain:

The name of the file with data label and notes

A list of variables with label and notes

A desription of each variables: a table if there are value labels or the number of distinct values is small, summary statistics for unlabeled numeric variables with more values, and some examples for string variables.

. htmlcodebook using arc06.dta, saving(test.html) replace Output written to test.html

    Such a program is not written in one go, but this happens in a large
    series of small steps.

I have created a large number of small exercises that simulate that process in a guided way.

These exercises will cover a large part material discussed before.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Starting

clear all

program define htmlcodebook
    version 14
    syntax , SAVing(string) 
    
    tempname book
    file open `book' using `saving', write 
    file write `book' "<!DOCTYPE html>"_n
    file write `book' "<html>"_n
    file write `book' "<body>"
    file write `book' "<h1>title</h1>"_n
    file write `book' "</body>" _n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
end

cd h:\stata_l2
htmlcodebook, saving(test.html) 
type test.html

We start working in a .do file, and we start small. A .html file with only one line: "title".

It shows us what a program looks like, and how to write a file with Stata using the file command.

It is important that we start our .do file with program drop _all. We are definig a program, and Stata will return an error message if it already exists in memory.

The version command makes sure that this program will continue to work in future versions of Stata.

We specify the possible options with the syntax command.

With the file command we can write a file.

The first line says what kind of html file this is

The second and last line tells when the html file begins and ends.

The third and second last line tells when the body of the html file begins and ends.

The fourth line appears is displayed in the html file: it is "title" displayed as a heading 1.

After we created the html file we let Stata display a link that will open it in a browser.

We can see the file in plain text with the type command.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Add some lines to our .html file to make it prettier

We can make the html file look prettier by including the following lines after <html> and before <body>.

<style> body { width: 650px; margin: auto; } </style>

It limits the width of the output and positions it in the middle of the screen.

Change the program to make it include those lines at the appropriate place.

htmlcodebook01.do 
 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

add a replace option

If we run our .do file multiple times, than the file test.html is created multiple times.

Stata never overwrites something unless you explicitly say so, i.e. specify the replace option in the file command.

We want to copy that behavior: i.e.

add a replace option to our htmlcodebook command, and

only specify the replace option in file, when the user specified the replace option in htmlcodebook.

Look in help syntax to find out how to implement on/off options (the user specified or did not specify the replace option).

htmlcodebook02.do 
 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Let the user specify a meaningful title

"Title" is not a very meaningful title, we should add an option that allows the user to specify its own title

Such an option would ask for a string, see help syntax on how to implement such an option

htmlcodebook03.do 
 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

What should the title be when the title() option is not specified?

If the user did not specify the title option then we need to make a sensible choice. I suggest "Codebook for" and than the file the user specified in using

See help ifcmd on how to implement different things depending on whether or not an option was specified.

htmlcodebook04.do 
 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Add a list of variable names to our codefile

clear all

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    qui use "`using'", clear
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n    
    file write `book' "<body>"_n
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    file write `book' "<h3>Variable list</h3>"_n    //new
    file write `book' "<ul>"_n                      //new
    foreach var of varlist * {                      //new
        file write `book' "<li>`var'</li>"_n        //new
    }                                               //new
    file write `book' "</ul>"                       //new
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
    restore
end

cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

At the very least a codebook should contain a list of variable names

In the .do file we can see how we can create a list in html using the <ul> , </ul>, <li>, and </li> tags.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Adding varialble labels to our variable list

Change this program such that

if there is a variable label then the variable list shows variable name : variable lable

otherwise just the variable name

Use extended macro functions find the variable label (see help extended_fcn)

Notice that that returns the variable name if no variable label is attached to that variable

htmlcodebook06.do 
 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Add data label to codebook

Use extended macro functions to find the label belonging to the dataset.

If such a label exists, add it to the codebook underneath the title with <h2> and </h2> tags.

htmlcodebook07.do 
 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Add data notes to codebook

Notes are stored as characteristics, which can be accessed using the `: char' extended macro function.

The notes are named _dta[note1], _dta[note2], etc.

How many notes there are is stored in _dta[note0]

This is empty ("") if no notes were specified

Or contains the number of notes, when one or more notes exist

Add a list of data notes underneath the data label if such notes exist

htmlcodebook08.do 
 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Splitting the program up in smaller subroutines

clear all

program define Descfile                                              //new
    syntax using/, book(string)                                      //new
                                                                     //new
    qui use "`using'", clear                                         //new
                                                                     //new
    if `"`: data label'"' != "" {                                    //new
        file write `book' `"<h2>`:data label'</h2>"' _n              //new
    }                                                                //new
    if "`: char _dta[note0]'" != "" {                                //new
        file write `book' "<h3>Notes:</h3>"_n"<ul>"_n                //new
        forvalues i = 1/`: char _dta[note0]' {                       //new
            file write `book' `"<li>`: char _dta[note`i']'</li>"' _n //new
        }                                                            //new
        file write `book' "</ul>"_n                                  //new
    }                                                                //new
end                                                                  //new

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n    
    file write `book' "<body>"_n
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    
    Descfile using "`using'", book(`book')                           //new
    
    file write `book' "<h3>Variable list</h3>"_n
    file write `book' "<ul>"_n
    foreach var of varlist * {
        file write `book' "<li>`var'"
        if `"`: var label `var''"' != "`var'" {
            file write `book' `": `: var label `var''"'
        }
        file write `book' "</li>"_n
    }
    file write `book' "</ul>"
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
    restore
end

cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

Our program becomes bigger and bigger. To keep an overview, and make it easier to spot errors and maintain the program it is best to split the program up into different subroutines, each for a specific task.

Here we create a subroutine for describing a file

StataCorp often starts subroutines with capital leters, and I have copied that convention.

Notice that the handle for our codebook, `book', was a tempname, so local to the program that created it.

To solve this we define the handle in the main program.

This means that it exists as long as the main program is being executed, even if it call another program, like our subroutine.

That way we can pass that temporary name on to sub-routines as options

Use this trick to create another sub-routine for describing the variables, call it Descvars

htmlcodebook10.do 
 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Add variable notes to the variable list

Change Descvars such that notes belonging to that variable are displayed if they exist.

htmlcodebook11.do 
 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Add a list of value labels to our codebook

clear all

program define Descfile
    syntax using/, book(string)
    
    qui use "`using'", clear
    
    if `"`: data label'"' != "" {
        file write `book' `"<h2>`:data label'</h2>"' _n
    }
    if "`: char _dta[note0]'" != "" {
        file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
        forvalues i = 1/`: char _dta[note0]' {
            file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
        }
        file write `book' "</ul>"_n
    }
end

program define Descvars
    syntax, book(string)
    
    file write `book' "<h3>Variable list</h3>"
    file write `book' "<ul>"_n
    foreach var of varlist * {
        file write `book' "<li>`var'"
        if `"`: var label `var''"' != "`var'" {
            file write `book' `": `: var label `var''"'
        }
        file write `book' "</li>"_n
        if "`: char `var'[note0]'" != "" {
            file write `book' "<ul>" _n
            forvalues i = 1/`: char `var'[note0]'{
                file write `book' `"<li>`: char `var'[note`i']'</li>"'
            }
            file write `book' "</ul>" _n
        }
    }
    file write `book' "</ul>"
end    

program define Descvalues                                     //new
    syntax, book(string) file(string)                         //new
                                                              //new
    foreach var of varlist * {                                //new
        local vallab : val label `var'                        //new
        if "`vallab'" != "" {                                 //new
            file write `book' `"<h4>`var'"'                   //new
            local varlab : var label `var'                    //new
            if `"`varlab'"' != "`var'" {                      //new
                file write `book' `": `varlab'"'              //new
            }                                                 //new 
            file write `book' "</h4>"_n                       //new  
            qui uselabel `vallab'                             //new  
            file write `book' "<table>"_n                     //new
            forvalues i = 1/`=_N' {                           //new
                file write `book' "<tr>"                      //new
                file write `book' `"<td>`=value[`i']'</td>"'  //new
                file write `book' `"<td>`=label[`i']'</td>"'  //new
                file write `book' "</td>"_n                   //new
            }                                                 //new
            file write `book' "</table>"_n                    //new
            qui use `file', clear                             //new
        }                                                     //new
    }                                                         //new
                                                              //new
end                                                           //new

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n
    file write `book' "<body>"_n
    
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    
    Descfile using "`using'", book(`book')
    Descvars, book(`book')
    Descvalues, book(`book') file(`using')                      //new  
    
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
    restore
end

cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

Here we add for each variable a list of value labels to our codebook if value labels exist

It uses the uselabel command, which stores value labels as a new dataset

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Add frequencies to our labels

clear all

program define Descfile
    syntax using/, book(string)
    
    qui use "`using'", clear
    
    if `"`: data label'"' != "" {
        file write `book' `"<h2>`:data label'</h2>"' _n
    }
    if "`: char _dta[note0]'" != "" {
        file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
        forvalues i = 1/`: char _dta[note0]' {
            file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
        }
        file write `book' "</ul>"_n
    }
end

program define Descvars
    syntax, book(string)
    
    file write `book' "<h3>Variable list</h3>"
    file write `book' "<ul>"_n
    foreach var of varlist * {
        file write `book' "<li>`var'"
        if `"`: var label `var''"' != "`var'" {
            file write `book' `": `: var label `var''"'
        }
        file write `book' "</li>"_n
        if "`: char `var'[note0]'" != "" {
            file write `book' "<ul>" _n
            forvalues i = 1/`: char `var'[note0]'{
                file write `book' `"<li>`: char `var'[note`i']'</li>"'
            }
            file write `book' "</ul>" _n
        }
    }
    file write `book' "</ul>"
end    

program define Descvalues
    syntax, book(string) file(string)
    
    foreach var of varlist * {
        local vallab : val label `var'
        if "`vallab'" != "" {
            file write `book' `"<h4>`var'"'
            local varlab : var label `var'
            if `"`varlab'"' != "`var'" {
                file write `book' `": `varlab'"'
            }
            file write `book' "</h4>"_n
            
            tempvar freq                                      //new
            tempfile freqtable                                //new
            contract `var', freq(`freq')                      //new
            rename `var' value                                //new
            qui save `freqtable'                              //new
            qui uselabel `vallab'           
            qui merge 1:1 value using `freqtable'             //new
            file write `book' "<table>"_n
            file write `book' "<tr>"                          //new
            file write `book' "<th>value</th>"                //new
            file write `book' "<th>label</th>"                //new
            file write `book' "<th>frequency</th>"            //new
            file write `book' "</tr>"_n                       //new
            forvalues i = 1/`=_N' {
                file write `book' "<tr>"
                file write `book' `"<td>`=value[`i']'</td>"'
                file write `book' `"<td>`=label[`i']'</td>"'
                file write `book' `"<td>`=`freq'[`i']'</td>"' //new
                file write `book' "</td>"_n
            }
            file write `book' "</table>"_n
            qui use `file', clear
        }
    }

end

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n
    file write `book' "<body>"_n
    
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    
    Descfile using "`using'", book(`book')
    Descvars, book(`book')
    Descvalues, book(`book') file(`using')    
    
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
    restore
end

cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

contract changes the data in one observation per value of `var' and a new variable `freq' that contains the number of observations with that value

We merge that with the file created by uselabel

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Numeric variables without value labels

clear all

program define Descfile
    syntax using/, book(string)
    
    qui use "`using'", clear
    
    if `"`: data label'"' != "" {
        file write `book' `"<h2>`:data label'</h2>"' _n
    }
    if "`: char _dta[note0]'" != "" {
        file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
        forvalues i = 1/`: char _dta[note0]' {
            file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
        }
        file write `book' "</ul>"_n
    }
end

program define Descvars
    syntax, book(string)
    
    file write `book' "<h3>Variable list</h3>"
    file write `book' "<ul>"_n
    foreach var of varlist * {
        file write `book' "<li>`var'"
        if `"`: var label `var''"' != "`var'" {
            file write `book' `": `: var label `var''"'
        }
        file write `book' "</li>"_n
        if "`: char `var'[note0]'" != "" {
            file write `book' "<ul>" _n
            forvalues i = 1/`: char `var'[note0]'{
                file write `book' `"<li>`: char `var'[note`i']'</li>"'
            }
            file write `book' "</ul>" _n
        }
    }
    file write `book' "</ul>"
end    

program define Descvalues
    syntax, book(string) file(string)
    
    local N = _N
    foreach var of varlist * {
        file write `book' `"<h4>`var'"'
        local varlab : var label `var'
        if `"`varlab'"' != "`var'" {
            file write `book' `": `varlab'"'
        }
        file write `book' "</h4>"_n
        tempvar freq
        contract `var', freq(`freq')
        
        local vallab : val label `var'    
        if "`vallab'" != "" {
            tempfile freqtable
            rename `var' value
            qui save `freqtable'
            qui uselabel `vallab'
            qui merge 1:1 value using `freqtable'
            file write `book' "<table>"_n
            file write `book' "<tr>"
            file write `book' "<th>value</th>"
            file write `book' "<th>label</th>"
            file write `book' "<th>frequency</th>"
            file write `book' "</tr>"_n
            forvalues i = 1/`=_N' {
                file write `book' "<tr>"
                file write `book' `"<td>`=value[`i']'</td>"'
                file write `book' `"<td>`=label[`i']'</td>"'
                file write `book' `"<td>`=`freq'[`i']'</td>"'
                file write `book' "</td>"_n
            }
            file write `book' "</table>"_n
        }
        else {                                                       //new
            if _N <= 10 {                                            //new
                file write `book' "<table>"_n                        //new
                file write `book' "<tr>"                             //new
                file write `book' "<th>value</th>"                   //new
                file write `book' "<th>frequency</th>"               //new
                file write `book' "</tr>"_n                          //new
                forvalues i = 1/`=_N' {                              //new
                    file write `book' "<tr>"                         //new
                    file write `book' `"<td>`=`var'[`i']'</td>"'     //new      
                    file write `book' `"<td>`=`freq'[`i']'</td>"'    //new
                    file write `book' "</td>"_n                      //new
                }                                                     //new
            }                                                        //new
            capture confirm numeric variable `var'                   //new
            else if !_rc {                                           //new
                file write `book' "<table>" _n                       //new
                qui count if !missing(`var')                         //new
                local distinct = r(N)                                //new
                qui sum `var' [fw=`freq'], detail                    //new
                file write `book' "<tr>"_n                           //new
                file write `book' `"<th>valid/missing obs.</th>"'    //new
                file write `book' `"<td>`r(N)'/`=`N'-`r(N)'' </td>"' //new
                file write `book' "</tr>"_n                          //new 
                file write `book' "<tr>"                             //new
                file write `book' "<th>distinct values</th>"         //new
                file write `book' "<td>`distinct'</td>"              //new
                file write `book' "</tr>"_n                          //new
                file write `book' "<tr>"                             //new
                file write `book' "<th>minimum</th>"                 //new
                file write `book' "<td>`r(min)'</td>"                //new
                file write `book' "</tr>"_n                          //new
                file write `book' "</table>" _n                      //new
            }                                                        //new
        }                                                            //new
        qui use `file', clear
    }

end

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n
    file write `book' "<body>"_n
    
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    
    Descfile using "`using'", book(`book')
    Descvars, book(`book')
    Descvalues, book(`book') file(`using')    
    
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
    restore
end

cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

For numeric variables with less than 10 values we tabulate the data

For more we start showing summary statistics.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

display format for summary statistics

I am not quite happy with the way the minimum is displayed.

Can't we copy the display format for that variable?

The display format for a variable can be extracted using extended macro functions

You can apply that format using the `: display ' extended macro function

Change the program to achieve this goal

htmlcodebook15.do 
 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Add other summary statistics

Expand our program to also incude the 25th, 50th, 75th percentiles and the maximum.

htmlcodebook16.do 
 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

String variables

We also want to show something for string variables.

Lets show the first 10 distinct values as an example

Change the program to achive that goal

htmlcodebook17.do 
 
 
-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Add links between the variable list and the value label list

clear all

program define Descfile
    syntax using/, book(string)
    
    qui use "`using'", clear
    
    if `"`: data label'"' != "" {
        file write `book' `"<h2>`:data label'</h2>"' _n
    }
    if "`: char _dta[note0]'" != "" {
        file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
        forvalues i = 1/`: char _dta[note0]' {
            file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
        }
        file write `book' "</ul>"_n
    }
end

program define Descvars
    syntax, book(string)
    
    file write `book' "<h3>Variable list</h3>"
    file write `book' "<ul>"_n
    foreach var of varlist * {
        file write `book' `"<li><a href="#l`var'">`var'</a>"'   //new
        if `"`: var label `var''"' != "`var'" {
            file write `book' `": `: var label `var''"'
        }
        file write `book' "</li>"_n
        if "`: char `var'[note0]'" != "" {
            file write `book' "<ul>" _n
            forvalues i = 1/`: char `var'[note0]'{
                file write `book' `"<li>`: char `var'[note`i']'</li>"'
            }
            file write `book' "</ul>" _n
        }
    }
    file write `book' "</ul>"
end    

program define Descvalues
    syntax, book(string) file(string)
    
    local N = _N
    foreach var of varlist * {
        file write `book' `"<h4 id="l`var'">`var'"'             //new
        local varlab : var label `var'
        if `"`varlab'"' != "`var'" {
            file write `book' `": `varlab'"'
        }
        file write `book' "</h4>"_n
        tempvar freq
        contract `var', freq(`freq')
        
        local vallab : val label `var'    
        if "`vallab'" != "" {
            tempfile freqtable
            rename `var' value
            qui save `freqtable'
            qui uselabel `vallab'
            qui merge 1:1 value using `freqtable'
            file write `book' "<table>"_n
            file write `book' "<tr>"
            file write `book' "<th>value</th>"
            file write `book' "<th>label</th>"
            file write `book' "<th>frequency</th>"
            file write `book' "</tr>"_n
            forvalues i = 1/`=_N' {
                file write `book' "<tr>"
                file write `book' `"<td>`=value[`i']'</td>"'
                file write `book' `"<td>`=label[`i']'</td>"'
                file write `book' `"<td>`=`freq'[`i']'</td>"'
                file write `book' "</td>"_n
            }
            file write `book' "</table>"_n
        }
        else {
            if _N <= 10 {
                file write `book' "<table>"_n
                file write `book' "<tr>"
                file write `book' "<th>value</th>"
                file write `book' "<th>frequency</th>"
                file write `book' "</tr>"_n
                forvalues i = 1/`=_N' {
                    file write `book' "<tr>"
                    file write `book' `"<td>`=`var'[`i']'</td>"'
                    file write `book' `"<td>`=`freq'[`i']'</td>"'
                    file write `book' "</td>"_n
                }                
            }
            capture confirm numeric variable `var'
            else if !_rc {
                file write `book' "<table>" _n
                qui count if !missing(`var')
                local distinct = r(N)
                local fmt : format `var'
                qui sum `var' [fw=`freq'], detail
                file write `book' "<tr>"_n
                file write `book' `"<th>valid/missing obs.</th>"'
                file write `book' `"<td>`r(N)'/`=`N'-`r(N)'' </td>"'
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>distinct values</th>"
                file write `book' "<td>`distinct'</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>minimum</th>"
                file write `book' "<td>`:display `fmt' `r(min)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>25th percentile</th>"
                file write `book' "<td>`:display `fmt' `r(p25)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>50th percentile</th>"
                file write `book' "<td>`:display `fmt' `r(p50)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>75th percentile</th>"
                file write `book' "<td>`:display `fmt' `r(p75)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>maximum</th>"
                file write `book' "<td>`:display `fmt' `r(max)''</td>"
                file write `book' "</tr>"_n                
                file write `book' "</table>" _n
            }
            else {
                file write `book' "<p>Variable `var' is a string variable,"
                file write `book' "example values are:</p>"_n
                file write `book' "<ul>"_n
                forvalues i = 1/10 {
                    file write `book' `"<li>`=`var'[`i']'</li>"'_n
                }
                file write `book' "</ul>"_n
            }
        
        }
        qui use `file', clear
    }

end

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n
    file write `book' "<body>"_n
    
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    
    Descfile using "`using'", book(`book')
    Descvars, book(`book')
    Descvalues, book(`book') file(`using')    
    
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
    restore
end

cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

Descvalues creates markers, and in Descvars it adds links to those markers

That way users can now jump from the list of variables to the list of value labels.

clear all

program define Descfile
    syntax using/, book(string)
    
    qui use "`using'", clear
    
    if `"`: data label'"' != "" {
        file write `book' `"<h2>`:data label'</h2>"' _n
    }
    if "`: char _dta[note0]'" != "" {
        file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
        forvalues i = 1/`: char _dta[note0]' {
            file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
        }
        file write `book' "</ul>"_n
    }
end

program define Descvars
    syntax, book(string)
    
    file write `book' "<h3>Variable list</h3>"
    file write `book' "<ul>"_n
    foreach var of varlist * {
        file write `book' `"<li id="v`var'"><a href="#l`var'">`var'</a>"' //new
        if `"`: var label `var''"' != "`var'" {
            file write `book' `": `: var label `var''"' 
        }
        file write `book' "</li>"_n
        if "`: char `var'[note0]'" != "" {
            file write `book' "<ul>" _n
            forvalues i = 1/`: char `var'[note0]'{
                file write `book' `"<li>`: char `var'[note`i']'</li>"'
            }
            file write `book' "</ul>" _n
        }
    }
    file write `book' "</ul>"
end    

program define Descvalues
    syntax, book(string) file(string)
    
    local N = _N
    foreach var of varlist * {
        file write `book' `"<h4 id="l`var'"><a href="#v`var'">`var'</a>"' //new
        local varlab : var label `var'
        if `"`varlab'"' != "`var'" {
            file write `book' `": `varlab'"'
        }
        file write `book' "</h4>"_n
        tempvar freq
        contract `var', freq(`freq')
        
        local vallab : val label `var'    
        if "`vallab'" != "" {
            tempfile freqtable
            rename `var' value
            qui save `freqtable'
            qui uselabel `vallab'
            qui merge 1:1 value using `freqtable'
            file write `book' "<table>"_n
            file write `book' "<tr>"
            file write `book' "<th>value</th>"
            file write `book' "<th>label</th>"
            file write `book' "<th>frequency</th>"
            file write `book' "</tr>"_n
            forvalues i = 1/`=_N' {
                file write `book' "<tr>"
                file write `book' `"<td>`=value[`i']'</td>"'
                file write `book' `"<td>`=label[`i']'</td>"'
                file write `book' `"<td>`=`freq'[`i']'</td>"'
                file write `book' "</td>"_n
            }
            file write `book' "</table>"_n
        }
        else {
            if _N <= 10 {
                file write `book' "<table>"_n
                file write `book' "<tr>"
                file write `book' "<th>value</th>"
                file write `book' "<th>frequency</th>"
                file write `book' "</tr>"_n
                forvalues i = 1/`=_N' {
                    file write `book' "<tr>"
                    file write `book' `"<td>`=`var'[`i']'</td>"'
                    file write `book' `"<td>`=`freq'[`i']'</td>"'
                    file write `book' "</td>"_n
                }                
            }
            capture confirm numeric variable `var'
            else if !_rc {
                file write `book' "<table>" _n
                qui count if !missing(`var')
                local distinct = r(N)
                local fmt : format `var'
                qui sum `var' [fw=`freq'], detail
                file write `book' "<tr>"_n
                file write `book' `"<th>valid/missing obs.</th>"'
                file write `book' `"<td>`r(N)'/`=`N'-`r(N)'' </td>"'
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>distinct values</th>"
                file write `book' "<td>`distinct'</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>minimum</th>"
                file write `book' "<td>`:display `fmt' `r(min)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>25th percentile</th>"
                file write `book' "<td>`:display `fmt' `r(p25)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>50th percentile</th>"
                file write `book' "<td>`:display `fmt' `r(p50)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>75th percentile</th>"
                file write `book' "<td>`:display `fmt' `r(p75)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>maximum</th>"
                file write `book' "<td>`:display `fmt' `r(max)''</td>"
                file write `book' "</tr>"_n                
                file write `book' "</table>" _n
            }
            else {
                file write `book' "<p>Variable `var' is a string variable,"
                file write `book' "example values are:</p>"_n
                file write `book' "<ul>"_n
                forvalues i = 1/10 {
                    file write `book' `"<li>`=`var'[`i']'</li>"'_n
                }
                file write `book' "</ul>"_n
            }
        
        }
        qui use `file', clear
    }

end

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n
    file write `book' "<body>"_n
    
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    
    Descfile using "`using'", book(`book')
    Descvars, book(`book')
    Descvalues, book(`book') file(`using')    
    
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
    restore
end

cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

Descvars creates markers, and Descvalues adds links to those markers

that way users can now jump from the list of value labels back to the

clear all

program define Descfile
    syntax using/, book(string)
    
    qui use "`using'", clear
    
    if `"`: data label'"' != "" {
        file write `book' `"<h2>`:data label'</h2>"' _n
    }
    if "`: char _dta[note0]'" != "" {
        file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
        forvalues i = 1/`: char _dta[note0]' {
            file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
        }
        file write `book' "</ul>"_n
    }
end

program define Descvars
    syntax, book(string)
    
    file write `book' "<h3>Variable list</h3>"
    file write `book' "<ul>"_n
    foreach var of varlist * {
        file write `book' `"<li id="v`var'"><a href="#l`var'">`var'</a>"'
        if `"`: var label `var''"' != "`var'" {
            file write `book' `": `: var label `var''"'
        }
        file write `book' "</li>"_n
        if "`: char `var'[note0]'" != "" {
            file write `book' "<ul>" _n
            forvalues i = 1/`: char `var'[note0]'{
                file write `book' `"<li>`: char `var'[note`i']'</li>"'
            }
            file write `book' "</ul>" _n
        }
    }
    file write `book' "</ul>"
end    

program define Descvalues
    syntax, book(string) file(string)
    
    local N = _N
    foreach var of varlist * {
        file write `book' `"<h4 id="l`var'"><a href="#v`var'">`var'</a>"'
        local varlab : var label `var'
        if `"`varlab'"' != "`var'" {
            file write `book' `": `varlab'"'
        }
        file write `book' "</h4>"_n
        tempvar freq
        contract `var', freq(`freq')
        
        local vallab : val label `var'    
        if "`vallab'" != "" {
            tempfile freqtable
            rename `var' value
            qui save `freqtable'
            qui uselabel `vallab'
            qui merge 1:1 value using `freqtable'
            file write `book' "<table>"_n
            file write `book' "<tr>"
            file write `book' "<th>value</th>"
            file write `book' "<th>label</th>"
            file write `book' "<th>frequency</th>"
            file write `book' "</tr>"_n
            forvalues i = 1/`=_N' {
                file write `book' "<tr>"
                file write `book' `"<td>`=value[`i']'</td>"'
                file write `book' `"<td>`=label[`i']'</td>"'
                file write `book' `"<td>`=`freq'[`i']'</td>"'
                file write `book' "</td>"_n
            }
            file write `book' "</table>"_n
        }
        else {
            if _N <= 10 {
                file write `book' "<table>"_n
                file write `book' "<tr>"
                file write `book' "<th>value</th>"
                file write `book' "<th>frequency</th>"
                file write `book' "</tr>"_n
                forvalues i = 1/`=_N' {
                    file write `book' "<tr>"
                    file write `book' `"<td>`=`var'[`i']'</td>"'
                    file write `book' `"<td>`=`freq'[`i']'</td>"'
                    file write `book' "</td>"_n
                }                
            }
            capture confirm numeric variable `var'
            else if !_rc {
                file write `book' "<table>" _n
                qui count if !missing(`var')
                local distinct = r(N)
                local fmt : format `var'
                qui sum `var' [fw=`freq'], detail
                file write `book' "<tr>"_n
                file write `book' `"<th>valid/missing obs.</th>"'
                file write `book' `"<td>`r(N)'/`=`N'-`r(N)'' </td>"'
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>distinct values</th>"
                file write `book' "<td>`distinct'</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>minimum</th>"
                file write `book' "<td>`:display `fmt' `r(min)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>25<sup>th</sup> percentile</th>"  //new
                file write `book' "<td>`:display `fmt' `r(p25)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>50<sup>th</sup> percentile</th>"  //new
                file write `book' "<td>`:display `fmt' `r(p50)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>75<sup>th</sup> percentile</th>"  //new
                file write `book' "<td>`:display `fmt' `r(p75)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>maximum</th>"
                file write `book' "<td>`:display `fmt' `r(max)''</td>"
                file write `book' "</tr>"_n                
                file write `book' "</table>" _n
            }
            else {
                file write `book' "<p>Variable `var' is a string variable,"
                file write `book' "example values are:</p>"_n
                file write `book' "<ul>"_n
                forvalues i = 1/10 {
                    file write `book' `"<li>`=`var'[`i']'</li>"'_n
                }
                file write `book' "</ul>"_n
            }
        
        }
        qui use `file', clear
    }

end

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "    width: 650px;"_n
    file write `book' "    margin: auto;"_n
    file write `book' "}"_n
    file write `book' "table {"_n                                        //new
    file write `book' "    border-collapse: collapse;"_n                 //new
    file write `book' "}"_n                                              //new
    file write `book' "table, th, td {"_n                                //new
    file write `book' "    border: 1px solid black;"_n                   //new
    file write `book' "}"_n                                              //new
    file write `book' "th, td {"_n                                       //new
    file write `book' "    text-align: left;"_n                          //new
    file write `book' "    padding-right: 10px;"_n                       //new
    file write `book' "    padding-left: 10px;"_n                        //new
    file write `book' "}"_n                                              //new
    file write `book' "</style>"_n
    file write `book' "<body>"_n
    
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    
    Descfile using "`using'", book(`book')
    Descvars, book(`book')
    Descvalues, book(`book') file(`using')    
    
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
    restore
end

cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

Add some additional HTML code to make the tables look pretty.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

Turn this into an .ado file

To turn this into an .ado file all we have to do is

remove the commands outside the programs.

move the main program htmlcodebook to the top of the file

Add a comment *! version 0.1.0 24Apr2018 MLB, this is displayed when you type which htmlcodebook

Store this under htmlcodebook.ado, the name of the file has to correspond with the name of the first program

All other programs are local to the main program. So other programs cannot see them.

-------------------------------------------------------------------------------

<< index >>

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------

What is still left?

Some datasets consist of mulitple files, it would be nice to be able to specify a directory and let htmlcodebook create a codebook for all files in that directory.

No program is complete without a help file. We can use examplehelpfile and viewsource examplehelpfile.sthlp as a template.

-------------------------------------------------------------------------------

<< index

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
digression
-------------------------------------------------------------------------------

c_local

With c_local you can create a local macro one level below.

So if bar.do calls foo.do, and foo.do contains the line c_local blup 1 then that will create a local macro in bar.do called `blup' containing 1

If you are later reading bar.do you have no indication that foo.do will create that local macro. So it can be very hard to later figure out where that local macro `blup' came from. This is why I labeled it as dangerous.

This is why it is not documented, and not even undocumented.

To quote Nick Cox: https://www.stata.com/statalist/archive/2005-11/msg00405.html:

-c_local- is not documented; it is not even "undocumented" (-help undocumented-). So, how does anyone outside StataCorp know about it?

What happens is this: after a long period of Stata use in which you have done well, Stata will speak to you:

"Greetings! You have reached the seventh level of Stata, and I name you Statafriend.

You will now be initiated into seven Stata secrets. The first is -c_local-."

and so forth, but the rest of it is probably not of interest.

Since this is only a second level Stata course you still have a way to go...

-------------------------------------------------------------------------------

<< index

-------------------------------------------------------------------------------

htmlcodebook01.do



clear all

program define htmlcodebook
    version 14
    syntax , SAVing(string) 
    
    tempname book
    file open `book' using `saving', write 
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n           //new
    file write `book' "body {"_n            //new  
    file write `book' "width: 650px;"_n     //new
    file write `book' "margin: auto;"_n     //new   
    file write `book' "}"_n                 //new
    file write `book' "</style>"_n          //new
    file write `book' "<body>"
    file write `book' "<h1>title</h1>"_n
    file write `book' "</body>" _n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
end

cd h:\stata_l2
htmlcodebook, saving(test.html) 
type test.html

-------------------------------------------------------------------------------

<<

-------------------------------------------------------------------------------

htmlcodebook02.do



clear all

program define htmlcodebook
    version 14
    syntax , SAVing(string) [replace]                 //new
    
    tempname book
    file open `book' using `saving', write `replace'  //new
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n    
    file write `book' "<body>"_n    
    file write `book' "<h1>title</h1>"_n
    file write `book' "</body>"_n    
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'
end

cd h:\stata_l2
htmlcodebook, saving(test.html) replace

-------------------------------------------------------------------------------

<<

-------------------------------------------------------------------------------

htmlcodebook03.do



clear all

program define htmlcodebook
    version 14
    syntax , SAVing(string) [replace title(string)]   //new
    
    tempname book
    file open `book' using `saving', write `replace'
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n
    file write `book' "<body>"_n
    if `"`title'"' != "" {                            //new
        file write `book' "<h1>`title'</h1>"_n        //new
    }                                                 //new
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
end

cd h:\stata_l2
htmlcodebook, saving(test.html) title("foo") replace

-------------------------------------------------------------------------------

<<

-------------------------------------------------------------------------------

htmlcodebook04.do



clear all

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book data
    file open `book' using `saving', write `replace'
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n    
    file write `book' "<body>"_n
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {                                                   //new
        file write `book' "<h1>Codebook for `using'</h1>"_n  //new
    }                                                        //new
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
end

cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

-------------------------------------------------------------------------------

<<

-------------------------------------------------------------------------------

htmlcodebook06.do



clear all

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    qui use "`using'", clear
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n    
    file write `book' "<body>"_n
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    file write `book' "<h3>Variable list</h3>"
    file write `book' "<ul>"_n
    foreach var of varlist * {
        file write `book' "<li>`var'"                    //new
        if `"`: var label `var''"' != "`var'" {          //new
            file write `book' `": `: var label `var''"'  //new
        }                                                //new
        file write `book' "</li>"_n                      //new
    }
    file write `book' "</ul>"
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
    restore
end

cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

-------------------------------------------------------------------------------

<<

-------------------------------------------------------------------------------

htmlcodebook07.do



clear all

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    qui use "`using'", clear
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n
    file write `book' "<body>"_n
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    
    if `"`: data label'"' != "" {                          //new
        file write `book' `"<h2>`:data label'</h2>"' _n    //new
    }                                                      //new
    
    file write `book' "<h3>Variable list</h3>"_n
    file write `book' "<ul>"_n
    foreach var of varlist * {
        file write `book' "<li>`var'"
        if `"`: var label `var''"' != "`var'" {
            file write `book' `": `: var label `var''"'
        }
        file write `book' "</li>"_n
    }
    file write `book' "</ul>"
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
    restore
end

cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

-------------------------------------------------------------------------------

<<

-------------------------------------------------------------------------------

htmlcodebook08.do



clear all

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    qui use "`using'", clear
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n    
    file write `book' "<body>"_n
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    
    if `"`: data label'"' != "" {
        file write `book' `"<h2>`:data label'</h2>"' _n
    }
    if "`: char _dta[note0]'" != "" {                                //new
        file write `book' "<h3>Notes:</h3>"_n"<ul>"_n                //new
        forvalues i = 1/`: char _dta[note0]' {                       //new
            file write `book' `"<li>`: char _dta[note`i']'</li>"' _n //new
        }                                                            //new
        file write `book' "</ul>"_n                                  //new
    }                                                                //new
    
    file write `book' "<h3>Variable list</h3>"_n
    file write `book' "<ul>"_n
    foreach var of varlist * {
        file write `book' "<li>`var'"
        if `"`: var label `var''"' != "`var'" {
            file write `book' `": `: var label `var''"' 
        }
        file write `book' "</li>"_n
    }
    file write `book' "</ul>"
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
    restore
end

cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace


-------------------------------------------------------------------------------

<<

-------------------------------------------------------------------------------

htmlcodebook10.do



clear all

program define Descfile
    syntax using/, book(string)
    
    qui use "`using'", clear
    
    if `"`: data label'"' != "" {
        file write `book' `"<h2>`:data label'</h2>"' _n
    }
    if "`: char _dta[note0]'" != "" {
        file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
        forvalues i = 1/`: char _dta[note0]' {
            file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
        }
        file write `book' "</ul>"_n
    }
end

program define Descvars                                  //new
    syntax, book(string)                                 //new
                                                         //new
    file write `book' "<h3>Variable list</h3>"_n         //new
    file write `book' "<ul>"_n                           //new
    foreach var of varlist * {                           //new
        file write `book' "<li>`var'"                    //new
        if `"`: var label `var''"' != "`var'" {          //new
            file write `book' `": `: var label `var''"'  //new
        }                                                //new
        file write `book' "</li>"_n                      //new
    }                                                    //new
    file write `book' "</ul>"                            //new
end                                                         //new

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n    
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    
    Descfile using "`using'", book(`book')
    Descvars, book(`book')                               //new
    
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'        
    restore
end

cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

-------------------------------------------------------------------------------

<<

-------------------------------------------------------------------------------

htmlcodebook11.do



clear all

program define Descfile
    syntax using/, book(string)
    
    qui use "`using'", clear
    
    if `"`: data label'"' != "" {
        file write `book' `"<h2>`:data label'</h2>"' _n
    }
    if "`: char _dta[note0]'" != "" {
        file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
        forvalues i = 1/`: char _dta[note0]' {
            file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
        }
        file write `book' "</ul>"_n
    }
end

program define Descvars
    syntax, book(string)
    
    file write `book' "<h3>Variable list</h3>"
    file write `book' "<ul>"_n
    foreach var of varlist * {
        file write `book' "<li>`var'"
        if `"`: var label `var''"' != "`var'" {
            file write `book' `": `: var label `var''"'
        }
        file write `book' "</li>"_n        
        if "`: char `var'[note0]'" != "" {                             //new
            file write `book' "<ul>" _n                                //new
            forvalues i = 1/`: char `var'[note0]'{                     //new
                file write `book' `"<li>`: char `var'[note`i']'</li>"' //new
            }                                                          //new
            file write `book' "</ul>" _n                               //new
        }                                                              //new
    }
    file write `book' "</ul>"
end    

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n
    file write `book' "<body>"_n
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    
    Descfile using "`using'", book(`book')
    Descvars, book(`book')
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
    restore
end

cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

-------------------------------------------------------------------------------

<<

-------------------------------------------------------------------------------

htmlcodebook15.do



clear all

program define Descfile
    syntax using/, book(string)
    
    qui use "`using'", clear
    
    if `"`: data label'"' != "" {
        file write `book' `"<h2>`:data label'</h2>"' _n
    }
    if "`: char _dta[note0]'" != "" {
        file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
        forvalues i = 1/`: char _dta[note0]' {
            file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
        }
        file write `book' "</ul>"_n
    }
end

program define Descvars
    syntax, book(string)
    
    file write `book' "<h3>Variable list</h3>"
    file write `book' "<ul>"_n
    foreach var of varlist * {
        file write `book' "<li>`var'"
        if `"`: var label `var''"' != "`var'" {
            file write `book' `": `: var label `var''"'
        }
        file write `book' "</li>"_n
        if "`: char `var'[note0]'" != "" {
            file write `book' "<ul>" _n
            forvalues i = 1/`: char `var'[note0]'{
                file write `book' `"<li>`: char `var'[note`i']'</li>"'
            }
            file write `book' "</ul>" _n
        }
    }
    file write `book' "</ul>"
end    

program define Descvalues
    syntax, book(string) file(string)
    
    local N = _N
    foreach var of varlist * {
        file write `book' `"<h4>`var'"'
        local varlab : var label `var'
        if `"`varlab'"' != "`var'" {
            file write `book' `": `varlab'"'
        }
        file write `book' "</h4>"_n
        tempvar freq
        contract `var', freq(`freq')
        
        local vallab : val label `var'    
        if "`vallab'" != "" {
            tempfile freqtable
            rename `var' value
            qui save `freqtable'
            qui uselabel `vallab'
            qui merge 1:1 value using `freqtable'
            file write `book' "<table>"_n
            file write `book' "<tr>"
            file write `book' "<th>value</th>"
            file write `book' "<th>label</th>"
            file write `book' "<th>frequency</th>"
            file write `book' "</tr>"_n
            forvalues i = 1/`=_N' {
                file write `book' "<tr>"
                file write `book' `"<td>`=value[`i']'</td>"'
                file write `book' `"<td>`=label[`i']'</td>"'
                file write `book' `"<td>`=`freq'[`i']'</td>"'
                file write `book' "</td>"_n
            }
            file write `book' "</table>"_n
        }
        else {
            if _N <= 10 {
                file write `book' "<table>"_n
                file write `book' "<tr>"
                file write `book' "<th>value</th>"
                file write `book' "<th>frequency</th>"
                file write `book' "</tr>"_n
                forvalues i = 1/`=_N' {
                    file write `book' "<tr>"
                    file write `book' `"<td>`=`var'[`i']'</td>"'
                    file write `book' `"<td>`=`freq'[`i']'</td>"'
                    file write `book' "</td>"_n
                }                
            }
            capture confirm numeric variable `var'
            else if !_rc {
                file write `book' "<table>" _n
                qui count if !missing(`var')
                local distinct = r(N)
                local fmt : format `var'                               //new
                qui sum `var' [fw=`freq'], detail
                file write `book' "<tr>"_n
                file write `book' `"<th>valid/missing obs.</th>"'
                file write `book' `"<td>`r(N)'/`=`N'-`r(N)'' </td>"'
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>distinct values</th>"
                file write `book' "<td>`distinct'</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>minimum</th>"
                file write `book' "<td>`:display `fmt' `r(min)''</td>" //new
                file write `book' "</tr>"_n
                file write `book' "</table>" _n
            }
        
        }
        qui use `file', clear
    }

end

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n
    file write `book' "<body>"_n
    
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    
    Descfile using "`using'", book(`book')
    Descvars, book(`book')
    Descvalues, book(`book') file(`using')    
    
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
    restore
end

cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

-------------------------------------------------------------------------------

<<

-------------------------------------------------------------------------------

htmlcodebook16.do



clear all

program define Descfile
    syntax using/, book(string)
    
    qui use "`using'", clear
    
    if `"`: data label'"' != "" {
        file write `book' `"<h2>`:data label'</h2>"' _n
    }
    if "`: char _dta[note0]'" != "" {
        file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
        forvalues i = 1/`: char _dta[note0]' {
            file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
        }
        file write `book' "</ul>"_n
    }
end

program define Descvars
    syntax, book(string)
    
    file write `book' "<h3>Variable list</h3>"
    file write `book' "<ul>"_n
    foreach var of varlist * {
        file write `book' "<li>`var'"
        if `"`: var label `var''"' != "`var'" {
            file write `book' `": `: var label `var''"'
        }
        file write `book' "</li>"_n
        if "`: char `var'[note0]'" != "" {
            file write `book' "<ul>" _n
            forvalues i = 1/`: char `var'[note0]'{
                file write `book' `"<li>`: char `var'[note`i']'</li>"'
            }
            file write `book' "</ul>" _n
        }
    }
    file write `book' "</ul>"
end    

program define Descvalues
    syntax, book(string) file(string)
    
    local N = _N
    foreach var of varlist * {
        file write `book' `"<h4>`var'"'
        local varlab : var label `var'
        if `"`varlab'"' != "`var'" {
            file write `book' `": `varlab'"'
        }
        file write `book' "</h4>"_n
        tempvar freq
        contract `var', freq(`freq')
        
        local vallab : val label `var'    
        if "`vallab'" != "" {
            tempfile freqtable
            rename `var' value
            qui save `freqtable'
            qui uselabel `vallab'
            qui merge 1:1 value using `freqtable'
            file write `book' "<table>"_n
            file write `book' "<tr>"
            file write `book' "<th>value</th>"
            file write `book' "<th>label</th>"
            file write `book' "<th>frequency</th>"
            file write `book' "</tr>"_n
            forvalues i = 1/`=_N' {
                file write `book' "<tr>"
                file write `book' `"<td>`=value[`i']'</td>"'
                file write `book' `"<td>`=label[`i']'</td>"'
                file write `book' `"<td>`=`freq'[`i']'</td>"'
                file write `book' "</td>"_n
            }
            file write `book' "</table>"_n
        }
        else {
            if _N <= 10 {
                file write `book' "<table>"_n
                file write `book' "<tr>"
                file write `book' "<th>value</th>"
                file write `book' "<th>frequency</th>"
                file write `book' "</tr>"_n
                forvalues i = 1/`=_N' {
                    file write `book' "<tr>"
                    file write `book' `"<td>`=`var'[`i']'</td>"'
                    file write `book' `"<td>`=`freq'[`i']'</td>"'
                    file write `book' "</td>"_n
                }                
            }
            capture confirm numeric variable `var'
            else if !_rc {
                file write `book' "<table>" _n
                qui count if !missing(`var')
                local distinct = r(N)
                local fmt : format `var'
                qui sum `var' [fw=`freq'], detail
                file write `book' "<tr>"_n
                file write `book' `"<th>valid/missing obs.</th>"'
                file write `book' `"<td>`r(N)'/`=`N'-`r(N)'' </td>"'
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>distinct values</th>"
                file write `book' "<td>`distinct'</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>minimum</th>"
                file write `book' "<td>`:display `fmt' `r(min)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"                                //new
                file write `book' "<th>25th percentile</th>"            //new
                file write `book' "<td>`:display `fmt' `r(p25)''</td>"  //new
                file write `book' "</tr>"_n                             //new
                file write `book' "<tr>"                                //new
                file write `book' "<th>50th percentile</th>"            //new
                file write `book' "<td>`:display `fmt' `r(p50)''</td>"  //new
                file write `book' "</tr>"_n                             //new
                file write `book' "<tr>"                                //new
                file write `book' "<th>75th percentile</th>"            //new
                file write `book' "<td>`:display `fmt' `r(p75)''</td>"  //new
                file write `book' "</tr>"_n                             //new
                file write `book' "<tr>"                                //new
                file write `book' "<th>maximum</th>"                    //new
                file write `book' "<td>`:display `fmt' `r(max)''</td>"  //new
                file write `book' "</tr>"_n                                //new
                file write `book' "</table>" _n
            }
        
        }
        qui use `file', clear
    }

end

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n
    file write `book' "<body>"_n
    
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    
    Descfile using "`using'", book(`book')
    Descvars, book(`book')
    Descvalues, book(`book') file(`using')    
    
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
    restore
end

cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

-------------------------------------------------------------------------------

<<

-------------------------------------------------------------------------------

htmlcodebook17.do



clear all

program define Descfile
    syntax using/, book(string)
    
    qui use "`using'", clear
    
    if `"`: data label'"' != "" {
        file write `book' `"<h2>`:data label'</h2>"' _n
    }
    if "`: char _dta[note0]'" != "" {
        file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
        forvalues i = 1/`: char _dta[note0]' {
            file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
        }
        file write `book' "</ul>"_n
    }
end

program define Descvars
    syntax, book(string)
    
    file write `book' "<h3>Variable list</h3>"
    file write `book' "<ul>"_n
    foreach var of varlist * {
        file write `book' "<li>`var'"
        if `"`: var label `var''"' != "`var'" {
            file write `book' `": `: var label `var''"'
        }
        file write `book' "</li>"_n
        if "`: char `var'[note0]'" != "" {
            file write `book' "<ul>" _n
            forvalues i = 1/`: char `var'[note0]'{
                file write `book' `"<li>`: char `var'[note`i']'</li>"'
            }
            file write `book' "</ul>" _n
        }
    }
    file write `book' "</ul>"
end    

program define Descvalues
    syntax, book(string) file(string)
    
    local N = _N
    foreach var of varlist * {
        file write `book' `"<h4>`var'"'
        local varlab : var label `var'
        if `"`varlab'"' != "`var'" {
            file write `book' `": `varlab'"'
        }
        file write `book' "</h4>"_n
        tempvar freq
        contract `var', freq(`freq')
        
        local vallab : val label `var'    
        if "`vallab'" != "" {
            tempfile freqtable
            rename `var' value
            qui save `freqtable'
            qui uselabel `vallab'
            qui merge 1:1 value using `freqtable'
            file write `book' "<table>"_n
            file write `book' "<tr>"
            file write `book' "<th>value</th>"
            file write `book' "<th>label</th>"
            file write `book' "<th>frequency</th>"
            file write `book' "</tr>"_n
            forvalues i = 1/`=_N' {
                file write `book' "<tr>"
                file write `book' `"<td>`=value[`i']'</td>"'
                file write `book' `"<td>`=label[`i']'</td>"'
                file write `book' `"<td>`=`freq'[`i']'</td>"'
                file write `book' "</td>"_n
            }
            file write `book' "</table>"_n
        }
        else {
            if _N <= 10 {
                file write `book' "<table>"_n
                file write `book' "<tr>"
                file write `book' "<th>value</th>"
                file write `book' "<th>frequency</th>"
                file write `book' "</tr>"_n
                forvalues i = 1/`=_N' {
                    file write `book' "<tr>"
                    file write `book' `"<td>`=`var'[`i']'</td>"'
                    file write `book' `"<td>`=`freq'[`i']'</td>"'
                    file write `book' "</td>"_n
                }                
            }
            capture confirm numeric variable `var'
            else if !_rc {
                file write `book' "<table>" _n
                qui count if !missing(`var')
                local distinct = r(N)
                local fmt : format `var'
                qui sum `var' [fw=`freq'], detail
                file write `book' "<tr>"_n
                file write `book' `"<th>valid/missing obs.</th>"'
                file write `book' `"<td>`r(N)'/`=`N'-`r(N)'' </td>"'
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>distinct values</th>"
                file write `book' "<td>`distinct'</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>minimum</th>"
                file write `book' "<td>`:display `fmt' `r(min)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>25th percentile</th>"
                file write `book' "<td>`:display `fmt' `r(p25)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>50th percentile</th>"
                file write `book' "<td>`:display `fmt' `r(p50)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>75th percentile</th>"
                file write `book' "<td>`:display `fmt' `r(p75)''</td>"
                file write `book' "</tr>"_n
                file write `book' "<tr>"
                file write `book' "<th>maximum</th>"
                file write `book' "<td>`:display `fmt' `r(max)''</td>"
                file write `book' "</tr>"_n                
                file write `book' "</table>" _n
            }
            else {                                                          //new
                file write `book' "<p>Variable `var' is a string variable," //new
                file write `book' "example values are:</p>"_n               //new
                file write `book' "<ul>"_n                                  //new
                forvalues i = 1/10 {                                        //new
                    file write `book' `"<li>`=`var'[`i']'</li>"'_n          //new
                }                                                           //new
                file write `book' "</ul>"_n                                 //new
            }                                                               //new
        
        }
        qui use `file', clear
    }

end

program define htmlcodebook
    version 14
    syntax using/ , SAVing(string) [replace title(string)]
    
    tempname book 
    file open `book' using `saving', write `replace'
    preserve
    
    file write `book' "<!DOCTYPE html>"_n"<html>"_n
    file write `book' "<style>"_n
    file write `book' "body {"_n
    file write `book' "width: 650px;"_n
    file write `book' "margin: auto;"_n
    file write `book' "}"_n
    file write `book' "</style>"_n
    file write `book' "<body>"_n
    
    if `"`title'"' != "" {
        file write `book' "<h1>`title'</h1>"_n
    }
    else {
        file write `book' "<h1>Codebook for `using'</h1>"_n
    }
    
    Descfile using "`using'", book(`book')
    Descvars, book(`book')
    Descvalues, book(`book') file(`using')    
    
    file write `book' "</body>"_n
    file write `book' "</html>"_n
    file close `book'
    di as txt "Output written to " `"{browse "`saving'"}"'    
    restore
end

cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace

-------------------------------------------------------------------------------

<<

-------------------------------------------------------------------------------