NAG Logo
Numerical Algorithms Group

N-SEA Documentation: Stratified Sampling

If the data set can be split into strata (groups) that may affect the data that you are exploring, you can opt to preserve the proportion of results from each stratum in your sample, thus producing a Stratified sample. For example, if your data set has 20 boys and 30 girls, you may wish your sample to also include 40% boys and 60% girls. Sampling inside each stratum is then performed using random sampling, with or without replacement as specified below.

To obtain a stratified sample open an initial dialogue box by choosing 'Statistics' then 'Sampling' from the N-SEA menu and choose the 'Stratified' option. The dialogue box then looks like the diagram below:

samplingstratefied.jpg

The data to be sampled is selected by clicking on the 'Data to be sampled' box and then selecting the data from cells on an Excel worksheet. If you wish to apply a filter then not all of the columns, which comprise the data, need be selected, only the relevant columns. However if you wish to copy the sample to a new location then all of the columns comprising the data must be selected.

Choose one of the options to indicate whether the sample should be chosen with or without replacement. The default is, as indicated, to 'Sample without replacement'.

Click on the 'Groups/Strata defined by' box and select those columns that contain any Group or Strata labels. These might be the words "Male" and "Female" or numerical values "1", "2", "3", "4" and "5" for example.

In the 'Take a sample of' box', type a number to represent the sample size. The default value is 10. The number may be treated in one of two ways: 'percent of observations (rows) per group' or as the number of 'observations (rows), in total'. Choose the required option. The default is as a percentage of the group. Depending on the choice restrictions are placed upon the number that may be entered. Clearly a percentage must be greater than 0 and less than or equal to 100. The number must be positive if the other option is chosen. If it exceeds the number of data items in total then the whole of the data is returned as the sample.

If you press the 'Next' button then the familiar output option dialogue box appears. The 'Finish' button accepts the default output options illustrated below.

samplingoutputoptions.jpg

This output box and the options have been described in greater detail under 'Random Sampling'.

© The Numerical Algorithms Group 2008
Privacy Policy | Trademarks

© Numerical Algorithms Group

Visit NAG on the web at:

www.nag.co.uk (Europe and ROW)
www.nag.com (North America)
www.nag-j.co.jp (Japan)

http://www.nag.com/n-sea/docs/stratifiedsampling.asp