NAG Schools Excel Add-In: Using Excel to Draw Statistically Correct Graphs
By Ed Carlin, Wallingford School, Oxfordshire, UK
Ever since the introduction of a statistical coursework task at GCSE level a much greater emphasis has been placed on the interpretation of data as opposed to the construction of a simple graph. A pupil must decide what diagrams will be useful and then utilise them to argue either in favour or against an original hypothesis.
Although Microsoft Excel is able to return pre-programmed statistical measures for comparing data including the mean [=AVERAGE(A1:A5)], the median [=MEDIAN(A1:A5)], the range [=MAX(A1:A5)-MIN(A1:A5)] and the standard deviation [=STDEVPA(A1:A5)] it's ability to pictorially represent the data to a standard required at Intermediate GCSE level has been seriously underdeveloped.
NAG Schools Excel Add-In (N-SEA) is an Excel Add-in, accessible through a drop down menu on the Excel toolbar, that allows the user - a pupil or the teacher - to tackle the areas of data representation that were previously unavailable. This allows the class - either individually or led from the front - to focus its energy and discussion on the important analysis without having wasted time and motivation on hand-drawing the many graphs available.
The Data Handling Task
Suppose that students are to analyse the differences between the heights of boys and girls in year 9, hypothesising that at this age boys are taller than girls. After having painstakingly produced a sample through either random number generation or 'hat picking' pupils would be expected to laboriously tally their results into two frequency tables, produce comparable frequency diagrams, calculate cumulative frequencies, draw cumulative frequency curves and then construct box-whisker plots (all of which is tested in the appropriate written papers). Only then would they begin comparison of the two groups.
Although in a real life situation we would first require a sample to be selected from a suitable sampling frame and only necessary data collected it has become a generally accepted alternative to conduct (or research) a census or large sample and use only a small sample of this more complete data set. By highlighting the data N-SEA allows you to form a stratified, random or systematic sample of the data, exporting it to a new spreadsheet as required.
The menu is intuitive in its layout and therefore very simple to use, the process only taking a fraction of the time it took previously. More importantly, we are using computers to save us time and do automated jobs that they would be expected to perform in the 'real world'.
Naturally, the sampling function could be used to pick a sample before data collection and only the necessary data be obtained by the student. With computers able to cope with data sets exceeding the suggested sample size of 30 it is counter intuitive that we sample from a complete data set and this acts solely as a hoop jumping exercise!
Excels best attempt at a frequency diagram is both relatively difficult to obtain and statistically unsatisfactory. The horizontal axis should be drawn as a scale, the continuous nature of the data meaning the bars will touch, which leads nicely onto histograms and unequal group-widths.
By highlighting the necessary column of raw data a frequency diagram can be drawn and the corresponding frequency table also displayed - previously in Excel, tallying results was not automatic. Again, an intuitive menu structure allows the user to change, amongst other things, the group sizes, the number of groups, their boundaries and the axis titles. The diagram that results is pleasingly accurate.
Histograms can also be drawn by defining the group boundaries on raw data or selecting a pre-drawn frequency table directly.
Excel's ability to cope with continuous data is poor. For a teacher, the fact that it does not draw box plots is very disappointing and limits its effect use in the classroom.
As can be seen from the diagram, the pupils are still able to see how the box plot drawn in Excel (using N-SEA) was constructed from the cumulative frequency curve.
Another Stats Package
The ability to pictorially represent data quickly and easily is not restricted in its use to coursework. It is possible to collect data from the class at the beginning of a lesson and, after producing the relevant graphs in front of them, have twenty or more comparisons of boys and girls from every pupil by the bell. The ability to use "real data" increases the interest and relevance of the techniques being discussed.
Alternatively, the functionality can be used simply to display "right answers" to allow pupils to check their work. In the discussion that follows though, data can be altered and the effect that has on the graph can be quickly seen.
Obviously there are many statistical packages on the market that offer some or all of this functionality. However, pupils have found using N-SEA is 'second nature' as it sits within Excel. Equally, and perhaps more surprisingly, so have I!
Future releases of N-SEA promise to address scatter diagrams and comparison diagrams (using frequency polygons) and much more. The time saved doing GCSE coursework, whilst retaining the understanding, will be huge.
By Ed Carlin. Ed is an Advanced Skills Teacher in Mathematics at the Wallingford School in Oxfordshire, UK. He can be contacted at firstname.lastname@example.org.