What is Data SGP?

Data SGP is a data analysis package for the free and open source programming environment, R. It can be run on any operating system that supports R and provides the required software/hardware. Running SGP analyses requires some familiarity with the R software. The bulk of the effort for running SGP analyses is in data preparation and then the actual analysis. SGP analyses are designed to be simple and straightforward. In our experience errors in running SGP calculations generally revert back to problems with data preparation so it is very important that the data is prepared properly.

The Southern Great Plains (SGP) observatory is an atmospheric research facility located on 160 acres of cattle pasture and wheat fields southeast of Lamont, Oklahoma. The observatory hosts a number of in situ and remote-sensing instrument clusters that cover an area of about 9,000 square miles in north central and south Oklahoma and southwestern Kansas. Researchers use SGP data to better understand cloud and aerosol processes and improve Earth system models.

As a part of its mission to provide high-quality data and simulations, the SGP operates a large-scale, continuously operating radio telescope (VLBI). It is the world’s first field measurement site established by the Atmospheric Radiation Measurement (ARM) user facility. The site collects continuous observations of the atmosphere and is a vital component in developing better predictions of climate change.

A term that’s used quite often in business today, primarily as a marketing buzzword, is ‘big data’. Big data refers to massive, complicated, and varied datasets that are beyond the abilities of traditional database systems. The SGP is working to assemble unprecedented amounts of scientific information for the questions at hand but in comparison to a global Facebook interaction this is not big data.

SGP uses a data set called the sgpdata package that is built within the free and open source program, R. The sgpdata package provides the classes, functions and data that are used in the SGP analytics. The classes and functions used in SGP are based on the work of Dr. Damian Betebenner. They include his well-known catch-up, keep-up growth projections which compare a student’s current growth to that of their academic peers nationally who have similar achievement histories on Star assessments.

The sgpdata package includes 4 example data sets for use in SGP analyses. One of these, sgpData, specifies data in the WIDE format that’s used with lower level SGP functions like studentGrowthPercentiles and studentGrowthProjections. The other two, sgpData_LONG and sgptData_LONG specify the LONG format that’s used with higher level SGP functions such as abcSGP, prepareSGP, and analyzeSGP. These data sets are updated each day so that when users select a specific date range during report customization, the growth trajectories and projections they see will be based on up to date information. The sgpdata package can be downloaded from the CRAN website.