Bruce's Blog | The US Temperature Record 4: Data Flags

The data provided has flags you should understand. Definitions below are from the USHCN Status file.

Data Measurement Flag

                  blank = no measurement information applicable
                  a-i = number of days missing in calculation of monthly mean
                        temperature 
		  E = The value is estimated using values from surrounding
		      stations because a monthly value could not be computed 
		      from daily data; or,
                      the pairwise homogenization algorithm removed the value 
		      because of too many apparent inhomogeneities occuring 
		      close together in time.

Quality Control Flag

        
                  BLANK = no failure of quality control check or could not be
                          evaluated.

                  D = monthly value is part of an annual series of values that
                      are exactly the same (e.g. duplicated) within another
                      year in the station's record.

                  I = checks for internal consistency between TMAX and TMIN. 
                      Flag is set when TMIN > TMAX for a given month. 

                  L = monthly value is isolated in time within the station
                      record, and this is defined by having no immediate non-
                      missing values 18 months on either side of the value.

                  M = Manually flagged as erroneous.

                  O = monthly value that is >= 5 bi-weight standard deviations
                      from the bi-weight mean.  Bi-weight statistics are
                      calculated from a series of all non-missing values in 
                      the station's record for that particular month.

                  S = monthly value has failed spatial consistency check.
                      Any value found to be between 2.5 and 5.0 bi-weight
                      standard deviations from the bi-weight mean, is more
                      closely scrutinized by exmaining the 5 closest neighbors
                      (not to exceed 500.0 km) and determine their associated
                      distribution of respective z-scores.  At least one of 
                      the neighbor stations must have a z score with the same
                      sign as the target and its z-score must be greater than
                      or equal to the z-score listed in column B (below),
                      where column B is expressed as a function of the target
                      z-score ranges (column A). 

                                  ---------------------------- 
                                       A       |        B
                                  ----------------------------
                                    4.0 - 5.0  |       1.9
                                  ----------------------------
                                    3.0 - 4.0  |       1.8
                                  ----------------------------
                                   2.75 - 3.0  |       1.7
                                  ----------------------------
                                   2.50 - 2.75 |       1.6
                                     


                  W = monthly value is duplicated from the previous month,
                      based upon regional and spatial criteria and is only 
                      applied from the year 2000 to the present.                   

                  Quality Controlled Adjusted (QCA) QC Flags:

                  A = alternative method of adjustment used.
 
                  M = values with a non-blank quality control flag in the "qcu"
                      dataset are set to missing the adjusted dataset and given
                      an "M" quality control flag.

Data Source Flag

                  Blank = Value was computed from daily data available in GHCN-Daily
		  
	          Not Blank = Daily data are not available so the monthly value was 
		              obtained from the USHCN version 1 dataset.  The possible 
			      Version 1 DSFLAGS are as follows:
	       
                              1 =  NCDC Tape Deck 3220, Summary of the Month Element Digital File
	                      2 =  Means Book - Smithsonian Institute, C.A. Schott (1876, 1881 thru 1931)
                              3 =  Manuscript - Original Records, National Climatic Data Center 
                              4 =  Climatological Data (CD), monthly NCDC publication 
                              5 =  Climate Record Book, as described in History of Climatological Record
		                   Books, U.S. Department of Commerce, Weather Bureau, USGPO (1960)
                              6 =  Bulletin W - Summary of the Climatological Data for the United States (by
                                   section), F.H. Bigelow, U.S. Weather Bureau (1912); and, Bulletin W -
                                   Summary of the Climatological Data for the United States, 2nd Ed.
                              7 =  Local Climatological Data (LCD), monthly NCDC publication
                              8 =  State Climatologists, various sources
                              B =  Professor Raymond Bradley - Refer to Climatic Fluctuations of the Western
                                   United States During the Period of Instrumental Records, Bradley, et. al.,
                                   Contribution No. 42, Dept. of Geography and Geology, University of
                                   Massachusetts (1982)
                              D =  Dr. Henry Diaz, a compilation of data from Bulletin W, LCD, and NCDC Tape
                                   Deck 3220 (1983)
                              G =  Professor John Griffiths - primarily from Climatological Data

Most of these flags I ignore because they either won't change the annual averages or they are the consequence of an opinion. The only flag I care about is the I flag, meaning the reported Tmin is larger than the Tmax for that month. I'll report later how many flags there are in the example dataset I'll use.

Next up: Part 5: Preparing the Data

Related posts