Update : I just had to reproduce this comment found on The Air Vent, on the post where JeffId picked up the thread and did a more complete analysis of Antarctica. Incriminating? Not at all ….
From: Tom Wigley
To: Phil Jones
Date: Mon, 12 Dec 2005 15:16:28 -0700
Cc: Tim Osborn , Ben Santer
Why is there so much missing data for the South Pole? The period Jan 75 thru
Dec 90 is all missing except Dec 81, July & Dec 85, Apr 87, Apr & Sept 88,
Apr 89. Also, from and including Aug 2003 is missing.
Also — more seriously but correctable. The S Pole is just represented
by a single
box at 87.5S (N Pole ditto I suspect). This screws up area averaging. It
better to put the S Pole value in ALL boxes at 87.5S.
I have had to do this in my code — but you really should fix the ‘raw’
For area averages, the difference is between having the S Pole represent
region south of 85S, and having (as now) it represent one 72nd of this
is pretty obvious to me what is better.
This affects the impression of missing data too of course.
Okay, so I can finally present some result of all the hours I’ve spent loading the GHCN Mean Temperatures dataset(s) into an SQL database. The procedure looked something like this :
- Load v2_mean and v2_mean_adj into database tables
- Create views for these that show the table, with an added column for AnnualMean, which is calculated AS (Jan + Feb + Mar + Apr + May + Jun + Jul + Aug + Sep + Oct + Nov + Dec) / 12. Since I have inserted NULL for months with missing values, the AnnualMean will also be NULL for these rows.
- Create a meta-table for each series, which contains the following : ID of the temperature station, number of rows for this series, first year, last year, linear regression slope using all the AnnualMean values from the unadjusted data and linear regression slope for all the AnnualMean values from the adjusted data
You probably see where this is going. By calculating the “slope” (angle) of the linear regression of all series both for the unadjusted and adjusted data, I started looking for series where the difference was as large as possible. Aggregating this data on country led me to start looking at Antarctica. There was a very large difference between the average “slope” of the unadjusted series when compared to the adjusted series. And so, I continued.
- I added two columns to the meta-table, namely ControlPeriodAverage and AdjustedControlPeriodAverage. Into these I loaded the average temperature for each series between 1961-1990, in order to normalize all values.
- With the annual average temperature calculated for all series, I pulled out all unadjusted and adjusted series for Antarctica – and used the ControlPeriodAverage and AdjustedControlPeriodAverage columns from the metadata table to normalize all series (showing anomalies from the mean instead of absolute values)
- After running averages on first unadjusted series, adjusted series and finally all the series excluded in the adjusted dataset, I found the following
(keep in mind that since this data is normalized with the mean from 1961-1990, the “slope” of the average can be interpreted as “Average anomaly from mean per year”, or in simpler terms the average amount of warming per year)
Okay, so here we have the unadjusted data, normalized for the period 1961-1990. Notice that the “slope” is 0,0122 degrees C / year, which would mean a trend of roughly +1 degree celsius during the last century. Sounds reasonable, and isn’t that far from all other data we have for the 20th century warming. After starting to plot the adjusted data, there seemed to be very little of it, so I decided to query which series existed in the unadjusted data but not in the adjusted data. Here’s what I found :
Hmmm, thats a LOT of removed data. And if you notice the “slope” is only 0,0066 for the series that has been removed, that must mean ….
Tadaaaaa! Of the original 110 dataseries, only 18 are left. The original 2700+ datapoints are down to around 600. And what do you know – the series shows a whopping slope of 0.0447 which would mean a trend of 4.47 degrees of warming per century! I am sorry boys and girls, but there simply is no way in HELL that you can “accidentally” remove all series that show less of an upward trend, and settle for 18 of the most upward trending series (thus raising the warming / century by 3 degrees!). I don’t know how they do things with the GHCN dataset, or who is responsible for this, but just like New Zealand this is pretty damning evidence that all the “adjustments” are done to deliberately corrupt data to cause specific trends.
Since this has taken all the spare time I’ve had for more than a week, and I don’t have a tip jar, you can simply give me a pat on the back and a “well done, old chap”, can’t you?
DISCLAIMER : I am not a climate scientist, nor do I work with statistics on a daily basis. Thus, to guarantee that there are no mistakes you should check the data yourself. I encourage everyone to do so. I can help with instructions, spreadsheets I have used and possibly a database dump if I figure out how to send you one. Please check this yourself – and I welcome any corrections to my work. The reason the graphs look so crappy is that I’ve used OpenOffice Calc for drawing them, then copy-pasted them into the worlds greates image processing software (Paint.exe), and thereafter saved them as .png files.