GHCN Antarctica : Careful selection of data

Update : I just had to reproduce this comment found on The Air Vent, on the post where JeffId picked up the thread and did a more complete analysis of Antarctica. Incriminating? Not at all ….

From: Tom Wigley
To: Phil Jones
Subject: HadCRUT2v
Date: Mon, 12 Dec 2005 15:16:28 -0700
Cc: Tim Osborn , Ben Santer

Phil,

Why is there so much missing data for the South Pole? The period Jan 75 thru
Dec 90 is all missing except Dec 81, July & Dec 85, Apr 87, Apr & Sept 88,
Apr 89. Also, from and including Aug 2003 is missing.

Also — more seriously but correctable. The S Pole is just represented
by a single
box at 87.5S (N Pole ditto I suspect). This screws up area averaging. It
would be
better to put the S Pole value in ALL boxes at 87.5S.

I have had to do this in my code — but you really should fix the ‘raw’
gridded data.

For area averages, the difference is between having the S Pole represent
the whole
region south of 85S, and having (as now) it represent one 72nd of this
region. It
is pretty obvious to me what is better.

This affects the impression of missing data too of course.

Tom.

 

Okay, so I can finally present some result of all the hours I’ve spent loading the GHCN Mean Temperatures dataset(s) into an SQL database. The procedure looked something like this :

  • Load v2_mean and v2_mean_adj into database tables
  • Create views for these that show the table, with an added column for AnnualMean, which is calculated AS (Jan + Feb + Mar + Apr + May + Jun + Jul + Aug + Sep + Oct + Nov + Dec) / 12. Since I have inserted NULL for months with missing values, the AnnualMean will also be NULL for these rows.
  • Create a meta-table for each series, which contains the following : ID of the temperature station, number of rows for this series, first year, last year, linear regression slope using all the AnnualMean values from the unadjusted data and linear regression slope for all the AnnualMean values from the adjusted data

You probably see where this is going. By calculating the “slope” (angle) of the linear regression of all series both for the unadjusted and adjusted data, I started looking for series where the difference was as large as possible. Aggregating this data on country led me to start looking at Antarctica. There was a very large difference between the average “slope” of the unadjusted series when compared to the adjusted series. And so, I continued.

  • I added two columns to the meta-table, namely ControlPeriodAverage and AdjustedControlPeriodAverage. Into these I loaded the average temperature for each series between 1961-1990, in order to normalize all values.
  • With the annual average temperature calculated for all series, I pulled out all unadjusted and adjusted series for Antarctica – and used the ControlPeriodAverage and AdjustedControlPeriodAverage columns from the metadata table to normalize all series (showing anomalies from the mean instead of absolute values)
  • After running averages on first unadjusted series, adjusted series and finally all the series excluded in the adjusted dataset, I found the following

(keep in mind that since this data is normalized with the mean from 1961-1990, the “slope” of the average can be interpreted as “Average anomaly from mean per year”, or in simpler terms the average amount of  warming per year)

Original data

Original data

Okay, so here we have the unadjusted data, normalized for the period 1961-1990. Notice that the “slope” is 0,0122 degrees C / year, which would mean a trend of roughly +1 degree celsius during the last century. Sounds reasonable, and isn’t that far from all other data we have for the 20th century warming. After starting to plot the adjusted data, there seemed to be very little of it, so I decided to query which series existed in the unadjusted data but not in the adjusted data. Here’s what I found :

Removed temperature series

Removed temperature series

Hmmm, thats a LOT of removed data. And if you notice the “slope” is only 0,0066 for the series that has been removed, that must mean ….

Remaining series, adjusted

Remaining series, adjusted

Tadaaaaa! Of the original 110 dataseries, only 18 are left. The original 2700+ datapoints are down to around 600. And what do you know – the series shows a whopping slope of 0.0447 which would mean a trend of 4.47 degrees of warming per century! I am sorry boys and girls, but there simply is no way in HELL that you can “accidentally” remove all series that show less of an upward trend, and settle for 18 of the most upward trending series (thus raising the warming / century by 3 degrees!). I don’t know how they do things with the GHCN dataset, or who is responsible for this, but just like New Zealand this is pretty damning evidence that all the “adjustments” are done to deliberately corrupt data to cause specific trends.

Since this has taken all the spare time I’ve had for more than a week, and I don’t have a tip jar, you can simply give me a pat on the back and a “well done, old chap”, can’t you?

DISCLAIMER : I am not a climate scientist, nor do I work with statistics on a daily basis. Thus, to guarantee that there are no mistakes you should check the data yourself. I encourage everyone to do so. I can help with instructions, spreadsheets I have used and possibly a database dump if I figure out how to send you one. Please check this yourself – and I welcome any corrections to my work. The reason the graphs look so crappy is that I’ve used OpenOffice Calc for drawing them, then copy-pasted them into the worlds greates image processing software (Paint.exe), and thereafter saved them as .png files.

About these ads
This entry was posted in Climate Watch and tagged , , , , , , . Bookmark the permalink.

37 Responses to GHCN Antarctica : Careful selection of data

  1. Bohemond says:

    Good work. I’ve given you an HT to Anthony Watts

    • hpx83 says:

      Very much appreciated. This blog is temporarily masquerading as an amatuer climate hacking station, and will do so until this thing is over.

  2. another sql guy says:

    wow. great work. can you put this in context? who was using the adjusted GHCN dataset?

  3. Nicey done. I will be taking a look at this shortly. If correct it certainly raises ummm issues….

    Why does this not suprise me.

  4. Jeff Id says:

    This is a very interesting analysis, I don’t know if you’re aware but at tAV we put a lot of work into a Steig Antarctic reconstruction and hope to publish a paper on it soon.

    When you say ‘removed series’, are you saying these series contain no adjustment column from the GHCN database?

    Can you give a link to a single example so I can understand which are removed and how it looks. Your post could make a hell of a lot of trouble if it’s correct.

  5. Pingback: GHCN – Antarctic Warming Eight Times Actual « the Air Vent

  6. Pingback: The Air Vent picks up the thread « Save Capitalism

  7. Third Party says:

    I’d be interested in what the trends looked like by month of the year. Especially in the SH, one might think that the CO2 CAGW hypothesis would result in an equal trend for each of the months.

    • hpx83 says:

      I’ll see if I can fix that for you. May be some issues with missing values, but I’ll try and look into it. Check back in a week or so, and I may have something for ya.

  8. Arnold says:

    hpx83, great work. And seems to me that you explained youre work good enough to follow it. I hope finnaly that the MSM will take there head out of the a** and start to investigate the mather themselfs.

    • hpx83 says:

      Don’t count on it. The sad thing is that most people cannot set up an SQL-database and run sql-queries themselves – it’s mostly us IT nerds who know how to do that. I am planning to make the little software I use to import all the textfiles into the sql database available, but currently I just add code everytimes I find a new type of file I need to import, so I won’t do that until there is some sort of workable interface for it.

      • Arnold says:

        I dont think that the journalists are going to setup there own database. But what i would like to see some journalists looking further than what is said on for example RC.org and start to look elsewhere for there info. And force the scientists to answer. My opinion when this whole thing started was that it had to be a conspiracy, but it seems that the journalists simply dont know what to do. Mostly right or left orientated media (Atleast in the netherlands) are commenting on it. The more moderate media seems to be in shock and seems to be waiting for a outcome of somesort. Funny thing for me is that there seems to have been a big revolt of the internet though, because for a couple of months ago most mileu orientated pieces where commented on be leftwing people. Now it seems to be the other way arround :) And i think the media will follow (in there own time offcourse).

  9. John F. Pittman says:

    Interesting work. Thanks.

  10. NormD says:

    In the Disclaimer do you mean to say you are not a “climate” scientist?

    • hpx83 says:

      No, I mean to say that I am not a scientist at all. I have a degree in Information Technology and work with programming. If that disqualifies me from running numbers I guess these numbers are not to be trusted.

      • NormD says:

        Sorry if I am being obtuse, but in your disclaimer you say that you are not a “client scientist”. I have never heard of a client scientist. I think your work is very interesting.

  11. Pingback: Frigid Folly: UHI, siting issues, and adjustments in Antarctic GHCN data « Watts Up With That?

  12. Sinan Unur says:

    This is a great example of data mining used for the right purpose (which is rarely what happens). Just a minor correction:

    > DISCLAIMER : I am not a client scientist

    ITYM “climate” rather than “client”.

  13. Creation Man says:

    Well done indeed.

    You’ll be very interested to know what Watts and Id found out about where those data have come from. See here:

    http://www.examiner.com/x-28973-Essex-County-Conservative-Examiner~y2009m12d13-Antarctic-weather-station-data-found-flawed

  14. E.M.Smith says:

    Wonderfully done. Yes, it is all in the ‘adjustments’, of various sorts.

    If you would like ideas on other places to look for the adjustments cooking the books, I’ve found a decent set of ‘footprints in the snow’ pointing to thermometer deletions from cold places. For example they are being systematically removed from the Andes in S. America.

    I would love to see a similar set of articles, like this one, for the “other places” that are being “cooked” via thermometer change.

    I’ve added link to here from my page that covers what GHCN unadjusted plus SCAR stations in GIStemp cover:

    http://chiefio.wordpress.com/2009/11/02/ghcn-antarctica-ice-on-the-rocks/

    For a look at other “thermometer deletion patterns” in other countries, you can see:

    http://chiefio.wordpress.com/2009/11/03/ghcn-the-global-analysis/

    It looks like they run from cold mountains and higher latitudes. Mexico has them move to the tropical thermal belt. I would encourage you to see if the GHCN Adjusted data you have loaded up makes these trends more extreme (as it clearly does for Antarctica).

    I suspect you have about a dozen articles ahead of you just looking at that. FWIW, Morocco has the thermometers run away from the beach (they have a cold current off shore) and into the Atlas mountains in the Sahara… Now if the adjusted drops some more of the cold history…

  15. I understand the general thrust of this work, but I don’t need to understand it at ALL in order to say thank you, thank you, thank you for caring enough to put your time into it. I expect your work to be recognized as correct. I’m at least savvy enough to know that the implications of this study of yours are huge.

    I love your attitude. Unlike the alarmists (anti-capitalist gentlemen and ladies that many of them are), you’re more than willing to be proved wrong, openly sharing your methods for others to attempt to replicate. I just admire that. So here goes — Well done, old chap.

  16. Steve Adams says:

    I think you ‘misspoke’ when you said you are not a climate scientist. It is not unreasonable based on your work here and the open approach to sharing the data that you are more of a scientist then a great number of people that are employed in this field.

    A great result of your work and of the efforts of others is to build in society a little more skepticism for ‘scientists’ at the same time more respect for those who undertake real science – open science – science not based on preconceived notions of the folks that are paying for it. Hats of to the Ben Franklins of the era !!!!!

  17. Simeon Higgs says:

    can I have your openoffice file?
    I wanna see all the conclusions that could be made with the data depending on how you manipulate it

    • hpx83 says:

      Sure, don’t know if I can post it to the blog, but I’ll send it to you over email. You should be aware though that the OpenOffice-file doesn’t contain the entire GHCN database, only data which I copied from my SQL, so the data is very condensated (I let the SQL do the aggregate SUM/AVERAGE/COUNT etc. for me). Haven’t figured out a good way to make the SQL-db public yet.

  18. Thomas Gough says:

    Very simple (i.e. the idea), easy to understand and therefore elegant. True science is getting rid of the ‘frosted glass’ so that the facts can be seen clearly. If I may say so “The foot soldiers also matter.” (This is intended as praise, not disrespect)
    I would like to believe that there is more to come. This is ‘only’ A (ntartica).
    TTG

    • hpx83 says:

      Thomas : Thank you for the kind words. There will be more coming, count on it.

      And who wants to be one of the kings knights, when one can be a footsoldier of the resistance? :)

  19. Pingback: Newspaper hack’n’slash! « Save Capitalism

  20. erik says:

    Läste snabbt och fattar inte helt (dålig på engelska, kom hit via DN) men:

    1. Du kommer alltså fram till att det _har_ varit en uppvärmning iaf, även om den är mycket mindre än vad “forskarna” säger? (1 grad på 100år?)

    2. Det finns alltså någon artikel där “forskarna” har använt den här trimmade datan och sedan hävdar att uppvärmningen ligger på 3-4 grader på antarktis de senaste 100 åren? Vilken då?

    • hpx83 says:

      Hej Erik,

      Detta är inte kritik mot en specifik artikel, jag har t.ex. ännu inte hunnit se den undersökning som nämns i DN (eftersom den tydligen inte släpptes till Köpenhamns-spektaklet förrän idag, om jag förstod DN rätt). Min kritik är riktad mot det största datasetet som används, nämligen GHCNv2, vars data står som grund till delar av t.ex. HADCRUT m.fl. Min kritik går ut på att den s.k. “homogeniserade” versionen av detta data visar en fullständigt annan bild än de ursprungliga mätvärden, vilket tyder på allvarliga brister i de urvalsmetoder som görs för det homogeniserade datasetet. Detta är ganska oroväckande då t.ex. forskare på CRU har erkänt att de har “tappat bort” original-data för vissa mätvärden, d.v.s. det finns endast justerade värden kvar. Och om det är den här typen av justeringar som använts, så bör trovärdigheten på det data som används idag vara kraftigt ifrågasatt.

      Citat från Phil Jones (via TheAirVent som gjorde en mkt grundligare undersökning baserat på det jag hittade)

      “Almost all the data we have in the CRU archive is exactly the same
      > as in the Global Historical Climatology Network (GHCN) archive used
      > by the NOAA National Climatic Data Center”

      Oroväckande var ordet.

  21. Pingback: Climate Data Keeps Snowballing | Constant Conservative

  22. Pingback: U.S. Govt. Database Complicit in Climate Data Deception : Odd Citizen

  23. Pingback: Jungla di Ghiaccio | Climate Monitor

  24. john says:

    This is quite impressive. I can’t believe that there are not documented and available procedures for processing the data in these nationalized/globalized databases in existence somewhere.

  25. A C Osborn says:

    This is brilliant analysis.
    Regarding the SQL database, how large is it?
    If it is less than 2Gb it will fit in an Access database which of course is emailable when zipped or can be uploaded to a suitable website.
    Tony

  26. KC says:

    Hi, just wanna tell you, that large files can be shared through services like filemail.com

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s