Monday, September 15, 2014

Trying out ExtRemes R package : Part 1 - Block Maxima Approach

I used http://www.ral.ucar.edu/~ericg/extRemes/extRemes2.pdf to try out "extRemes" R package.
Below is my notes and thoughts.
On my desk was a copy of
Coles, S. (2001). An introduction to statistical modeling of extreme values / Stuart Coles, London ; New York : Springer, [2001].

Fitting the Generalized Extreme Value distribution function to the Port Jervis, New York annual maximum winter temperature data using "extRemes" R package

library(extremes)

data("PORTw")

plot(PORTw$TMX1, type = "l", xlab = "Year", ylab = "Maximum winter temperature", col = "darkblue")

Default Run

fit1 <- fevd(TMX1, PORTw, units = "deg C")

fit1

Output

fevd(x = TMX1, data = PORTw, units = "deg C")
[1] "Estimation Method used: MLE"

Negative Log-Likelihood Value: 172.7426

Estimated parameters: location scale shape 15.1406132 2.9724952 -0.2171486

Standard Error Estimates: location scale shape 0.39745119 0.27521741 0.07438302

Estimated parameter covariance matrix. location scale shapelocation 0.15796745 0.01028664 -0.010869596scale 0.01028664 0.07574462 -0.010234809shape -0.01086960 -0.01023481 0.005532834

AIC = 351.4853

BIC = 358.1438

GEV model has 3 parameters

locatation
scale
shape

The simplest distribution function is Gumbel, for which is only 2 parameters.
When the Gumbel df. is a good fit, the estimate of the shape parameter should be approx. 0

Here shape is not close to 0. Gumbel probably not good.

3 measures of the goodness of fit provided.

Negative Log-Likelihood Value
AIC
BIC

Gumbel Run

fit0 <- fevd(TMX1, PORTw, type = "Gumbel", units = "deg C")

fit0

fevd(x = TMX1, data = PORTw, type = "Gumbel", units = "deg C")
[1] "Estimation Method used: MLE"

Negative Log-Likelihood Value: 175.7782

Estimated parameters:
location scale
14.799982 2.886128

Standard Error Estimates:
location scale
0.3709054 0.2585040

Estimated parameter covariance matrix.
location scale
location 0.13757080 0.03173887
scale 0.03173887 0.06682429

AIC = 355.5563

BIC = 359.9954

All the goodness of fit measure are all larger as expected.

lr.test Likelihood-Ratio Test

Use the likelihood ratio test to confirm this

lr.test(fit1,fit0)
Likelihood-ratio Test
data: TMX1TMX1
Likelihood-ratio = 6.0711, chi-square critical value = 3.841, alpha = 0.050,
Degrees of Freedom = 1.000, p-value = 0.01374 alternative hypothesis: greater

I look at the p-value, because it makes sense to me.

Bayesian estimation

> fB <- fevd(TMX1, PORTw, method = "Bayesian")
> fB

fevd(x = TMX1, data = PORTw, method = "Bayesian")
[1] "Estimation Method used: Bayesian"

Acceptance Rates:
log.scale shape
0.5379076 0.4971994

fevd(x = TMX1, data = PORTw, method = "Bayesian")

[1] "Quantiles of MCMC Sample from Posterior Distribution"

2.5% Posterior Mean 97.5%
location 14.2452034 15.1496558 16.0945492194
scale 2.5276179 3.0935399 3.8803429287
shape -0.3654228 -0.1975376 -0.0004970114

Estimated parameter covariance matrix.
location log.scale shape
location 0.240496184 0.009813165 -0.015884315
log.scale 0.009813165 0.124166098 -0.013129196
shape -0.015884315 -0.013129196 0.008814744

DIC = 1045.476

L-moments

fitLM <- fevd(TMX1, PORTw, method = "Lmoments")
fitLM

fevd(x = TMX1, data = PORTw, method = "Lmoments")
[1] "GEV Fitted to TMX1 of PORTw data frame, using L-moments estimation."
location scale shape
15.1775146 3.0286294 -0.2480594

Significance of Co-variance Information

plot(PORTw$TMX1, PORTw$AOindex)

> fit2 <- fevd(TMX1, PORTw, location.fun = ~AOindex, units = "deg C")
> fit2

fevd(x = TMX1, data = PORTw, location.fun = ~AOindex, units = "deg C")
[1] "Estimation Method used: MLE"

Negative Log-Likelihood Value: 166.7992

Estimated parameters:
mu0 mu1 scale shape
15.2538412 1.1518782 2.6809613 -0.1812824

Standard Error Estimates:
mu0 mu1 scale shape
0.35592663 0.31800904 0.24186870 0.06725912

Estimated parameter covariance matrix.
mu0 mu1 scale shape
mu0 0.126683767 0.002230374 0.010009100 -0.008065698
mu1 0.002230374 0.101129752 -0.002538585 0.002075487
scale 0.010009100 -0.002538585 0.058500466 -0.007020374
shape -0.008065698 0.002075487 -0.007020374 0.004523789

AIC = 341.5984

BIC = 350.4764

What is going on here?
location parameterisation now has 2 parts mu0 and mu1

mu(x) = mu0 + mu1 * x, where x = AO Index

lr-test
All the goodness of fit measure are lower when AO Index is consider
Expect utilises the AO index information improves the fit of the model

lr.test(fit1,fit2)
Likelihood-ratio Test
data: TMX1TMX1
Likelihood-ratio = 11.8869, chi-square critical value = 3.841, alpha = 0.050,
Degrees of Freedom = 1.000, p-value = 0.0005653

p-value very small, result as expected

Is the covariance information significant to the scale parameter?

fit3 <- fevd(TMX1, PORTw, location.fun= ~AOindex, scale.fun = ~AOindex, units = "deg C")
> fit3
fevd(x = TMX1, data = PORTw, location.fun = ~AOindex, scale.fun = ~AOindex,
units = "deg C")
[1] "Estimation Method used: MLE"

Negative Log-Likelihood Value: 166.6593

Estimated parameters:
mu0 mu1 sigma0 sigma1 shape
15.2616373 1.1802783 2.6782395 -0.1464682 -0.1860603

Standard Error Estimates:
mu0 mu1 sigma0 sigma1 shape
0.35610192 0.31724472 0.24242220 0.27629143 0.06862845

Estimated parameter covariance matrix.
mu0 mu1 sigma0 sigma1 shape
mu0 0.126808580 -0.0085301401 0.0098703920 -0.001720545 -0.0084519671
mu1 -0.008530140 0.1006442141 -0.0001225007 -0.014111326 0.0004891586
sigma0 0.009870392 -0.0001225007 0.0587685234 -0.005405442 -0.0073208829
sigma1 -0.001720545 -0.0141113262 -0.0054054420 0.076336953 0.0015286996
shape -0.008451967 0.0004891586 -0.0073208829 0.001528700 0.0047098637

AIC = 343.3185

BIC = 354.4161

lr-test
All the goodness of fit measure are HIGHER when AO Index is consider for location and scale
than only location

lr.test(fit2,fit3)
Likelihood-ratio Test
data: TMX1TMX1

Likelihood-ratio = 0.2798, chi-square critical value = 3.841, alpha = 0.050,
Degrees of Freedom = 1.000, p-value = 0.5968
alternative hypothesis: greater

p-value is large,

Thursday, September 11, 2014

Basic Time Series with R

Currently play with the Port Jervis, New York annual maximum and minimum winter temperature data, provided with the extRemes R package

Extreme value deal with rare events which are unlikely to follow patterns and assumptions common with complete event view, but standard initial analysis tools can still provide insight.

library(extremes)
data("PORTw")

plot(PORTw$TMX1, type = "l", xlab = "Year", ylab = "Maximum winter temperature", col = "red")
plot(PORTw$TMX0, type = "l", xlab = "Year", ylab = "Maximum winter temperature", col = "darkblue")

Visual inspection

Trend - none
Cycle - none
Clustering- none
Pair wise correlation - maybe

Both are non-cyclic, and can probably be described using an additive model, since the random fluctuations in the data are roughly constant in size over time:

Exponential Model

Lets try fitting a Simple Exponential Smoothing
convert to a time series object

> maxts <- ts(PORTw$TMX1, start=c(1927))

Fit model with no trend or cycle

> fit1 <- HoltWinters(maxts, beta=FALSE, gamma=FALSE)
> fit1
Holt-Winters exponential smoothing without trend and without seasonal component.
Call:
HoltWinters(x = maxts, beta = FALSE, gamma = FALSE)
Smoothing parameters:
alpha: 0.1368945
beta : FALSE
gamma: FALSE
Coefficients:
[,1]
a 16.49764

plot(fit1)

Simple measure of fit. sum-of-squared-errors

fit1$SSE
[1] 763.3207

ARIMA Model

Difference the times series

maxtsdiff <- diff(maxts, differences = 1)

acf(maxtsdiff, lag.max = 20)

Basic visual inspection, may something at 1 year lag.

maxtsarima <- arima(maxts, order=c(0,1,1))
> maxtsarima
Series: maxts
ARIMA(0,1,1)
Coefficients:
ma1
-1.0000

s.e. 0.0532

sigma^2 estimated as 9.636:

log likelihood=-173.07

AIC=350.15

AICc=350.33

BIC=354.56

And let R work out if and what order ARIMA would be appropriate

> mtsARIMA <- auto.arima(maxts)
> mtsARIMA
Series: maxts
ARIMA(0,0,0) with non-zero mean
Coefficients:
intercept
16.3154
s.e. 0.3737
sigma^2 estimated as 9.494:

log likelihood=-173.01

AIC=350.02

AICc=350.21

BIC=354.46

No moving average happening

Pair Wise Correlation

Start with scatter plots

Check correlation coefficients

> cor(PORTw$TMX1, PORTw$TMN0)
[1] 0.1802413> cor(PORTw$TMX1, PORTw$AOindex)[1] 0.3944286> cor(PORTw$TMN0, PORTw$AOindex)[1] 0.01206764> cor(PORTw$TMX1, PORTw$AOindex, method="kendall")[1] 0.3019692

Possibly something to work with Arctic Oscillation Index and Maximium winter temperature

Sunday, August 24, 2014

Update QGIS from 2.2.0-8 to 2.4.0-1

Updating QGIS on Mac OSX 10.9.4

To update my QGIS need to download 2 packages

http://www.kyngchaos.com/software/qgis

http://www.kyngchaos.com/software/frameworks#gdal_complete

I deleted the old QGIS from the application directory.

I investigated updating GDAL to GDAL 1.11 using anaconda, but

 the following combinations of packages create a conflict with the

remaining packages:

  - python 2.7*

  - gdal 1.11*

So I installed GDAL 1.11 using
http://www.kyngchaos.com/software/frameworks#gdal_complete

Then installed QGIS 2.4.0

Friday, August 15, 2014

Days since 0001-01-01

Viewing output from CSIRO ACCESS 1.3 CMIP5 run, I saw the time units were 'days since 0001-01-01' the coder in me had heart palpitations.

The joys of Julian Calendars, leap years, does anyone know how many days between now and then.

Looking at a ACCESS Port Processing script
https://trac.nci.org.au/trac/access_tools/browser/app/trunk/app.py?rev=85

refdate are 1 (0001-01-01) and 719 163 (1970-01-01)

As I am currently just mucking around, I am just glad that someone knows how many days has passed since January 1, Year 1.

Now I can transform to my frame of reference more readily.

Sunday, June 15, 2014

Climate Change in 4 Dimensions

From April 8 2104 till June 17 I participated in online course

www.coursera.org

Climate Change in 4 Dimensions
University of San Diego.

Overal I learnt a lot about climate change beyond the science aspects

During the course I kept notes to provide some feedback and ideas for improvement.
I captured this in this post.

Lecture 5

Early part is very US centric

Global Average Warming Chart after Ocean Acidification and Coral , the text is to small to read.

Lecture 6

Very informative and enjoyable, Professor David G. Victor engaged me well.Introduced Stock Problem Economic model to me. Already introducing new ideas to my thoughts

Lecture 9

Weekly activity
Survey is USA oriented. Of course the US should act independent of other countries. The question for me is should Australia act independent of other countries.

This weeks graded quiz was not up to scratch. Usually only one or none of the question are badly worded, easily to be misunderstood, or difficult to attribute to any of the reading or lectures. This week multiple questions including.
Question 8 , 9. difficult to understand/read.
"Heat waves and heavy precipitation" is not a trend.

Lecture 10

Feedback:
Slide from Rogell, Meinshausen 2012 is out of date, wrong name inconsistent with other slides.
Chaparral is not a common international word, and is not what would grow in Australia.
Question in the ungraded quiz unclear

Based on global climate models, what regions will see the biggest precipitation increase with global warming?

northern

western

southern

eastern

This question is about North America, Not clear

Concepts

Uses indices to show warming particular in USA, definitely regional/spatial
Indices

Mean annual temperature
Number of cold days
Number of hot days
Agricultural Based - frost, super hot days, frost free (growing season) forecast.

Keeling curve shows growing season has increased by about 10 days, in agreement with ag-models

Heat Wave

Multiple factors some of which more common due to CC
Europe 2003, 2006, Australia 2009 (Karoly), Russia 2010, USA 2009

Models don't seem to capture very cold days.

Midterm

Question 7 unclear
If someone makes a prediction, and it comes true, does that mean that the prediction is correct?
does it mean hypothesis or prediction is true?

Question 9
Debatable if one or more answers are correct

Week 8

Lecture 14
Ice, snow and water
Does NOT mention ice West antarctica Ice Shelf info?
Projections continue to change with increasing understanding?
Parts of the lecture are dating quickly.

Lecture 15
Very directed to a Californian Audience. The level of knowledge about geography of California was assumed to be higher than my own, though I have travelled there. Think this would exclude some of the coursera audience.

Quiz Question 9

"Please fill in the blank: By 2050, the number of ‘extremely hot’ days could increase ________."

This statement was not in the lecture notes, so how does one know what location of study to cite? Does it it relate to global, USA, California, San Diego, or Australia extremely hot days.?

Lecture 19

Spelling mistake in the "Check your knowledge quiz" question 4
Arpanet

Sunday, May 25, 2014

Thoughts on Pattern Scaling

During the last week I have been consider ideas put forward at the

Pattern Scaling, Climate Model Emulators and their Application to the new Scenario Process
NCAR, Boulder Colorado, April 23-25 2014

and

Lopez et. la. 2013 Robustness of pattern scaled climate change scenarios for adaption decision support

Major Aims

"Fit empirical / statistical relation b/w impact relevant climate variables and large scale quantities obtainable through simple models"
"Run simple models under arbitrary scenarios and recover impact relevant outcomes by applying those relations"

User Needs

Impact research
policy makers
Social Economic
higher resolution

Uncertainty

Handling
Quantitating
sufficiently low uncertainty for outcome information produced to be useful.

Standard Pattern Scaling

Developed, tested and applied for 20 years
provide a simplified representation of climate system responses.
- local (or regional) changes in these variables tend to increase linearly with the global warming over the coming century.
1. local change can be seen as a 'response' to the global warming (GW)
critical assumption is the there is linear relationship b/w a scalar, and a geographical response pattern

Flaws / Concerns

main climate mechanisms are not linear
feedback
timescale in response change
patterns evolve

Uncertainty

Uncertainty hard to capture

model uncertainty

Multimodel ensemble can reduce uncertainty

scenario uncertainty
depend on statistical assumption
analysis of variance

map std dev.

Wednesday, May 14, 2014

NCL Regional Temperature Anomolies

Data
HadCRUT 4
Surface Temperature Anomalies (C with respect to 1961-1990)

Choose a region

Getting my regions in a reusable, correctly projected, and regridded was more difficult than I thought it would be.
Publicly available global region based netCDF files are not common
In the end, I used a shape file to create a netCDF with a single time dimension, and integer layer of region ids at each 1 degree

Australia

Just going with continents at this stage and Australia is the country I was born.

Start with a single day.
; read only desired time 1970
x = fin->temperature_anomaly(1440,:,:)

xr = where(region.eq.2,x,x@_FillValue)

print(avg(xr))

Then time series for a single year. Oct 2012 to Sept 2013

; Time Series average for Australia 2013
x1=fin->temperature_anomaly(1953:1964,:,:)
rconform = conform(x1, region, (/1,2/))
xr1 =mask(x1, rconform, 2)

xa1 = dim_avg_n(xr1,(/1 , 2/))
t1 = ispan(0,12,1)

wks = gsn_open_wks ("x11","xy") ; open workstation
res = True ; plot mods desired
res@tiMainString = "Average Australian Temperature Anomaly 2013" ;

Result similar to BOM

Then annual time series for a decade or so

yStart = 1990
yEnd = 2013
tStart = (yStart - T_OFFSET ) * 12;
tEnd = (yEnd - T_OFFSET) * 12 + 11
x2=fin->temperature_anomaly(tStart:tEnd,:,:)

rconform2 = conform(x2, region, (/1,2/))
xr2 =mask(x2, rconform2, 2)
copy_VarCoords(x2, xr2) ; need dim metadata retained for clim functions
xa2 = dim_avg_n(xr2,(/1 , 2/))
xannual = month_to_annual(xa2, 1) ;Annual Average Temperautre

printVarSummary(xannual)

wks = gsn_open_wks ("ps","xy") ; open workstation
res = True ; plot mods desired
res@tiMainString = "Annual Mean Australian Temperature 1990-2013" ;
res@tiYAxisString = "Anomalies" ; y-axis label

res@gsnYRefLine = 0. ; reference line
res@gsnXYBarChart = True ; create bar chart
res@gsnAboveYRefLineColor = "red" ; above ref line fill red
res@gsnBelowYRefLineColor = "blue" ; below ref line fill blue

res@tiXAxisString = "Year"

plot = gsn_csm_xy (wks,ispan(yStart,yEnd,1),xannual,res) ; create plot

Then summer time series for a decade

; Summer Time Average Temperature anomaly for 2000's
yStart = 2000
yEnd = 2013
tStart = (yStart - T_OFFSET ) * 12 ;
tEnd = (yEnd - T_OFFSET) * 12 + 11
x3=fin->temperature_anomaly(tStart:tEnd,:,:)

rconform3 = conform(x3, region, (/1,2/)) ;Create Mask grid with time dim of data
xr3 =mask(x3, rconform3, 2) ; mask out data not in region 2 , Australia
copy_VarCoords(x3, xr3) ; need dim metadata retained for clim functions

xseasonal = month_to_seasonN(xr3, (/ "DJF"/)) ; Seasonal Average Temperautre
printVarSummary(xseasonal)

xa3 = dim_avg_n(xseasonal,(/2 , 3/)) ; Average across long and lat dimension
; average across region as all other values masked
copy_VarCoords_2(xseasonal,xa3)
printVarSummary(xa3)

print(xa3)

wks = gsn_open_wks ("x11","xy") ; open workstation
res = True ; plot mods desired
res@tiMainString = "Summer Mean Australian Temperature 1990-2013" ;
res@tiYAxisString = "Anomalies" ; y-axis label

res@gsnYRefLine = 0. ; reference line
res@gsnXYBarChart = True ; create bar chart
res@gsnAboveYRefLineColor = "red" ; above ref line fill red
res@gsnBelowYRefLineColor = "blue" ; below ref line fill blue

res@tiXAxisString = "Year"

plot = gsn_csm_xy (wks,ispan(yStart,yEnd,1),xa3(0,:),res) ; create plot

Happy with this time to move on

Thursday, May 8, 2014

Install CDO

Mac OSX Mavericks

Download https://code.zmaw.de/projects/cdo/files

In terminal

gunzip cdo-current.tar.gz

tar -xf cdo-current.tar

cd cdo-1.6.4r5

./configure

make

sudo make install

Tuesday, May 6, 2014

Projections and Gridding

Today I have been thinking about projections

CMIP5 is netCDF the projection is provided in the Metadata, resolution 0.5° x 0.5°

I want to perform polygon regions analysis.

I need the 'masking' of the CMIP5 to be efficient. Visualisation is not a concern.

What file format for storing polygons should I use ?

projection
resolution

I have been looking at different programming languages, particularly R and NCL.

NCL is built to handle netCDF data efficiently.

But I am having trouble grasping how I can use my polygons efficiently. The polygons are initially shapefiles. The example polygon code focuses on using polygons in visualisation. The example shapefile masking code seems very inefficient.

Focusing on NCL

If I know CMIP5 resolution and projection I should be able to create netCDF file with a layer / a variable / or slot in array - per region , with binary values.

Max and Min Latitude of the region could be used to reduce data extract from the netCDF

; read only desired area & times

x = in->SST(tStrt:tLast,{latS:latN},{lonL:lonR})

Merge / AND / if / where

@FillValue is kind of like null for this variable, ignored by many functions.

  if(.not.isatt(data,"_FillValue")) then
    data@_FillValue = default_fillvalue(typeof(data))          ;-- make sure "data" has a missing value
  end if

x is CMIP5 netCDF data

regions is all my polygons

 x and regions should have the same dimensions

 xr = where(regions.eq.17, x, x@_FillValue)

Or once I have a single binary object per region

 x = in->SST(tStrt:tLast,{r17.latS:r17.latN},{r17.lonL:r17.lonR})

 xr = where(r17, x, x@_FillValue)

There is memory concern here.

I am going with the concept by region. that is

For each region


For each netCDF file

Open netCDF file
Populate variable (X) with netCDF data by region and time period
Close netCDF file
Perform any calculation which will optimise memory footprint 

x <- X

Delete X, keep derived data.


Still got the region shape file. Regrid to standard CMIP5

Ahhh there is no standard grid ding CMIP5

Initially going for bi-linearly interpolation  to 1x1 rectilinear grid

Why Bilinear


Focus initially land
Noaa using in this http://www.esrl.noaa.gov/psd/ipcc/ocn/details.html
Linear Interpolation had a good ranking in 

Mora, C., et al. (2013). "Biotic and human vulnerability to projected changes in ocean biogeochemistry over the 21st century." PLoS biology 11(10): e1001682.




Why Rectilinear grid



Simple
Too many people think in rectangles
I don't like how the areas are so different
Consider variation at later stage


Why 1 x 1


Because resolution should be reasonable with all the CMIP5 0.5 degree data

Thursday, May 1, 2014

NCL Try outs

Installing NCL

Put this in my .bash_profile file:

# NCL additions

export NCARG_ROOT=/Volumes/Data/Users/fmacgill/Projects/ncl62/

export PATH=$NCARG_ROOT/bin:$PATH

Library Issue

Had issues with a library, solve by installing gcc47 through macports

http://www.ncl.ucar.edu/Download/macosx.shtml#libgomp

Put this in my .bash_profile file:

export DYLD_FALLBACK_LIBRARY_PATH=/opt/local/lib/gcc47/

clmMonLLT

Decide to try out NCL with this example code.

https://www.ncl.ucar.edu/Applications/climo.shtml

Downloaded

clim0_4.ncl

https://www.ncl.ucar.edu/Applications/Data/

Downloaded

xieArkin_T42.nc

Want to see output immediately so changed Line 31

wks = gsn_open_wks("x11" ,"climo") ; open ps file

So what going on

Wednesday, April 30, 2014

Initial Region Masking map

There are many ways to break up the global into regions.
Already spent hours trying to find ways of classification of create regions
Already spent hours trying to locate shape files, geoTiffs, or any GIS compatible sources

Waste of time, the data exists, others have it. Need to work out who to ask and gain the confident to ask.

Need to focus on the process

Josh O'Brien
So from http://stackoverflow.com/questions/20146809/how-can-i-plot-a-continents-map-with-r
SRES are IPCC Emission Scenario Regions

Same code tweaked.

library(rworldmap)
library(rgeos)

sPDF <- getMap()
sres <-
sapply(levels(sPDF$SRES),
FUN = function(i) {
## Merge polygons within a continent
poly <- gUnionCascaded(subset(sPDF, SRES==i))
## Give each polygon a unique ID
poly <- spChFIDs(poly, i)
## Make SPDF from SpatialPolygons object
SpatialPolygonsDataFrame(poly,
data.frame(SRES=i, row.names=i))
},
USE.NAMES=TRUE)

## Bind the 11 SRES-level SPDFs into a single SPDF
sres <- Reduce(spRbind, sres)

## Plot to check that it worked
plot(sres, col=heat.colors(nrow(sres)))

## Check that it worked by looking at the SPDF's data.frame
## (to which you can add attributes you really want to plot on)
data.frame(sres)

Save for future use. Use .RData for faster loading in the future.
> class(sres)
[1] "SpatialPolygonsDataFrame"
attr(,"package")
[1] "sp"
> proj4string(sres)
[1] "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"

> setwd("~/Projects/R/test")
> getwd()
[1] "/Volumes/Data/Users/fmacgill/Projects/R/test"

> sres <- data_name
> plot(sres, col=heat.colors(nrow(sres)))
> data_name <- sres
> save(data_name, file="SRESGlobal.RData" )

Clear Variables and check retrieval

> sres <- 1
> data_name <- 1
> load("SRESGlobal.RData")
> sres <- data_name
> plot(sres, col=heat.colors(nrow(sres)))

Monday, March 10, 2014

Download Data

Downloaded some simple time series data.

How do I keep the data, the source, and crap together?

Metadata

What Metadata do I need?

I don't know.

Starting with

source / url
units
location
type
startDate
endDate
increment

Going to save a matlab struct per time series.

This is not going to be an optimal solution, but better than nothings, and better than doing nothing.

Python

I need to work with netCDF files

Currently I know Java, am familiar with MATLAB

Java is clunking with numbers.

I have found MATLAB difficult to produce reusable code with.

R sounds appealing, but python has more support in the college, for netCDF and GIS applications

--> Introducing myself to python,

Download and install iPython
Watch intro to iPython
Download zip from Unidata Training Workshop from GitHub
https://github.com/Unidata/tds-python-workshop
Unzip in directory
Start ipython in this directory
ipython notebook --pylab inline

This went nowhere.

Talked with my supervisors going elsewhere at the moment

Tuesday, February 25, 2014

Submitted Abstract to Conference

Today I submitted an abstract to a conference.

The research was from my masters. It felt good to have something out there even if it is not related to my current research.

Thursday, February 20, 2014

Wiki

Did a mini Wiki Demo last week for supervisor.
Good support

Already in soft release. Multiple people using the 'Getting Started' Printer setup page.

Continuing to add pages and information for release to the group next week.

New Machine

Software

Matlab 2013b
Google Chrome

Google Drive
Evernote
Evernote Web Clipper for Chrome
Endnote V7
Spotify

Google Chrome

Linked mainly to my PhD Google+ account.
Will know that I also have a Google Account
Will never be used for Banking, PayPal, government interactions.

Storage

Online Backup

One of the extra tasks I have taken on for the group is to look at storage and online backup.

Storage

Document Storage

Options

1. GoogleDrive / GoogleDoc through GoogleApps

2. University hosted file sever costing 250 per 1Tb

Recommendation

1. GoogleDrive / GoogleDoc through GoogleApps

· Currently provided by GoogleDoc and Google Drive.

· Everyone seems confortable with.

· Main issue is there is no encryption security provided

Large data set storage

Currently students using the largest Datasets. They have access to a Earth Science hosted database for their work. Leave for later date.

Version Control Repository

Why – good practice to store versions of software.

Options

1. Git:

GitHub

· Uni has a evangelist, who would help with introduction

· I have used it.

· Private Repository hosted in the Cloud by reputable GitHub available to education groups free, but further investigation shows there are quite a few hoops

Bitbucket

· Uni has a evangelist, who would help with introduction

· BitBucket allows unlimited Private Repositories with up to 5 users

· Other Departments use it.

2. Subversion :

· PIK uses it

· I can administer it

· We would need have a box,

o Upfront costs

o Location issues.

Recommendation

1. BitBucket unless likely to have more than 5 users.

Wiki

Why – great way to communicate with in the Phd group.

Particularly as a single point for new comers to get established, record any issue or work arounds

Recommendation

Google Sites included as part of GoogleApps Education

Laptop Backup

Important Consideration

· Off Site

· Cost

· Size

· Encryption Security

· Longevity

· Ease of Use

· OS Support (Windows , Mac, Linux)

· Organisation Solution

Uni does not support or have a recommended solution

Dual solutions provide best disaster plan.

Regular Manual Hard disk / USB backup

Online Automated Backup

Reasonable bandwidth

Ability to turn off when in a low bandwidth situation.

Non Free Solutions

SpiderOak

~$5 per month per user 200Gb (e)

iDrive

from $25 per year for 150Gb (e) individuals

also org. pricing need further investigation,

start with 5Gb free

CrashPlan

~ $5 per month per user

need to investigate educational discounts

start with 2Gb free

Free

	Mega	GoogleDrive	iDrive
Size	50Gb	30Gb	5
Encryption	No	No	Yes
Longevity	Unknown / Could be issue	Good	Good
Windows	yes	yes	Yes
OSx	no	yes	Yes
Linux	no	yes	Yes
Org Sol	no	somewhat	Upgrade
Location Choice	yes	no	Yes

Distributed Computing (CPU) power

Not Considered

Monday, September 15, 2014

Default Run

lr.test Likelihood-Ratio Test

Bayesian estimation

L-moments

Significance of Co-variance Information

Thursday, September 11, 2014

Exponential Model

ARIMA Model

Pair Wise Correlation

Sunday, August 24, 2014

Friday, August 15, 2014

Sunday, June 15, 2014

Lecture 5

Lecture 6

Lecture 9

Lecture 10

Sunday, May 25, 2014

Major Aims

Standard Pattern Scaling

Flaws / Concerns

Uncertainty

Wednesday, May 14, 2014

Thursday, May 8, 2014

Tuesday, May 6, 2014

Thursday, May 1, 2014

Installing NCL

clmMonLLT

Wednesday, April 30, 2014

Monday, March 10, 2014

Metadata

Tuesday, February 25, 2014

Thursday, February 20, 2014

Online Backup

Storage

Document Storage

Large data set storage

Version Control Repository

Wiki

Laptop Backup

Mega

GoogleDrive

iDrive

Size

Encryption

Longevity

Windows

OSx

Linux

Org Sol

Location Choice

Distributed Computing (CPU) power