Calculating daily means from GNSS-IR data

This document demonstrates how we calculate the daily means displayed in the plots on the site pages (e.g. Brest) from the distributed high frequency data. For simplicity, we're using the Pandas data analysis toolbox, although similar code could be written without it.

Loading data

We'll use Brest as an example - this code loads the data into a data frame, then we'll look at the first few rows.

See the general example notebook for an overview of what the columns mean. We're going to create daily averages by taking a simple mean of all observations on each day. But to prevent that being biased by how the observations are distributed over the tidal cycle, we're going to do it on residual data (data with the tide removed). We'll also used the adjusted height variable, which accounts for vertical movement of the water as the satellite passes overhead. You'll notice this operation will result in values around zero - don't worry, we'll add the mean sea level value back in later.

Separating the signals

Before creating the average, we're going to need to separate the data into the different signals used. There's uncertainty over the exact location measurements are taken from in the GNSS sensor (the antenna phase centre) and this can location can be different for signals of different frequencies.

We'll create an extra variable dictating which GNSS constellation the constellation belongs to (0=GPS, 1=GLONASS, 2=Galileo, 3=BeiDou - remember the 'prn' variable gives us this information)

Now, we want to group the variables by constallation and signal - this will give a separate data frame for each combination of constellation and signal. We'll display how many cases we've got for each case

So lots of GPS and GLONASS data, a small amount from Galileo, and a tiny bit of BeiDou. Next we calculate the daily averages, considering each constellation/signal group separately

We'll do some rearranging - we'll convert this into a dictionary where the keys are the (constellation, signal) pairs, and the values are the residual values for this frequency. We'll show the top of the GPS L2 channel, that's (0,2)

Calculating daily means

Now, we'll take the daily mean of each frequency in turn - again we'll display GPS L2 as an example. The groupby function just splits the data into daily chunks and takes the mean of each chunk.

Time for some more rearranging - lets convert this into one data frame, with each column representing one of the channels

The display of the column headers is a bit unclear there, but never mind. You'll also see that at the start of the record, we only have data from the main two channels of GPS and GLONASS. Next we want to account for the slight differences in antenna phase centre for different channels by comparing them to a reference channel. We've used GPS L2 to do this, as we know it's present in all the records we've processed.

And we just take the average of all the rows in this table to create an average offset for each channel in the table. Obviously it'll be zero for GPS L2.

Subtract these values from the relevant column to put all the values in approximately the same reference frame as the GPS L2 data

And finally we can calculate our daily means!

We want to get it in the reference frame of the original data - this information is actually stored in the original file header, but I've just copied the values here. It works like this:

Comparing with tide gauge data

Finally, let's compare that with some tide gauge data. The best place to obtain quality controlled tide gauge data from Brest is from the supplying authority, SHOM, but UHLSC have an ERDDAP server we can use to get daily means without any extra effort.

We skip the second line of data using the parameter skiprows=[1] (as python uses zero-indexing) as this line contains the units of the parameters. The sea level is in millimetres, but we'll deal with that in a moment.

We need to convert that from millimetres to metres, and then apply a similar correction as in the general example notebook to put it in the same reference frame as the GNSS-IR data, then we can plot them both.

In the case of Brest, I can use the information on the PSMSL datum page from Brest and the knowledge that the UHSLC record uses the HZ 1996 datum there to transform the data to approximately the same reference frame as the GNSS-IR data.

There's still a slight offset, perhaps due to issues around the antenna phase centre of GPS L2 itself, and also the time we've referred the sea level and GNSS-IR data to the GRS80 ellipsoid is slightly different, but you can see the records match well.