# Calculating percentiles

One thing I've been wanting to do for quite some time now is calculating a confidence interval for the metrics measured by Fluxmon. I still have to figure out whether or not it's the same thing, but for starters I'd like to calculate percentiles for those values.

Hans-Jürgen Schönig was nice enough to email me the slides for his presentation, where he showed pretty much exactly how this is done:

```select
cm.check_id,
cm.value,
ntile(100) over (order by cm.value)
from
monitoring_checkmeasurement cm
inner join monitoring_sensorvariable sv on variable_id=sv.id
where
cm.check_id=287 and
sv.name = 'wr_ios'
group by
cm.check_id,
cm.value;
```

So let's see:

``` check_id |      value       | ntile
----------+------------------+-------
287 | 229.136365378731 |     1
287 | 236.251859354762 |     1
287 | 236.290934270205 |     1
287 | 237.183575731604 |     1
287 | 237.468216034691 |     1
287 | 238.002109663627 |     1
287 | 240.047801429467 |     1
287 | 241.096133107505 |     1
287 | 241.142928017446 |     1
287 | 241.271274374722 |     1
287 |  241.43676748922 |     1
287 | 241.631330132004 |     1
287 | 241.648369093682 |     1
287 | 242.301309617281 |     1
287 | 242.528538355955 |     1
287 | 242.601739331639 |     2
287 |  242.76048787876 |     2
287 | 242.940793596404 |     2
287 | 242.977103423104 |     2
287 | 243.035109171986 |     2
287 | 243.294571376249 |     2
287 | 244.215759503865 |     2
287 | 244.259601351866 |     2
287 | 244.432728072524 |     2
287 | 244.516693368984 |     2
287 | 244.867435321636 |     2
287 | 245.093520281211 |     2
287 | 245.122918617753 |     2
287 | 245.158186023094 |     2
287 |  245.18250418663 |     2```

That's a good start, but I'd like to get one value per percentile. And I guess those would be most useful if I had the MAX for ntiles <= 50 and the MIN for those > 50.

```select
check_id,
CASE WHEN entile <= 50 THEN MAX(value) ELSE MIN(value) END as boundary,
entile
from (
select
cm.check_id,
cm.value,
ntile(100) over (order by cm.value) as entile
from
monitoring_checkmeasurement cm
inner join monitoring_sensorvariable sv on variable_id=sv.id
where
cm.check_id = 287 and
sv.name = 'wr_ios'
group by
cm.check_id,
cm.value
) as x
group by
x.check_id,
x.entile
order by x.entile;
```

This looks better:

``` check_id |     boundary     | entile
----------+------------------+--------
287 | 242.528538355955 |      1
287 |  245.18250418663 |      2
287 | 247.452813643881 |      3
287 | 249.174154040999 |      4
287 | 250.819910098371 |      5
287 | 252.093448602465 |      6
287 | 253.222732199382 |      7
287 | 254.289464902005 |      8
287 | 255.301277856935 |      9
287 | 256.807463466736 |     10
287 | 257.886738494251 |     11
287 | 259.363337856962 |     12
287 | 260.591828244757 |     13
287 | 261.250520159227 |     14
287 | 262.297351865419 |     15
287 | 263.099028123455 |     16
287 | 263.982356107804 |     17
287 | 264.603170887436 |     18
287 | 265.187531126404 |     19
287 | 266.111760914008 |     20
287 |  267.20722187206 |     21
287 | 268.096699408997 |     22
287 | 268.853926667567 |     23
287 | 269.744325622614 |     24
287 | 270.188142168672 |     25
287 | 270.593971794008 |     26
287 | 271.236878900381 |     27
287 | 272.133271901325 |     28
287 | 272.613305290942 |     29
287 | 273.139119588124 |     30
287 | 273.941309947633 |     31
287 | 274.798961636146 |     32
287 | 275.440183189522 |     33
287 | 276.138383373647 |     34
287 | 276.611642523528 |     35
287 | 277.093934420074 |     36
287 | 277.987756333866 |     37
287 | 278.571849180595 |     38
287 | 279.254178302784 |     39
287 | 279.905348607647 |     40
287 | 280.309358977164 |     41
287 | 280.665632815261 |     42
287 | 281.240222774005 |     43
287 | 281.841771506664 |     44
287 | 282.310636865485 |     45
287 | 283.014585882153 |     46
287 | 283.835777648474 |     47
287 | 284.455286722166 |     48
287 | 285.137465659157 |     49
287 | 285.578659445801 |     50
287 | 285.581995639674 |     51
287 | 286.326444822576 |     52
287 | 286.688561129296 |     53
287 | 287.223072002403 |     54
287 | 287.919565845853 |     55
287 | 288.721280851317 |     56
287 | 289.432329385731 |     57
287 | 289.959411908295 |     58
287 | 290.651482471229 |     59
287 | 291.372769922591 |     60
287 | 291.842573577583 |     61
287 | 292.736031355903 |     62
287 | 293.187362152819 |     63
287 | 293.838025307673 |     64
287 |  294.57972141345 |     65
287 | 295.082963391352 |     66
287 |  296.24698891679 |     67
287 | 297.600830745376 |     68
287 |  298.87510821896 |     69
287 | 299.566450949587 |     70
287 |  300.56043720098 |     71
287 | 301.530900814087 |     72
287 | 302.510107864529 |     73
287 | 303.340573447187 |     74
287 | 304.723269421615 |     75
287 |  305.39145573103 |     76
287 |  306.51173792586 |     77
287 | 307.668492795808 |     78
287 | 308.466223822391 |     79
287 | 309.823047453998 |     80
287 | 311.146362389198 |     81
287 | 313.607204835349 |     82
287 |   314.9906630401 |     83
287 | 316.958319112472 |     84
287 | 318.777340519479 |     85
287 | 320.171519126339 |     86
287 | 322.255329479526 |     87
287 | 324.608642117519 |     88
287 | 327.670448175898 |     89
287 |   331.5988884139 |     90
287 |  337.78420647411 |     91
287 | 342.050233338103 |     92
287 | 349.403819865874 |     93
287 | 358.749356396326 |     94
287 | 376.297282027477 |     95
287 |  403.67843072412 |     96
287 | 422.926274104004 |     97
287 | 487.446437625305 |     98
287 | 605.442699965559 |     99
287 | 684.885164154152 |    100
(100 rows)```

Btw, the average of all values is:

``` check_id |       avg
----------+------------------
287 | 301.871884282724
(1 row)```

So, what do we learn from that?