Calculating percentiles
One thing I've been wanting to do for quite some time now is calculating a confidence interval for the metrics measured by Fluxmon. I still have to figure out whether or not it's the same thing, but for starters I'd like to calculate percentiles for those values.
Hans-Jürgen Schönig was nice enough to email me the slides for his presentation, where he showed pretty much exactly how this is done:
select cm.check_id, cm.value, ntile(100) over (order by cm.value) from monitoring_checkmeasurement cm inner join monitoring_sensorvariable sv on variable_id=sv.id where cm.check_id=287 and sv.name = 'wr_ios' group by cm.check_id, cm.value;
So let's see:
check_id | value | ntile ----------+------------------+------- 287 | 229.136365378731 | 1 287 | 236.251859354762 | 1 287 | 236.290934270205 | 1 287 | 237.183575731604 | 1 287 | 237.468216034691 | 1 287 | 238.002109663627 | 1 287 | 240.047801429467 | 1 287 | 241.096133107505 | 1 287 | 241.142928017446 | 1 287 | 241.271274374722 | 1 287 | 241.43676748922 | 1 287 | 241.631330132004 | 1 287 | 241.648369093682 | 1 287 | 242.301309617281 | 1 287 | 242.528538355955 | 1 287 | 242.601739331639 | 2 287 | 242.76048787876 | 2 287 | 242.940793596404 | 2 287 | 242.977103423104 | 2 287 | 243.035109171986 | 2 287 | 243.294571376249 | 2 287 | 244.215759503865 | 2 287 | 244.259601351866 | 2 287 | 244.432728072524 | 2 287 | 244.516693368984 | 2 287 | 244.867435321636 | 2 287 | 245.093520281211 | 2 287 | 245.122918617753 | 2 287 | 245.158186023094 | 2 287 | 245.18250418663 | 2
That's a good start, but I'd like to get one value per percentile. And I guess those would be most useful if I had the MAX for ntiles <= 50 and the MIN for those > 50.
select check_id, CASE WHEN entile <= 50 THEN MAX(value) ELSE MIN(value) END as boundary, entile from ( select cm.check_id, cm.value, ntile(100) over (order by cm.value) as entile from monitoring_checkmeasurement cm inner join monitoring_sensorvariable sv on variable_id=sv.id where cm.check_id = 287 and sv.name = 'wr_ios' group by cm.check_id, cm.value ) as x group by x.check_id, x.entile order by x.entile;
This looks better:
check_id | boundary | entile ----------+------------------+-------- 287 | 242.528538355955 | 1 287 | 245.18250418663 | 2 287 | 247.452813643881 | 3 287 | 249.174154040999 | 4 287 | 250.819910098371 | 5 287 | 252.093448602465 | 6 287 | 253.222732199382 | 7 287 | 254.289464902005 | 8 287 | 255.301277856935 | 9 287 | 256.807463466736 | 10 287 | 257.886738494251 | 11 287 | 259.363337856962 | 12 287 | 260.591828244757 | 13 287 | 261.250520159227 | 14 287 | 262.297351865419 | 15 287 | 263.099028123455 | 16 287 | 263.982356107804 | 17 287 | 264.603170887436 | 18 287 | 265.187531126404 | 19 287 | 266.111760914008 | 20 287 | 267.20722187206 | 21 287 | 268.096699408997 | 22 287 | 268.853926667567 | 23 287 | 269.744325622614 | 24 287 | 270.188142168672 | 25 287 | 270.593971794008 | 26 287 | 271.236878900381 | 27 287 | 272.133271901325 | 28 287 | 272.613305290942 | 29 287 | 273.139119588124 | 30 287 | 273.941309947633 | 31 287 | 274.798961636146 | 32 287 | 275.440183189522 | 33 287 | 276.138383373647 | 34 287 | 276.611642523528 | 35 287 | 277.093934420074 | 36 287 | 277.987756333866 | 37 287 | 278.571849180595 | 38 287 | 279.254178302784 | 39 287 | 279.905348607647 | 40 287 | 280.309358977164 | 41 287 | 280.665632815261 | 42 287 | 281.240222774005 | 43 287 | 281.841771506664 | 44 287 | 282.310636865485 | 45 287 | 283.014585882153 | 46 287 | 283.835777648474 | 47 287 | 284.455286722166 | 48 287 | 285.137465659157 | 49 287 | 285.578659445801 | 50 287 | 285.581995639674 | 51 287 | 286.326444822576 | 52 287 | 286.688561129296 | 53 287 | 287.223072002403 | 54 287 | 287.919565845853 | 55 287 | 288.721280851317 | 56 287 | 289.432329385731 | 57 287 | 289.959411908295 | 58 287 | 290.651482471229 | 59 287 | 291.372769922591 | 60 287 | 291.842573577583 | 61 287 | 292.736031355903 | 62 287 | 293.187362152819 | 63 287 | 293.838025307673 | 64 287 | 294.57972141345 | 65 287 | 295.082963391352 | 66 287 | 296.24698891679 | 67 287 | 297.600830745376 | 68 287 | 298.87510821896 | 69 287 | 299.566450949587 | 70 287 | 300.56043720098 | 71 287 | 301.530900814087 | 72 287 | 302.510107864529 | 73 287 | 303.340573447187 | 74 287 | 304.723269421615 | 75 287 | 305.39145573103 | 76 287 | 306.51173792586 | 77 287 | 307.668492795808 | 78 287 | 308.466223822391 | 79 287 | 309.823047453998 | 80 287 | 311.146362389198 | 81 287 | 313.607204835349 | 82 287 | 314.9906630401 | 83 287 | 316.958319112472 | 84 287 | 318.777340519479 | 85 287 | 320.171519126339 | 86 287 | 322.255329479526 | 87 287 | 324.608642117519 | 88 287 | 327.670448175898 | 89 287 | 331.5988884139 | 90 287 | 337.78420647411 | 91 287 | 342.050233338103 | 92 287 | 349.403819865874 | 93 287 | 358.749356396326 | 94 287 | 376.297282027477 | 95 287 | 403.67843072412 | 96 287 | 422.926274104004 | 97 287 | 487.446437625305 | 98 287 | 605.442699965559 | 99 287 | 684.885164154152 | 100 (100 rows)
Btw, the average of all values is:
check_id | avg ----------+------------------ 287 | 301.871884282724 (1 row)
So, what do we learn from that?