<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>why = WHY_NOT; (Posts about storage)</title><link>https://blog.svedr.in/</link><description></description><atom:link href="https://blog.svedr.in/categories/storage.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><lastBuildDate>Sun, 15 Feb 2026 11:29:07 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Ceph Bluestore/Filestore latency</title><link>https://blog.svedr.in/posts/ceph-BluestoreFilestore-latency/</link><dc:creator>Svedrin</dc:creator><description>&lt;div&gt;&lt;p&gt;We’ve been looking deeply into Ceph Storage latency, comparing
BlueStore and FileStore, and looking at methods how to get below the
magic 2ms write latency mark in our Proxmox clusters. Here’s what we
found.&lt;/p&gt;
&lt;p&gt;The endeavour was sparked by our desire to run ZooKeeper on our
Proxmox Clusters. ZooKeeper is highly sensitive to IO latency: If writes
are too slow, it will log messages like this one:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;fsync-ing the write ahead log in SyncThread:1 took 1376ms which will adversely effect operation latency.File size is 67108880 bytes. See the ZooKeeper troubleshooting guide&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Subsequently, ZooKeeper nodes will consider themselves broken and
restart. If the thing that’s slow is your Ceph cluster, this means that
all three VMs will be affected at the same time, and you’ll end up
losing your ZooKeeper cluster altogether.&lt;/p&gt;
&lt;p&gt;We mitigated this by moving ZooKeeper to local disks, and getting rid
of the Ceph layer in between. But that is obviously not a satisfactory
solution, so we’ve spent some time looking into Ceph latency.&lt;/p&gt;
&lt;p&gt;Unfortunately, there’s not a lot of advice to be found other than
“buy faster disks”. This didn’t seem to cut it for us: Our hosts were
reporting 0.1ms of disk latency, while the VMs measured 2ms of latency.
If our hosts had weighed in at 1.8ms, I’d be willing to believe that we
have a disk latency issue - but not with the discrepancy that we were
seeing. So let’s dive in and see if we can find other issues.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://blog.svedr.in/posts/ceph-BluestoreFilestore-latency/"&gt;Read more…&lt;/a&gt; (11 min remaining to read)&lt;/p&gt;&lt;/div&gt;</description><category>linux</category><category>storage</category><category>virtualization</category><guid>https://blog.svedr.in/posts/ceph-BluestoreFilestore-latency/</guid><pubDate>Mon, 01 Feb 2021 10:08:06 GMT</pubDate></item><item><title>Speeding up Ceph recovery</title><link>https://blog.svedr.in/posts/speeding-up-ceph-recovery/</link><dc:creator>Svedrin</dc:creator><description>&lt;p&gt;Note to self: Here’s the command to speed up Ceph recovery by
backfilling more than one PG at a time:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ceph tell osd.* injectargs '--osd_max_backfills 16'&lt;/code&gt;&lt;/pre&gt;</description><category>storage</category><guid>https://blog.svedr.in/posts/speeding-up-ceph-recovery/</guid><pubDate>Mon, 07 Jan 2019 11:44:04 GMT</pubDate></item><item><title>Filesystem Tuning</title><link>https://blog.svedr.in/posts/filesystem-tuning/</link><dc:creator>Svedrin</dc:creator><description>&lt;div&gt;&lt;p&gt;&lt;a href="https://blog.svedr.in/stories/storage-performance/"&gt;A while back&lt;/a&gt;, I
promised I’d write about file system tuning someday. Since it hasn’t
really happened yet, I thought I’d do it now.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://blog.svedr.in/posts/filesystem-tuning/"&gt;Read more…&lt;/a&gt; (3 min remaining to read)&lt;/p&gt;&lt;/div&gt;</description><category>linux</category><category>storage</category><guid>https://blog.svedr.in/posts/filesystem-tuning/</guid><pubDate>Tue, 03 Jul 2018 20:14:53 GMT</pubDate></item><item><title>Setting up Ceph FS on a Proxmox cluster</title><link>https://blog.svedr.in/posts/setting-up-ceph-fs-on-a-proxmox-cluster/</link><dc:creator>Svedrin</dc:creator><description>&lt;div&gt;&lt;p&gt;Proxmox apparently does not yet support running CephFS, but it can be
done using a bunch of manual steps. Here’s how.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://blog.svedr.in/posts/setting-up-ceph-fs-on-a-proxmox-cluster/"&gt;Read more…&lt;/a&gt; (2 min remaining to read)&lt;/p&gt;&lt;/div&gt;</description><category>linux</category><category>storage</category><category>virtualization</category><guid>https://blog.svedr.in/posts/setting-up-ceph-fs-on-a-proxmox-cluster/</guid><pubDate>Mon, 02 Jul 2018 12:12:38 GMT</pubDate></item><item><title>Manually creating a Ceph OSD</title><link>https://blog.svedr.in/posts/manually-creating-a-ceph-osd/</link><dc:creator>Svedrin</dc:creator><description>&lt;div&gt;&lt;p&gt;When setting up the &lt;a class="reference external" href="https://pve.proxmox.com/wiki/Ceph_Server"&gt;Ceph Server&lt;/a&gt; scenario for Proxmox, the PVE guide suggests to use the
&lt;a class="reference external" href="https://pve.proxmox.com/wiki/Ceph_Server#Creating_Ceph_OSDs"&gt;pveceph createosd&lt;/a&gt; command for creating OSDs. Unfortunately, this command
assumes that you want to dedicate a complete harddrive to your OSD and format it using ZFS. I tend to disagree: Not only do I prefer RAIDs
because their caches eliminate latency. I also always have LVM in between so that I'm flexible with the disk space allocation. And I'm not
really a huge fan of ZFS &lt;a class="reference external" href="https://blog.svedr.in/posts/resilvering-a-zfsonlinux-disk.html"&gt;ever since it bit me&lt;/a&gt;, albeit they
&lt;a class="reference external" href="https://github.com/zfsonlinux/zfs/issues/3625"&gt;fixed that issue&lt;/a&gt; by now. Still, I'm staying with my trusty XFS.&lt;/p&gt;
&lt;p&gt;That of course means that I'll have to create my OSDs differently because &lt;cite&gt;pveceph createosd`&lt;/cite&gt; isn't going to work. Here's how I do it.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://blog.svedr.in/posts/manually-creating-a-ceph-osd/"&gt;Read more…&lt;/a&gt; (1 min remaining to read)&lt;/p&gt;&lt;/div&gt;</description><category>storage</category><category>virtualization</category><guid>https://blog.svedr.in/posts/manually-creating-a-ceph-osd/</guid><pubDate>Wed, 24 Jan 2018 15:44:15 GMT</pubDate></item><item><title>Ceph CRUSH map with multiple storage tiers</title><link>https://blog.svedr.in/posts/ceph-crush-map-with-multiple-storage-tiers/</link><dc:creator>Svedrin</dc:creator><description>&lt;div&gt;&lt;p&gt;At work, we're running a virtualization server that has two kinds of storage built-in: An array of fast SAS disks, and another one of
slow-but-huge SATA disks. We're running OSDs on both of them, and I wanted to distinguish between them when creating RBD images, so that
I could choose the performance characteristics of the pool. I'm not sure if this post is outdated by now (Jan 2018), there's a "class"
thing in crush map all of a sudden. However, here's what we're currently running.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://blog.svedr.in/posts/ceph-crush-map-with-multiple-storage-tiers/"&gt;Read more…&lt;/a&gt; (2 min remaining to read)&lt;/p&gt;&lt;/div&gt;</description><category>storage</category><guid>https://blog.svedr.in/posts/ceph-crush-map-with-multiple-storage-tiers/</guid><pubDate>Wed, 24 Jan 2018 15:40:01 GMT</pubDate></item><item><title>Locating dying disks in LSI RAID using StorCLI</title><link>https://blog.svedr.in/posts/locating-dying-disks-in-lsi-raid-using-storcli/</link><dc:creator>Svedrin</dc:creator><description>&lt;div&gt;&lt;p&gt;I often find myself in need of locating disks in an LSI RAID that are not quite dead yet, but in the process of dying. &lt;a class="reference external" href="https://www.google.de/#q=storcli+%22critical+disks%22"&gt;Google&lt;/a&gt; knows how to do that using MegaCli, but I totally &lt;strong&gt;hate&lt;/strong&gt; that tool and want to do the same thing using storcli instead, which is a bit less insane. Here's how.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://blog.svedr.in/posts/locating-dying-disks-in-lsi-raid-using-storcli/"&gt;Read more…&lt;/a&gt; (2 min remaining to read)&lt;/p&gt;&lt;/div&gt;</description><category>storage</category><guid>https://blog.svedr.in/posts/locating-dying-disks-in-lsi-raid-using-storcli/</guid><pubDate>Wed, 12 Apr 2017 11:14:31 GMT</pubDate></item><item><title>Storage fun with Steam</title><link>https://blog.svedr.in/posts/storage-fun-with-steam/</link><dc:creator>Svedrin</dc:creator><description>&lt;div&gt;&lt;p&gt;Yesterday evening, I enjoyed a nice game of Dishonored 2. After dealing with the Crown Killer in a non-lethal and somewhat stealthy way, I shut off my PC, went to sleep, and set to continue my endeavour tonight. When I started my PC and fired up Dishonored again, my PC completely froze. I hit the reset button, started Task Manager before starting Dishonored, and I discovered that Steam chose to completely smash my harddrive to pieces. Here's what I saw.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://blog.svedr.in/posts/storage-fun-with-steam/"&gt;Read more…&lt;/a&gt; (2 min remaining to read)&lt;/p&gt;&lt;/div&gt;</description><category>storage</category><guid>https://blog.svedr.in/posts/storage-fun-with-steam/</guid><pubDate>Tue, 22 Nov 2016 18:29:31 GMT</pubDate></item><item><title>Ceph CRUSH map editing script</title><link>https://blog.svedr.in/posts/ceph-crush-map-editing-script/</link><dc:creator>Svedrin</dc:creator><description>&lt;div&gt;&lt;p&gt;If you're working with Ceph, you'll find yourself updating the CRUSH map sooner or later. For that, you regularly need to get the current map, decompile it, edit it, comile it and upload it again. Here's a little script that makes this easier.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://blog.svedr.in/posts/ceph-crush-map-editing-script/"&gt;Read more…&lt;/a&gt; (1 min remaining to read)&lt;/p&gt;&lt;/div&gt;</description><category>linux</category><category>storage</category><guid>https://blog.svedr.in/posts/ceph-crush-map-editing-script/</guid><pubDate>Wed, 13 Jul 2016 14:43:50 GMT</pubDate></item><item><title>Disk alignment and caching</title><link>https://blog.svedr.in/posts/disk-alignment-and-caching/</link><dc:creator>Svedrin</dc:creator><description>&lt;div&gt;&lt;p&gt;So now that we've
&lt;a class="reference external" href="https://blog.svedr.in/posts/measuring-storage-performance/"&gt;conducted measurements&lt;/a&gt;
and &lt;a class="reference external" href="https://blog.svedr.in/posts/how-to-run-relevant-benchmarks/"&gt;run benchmarks&lt;/a&gt;,
what do we do with the results? How does the system need to be built
to deliver good performance? What options do we have?&lt;/p&gt;
&lt;p&gt;&lt;a href="https://blog.svedr.in/posts/disk-alignment-and-caching/"&gt;Read more…&lt;/a&gt; (8 min remaining to read)&lt;/p&gt;&lt;/div&gt;</description><category>linux</category><category>storage</category><guid>https://blog.svedr.in/posts/disk-alignment-and-caching/</guid><pubDate>Sun, 29 Nov 2015 19:29:41 GMT</pubDate></item></channel></rss>