
<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Linux &#8211; Conetix</title>
	<atom:link href="https://testing.conetix.com.au/blog/category/virtual-servers/linux/feed/" rel="self" type="application/rss+xml" />
	<link>https://testing.conetix.com.au</link>
	<description>Premier Web Hosting Provider</description>
	<lastBuildDate>Sat, 24 Oct 2020 09:21:36 +0000</lastBuildDate>
	<language>en-AU</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	

<image>
	<url>https://testing.conetix.com.au/wp-content/uploads/favicon.png</url>
	<title>Linux &#8211; Conetix</title>
	<link>https://testing.conetix.com.au</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>CentOS 8: What&#8217;s New?</title>
		<link>https://testing.conetix.com.au/blog/centos-8-whats-new/</link>
		
		<dc:creator><![CDATA[Tim Butler]]></dc:creator>
		<pubDate>Fri, 14 Aug 2020 04:55:59 +0000</pubDate>
				<category><![CDATA[Hosting]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Virtual Servers]]></category>
		<category><![CDATA[centos]]></category>
		<category><![CDATA[centos 8]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[updates]]></category>
		<guid isPermaLink="false">https://conetix.com.au/?p=11274</guid>

					<description><![CDATA[Here at Conetix, we&#8217;re big fans of the Community Enterprise Operating System (CentOS) platform as it runs the majority of our web hosting infrastructure. The version stability and long term support for security updates (thanks also due to the Red Hat release cycles) make it an excellent base platform to ensure we have both a...  <a class="excerpt-read-more" href="https://testing.conetix.com.au/blog/centos-8-whats-new/" title="Read CentOS 8: What&#8217;s New?">Read more &#187;</a>]]></description>
										<content:encoded><![CDATA[
<p>Here at Conetix, we&#8217;re big fans of the Community Enterprise Operating System (CentOS) platform as it runs the majority of our web hosting infrastructure. </p>



<p>The version stability and long term support for security updates (thanks also due to the Red Hat release cycles) make it an excellent base platform to ensure we have both a secure and reliable system.</p>



<p>While nearly all of our systems have been using the release 7 (CentOS 7) variant, we&#8217;ve begun deploying newer systems to CentOS 8 now that some of the initial compatibility issues have been ironed out.</p>



<p>In this article, we&#8217;ll outline some of the new features rolled in and why they&#8217;re great reason to use for hosting.</p>





<h2 class="wp-block-heading">Updated software management</h2>



<p>For those used to using <em>YUM</em> (the package manager for Red Hat / CentOS), a new version is now available. Known as DNF (Dandified YUM), this features increase performance, increased flexibility through a far more detailed API, less memory usage and dozens of other improvements.</p>



<p>Nearly all of the commands are exactly the same, and you can still run as &#8220;yum&#8221; commands as it&#8217;s simply a symlink to the new &#8220;dnf&#8221; command. For example:</p>



<pre class="wp-block-code"><code>yum install httpd
dnf install httpd</code></pre>



<p>Both of these commands call exactly the same thing, so you can use the old commands while you transition. We&#8217;ve definitely seen the speed improvement when running updates so already it&#8217;s a great bonus!</p>



<figure class="wp-block-table is-style-stripes"><table><tbody><tr><td class="has-text-align-right" data-align="right"><strong>Official Documentation: </strong></td><td><a href="https://dnf.readthedocs.io/en/latest/">https://dnf.readthedocs.io/en/latest/</a></td></tr></tbody></table></figure>



<h2 class="wp-block-heading">OpenSSL 1.1.1</h2>



<p>We&#8217;re lucky enough that the use of NGINX across our hosting platform allowed us to roll <a href="https://testing.conetix.com.au/support/tls-1-2-and-1-3-support/">TLS 1.3</a> out last year, but the inclusion of OpenSSL 1.1.1 in CentOS 8 means all services (including Apache and outbound PHP) can now use TLS 1.3 directly as well.</p>



<p>This allows for greater security and therefore greater protection of data against Man in the Middle (MITM) attacks.</p>



<h2 class="wp-block-heading">PHP 7.2 </h2>



<p>While we use Plesk to provide updated PHP versions (including <a href="https://testing.conetix.com.au/support/what-versions-php-does-conetix-support/">PHP 7.4</a>), having PHP 7.2 as the base version allows for easier compatibility with systems such as <a href="https://wp-cli.org/">WP-CLI</a> by default. </p>



<h2 class="wp-block-heading">MariaDB 10.3</h2>



<p>Like PHP, Conetix had always run the latest, compatible MariaDB version (10.3) on our systems but this required replacing system defaults with externally packaged ones (from MariaDB themselves).</p>



<p>Having this now as the distribution default removes one extra level of management and potential update risk as there&#8217;s more testing involved with the system defaults.</p>



<h2 class="wp-block-heading">Updated Kernel</h2>



<p>With CentOS-8 being based on the 4.18 Linux Kernel, this allows for all of the additional features and improvements since 3.10 (CentOS-7) to be incorporated. While some of these features have been backported for CentOS-7, having a more-up-to-date baseline allows for greater expansion in the future as well.</p>



<p>While the individual feature changes are too numerous to list here (a summary is available <a href="https://kernelnewbies.org/Linux_4.18">here</a> for those really interested!), some of the key stats will give an idea on the level of change:</p>



<ul class="wp-block-list"><li>12,879 changes</li><li>1,668 developers</li><li>553,000 lines of code added</li><li>652,000 lines of code removed </li></ul>



<p><em>Source: <a href="https://lwn.net/Articles/760690/">https://lwn.net/Articles/760690/</a></em></p>



<h2 class="wp-block-heading">Increased support timeline</h2>



<p>CentOS 8 will receive full updates until 2024 with security (and critical bugs) updates until 2029. Here&#8217;s a table to compare:</p>



<figure class="wp-block-table"><table><thead><tr><th>Distro</th><th>Full Updates</th><th>Security Updates</th></tr></thead><tbody><tr><td>CentOS 6</td><td>May 2017</td><td>November 2020</td></tr><tr><td>CentOS 7</td><td>August 2020</td><td>June 2024</td></tr><tr><td>CentOS 8</td><td>May 2024</td><td>May 2029</td></tr></tbody></table></figure>



<p>This means that if you install a CentOS 8 system today, you&#8217;ll be receiving security patches all way until May 2029!</p>



<h2 class="wp-block-heading">Many other updates</h2>



<p>If you&#8217;re using CentOS 8 in a non hosting environment then of course there&#8217;s hundreds of other features (including updated GUI etc) which may be applicable to you as well.</p>



<p>For a detailed overview of the new features and changes, please check out the <a href="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/8.2_release_notes/overview">Red Hat Enterprise Linux 8 Release Notes</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>That one odd sheep: A tale of a bad kernel update</title>
		<link>https://testing.conetix.com.au/blog/that-one-odd-sheep-a-tale-of-a-bad-kernel-update/</link>
		
		<dc:creator><![CDATA[Tim Butler]]></dc:creator>
		<pubDate>Thu, 23 Jan 2020 03:20:00 +0000</pubDate>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Virtual Servers]]></category>
		<category><![CDATA[centos]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[kernel]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[servers]]></category>
		<category><![CDATA[updates]]></category>
		<guid isPermaLink="false">https://conetix.com.au/?p=9551</guid>

					<description><![CDATA[Across thousands of systems setup in the exact same manner, every now and then there will be one which doesn’t quite look like the others and doesn’t behave like the others. In some instances, it’s easiest to destroy and start again but in others, these outliers represent an opportunity to work what went wrong and if improvements can be implemented to prevent other future issues.]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading"><strong>Overview</strong></h2>



<p>When managing large numbers of servers, we rely on standardisation and orchestration tools to be able to reliably replicate new systems and manage all the others. This is the classic “cattle vs pets”, where cattle are the systems who simply perform their duty and if there are any issues you can destroy and re-create. Pets on the other hand are the servers you love and care for, with hand-crafted configurations and if you have to rebuild will cause you lots of tears.</p>



<p>Using SaltStack as our orchestration allows us to have known states for servers, as well as installation and build procedures where the same result is repeated over and over without any human intervention or human error of a mistyped command.</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>And this usually works, until you find one. odd. Sheep.&nbsp;</p></blockquote>



<p>Across thousands of systems setup in the exact same manner, every now and then there will be one which doesn’t quite look like the others and doesn’t behave like the others. In some instances, it’s easiest to destroy and start again but in others, these outliers represent an opportunity to work what went wrong and if improvements can be implemented to prevent other future issues.</p>



<h2 class="wp-block-heading"><strong>A failed kernel update</strong></h2>



<p>This particular odd sheep was a CentOS 7 based Virtual Machine (VM), running the latest 7.7 release and didn’t have any abnormal changes compared to all other systems. The system build process was the same used as others, using <a href="https://www.saltstack.com/">SaltStack</a> to build the system, deploy apps and manage ongoing changes to ensure consistency across the fleet.</p>



<p>After an update and reboot this particular VM didn’t return, instead dropping to a <a href="https://fedoramagazine.org/initramfs-dracut-and-the-dracut-emergency-shell/">Dracut Emergency Shell</a>. Odd, as this was the first time I’d even encountered the dracut emergency shell and hadn’t had to deal with a Linux VM failing to boot in many, many years.&nbsp;</p>



<figure class="wp-block-image"><img decoding="async" src="https://lh4.googleusercontent.com/kSYNA6KWr8SPT_R3xo2QtxXHj7XTfkkLFX4k5smVwMSF52DwVl1JaI5kBoLkXtHTwoJg6Erucsv_P6-lL8jMl9--xQoeZ1qeVAe1saTIoUY-Vv0z4JRvRRaSgL47xj5xJncmvPcl" alt=""/></figure>



<p>This wasn’t good. The initial warnings indicate that it couldn’t find the root nor swap partitions. As these were Logical Volume Management (LVM) based and quite standard, it was very odd that they simply couldn’t be found. Was it disk corruption? Were they marked inactive by mistake?&nbsp;</p>



<p>The first step was to have a look at what LVM was reporting:</p>



<pre class="wp-block-preformatted prettyprint">lvm lvdisplay</pre>



<p>No results. That’s odd, it’s not seeing any logical volumes at all. The next step was to take a look at the physical disks then:</p>



<pre class="wp-block-preformatted prettyprint">lvm pvdisplay</pre>



<p>No results again. At this point it’s starting to dawn on me that this wasn’t going to be a quick fix.&nbsp;</p>



<p>Ok, last resort let’s simply see all of the drives which have presented to the OS:</p>



<pre class="wp-block-preformatted prettyprint">blkid</pre>



<p>No results. blkid is very low level and simply lists low level block device metadata, so the fact that it couldn’t return any data was highly concerning. It meant that the VM itself wasn’t seeing any disks after the initial ramfs system had been loaded.</p>



<h2 class="wp-block-heading"><strong>What is initramfs?&nbsp;</strong></h2>



<p>While it’s greatly simplifying things, <a href="https://en.wikipedia.org/wiki/Initial_ramdisk">initramfs</a> is the successor to initrd system of providing the minimum possible services to allow the root file system to be accessed and for the boot to continue.&nbsp;</p>



<p>This is of course a very crude simplification, but the initramfs resolves the classic chicken and egg problem. To get access to the root filesystem, you need to mount the root filesystem in order to load the drivers located on there … required to mount the root filesystem.&nbsp;</p>



<p>In order to get around this issue, there’s a tiny little preloaded set of drivers (initramfs) so that the system can at least mount the root filesystem to load more detailed drivers and continue the boot.</p>



<p>For this broken instance, clearly there was a failure in these drivers as it couldn’t load the filesystem!</p>



<p>As both grub and the initial boot both at least loaded, it was evident that the VM had a disk attached and this wasn’t a hypervisor fault at play here.</p>



<h2 class="wp-block-heading"><strong>Next step &#8211; Rescue Mode</strong></h2>



<p>Out of the box CentOS keeps the last two kernels installed so that you have the ability to fall-back in case of error &#8230; which is exactly the situation we were in. However, in this instance the previous kernel exhibited exactly the same fault.</p>



<p>Thankfully, CentOS 7 has a basic <a href="https://docs.centos.org/en-US/centos/install-guide/Rescue_Mode/">rescue mode</a> built in. This allowed us to boot the system up using a slightly older kernel version and it loaded without issue. At least this gave me some hope, especially since it validated that the filesystem itself was ok and that the system wasn&#8217;t a complete write-off.&nbsp;</p>



<p>To create a bit of further confusion, the rescue mode <em>also</em> uses initramfs and had no issues. Why did it work yet not the others two kernels? It was becoming a case of finding more questions at each step rather than answers!</p>



<h2 class="wp-block-heading"><strong>Rebuilding initramfs</strong></h2>



<p>If my assumptions were correct in that there were drivers missing from the initramfs, then the first step would be to therefore rebuild it. For nearly all modern Linux systems, this is handled via <a href="https://en.wikipedia.org/wiki/Dracut_(software)">Dracut</a>.&nbsp;</p>



<p>Referencing the CentOS 7 <a href="https://wiki.centos.org/TipsAndTricks/CreateNewInitrd">Tips and Tricks Wiki</a>, we can force a rebuild using a single line:</p>



<pre class="wp-block-preformatted prettyprint">dracut -f /boot/initramfs-3.10.0-1062.4.1.el7.x86_64.img</pre>



<p>This runs through and finds the required drivers to get root mounted (or at least it should!) and simply returns once complete. Hoping for an easy win, I rebooted the VM only to discover that the same fault existed. Back to the rescue mode again!</p>



<p>So that I could see more output, this time I used the -v flag to produce verbose output:</p>



<pre class="wp-block-preformatted prettyprint">dracut -v -f /boot/initramfs-3.10.0-1062.4.1.el7.x86_64.img</pre>



<p>The system showed each of the dracut modules determining what was or wasn’t required so that it knew if it was required to be included or not. Everything in the process was showing clean results, skipping where not required or including where things were required. Again, no easy wins here to find a simple show-stopper!</p>



<p>Knowing that dracut does a hardware detection in order to optimise what to load, my next step was to ensure that it was picking up the underlying hypervisor (<a href="https://www.redhat.com/en/topics/virtualization/what-is-KVM">KVM</a>) correctly. A simple command (also used by dracut) to run is:</p>



<pre class="wp-block-preformatted prettyprint">systemd-detect-virt</pre>



<p>This detected kvm correctly, so it should have known the drivers to load. Just in case there was some other odd mismatch, I also tried the <em>-N</em> flag to disable the Host-Only mode which should have added all additional drivers.</p>



<p><strong>No luck.</strong></p>



<p>I case there was some rare package corruption or the kernel itself somehow had failed to install cleanly, I ensured that the kernel was reinstalled and flushed the yum cache before proceeding:</p>



<pre class="wp-block-preformatted prettyprint">yum clean all&nbsp;<br>yum reinstall kernel-dev<br>yum reinstall kernel</pre>



<p>Still no luck, the rabbit hole was simply getting deeper and deeper without an end in sight.</p>



<h2 class="wp-block-heading"><strong>Determining the drivers required</strong></h2>



<p>To verify what drivers have actually been included, I scanned the initramfs image using:</p>



<pre class="wp-block-preformatted prettyprint">lsinitrd /boot/initramfs-3.10.0-1062.4.1.el7.x86_64.img</pre>



<p>This gives a very verbose output of the files and modules contained within the image, allowing us to see exactly what was bundled in. For example, we can see the library files for LVM loaded:&nbsp;</p>



<pre class="wp-block-preformatted prettyprint">-r-xr-xr-x &nbsp; 1 root     root&nbsp; &nbsp; &nbsp; &nbsp; 11328 Nov  4 12:23 usr/lib64/device-mapper/libdevmapper-event-lvm2mirror.so 
-r-xr-xr-x &nbsp; 1 root     root&nbsp; &nbsp; &nbsp; &nbsp; 15664 Nov  4 12:23 usr/lib64/device-mapper/libdevmapper-event-lvm2thin.so 
-r-xr-xr-x &nbsp; 1 root     root&nbsp; &nbsp; &nbsp; &nbsp; 15640 Nov  4 12:23 usr/lib64/device-mapper/libdevmapper-event-lvm2vdo.so </pre>



<p>However, KVM uses the virtio drivers and they were nowhere to be seen. To confirm what was different, I then inspected the rescue initramfs image:&nbsp;</p>



<pre class="wp-block-preformatted prettyprint">lsinitrd /boot/initramfs-0-rescue-5e8eb9af2347493a99c3b0b496485b3d.img | grep virtio</pre>



<p>The virtio drivers were present:</p>



<pre class="wp-block-preformatted prettyprint">-rw-r--r-- &nbsp; 1 root     root &nbsp; &nbsp; &nbsp; &nbsp; 7744 Apr 21  2018 usr/lib/modules/3.10.0-862.el7.x86_64/kernel/drivers/block/virtio_blk.ko.xz 
-rw-r--r-- &nbsp; 1 root     root&nbsp; &nbsp; &nbsp; &nbsp; 12944 Apr 21  2018 usr/lib/modules/3.10.0-862.el7.x86_64/kernel/drivers/char/virtio_console.ko.xz 
drwxr-xr-x &nbsp; 2 root     root&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0 Oct 21 15:53 usr/lib/modules/3.10.0-862.el7.x86_64/kernel/drivers/gpu/drm/virtio 
-rw-r--r-- &nbsp; 1 root     root&nbsp; &nbsp; &nbsp; &nbsp; 23260 Apr 21  2018 usr/lib/modules/3.10.0-862.el7.x86_64/kernel/drivers/gpu/drm/virtio/virtio-gpu.ko.xz 
-rw-r--r-- &nbsp; 1 root     root&nbsp; &nbsp; &nbsp; &nbsp; 14296 Apr 21  2018 usr/lib/modules/3.10.0-862.el7.x86_64/kernel/drivers/net/virtio_net.ko.xz 
-rw-r--r-- &nbsp; 1 root     root &nbsp; &nbsp; &nbsp; &nbsp; 8176 Apr 21  2018 usr/lib/modules/3.10.0-862.el7.x86_64/kernel/drivers/scsi/virtio_scsi.ko.xz 
drwxr-xr-x &nbsp; 2 root     root&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0 Oct 21 15:53 usr/lib/modules/3.10.0-862.el7.x86_64/kernel/drivers/virtio 
-rw-r--r-- &nbsp; 1 root     root &nbsp; &nbsp; &nbsp; &nbsp; 4556 Apr 21  2018 usr/lib/modules/3.10.0-862.el7.x86_64/kernel/drivers/virtio/virtio.ko.xz 
-rw-r--r-- &nbsp; 1 root     root &nbsp; &nbsp; &nbsp; &nbsp; 9664 Apr 21  2018 usr/lib/modules/3.10.0-862.el7.x86_64/kernel/drivers/virtio/virtio_pci.ko.xz 
-rw-r--r-- &nbsp; 1 root     root &nbsp; &nbsp; &nbsp; &nbsp; 8280 Apr 21  2018 usr/lib/modules/3.10.0-862.el7.x86_64/kernel/drivers/virtio/virtio_ring.ko.xz </pre>



<p>My question immediately was &#8220;why were they missing? Why weren&#8217;t they included, even when disabling the host only mode for dracut?&#8221;</p>



<p>Firstly, let’s verify they exist for the new kernel version:</p>



<pre class="wp-block-preformatted prettyprint">ls -lah 
/usr/lib/modules/3.10.0-1062.4.1.el7.x86_64/kernel/drivers/block/virtio_blk.ko.xz&nbsp;
-rw-r--r--. 1 root root 7.7K Oct 19 03:29 /usr/lib/modules/3.10.0-1062.4.1.el7.x86_64/kernel/drivers/block/virtio_blk.ko.xz</pre>



<p>Each file existed for the new kernel, so there was no reason why dracut shouldn’t be including them.</p>



<h2 class="wp-block-heading"><strong>Further issues present themselves</strong></h2>



<p>While I was focused on the KVM drivers themselves, I completely omitted the fact that <em><strong>all</strong></em> kernel drivers were missing. This was a classic case of looking for a needle in a haystack while using tools to filter results and not seeing the rest of the haystack was <em>missing</em>.</p>



<p>Looking at it again, I ran a simple file count between the two images:</p>



<pre class="wp-block-preformatted prettyprint">lsinitrd /boot/initramfs-3.10.0-1062.4.1.el7.x86_64.img | grep "usr/lib/modules/" | wc -l</pre>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p><strong>Result: </strong>12</p></blockquote>



<pre class="wp-block-preformatted prettyprint">lsinitrd /boot/initramfs-0-rescue-5e8eb9af2347493a99c3b0b496485b3d.img | grep "usr/lib/modules/" | wc -l</pre>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p><strong>Result: </strong>816</p></blockquote>



<p>Wowsers. While the rescue kernel should contain more drivers as it’s non-host specific, it’s clear that 800+ files vs 12 shows there&#8217;s a definite problem.</p>



<p>To get a bit more info from the initial build, I copied the image to a temp directory, then extracted the files:</p>



<pre class="wp-block-preformatted prettyprint">/usr/lib/dracut/skipcpio initramfs-3.10.0-1062.4.1.el7.x86_64.img&nbsp; | zcat | cpio -ivd</pre>



<p>This allows me to then see what kernel modules dracut thought it was loading by reading the build file:</p>



<pre class="wp-block-preformatted prettyprint">less usr/lib/dracut/loaded-kernel-modules.txt</pre>



<p>Of course, there were 53 modules in the list, including the expected virtio drivers:</p>



<pre class="wp-block-preformatted prettyprint">ablk_helper
aesni_intel
ata_generic
ata_piix
bochs_drm
cdrom
crc32_pclmul
crc32c_intel
crc_t10dif
crct10dif_common
crct10dif_generic
crct10dif_pclmul
cryptd
dm_log
dm_mirror
dm_mod
dm_region_hash
drm
drm_kms_helper
e1000
fb_sys_fops
floppy
gf128mul
ghash_clmulni_intel
glue_helper
i2c_core
i2c_piix4
iosf_mbi
ip_tables
isofs
joydev
libata
libcrc32c
lrw
parport
parport_pc
pata_acpi
pcspkr
ppdev
sd_mod
serio_raw
sr_mod
syscopyarea
sysfillrect
sysimgblt
ttm
virtio
virtio_balloon
virtio_console
virtio_pci
virtio_ring
virtio_scsi
xfs</pre>



<p>Trawling through the extracted files I found one other file which stood out being zero bytes:</p>



<pre class="wp-block-preformatted prettyprint">ls -lah /tmp/extracted-initramfs/usr/lib/modules/3.10.0-1062.4.1.el7.x86_64/modules.dep 
-rw-------. 1 root root 0 Nov&nbsp; 4 15:54 usr/lib/modules/3.10.0-1062.4.1.el7.x86_64/modules.dep</pre>



<p>The <a href="https://linux.die.net/man/5/modules.dep"><em>modules.dep</em></a> file is a list of all the kernel module dependencies, generated so that modprobe commands and similar can determine what modules need to be loaded first. Looking at the source for the <a href="https://github.com/dracutdevs/dracut/blob/master/modules.d/90kernel-modules/module-setup.sh">dracut kernel module</a>, I can see that it references a locally generated <em>modules.dep</em> to determine file inclusions and obviously being blank it’s simply not including any files!</p>



<p>Finally, at least we’re starting to narrow in on the cause.</p>



<h2 class="wp-block-heading"><strong>Finding the root cause</strong></h2>



<p>I first checked that the kernel itself had a clean module dependency file, which at 260k indicates it’s far from blank:&nbsp;</p>



<pre class="wp-block-preformatted prettyprint">ls -lah /usr/lib/modules/3.10.0-1062.4.1.el7.x86_64/modules.dep 
-rw-r--r--. 1 root root 266K Nov&nbsp; 1 12:13 /usr/lib/modules/3.10.0-1062.4.1.el7.x86_64/modules.dep</pre>



<p>At least from a main kernel perspective, it was able to build and generate a list of dependencies.</p>



<p>Our next steps were to determine how dracut performs the module dependency check and <em>why</em> this was failing.</p>



<p>Again, to eliminate any weird install issues I reinstalled dracut:</p>



<pre class="wp-block-preformatted">yum reinstall dracut</pre>



<p>The cycle of fun and repeating of the same error continued. As there’s a dozen or so other packages installed which could be at fault, I installed an additional plugin for <strong>yum</strong> to verify installed packages and their integrity. This was done via:</p>



<pre class="wp-block-preformatted prettyprint">yum install yum-plugin-verify</pre>



<p>To run, you can then simply call:</p>



<pre class="wp-block-preformatted prettyprint">yum verify</pre>



<p>Depending on the size of your installation and the speed of your server, this may take several minutes to complete and will provide a verbose output of any file which doesn’t match the installed package integrity. While this found a few faults, these faults weren’t to do with the kernel module simply the change of timestamps on a few Apache directories.</p>



<p>Out of absolute desperation and despite an hour of Googling previously, I ran one more search “initramfs has no modules” and found the result: <a href="https://stackoverflow.com/questions/53607020/initramfs-has-no-modules/53671006#53671006">https://stackoverflow.com/questions/53607020/initramfs-has-no-modules/53671006#53671006</a></p>



<p>This was for a completely different distro, but at this stage anything was worth a try. At this point I was happy to accept any miracle, no matter how vague.</p>



<p>Reinstallation of kmod (which handles the management of the Kernel modules) was as simple as:</p>



<pre class="wp-block-preformatted prettyprint">yum reinstall kmod</pre>



<p>Dracut was of course called to rebuild and all fingers, toes and any other objects I could find were all crossed.&nbsp;</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p><strong>Success.&nbsp;</strong></p></blockquote>



<p>I couldn’t believe it. We now had 819 kernel modules within the image, inline with exactly what was expected. Just so that I could confirm a standard upgrade was going to work, the simplest way was to reinstall the kernel again:</p>



<pre class="wp-block-preformatted prettyprint">yum reinstall kernel</pre>



<p>Rebooting the VM into the new kernel confirmed it was working exactly as expected, just like hundreds of others with the exact same build and environment. The sheep had returned to the herd.</p>



<h2 class="wp-block-heading"><strong>Conclusion</strong></h2>



<p>For those who have made it to the end without nodding off, the obvious question here is why didn’t we just restore from a backup?&nbsp;</p>



<p>The simple answer is because of the delay between the updates being applied (and the new kernel) and the actual reboot of the VM. Without knowing the root cause, the recovery point where it would work could be days or even weeks back and certainly create a significant delta when it comes to changed data on the VM.</p>



<p>The desire to find the fault and resolve of course meant that I simply couldn’t leave it alone anyway. Like many in the IT field, the desire to solve the puzzle sometimes overrides the more cost effective business logic of simply throwing it away and building it again.&nbsp;</p>



<p>While I have a better understanding of the initramfs build process and system now and managed to fix the problem, the simple underlying cause of <em><strong>why</strong> </em>or where the kmod package had failed still is and will remain a mystery. </p>



<p><em>Main Photo by&nbsp;<a href="https://unsplash.com/@daanstevens?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Daan Stevens</a>&nbsp;on&nbsp;<a href="https://unsplash.com/s/photos/herd-sheep?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></em></p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
