Two days ago I corrupted my file system during a failed resume from standby on Fedora 19. This feature has never quite worked correctly and randomly makes the kernel panic. Usually, I hard reboot my laptop and everything is fine but that time, something went wrong and when it came back up:
systemd-fsck[605]: /dev/sda2: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. systemd-fsck[605]: (i.e., without -a or -p options) [ 13.652068] systemd-fsck[605]: fsck failed with error code 4. Welcome to emergency mode. Use "systemctl default" or ^D to activate default mode. Give root password for maintenance (or type Control-D to continue):
In this case /dev/sda2 is my root partition and since it was mounted even in maintenance mode, attempting to run fsck on it would output:
fsck.ext4 /dev/sda2 e2fsck 1.42.7 (21-Jan-2013) /dev/sda2 is mounted. e2fsck: Cannot continue, aborting.
Which makes sense as common knowledge tells us that running fsck on a mounted file system will most likely do more damage to it.
The best option
Your best option is simply to boot into another Linux, be it on a different partition, a USB drive or a CD and run fsck manually on the faulty partition, which can easily be unmounted if necessary because no OS is using it. Easy. Normally yes, but my stupid Macbook Pro 2008 cannot boot though USB into anything else other than Mac OS X, my cd drive has been dead for years and recently, I got rid of my OS X partition. To make things more complicated, I’m in Thailand at the moment and obviously not able to take apart my computer to grab the hard drive and stick into a working system.
The other option (if you cannot boot into another Linux)
In order to assess the damage, I ran fsck in dry-run mode and piped the output to more to make reading more practical:
fsck.ext4 -n /dev/sda2 | more
From there, I could ensure that no critical files had been damaged and while keeping in mind that it’s always a gamble to use a corrupted file system, I proceeded to boot into the system to make some backups. That out of the way, I did some research on the web on how to fix a root file system that I had to boot into and sadly, not many things turned up for its not an ideal solution. Forcing the system to do it a boot time by creating a file named forcefsck and writing y in it (echo y > /forcefsck) at root no longer works and adding fsck.mode=force on the kernel command line did not fix the problem as fsck will not fix errors on its own without authorization, ie: someone to enter yes on the keyboard. Tried a few other tricks but none worked. I had no choice but keep my fingers crossed and use the system as is.
A few days later, I decided to get back to the issue and while researching alternative solutions, I read that it was possible to fix errors on a read-only file system, which it turns out can also be used to boot into. And it worked, so for posterity here is the technique:
- Put your root partition into read-only mode by modifying the faulty partition’s line on /etc/fstab (but remember your old settings):
UUID=fd1d0fad-3a4c-457f-9b5e-eed021cce3d1 / ext4 remount,ro 1 1
Note: If you’re already in maintenance mode at this point, you may be able to remount your file system in read only mode by running “mount -o remount,ro /” and skipping the reboot (thanks Jay).
- Reboot
- Switch to runlevel 1 just to minimize the amount of interfering processes (skip this step if you are running the session over SSH [thanks Josh]):
init 1
- Fix your file system (replace /dev/sda2 with your partition’s device), which should now work because the root partition is in read only:
fsck /dev/sda2
- Reboot
- Make your root file system readable/writable:
mount -o remount,rw /dev/sda2
- Restore your /etc/fstab to its original state.
- Reboot
Voilà, your system is safe to use again. Hopefully this will have gotten you out of a sticky situation like it did for me. If errors keep coming up, it’s probably a sign that your hard-drive is failing and before you loose it completely, you should mirror your data to a new one.
Yess!
Works on my Wheezy NAS. (Seagate GoFlex Net)
Hours of searching.
Found only, “touch /forcefsck”. which is not working on any modern linux system…
Many thanks for this!
Thank you very much.
You saved my life. If it had not worked I think I would have re-installed the OS.
Thanks again.
Glad it saved you the trouble of restarting from scratch.
After following all these step in my Centos 7, finally ended with grub-rescue after reboot.
thank you. worked .
Hi!
You can use: shutdown -rF now
it will force e2fsck’s on all the volumes when it reboots.
Regards.
Thanks for the advice. That’s the first thing I remember trying, however, I recall this technique not working for me at the time (Fedora 20). Hopefully, it was due to a bug that they have had fixed by now.
Thanks for your post. Would it also be possible to do such a check by mounting the root partition read-only while the system is running?
You wrote “Voilà, your system is not safe to use again.” I hope you meant “now” :-)
You’re welcome!
I don’t think it would be possible to switch a mounted root file system to read-only for the simple reason that you’d have to unmount and mount it again it to properly flush all write caches. Having all sorts of files opened on that partition, you system would simply not let you do it. Anyway, it may be technically feasible, butmy search on the web did not yield anything.
“Voilà” is borrowed from French and valid in this case I believe. However thanks for pointing out that I had originally written “Voilà, your system is not safe to use again.” :)
Thanks
This is not a good idea on CentOS 6. On reboot with root set readonly: “Can’t mount root filesystem. Boot has failed, sleeping forever”. To get anything back I had to mount this disk on another OS to restore the fstab on my root (since I have logical volumes I used “mount -t ext4 /dev/vg_MYNAME/lv_root /mnt/tmp”, being thankful I my two systems did not have logical volume name conflicts). Glad I’m all virtual, but still quite pitiful there is not a simple built in way to do this for what might be simple fixable root file system errors.
Tangent: as for fixing my file system issues, my original issue is that periodically the file system has been going readonly, and so I was looking at “fsck -n /” which reported from “Free blocks count wrong”. But, running “fsck -f /dev/mapper/vg_MYNAME-lv_root” when mounted under another OS did not find this issue. Good luck if you need to figure out you need to add the “ext4” force switch “-f” since mounting under another system previewed it clean and looked very different than when live. I assume the live “-n” error “Free blocks count wrong” on mounted file system was unreliable, based on “http://serverfault.com/questions/561282/unable-to-repair-ext4-filesystem-getting-errors”. So I will move on, and wait for the next recurrence, assuming I have pending disk problems coming my way (I’m a a white box ESXi with little physical disk info but no other VMs complaining [yet], so I live on the edge).
Typo here:
fsck,ext4 should be fsck.ext4
Thanks! Fixed.
All of my systems boot with an initramfs, so what I do is break to a shell before the root is mounted, mount it read-only and copy over e2fsck and its supporting shared libraries (you’ll very rarely find it in a ramdisk image), unmount it, check it fry shell and continue booting – if the FS is completely unmounted after the check you don’t even need to reboot!
Skip step 3 if you are connecting to a remote server over ssh. Running “init 1” will close your ssh session and not let you back in.
Thanks, I’ll update the post.
Thanks it worked :-)
Thanks man… You saved me a lot of time…
This was useful but it doesn’t work as written (at least on Fedora) because if you have set the root file system readonly when you reboot you ultimately get a message to that effect and you can’t issue the mount command to reset it. You have to do something like boot a rescue CD and get it to mount your system in /mnt/sysimage and edit /etc/fstab in that, and then reboot.
I think the trick here is the remount option in fstab. It remounts the filesystem in read-only mode after it has failed to mount it in read/write, but the original intention was to mount is as such, which enabled to you to issue another mount command afterwards to switch it back to that mode.
Thank you so much!
Works for me on Kali 2.0
What do you type for that …”7: Restore your /etc/fstab to its original state ” ?
What should you type ? Be more specific please.
Hey Bob, what I mean is that you have to revert whatever changes you made to your /etc/fstab file during step 1.
Dude, you made my day. I had a situation when my / filesystem was marked as corrupted, because of a suddenly shutdown. No rescue disk, no other computer to mount my disk. But this worked like a charm to me.
Thanks for sharing this!
You’re welcome! Glad it worked for you.
instead of changing the fstab, you can remount the root filesystem as read only and then FSCK it (no reboot needed). To do this you do :
mount -o remount,ro /
Thanks Jay. However, I believe this will only work if there are no files open on the root filesystem.
Jay is right. In maintanence mode there would be no files open in root filesystem
“In maintenance mode there would be no files open in root filesystem”
Doesn’t this assume the system has an independent “/boot” partition? If /boot is part of / then it needs to be mounted for access to /boot.
No matter the case, triggering a check of the root FS has had several ad-hoc solutions be each with dependencies to limit its success. It’s difficult (for me anyway) to understand why there’s still no kernel-level solution such that it works for any/all linux distributions. Some simple command like “fsutil –dirty –set ‘/'” or even a one-word command like “rootfsck-onboot.” BSD has a universal solution but it’s tied to the sysv init system.
Does this technique work on Ubuntu?
I don’t see any reason why it should’nt.
Thanks, much, that did the trick… I usually use a CD, but did not have one for this old Fedora OS. This is something I did in Old Solaris/SunOS to fix root. Solaris 10+ boots into a early boot stage now to fix root.
Thanks. It works.
Thanx dude u are a life saver
Finally something that works on Centos7 in an easy & straightforward manner.
Works exactly as described. I too skipped the init 1 as ssh gets disconnected shortly after executing it.
Thank you!
Thanks! This completly saved my box and kept me from loosing a day or more of work!
Thank you! I just spent nearly a month getting Fedora set up alongside Windows on my mother’s glacial and practically unusable laptop. I was talking all month about how great Linux is, and showing her all the ways I could configure it for her and make her machine fly again. Then the night before she gets on a plane back to her home, Fedora refuses to boot! Read-only file system! No! Hours searching online and trying different solutions came to naught. Then I found your post and followed the fsck steps. Before attempting the final reboot, I raised my arms in triumph, or really in a bleak mockery of triumph, since I fully expected to be deflated and have my grim outlook on the universe confirmed. But no! Fedora logo, then KDE logo! Victory! My mother’s machine lives another day. Thank you.
You’re very welcome Jeff. That was a heartwarming message.
No reason to make it this complicated. Just boot up your system, switch to a terminal, then use Sysrq+U to remount everything read-only and force fsck to run. Reboot and you’re done.
Thank you dude. This was very helpful and solved my problem. I had a system running fine then all of sudden it switched to RO mode. Your steps were spot on. Thank you again
In maintenance mode
cat /proc/mounts show ro
/dev/mapper/vg_xyz-lv_root / ext4 ro,relatime,barrier=1,data=ordered 0 0
fsck /dev/mapper/vg_xyz-lv_root
rebooted the system and everything return to normal.
Thank you!!
how to restore /etc/fstab to its original state????
Not sure how, but there’s a couple of guides around that will show you how to rebuild it by hand.
Hi Team,
Mine is CentOs 7,running under postgres,i removed some files and it keep asking for root maintenance password.now am not even able to acess it via putty,its ip address is not even reachable.please assist am new in Linux ,still in college.
regards,
Rufftone
Did you follow the steps outlined in the article? If you don’t even have access to the system though, you’re probably out of luck.
thanks bro that was great
You have no idea how helpful this was…
After applying tons of failed tips, this worked like a charm on my Cent OS 7 virtual private server.
Thank you very much
You’re very welcome!
Great, thanks! Saved my a lot of time!