Checking hdd for broken linux sectors. Linux: check disk

Did Linus Torvalds, creating his brainchild, that Linux would be used in embedded systems, not only in cheap home routers, but also in such serious telecom solutions as AVAYA PBX?

Recently it was necessary to restore the AVAYA automatic telephone exchange of one large customer. This is an Avaya G650 Media Gateway (chassis) with an Avaya S8400 Server (processor). Where the system disk is 2 GB CompactFlash. Which can conditionally be considered an SSD drive with an IDE interface.
And what was my surprise when I connected the CF drive through a card reader and saw the familiar structure of the Linux file system. This, of course, simplified the procedure for checking the performance of the CF drive.

How to check the file system of a Linux disk for errors

The MicroSoft DOS operating system (oh yes, I remember version 5.0, which fit on one floppy disk!) Had a CHKDSK check command. And there is something similar in Linux.
To check a Linux disk for file system errors, you need to find out the names of the file systems to check:

# df -h Filesystem Size Used Avail Use% Mounted on / dev / sda 20G 4.0G 15G 21% / / dev / sdd1 1G 455M 555M 46% / media / Np% blsl3648B4Jjeiedgyy / dev / sdd6 1G 98M 902M 10% / media / 10.13-23dd none 246M 0 246M 0% / dev / shm

For the CF drive under test, this is / dev / sdd1 and / dev / sdd6
Next, you need to unmount the filesystems under test:

#sudo umount / dev / sdd1 #sudo umount / dev / sdd6

#fsck -y / dev / sdd1 #fsck -y / dev / sdd6

Parameter -y will automatically answer all questions yes, which, as a rule, most users do.

Lunux FSCK File System Checker Results

In my case, there were errors on one of the sections that the utility corrected. After that the CF drive was returned to its place and the Avaya PBX was restored.

Any computer is a complex device that consists of many components and no one is immune from failures of any of them. In this article, we will look at how to timely recognize one of the serious problems with storage devices, be it a hard disk or a flash drive, how a disk is checked for bad linux sectors.

Any storage device consists of many small blocks (sectors) that store information in the form of zeros or ones (bits). If, for some reason, the operating system cannot write a bit of information to a certain sector, then it can be considered "broken".

A sector can become damaged for a variety of reasons:

Manufacturing defects
Power off the computer while recording information.
Physical wear and tear of the drive.

A small number of bad sectors are found on almost any drive. But it is worth paying attention if their number increases over time. This may indicate the imminent physical death of the drive, and it's time for you to think about replacing it.

Let's look at what utilities in Linux we can check the disk for bad linux sectors.

Checking the drive for bad sectors using badblocks.

Badblocks is a standard Linux utility for checking for bad sectors. It is installed by default in almost any distribution kit and with its help you can check both the hard drive and the external drive.

First, let's see what drives are connected to our system and what partitions they have. To do this, we need another standard Linux utility - fdisk.

Naturally, you need to execute commands with superuser rights:

The parameter -l we tell fdisk to show the list of partitions and exit.

Now that we know what partitions we have, we can check them for bad sectors. To do this, we will use the badblocks utility as follows:

sudo badblocks -v / dev / sda1> badsectors.txt

For verification, we indicate the following parameters:

-v- detailed display of information about the results of the check.
/ dev / sda1- the section that we want to check for bad sectors.
> badsectors.txt- we output the result of the command to the badsectors.txt file.

If, as a result, bad sectors were found, then we need to instruct the operating system not to write information in them in the future. To do this, we need Linux utilities for working with file systems:

e2fsck. If we fix a partition with Linux filesystems (ext2, ext3, ext4).
fsck. If we fix a filesystem other than ext.

We enter the following commands:

sudo e2fsck -l badsectors.txt / dev / sda1

Or, if our filesystem is not ext:

sudo fsck -l badsectors.txt / dev / sda1

The parameter -l we tell the utility to use the list of bad sectors from the badsectors.txt file, which we received earlier when checking with the badblocks utility.

Checking a drive for bad sectors in Linux in smartmontools

Now let's look at a more modern and reliable way to check a disk for bad linux sectors. Modern ATA / SATA, SCSI / SAS, SSD drives have a built-in self-monitoring system S.M.A.R.T (Self-Monitoring, Analysis and Reporting Technology, Self-monitoring, analysis and reporting technology), which monitors the parameters of the drive and helps to determine the deterioration of the drive's performance in the early stages. To work with S.M.A.R.T in Linux, there is a smartmontools utility.

Let's install it first. If your distribution is based on Debian \ Ubuntu, then enter:

sudo apt install smartmontools

If you have a distribution kit based on RHEL \ CentOS, then enter:

sudo yum install smartmontools

Now that we have installed smartmontools we can see the help page using the command:

Let's get down to working with the utility. We enter the following command with the parameter -H so that the utility shows us information about the state of the drive:

sudo smartctl -H / dev / sda1

As you can see, checking the disk for bad linux sectors is complete and the utility tells us that everything is in order with the drive!

Additionally, you can specify the following parameters -a or --all to get even more information about your drive, or -x and --xall to view information including the rest of the drive's parameters.

conclusions

In this article, we looked at ways to check drives for bad sectors under Linux in order to anticipate possible failures in time and not lose data.

A computer is a device that relies on the interaction of many components. They can cause malfunctions over time. One of the most common reasons for the defective work of the machine is bad sectors on the disk, so it needs to be tested periodically. Linux provides all the possibilities for this.

What are broken blocks and why do they appear

A block (sector) is a small cell on a disk that stores information in the form of bits (0 and 1). When the system fails to write the next bit into a cell, they speak of a bit sector. There may be several reasons for the appearance of such blocks:

manufacturing defects;
power off while recording information;
physical wear and tear of the disk.

Initially, almost all media have violations. Over time, their number may increase, which indicates an imminent failure of the device. There are several ways to test a disk for errors in Linux.

Linux disk check

Several operating systems run on the Linux kernel, including Ubuntu and Debian. The disk check procedure is universal and suitable for each of them. It is worth thinking about the time to test the media, when the disk system is under heavy load, the speed of working with the media (writing / reading) has significantly decreased, or these procedures even cause errors.

Many are familiar with the program for Windows - Victoria HDD. The developers took care of writing its counterparts for Linux.

Badblocks

Badblocks is a disk utility that comes with Ubuntu and other Linux distributions by default. The program allows you to test both hard drive and external drives.

Important! All terminal commands in this article begin with the sudo parameter, since they require superuser rights to execute.

Before testing a disk in Linux, you should check which drives are connected to the system using the fdisk-l utility. It will also show the sections available on them.

Now you can proceed to direct testing for bad sectors. Badblocks work is organized as follows:

badblocks -v / dev / sdk1> bsector.txt

The record uses the following commands and operands:

-v - displays a detailed report on the performed check;
/ dev / sdk 1- the section being checked;
bsector.txt - writing results to a text file.

If bad blocks are found when checking the disk, you need to run the fsck utility, or e2fsck, depending on the file system used. They will restrict the writing of information to non-working sectors. For ext2, ext3, or ext4 file systems, run the following command:

fsck -l bsector.txt / dev / sdk1

Otherwise:

fsck -l bsector.txt / dev / sdk1

The -l parameter tells the program that the bad blocks are listed in the bsector.txt file, and they should be excluded.

Gparted

The utility checks the Linux file system without resorting to a text-based interface.

The tool is not originally included in the distributions of the operating system, so you need to install it by running the command:

apt-get install gparted

Available drives are displayed in the main application window. The fact that it is time to test the carrier is clear from the exclamation mark next to its name. The check is started by clicking on the "Check for errors" item in the "Section" submenu located on the top panel. The desired disc is preselected. When the scan is complete, the utility will display the result.

Checking HDD and other storage devices with the GParted application is available for users of Ubuntu, FreeBSD, Centos, Debian and other and other distributions running on the Linux kernel.

Smartmontools

The tool allows you to test the file system with greater reliability. Modern hard drives have a built-in S. M. A. R. T. self-monitoring module, which analyzes the drive data and helps to determine the malfunction at the initial stage. Smartmontools is designed to work with this module.

The installation is started through the terminal:

apt install smartmontools - for Ubuntu / Debian;
yum install smartmontools for CentOS.

To view information about the state of the hard disk, enter the line:

smartctl –H / dev / sdk1

Error checking takes different time depending on the size of the disk. At the end, the program will display the result about the presence of bad sectors, or their absence.

The utility has other options: -a, --all, -x, --xall. Help is called for more information:

Safecopy

When there is a need to test a hard drive in Linux, you should be ready for any result.

Safecopy application copies data from a damaged device to a working one. The source can be both hard drives and removable media. This tool ignores I / O errors, reads, bad blocks, and continues to run continuously. The execution speed is the fastest possible that the computer provides.

Comment! The utility is not intended for recovering deleted files. She pulls out the information stored in the broken sectors.

To install Safecopy on Linux, enter the line into the terminal:

Scanning is started with the command:

safecopy / dev / sdk1 / home / files /

Here, the first path is the damaged disk, the second is the directory where the files will be saved.

The program is capable of creating an image of the file system of an unstable storage device.

What to do if an error is found in the Ubuntu system program

Installing new software or changing system settings may cause the message "An error has been detected in the system program." Many ignore it, since it does not affect the general work.

The problem is usually encountered by users of Ubuntu version 16.04. In this case, there is no need to test the HDD, since the problem is more likely to be a software failure. The message notifies about the unexpected termination of the program and offers to send a report to the developers. If you agree, a browser window will open where you need to fill out a 4-step form. This option causes difficulties and does not guarantee that the error will disappear.

The second method will help to avoid the appearance of the message only if it is called by the same program. To do this, at the next notification, you need to check the "Do not show more for this program" option.

The third method is to disable the Apport utility, which is responsible for collecting information and sending reports in Linux. This approach will completely eliminate pop-up windows with errors. It is possible to disable only the display of notifications, leaving the collection service in working order. To do this, you need to do:

gsettings set com.ubuntu.update-notifier show-apport-crashes false

Data will continue to be collected in the / var / crash folder. They need to be cleaned periodically so that they do not fill up disk space:

To completely disable Apport services, an entry is entered into the terminal:

gksu gedit / etc / default / apport

In the text that appears, the value of the enable field changes from 1 to 0. Later, to re-enable the service, the default settings are returned.

Conclusion

To prevent loss of files, it is recommended that you test your hard drive and removable media periodically. Linux offers several approaches to solving the problem. You can choose from a list of utilities that identify bad sectors and transfer information to a normally functioning device.

You should check your hard drive from time to time. I believe that there is nothing more valuable than information on the hard drive, well, of course, not counting our life, and it will be oh, how a shame when your family photos, videos, necessary abstracts and reports on work, passwords and any other important data disappear. How to check a hard drive in Linux, and in our case in Ubuntu, and what programs are there for testing our helpers and saviors - hard drives? You should check the hard disk not from the system installed on it, but from the LiveCD / USB. One of those valuable builds is Parted Magic, although you can do it from CD / USB Ubuntu as well. This is a complete ammunition set for working with hardy. Here you have GParted, to resize HDD partitions (analogous to Acronis Disc Director), and CloneZilla, to create exact copies of your system disks or partitions with subsequent recovery, and GSmartControl - to report on the status of your disk, and much more. So let's start an overview of programs for checking the hard drive in Ubuntu.

Console program Badblocks.

To find out how your hard disk or disks are partitioned, and select a partition to check, run the command:

sudo fdisk -l

To start scanning for bad sectors, just execute the command in the Terminal:

sudo badblocks -sv / dev / sdb1

where:

/ dev / sdb1- this is the section being checked,

-s- displays information about scanning as a percentage, the presence or absence of "broken" sectors, etc.,

-v- will display detailed information about the check.

If you need to get a text report, then you need to run the following command:

sudo badblocks -s / dev / sdb1 > errors.txt

Instead of / dev / sdb1 you must specify the desired partition of your hard disk, and a text file will appear in your Home directory errors.txt with a report. If there are still badges, then it is advisable to mark them so that the system does not address them while working with the disk. To mark bad sectors, run the command:

sudo e2fsck -l errors.txt / dev / sdb1

Key -l enables the program to use the errors.txt file to work with "broken" sectors. But you can avoid the above two commands and run just one:

sudo e2fsck -ct / dev / sdb1

The e2fsck program is part of the E2fsprogs software package, among which there is a badblock, and the key -c makes it possible to use the badblock utility to find bad sectors.

To check the filesystem (ext 2 / ext 3 / ext 4) run the following command:

e2fsck -y / dev / your disk partition or entire disk

Key -y tells the utility to answer all questions positively.

Other commonly used parameters:

-p, -a automatically "repair" the file system without any questions.
-f forced (forced) check. The check will happen in any case, even if the file system did not need it.
-c runs the badblocks program to find and mark bad sectors on the disk;
-v detailed information about the check will be displayed.

Although fsck can be used instead of e2fsck. But everyone is free to choose what is better or more convenient for him.

Disks program.

Ubuntu has a great program Discs, which displays information on all connected devices in the system (hard disks, flash drives, CD / DVD drives, etc.) By running it, you can find out the data of S.M.A.R.T. on the disk of interest.

Program GSmartControl

And finally, I want to recommend the program GSmartControl, which is a graphical shell (GUI) for the console program - smartctl. You can find it in the Ubuntu Application Center, or install it through the Terminal with the command:

sudo apt-get install gsmartcontrol

The program shows complete information according to S.M.A.R.T. More details on each item can be found .

Well, now you have learned how to check the hard drive in linux. May this information serve you well! Good luck!

If there is one thing that you really do not want to face in your operating system, then it is definitely an unexpected failure of hard drives. With backup and RAID storage technology, you can put all your data back in place very quickly, but losing a hardware device can take a huge toll on your budget, especially if you didn't plan to.

To avoid such problems, you can use smartmontools. It is a software package for managing and monitoring storage devices using Self-Monitoring Analysis and Reporting Technology, or simply SMART.

Most modern ATA / SATA, SCSI / SAS storage devices provide a SMART interface. The purpose of SMART is to monitor the reliability of the hard drive to detect various errors and respond in a timely manner to their occurrence. Smartmontools consists of two utilities, smartctl and smartd. Together they provide a powerful monitoring and warning system for possible HDD failure in Linux. Checking the linux hard drive will be discussed in detail below.

The smartmontools package is in the official repositories of most Linux distributions, so installation is reduced to a single command. On Debian and Debian-based systems, run:

aptitude install smartmontools

And for Red Hat:

yum install smartmontools

Now you can proceed to diagnose your linux hard drive.

Checking hard drive in smartctl

First, find out which hard drives are connected to your system:

ls -l / dev | grep -E "sd | hd"

The output will look something like this:

Here - sdx is the name of the HDD device connected to the computer.

To display information about a specific hard drive (device model, S / N, firmware version, ATA version, SMART interface availability) Run smartctl with info option and hard drive name. For example, for / dev / sda:

smartctl --info / dev / sda

While you may not pay attention to the ATA version, it is one of the most important factors when looking for a replacement device. Each new version of ATA is compatible with the previous ones. For example, older ATA-1 and ATA-2 devices will work fine on ATA-6 and ATA-7 interfaces, but not vice versa. When the ATA version of the device and the interface do not match, the capabilities of the hardware will not be fully revealed. In this case, it is best to choose an ATA-7 hard drive for replacement.

You can run ubuntu hard disk check with the command:

smartctl -s on -a / dev / sda

Here's the option -s turns on the SMART flag on the specified device. You can remove it if SMART support is already enabled. The disk information is divided into several sections. READ SMART DATA contains general information about the health of the hard drive.

START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment rest result: PASSED

This test can be passed ( PASSED) or not ( FAILED). In the latter case, failure is inevitable, start backing up data from this disk.

The next thing to look at when you need HDD diagnostics on linux is the SMART attribute table.

The SMART table contains the parameters defined for a particular disk by the developer, as well as the failure threshold for these parameters. The table is filled in automatically and updated based on the drive's firmware.

ID #- Attribute ID, usually a decimal number between 1 and 255;
ATTRIBUTE_NAME- The name of the attribute;
FLAG- attribute processing flag;
VALUE- This field represents the normal value for the state of this attribute in the range 1 to 253, 253 is the best state, 1 is the worst. Depending on the properties, the initial value can be from 100 to 200;
WORST- the worst value for the entire time;
THRESH- the lowest value of value, after passing over which it is necessary to report that the disk is unusable;
TYPE- attribute type, can be Pre-fail or Old_age. All attributes are considered critical by default, that is, if a disk fails the check for one of the attributes, then it is already considered FAILED, but the old_age attributes are not critical;
UPDATED- shows the frequency of the attribute update;
WHEN_FAILED- will be set to FAILING_NOW if the attribute value is less than or equal to THRESH, or "-" if higher. In the case of FAILING_NOW, it is best to back up as soon as possible, especially if the attribute type is Pre-fail.
RAW_VALUE is the value specified by the manufacturer.

Now you think that smartctl is a good tool, but I have no way to run it manually every time, it would be nice to automate the whole thing so that the program runs periodically and informs me about the results of the check. And this is possible with smartd.

Configuring smartd and smartctl for real-time diagnostics and monitoring

Real-time diagnostics of hdd in linux is very easy to configure. First, edit the smartd configuration file - /etc/smartd.conf. Add the following line:

nano /etc/smartd.conf

/ dev / sda -m [email protected]-M test

-m - e-mail address for sending the verification results. This can be a local user address, a superuser address, or an external address if the server is configured to send e-mail;
-M- the frequency of sending letters. once - send only one message about disk problems. daily- send messages every day if a problem was found. diminishing- send messages every other day if a problem was found. test- send a test message when starting smartd. exec- executes the specified program to the place where the mail was sent.

Save changes and restart smartd. You should receive an email with the following content:

You can also schedule tests according to your schedule, for this use the -s option and a regular expression like "T / MM / DD / DN / HH", where:

T- test type:
L- long test;
S- short test;
C- test displacement (ATA);
O- offline (test).

The rest of the characters define the date and time of the test:

MM- month of the year;
DD- day of the month;
Hh- one p.m;
DN- day of the week (from 1 - Monday 7 - Sunday;
MM, DD and HH- are indicated with two decimal digits.

A period means all possible values, an expression in brackets (A | B | C) means one of three options, an expression in square brackets means a range (from 1 to 5).

For example, to do a full scan of your linux hard drive every weekday at 1pm, add the following line to smartd.conf:

DEVICESCAN -s (L /../../ / 13)

conclusions

If you want to quickly check the mechanical operation of a hard disk, see its physical condition or perform a more or less complete scan of the disk surface, use smartmontools. Remember to do regular scans, then thank yourself. Have you done this before? Will you do? Or are you using other methods? Write in the comments!

Translation source.