Learn Linux 15: Storage Media

Published

Contents


Introduction

In this chapter, we will consider data at the device level. Linux has amazing capabilities for handling storage devices, whether physical storage such as hard disks, network storage, or virtual storage devices such as RAID (Redundant Array of Independent Disks) and LVM (Logical Volume Manager). This chapter will introduce the following commands:

  • mount - Mount a file system
  • unmount - Unmount a file system
  • fsck - Check and repair a file system
  • fdisk - Manipulate disk partition table
  • mkfs - Create a file system
  • dd - Convert and copy a file
  • genisoimage (mkisofs) - Create an ISO 9660 image file
  • wodim (cdrecord) - Write data to optical storage media
  • md5sum - Calculate an MD5 checksum

Mounting And Unmounting Storage Devices

Recent advances in the Linux desktop have made storage device management extremely easy for desktop users. For the most part, we attach a device to our system and it “just works.” In the old days (say, 2004), this stuff had to be done manually. On non-desktop systems (i.e., servers) this is still a largely manual procedure since servers often have extreme storage needs and complex configuration requirements.

The first step in managing a storage device is attaching the device to the file system tree. This process, called mounting, allows the device to interact with the operating system. Unix-like operating systems (like Linux) maintain a single file system tree with devices attached at various points. This contrasts with other operating systems such as Windows that maintain separate file system trees for each device (for example C:, D:, etc.).

A file named /etc/fstab (short for “file system table”) lists the devices (typically hard disk partitions) that are to be mounted at boot time. Here is an example /etc/fstab file from an early Fedora system.

LABEL=/12         /             ext4    defaults        1 1
LABEL=/home       /home         ext4    defaults        1 2
LABEL=/boot       /boot         ext4    defaults        1 2
tmpfs             /dev/shm      tmpfs   defaults        0 0
devpts            /dev/pts      devpts  gid=5,mode=620  0 0
sysfs             /sys          sysfs   defaults        0 0
proc              /proc         proc    defaults        0 0
LABEL=SWAP-sda3   swap          swap    defaults        0 0

Most of the file systems listed in this example file are virtual and not applicable to our discussion. For our purposes, the interesting ones are the first three.

LABEL=/12         /             ext4    defaults        1 1
LABEL=/home       /home         ext4    defaults        1 2
LABEL=/boot       /boot         ext4    defaults        1 2

These are the hard disk partitions. Each line of the file consists of six fields, as described below:

  • Field 1 - Device - Traditionally, this field contains the actual name of a device file associated with the physical device, such as /dev/sda1 (the first partition of the first detected hard disk). But with today’s computers, which have many devices that are hot pluggable (like USB drives), many modern Linux distributions associate a device with a text label instead. This label (which is added to the storage media when it is formatted) can be either a simple text label or a randomly generated UUID (Universally Unique Identifier). This label is read by the operating system when the device is attached to the system. That way, no matter which device file is assigned to the actual physical device, it can still be correctly identified.
  • Field 2 - Mount point - The directory where the device is attached to the file system tree.
  • Field 3 - File system type - Linux allows many file system types to be mounted. Most native Linux file systems are Fourth Extended File System (ext4), but many others are supported, such as FAT16 (msdos), FAT32 (vfat), NTFS (ntfs), CD-ROM (iso9660), etc.
  • Field 4 - Options - File systems can be mounted with various options. It is possible, for example, to mount file systems as read-only or to prevent any programs from being executed from them (a useful security feature for removable media).
  • Field 5 - Frequency - A single number that specifies if and when a file system is to be backed up with the dump command.
  • Field 6 - Order - A single number that specifies in what order file systems should be checked with the fsck command.

Viewing A List Of Mounted File Systems

The mount command is used to mount file systems. Entering the command without arguments will display a list of the file systems currently mounted.

[user@linux ~]$ mount
/dev/sda2 on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620) /dev/sda5 on /home type ext4 (rw)
/dev/sda1 on /boot type ext4 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
fusectl on /sys/fs/fuse/connections type fusectl (rw)
/dev/sdd1 on /media/disk type vfat (rw,nosuid,nodev,noatime, uhelper=hal,uid=500,utf8,shortname=lower)
twin4:/musicbox on /misc/musicbox type nfs4 (rw,addr=192.168.1.4)

The format of the listing is as follows: device on mount_point type filesystem_type (options).

For example, the first line shows that device /dev/sda2 is mounted as the root file system, is of type ext4, and is both readable and writable (the option rw). This listing also has two interesting entries at the bottom of the list. The next-to-last entry shows a 2GB SD memory card in a card reader mounted at /media/disk, and the last entry is a network drive mounted at /misc/musicbox.

For our first experiment, we will work with a CD-ROM. First, let’s look at a system before a CD-ROM is inserted.

[user@linux ~]$ mount
/dev/mapper/VolGroup00-LogVol00 on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda1 on /boot type ext4 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)

This listing is from a CentOS system, which is using LVM (Logical Volume Manager) to create its root file system. Like many modern Linux distributions, this system will attempt to automatically mount the CD-ROM after insertion.

[user@linux ~]$ mount
/dev/mapper/VolGroup00-LogVol00 on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda1 on /boot type ext4 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/sdc on /media/live-1.0.10-8 type iso9660 (ro,noexec,nosuid,nodev,uid=500)

After we insert the disc, we see the same listing as before with one additional entry. At the end of the listing we see that the CD-ROM (which is device /dev/sdc on this system) has been mounted on /media/live-1.0.10-8 and is type iso9660 (a CD-ROM). For the purposes of our experiment, we’re interested in the name of the device. When you conduct this experiment yourself, the device name will most likely be different.

In the examples that follow, it is vitally important that you pay close attention to the actual device names in use on your system and do not use the names used in this text! Also note that audio CDs are not the same as CD-ROMs. Audio CDs do not contain file systems and thus cannot be mounted in the usual sense.

Now that we have the device name of the CD-ROM drive, let’s unmount the disc and remount it at another location in the file system tree. To do this, we become the superuser and unmount the disc with the umount (notice the spelling) command.

[user@linux ~]$ su -
Password:
[user@linux ~]# umount /dev/sdc

The next step is to create a new mount point for the disk. A mount point is simply a directory somewhere on the file system tree. There’s nothing special about it. It doesn’t even have to be an empty directory, though if you mount a device on a non-empty directory, you will not be able to see the directory’s previous contents until you unmount the device. For our purposes, we will create a new directory.

[root@linux ~]# mkdir /mnt/cdrom

Finally, we mount the CD-ROM at the new mount point. The -t option is used to specify the file system type.

[root@linux ~]# mount -t iso9660 /dev/sdc /mnt/cdrom

Afterward, we can examine the contents of the CD-ROM via the new mount point.

[root@linux ~]# cd /mnt/cdrom
[root@linux cdrom]# ls

Notice what happens when we try to unmount the CD-ROM.

[root@linux cdrom]# umount /dev/sdc
umount: /mnt/cdrom: device is busy

Why is this? The reason is that we cannot unmount a device if the device is being used by someone or some process. In this case, we changed our working directory to the mount point for the CD-ROM, which causes the device to be busy. We can easily remedy the issue by changing the working directory to something other than the mount point.

[root@linux cdrom]# cd
[root@linux ~]# umount /dev/hdc

Now the device unmounts successfully.

Unmounting a device entails writing all the remaining data to the device so that it can be safely removed. If the device is removed without unmounting it first, the possibility exists that not all the data destined for the device has been transferred. In some cases, this data may include vital directory updates, which will lead to file system corruption, one of the worst things that can happen on a computer.

Determining Device Names

It’s sometimes difficult to determine the name of a device. In the old days, it wasn’t very hard. A device was always in the same place, and it didn’t change. Unix-like systems like it that way. When Unix was developed, “changing a disk drive” involved using a forklift to remove a washing machine–sized device from the computer room. In recent years, the typical desktop hardware configuration has become quite dynamic, and Linux has evolved to become more flexible than its ancestors.

In the examples in the previous section, we took advantage of the modern Linux desktop’s capability to “automagically” mount the device and then determine the name after the fact. But what if we are managing a server or some other environment where this does not occur? How can we figure it out?

First, let’s look at how the system names devices. If we list the contents of the /dev directory (where all devices live), we can see that there are lots and lots of devices.

[user@linux ~]$ ls /dev

In addition, we often see symbolic links such as /dev/cdrom, /dev/dvd, and /dev/floppy, which point to the actual device files, provided as a convenience.

If you are working on a system that does not automatically mount removable devices, you can use the following technique to determine how the removable device is named when it is attached. First, start a real-time view of the /var/log/messages or /var/log/syslog file (you may require superuser privileges for this).

[user@linux ~]$ sudo tail -f /var/log/messages

The last few lines of the file will be displayed and then will pause. Next, plug in the removable device. In this example, we will use a 16MB flash drive. Almost immediately, the kernel will notice the device and probe it.

Jul 23 10:07:53 linux kernel: usb 3-2: new full speed USB device using uhci_hcd and
Jul 23 10:07:53 linux kernel: usb 3-2: configuration #1 chosen from 1 choice
Jul 23 10:07:53 linux kernel: scsi3 : SCSI emulation for USB Mass Storage devices
Jul 23 10:07:58 linux kernel: scsi scan: INQUIRY result too short (5), using 36
Jul 23 10:07:58 linux kernel: scsi 3:0:0:0: Direct-Access Easy Disk .00 PQ: 0
Jul 23 10:07:59 linux kernel: sd 3:0:0:0: [sdb] 31263 512-byte hardware sectors (16 Jul 23 10:07:59 linux kernel: sd 3:0:0:0: [sdb] Write Protect is off
Jul 23 10:07:59 linux kernel: sd 3:0:0:0: [sdb] Assuming drive cache: write through
Jul 23 10:07:59 linux kernel: sd 3:0:0:0: [sdb] 31263 512-byte hardware sectors (16 Jul 23 10:07:59 linux kernel: sd 3:0:0:0: [sdb] Write Protect is off
Jul 23 10:07:59 linux kernel: sd 3:0:0:0: [sdb] Assuming drive cache: write through
Jul 23 10:07:59 linux kernel: sdb: sdb1
Jul 23 10:07:59 linux kernel: sd 3:0:0:0: [sdb] Attached SCSI removable disk
Jul 23 10:07:59 linux kernel: sd 3:0:0:0: Attached scsi generic sg3 type 0

After the display pauses again, press ctrl-C to get the prompt back. The interesting parts of the output are the repeated references to [sdb], which matches our expectation of a SCSI disk device name. Knowing this, these two lines become particularly illuminating:

Jul 23 10:07:59 linux kernel: sdb: sdb1
Jul 23 10:07:59 linux kernel: sd 3:0:0:0: [sdb] Attached SCSI removable disk

This tells us the device name is /dev/sdb for the entire device and /dev/sdb1 for the first partition on the device. As we have seen, working with Linux is full of interesting detective work!

With our device name in hand, we can now mount the flash drive. The device name will remain the same as long as it remains physically attached to the computer and the computer is not rebooted.

Creating New File Systems

Suppose that we want to reformat the flash drive with a Linux native file system, rather than the FAT32 system it has now. This involves two steps.

  • (optional) Create a new partition layout if the existing one is not to our liking.
  • Create a new, empty file system on the drive.

_Warning: In the following exercise, we are going to format a flash drive. Use a drive that contains nothing you care about because it will be erased! _

Manipulating Partitions With fdisk Command

fdisk is one of a host of available programs (both command line and graphical) that allows us to interact directly with disk-like devices (such as hard disk drives and flash drives) at a very low level. With this tool we can edit, delete, and create partitions on the device. To work with our flash drive, we must first unmount it (if needed) and then invoke the fdisk program.

[user@linux ~]$ sudo umount /dev/sdb1
[user@linux ~]$ sudo fdisk /dev/sdb

Notice that we must specify the device in terms of the entire device, not by partition number. After the program starts up, we will see the following prompt.

Command (m for help):

Entering an m will display the program menu.

Command action
  a   toggle a bootable flag
  b   edit bsd disklabel
  c   toggle the dos compatibility flag
  d   delete a partition
  l   list known partition types
  m   print this menu
  n   add a new partition
  o   create a new empty DOS partition table
  p   print the partition table
  q   quit without saving changes
  s   create a new empty Sun disklabel
  t   change a partition's system id
  u   change display/entry units
  v   verify the partition table
  w   write table to disk and exit
  x   extra functionality (experts only)

Command (m for help):

The first thing we want to do is examine the existing partition layout. We do this by entering p to print the partition table for the device.

Command (m for help): p

Disk /dev/sdb: 16 MB, 16006656 bytes
1 heads, 31 sectors/track, 1008 cylinders
Units = cylinders of 31 * 512 = 15872 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               2        1008       15608+   b  W95 FAT32

In this example, we see a 16MB device with a single partition (1) that uses 1,006 of the available 1,008 cylinders on the device. The partition is identified as a Windows 95 FAT32 partition. Some programs will use this identifier to limit the kinds of operations that can be done to the disk, but most of the time it is not critical to change it. However, in the interest of this demonstration, we will change it to indicate a Linux partition. To do this, we must first find out what ID is used to identify a Linux partition. In the previous listing, we see that the ID b is used to specify the existing partition. To see a list of the available partition types, we refer to the program menu.

If we enter l at the prompt, a large list of possible types is displayed. Among them we see b for our existing partition type and 83 for Linux. Going back to the menu, we see t action to change a partition ID. We enter t at the prompt and enter the new ID.

Command (m for help): t
Selected partition 1
Hex code (type L to list codes): 83
Changed system type of partition 1 to 83 (Linux)

This completes all the changes we need to make. Up to this point, the device has been untouched (all the changes have been stored in memory, not on the physical device), so we will write the modified partition table to the device and exit. To do this, we enter w at the prompt.

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: If you have created or modified any DOS 6.x
partitions, please see the fdisk manual page for additional
information.
Syncing disks.
[user@linux ~]$

If we had decided to leave the device unaltered, we could have entered q at the prompt, which would have exited the program without writing the changes. We can safely ignore the ominous-sounding warning message.

Creating A New File System With mkfs Command

With our partition editing done (lightweight though it might have been), it’s time to create a new file system on our flash drive. To do this, we will use mkfs (short for “make file system”), which can create file systems in a variety of formats. To create an ext4 file system on the device, we use the -t option to specify the ext4 system type, followed by the name of the device containing the partition we want to format.

[user@linux ~]$ sudo mkfs -t ext4 /dev/sdb1
mke2fs 2.23.2 (12-Jul-2011)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
3904 inodes, 15608 blocks
780 blocks (5.00%) reserved for the super user
First data block=1
Maximum filesystem blocks=15990784
2 block groups
8192 blocks per group, 8192 fragments per group
1952 inodes per group
Superblock backups stored on blocks:
  8193

Writing inode tables: done
Creating journal (1024 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 34 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
[user@linux ~]$

The program will display a lot of information when ext4 is the chosen file system type. To reformat the device to its original FAT32 file system, specify vfat as the file system type.

[user@linux ~]$ sudo mkfs -t vfat /dev/sdb1

This process of partitioning and formatting can be used anytime additional storage devices are added to the system. While we worked with a tiny flash drive, the same process can be applied to internal hard disks and other removable storage devices like USB hard drives.

Testing And Repairing File Systems

In our earlier discussion of the /etc/fstab file, we saw some mysterious digits at the end of each line. Each time the system boots, it routinely checks the integrity of the file systems before mounting them. This is done by the fsck program (short for “file system check”). The last number in each fstab entry specifies the order in which the devices are to be checked. In our previous example, we see that the root file system is checked first, followed by the home and boot file systems. Devices with a zero as the last digit are not routinely checked.

In addition to checking the integrity of file systems, fsck can also repair corrupt file systems with varying degrees of success, depending on the amount of damage. On Unix-like file systems, recovered portions of files are placed in the lost+found directory, located in the root of each file system.

To check our flash drive (which should be unmounted first), we could do the following.

[user@linux ~]$ sudo fsck /dev/sdb1
fsck 1.40.8 (13-Mar-2016)
e2fsck 1.40.8 (13-Mar-2016)
/dev/sdb1: clean, 11/3904 files, 1661/15608 blocks

These days, file system corruption is quite rare unless there is a hardware problem, such as a failing disk drive. On most systems, file system corruption detected at boot time will cause the system to stop and direct you to run fsck before continuing.

Moving Data Directly To And From Devices

While we usually think of data on our computers as being organized into files, it is also possible to think of the data in “raw” form. If we look at a disk drive, for example, we see that it consists of a large number of “blocks” of data that the operating system sees as directories and files. However, if we could treat a disk drive as simply a large collection of data blocks, we could perform useful tasks, such as cloning devices.

The dd program performs this task. It copies blocks of data from one place to another. It uses a unique syntax (for historical reasons).

dd if=input_file of=output_file [bs=block_size [count=blocks]]

Let’s say we had two USB flash drives of the same size and we wanted to exactly copy the first drive to the second. If we attached both drives to the computer and they are assigned to devices /dev/sdb and /dev/sdc, respectively, we could copy everything on the first drive to the second drive with the following:

dd if=/dev/sdb of=/dev/sdc

Alternately, if only the first device were attached to the computer, we could copy its contents to an ordinary file for later restoration or copying.

dd if=/dev/sdb of=flash_drive.img

Summary

In this chapter, we looked at the basic storage management tasks. There are, of course, many more. Linux supports a vast array of storage devices and file system schemes. It also offers many features for interoperability with other systems.