Planning the Architecture
Before any disk is touched, a solid design plan turns a messy set of drives into a coherent storage foundation. Start by cataloguing every physical device – model, capacity, and health status. A quick smartctl -a /dev/sdX run flags impending failures long before they become system‑wide problems. With the inventory in hand, decide how many disks will belong to each logical array. The simplest split is one array for the root file system and another for data, but larger setups often combine multiple RAID levels into a single, flexible pool.
Choosing the right RAID level is a balancing act between speed, capacity, and safety. If uptime is critical, RAID‑1 mirrors every write across all members, trading space for instant resilience. RAID‑10 pairs mirroring and striping to provide the performance of stripe with the protection of mirror, but it requires at least four disks. RAID‑5 and RAID‑6 offer better capacity usage by storing parity on the array; RAID‑6 keeps two parity sets, tolerating two simultaneous disk failures. In a server that hosts a web application, a RAID‑10 for the OS and a RAID‑6 for the database can give both performance and survivability.
Disk alignment cannot be ignored, especially with SSDs or NVMe devices. A 1 MiB boundary keeps each logical block aligned with the physical erase unit, eliminating needless read/write amplification. When building an array with mdadm, the --chunk size dictates the stripe width. For RAID‑1, 64 KiB is a safe default; RAID‑5 or RAID‑10 can use 128 KiB or 256 KiB, aligning with the filesystem’s block size. A mismatch can lead to inefficient I/O patterns or, on older hardware, subtle corruption.
With the RAID plan set, lay out the LVM hierarchy. A volume group (VG) can span one or more physical volumes (PVs), and logical volumes (LVs) are carved from that pool. A single VG that contains all arrays simplifies management but couples every LV to the same redundancy. Splitting VGs – one for the system, one for logs, one for applications – offers isolation: a degraded array only affects the VG that owns it. Decide early which model suits your growth trajectory.
Next, sketch the backup strategy. RAID protects against disk loss but not against accidental deletion or corruption. LVM snapshots capture a point‑in‑time copy without shutting down the system. Plan the snapshot size and retention policy: a 5 GiB snapshot on a 20 GiB LV is a risky ratio, while 1 GiB may be sufficient. Snapshots consume free VG space, so monitor that budget closely. Integrate a cron job that creates a snapshot, triggers an off‑site backup, and then deletes the snapshot. This cycle turns a risky manual process into a repeatable routine.
Finally, document everything. A concise diagram that maps each physical disk to its RAID level, the corresponding PV, the VG, and the LVs that depend on it reduces the learning curve for anyone who touches the system later. Keep the documentation in a versioned repository alongside the server’s configuration scripts. With planning done, you can move from theory to implementation confidently.
Building a Software RAID Array for LVM Use
With the architecture mapped, the next step is to materialise the physical layer using mdadm. Begin by ensuring that none of the target disks are in use. If a disk already has partitions, wipe the partition table with sgdisk --zap-all /dev/sdX and recreate a fresh table. Use fdisk or gdisk to carve a single primary partition that covers the entire device. Set the partition type to 0xFD – the “Linux raid autodetect” flag – so that the kernel recognises it as a RAID member.
After every disk has an identical 100% partition, launch mdadm --create to bring the array online. A typical RAID‑10 command looks like this:
Immediately after the command, the kernel begins a sync process that writes parity and replicates data across the mirrors. Monitoring can be done with watch -n 1 cat /proc/mdstat; the output shows the progress percentage and the estimated time left. When the sync reaches 100%, the array is stable and ready to be treated as a regular block device.
Persisting the array configuration across reboots requires updating /etc/mdadm.conf. Running mdadm --detail --scan >> /etc/mdadm.conf appends the UUID and device list, ensuring that systemd rebuilds the array on boot. Without this step, a system restart could leave the array unassembled, making the underlying storage inaccessible.
Labeling the MD device adds an extra layer of human readability. Although LVM can store its own labels, a clear block device name helps when troubleshooting. Use blkid --set-tag=LABEL --value="raid10_main" /dev/md0 to attach a descriptive tag. When multiple arrays exist – say raid1_sys and raid5_data – consistent naming conventions prevent confusion.
At this point, you have a live RAID array that presents itself as /dev/md0. The next layer is LVM, which will treat that array as a physical volume. Make sure the array contains no leftover partitions; if you partitioned the underlying disks, the MD device will expose partitions like /dev/md0p1. For a raw array that covers the whole block device, reference /dev/md0 directly. Consistency is key: choose one representation and stick with it throughout the rest of the setup.
Creating Logical Volumes on Top of the RAID and Managing Them
Now that the RAID array is alive, it’s time to feed it into LVM. The first LVM step is pvcreate, which turns the block device into a physical volume. For a single array, the command is straightforward:





No comments yet. Be the first to comment!