CHAPTER 4

image

The File-paths to Success

In this chapter we take a look at what a filesystem is and how different operating systems try to make sense of them. We explain the difference between separate root filesystems and unified root filesystems and why knowing this up front might save you a lot of headache.

We then move on to looking at the filesystem presented to you on your Pi. We explore the standard layout and explain what goes where and what it actually does. We will also show off some simple commands that highlight some benefits of the Unix “everything is file” approach to system design.

Next we will show you how to move around your system and you’ll learn about fully qualified and relative paths. We’ll show you how to create directories and files (and then copy, move and delete them) and we’ll show you how to create the Linux equivalent of short cuts.

We’ll then round off the chapter with an exploration of Linux file permissions and show you how Linux decides who can do what to your files. We’ll also touch on users and groups and show you how to set ownership and file permissions on your own files.

What is a Filing System?

We had actually planned to start this section off with a dictionary definition of what a filing system is followed by our interpretation. Unfortunately as most of the definitions we could find basically said ”a filing system is a system for filing” we’re going to have to skip that part and go straight to our own definition. It does suggest however that although a filing system is easy to talk about, it’s a lot harder to define. For us though, we think a filing system is:

A way of sorting and categorizing data that makes finding something easier.

This definition covers everything from having a simple in tray (at least you know where the document is even if you can’t put your hand on it right away), to the way you view your contacts on your phone (that’s an alphabetical system right there) all the way to some of the more exotic systems such as the Dewey decimal system used to categorize non-fiction content at your local library.

Actually libraries are an interesting use-case for filing systems because most of them have at least two going on at the same time. Generally libraries tend to arrange fictional work by the surname (and then forename) of the author whereas the non-fiction section is sorted based on their Dewey Decimal number. By using two systems at once, they solve a fundamental problem. Storing books by the author’s name is great if you have a favorite author and want to find more of their books. When it comes to non-fiction though, you probably want to browse a selection of books on a given topic and more often than not you have no particular author in mind. If those books were sorted by author name you’d probably never be able to find anything.

Some libraries take this a step further by sorting fictional books first by genre (say Sci-Fi or Fantasy) and then within that category they sort by author name. This has some pretty neat benefits. You can still go straight to your favorite author as you’ll know the genre, but you can also browse for similar books with the same genre. If you are looking for the next best thing in the fantasy genre, you can simply stand in that section and flick through.

So far so good. Filing systems make finding things easier and we can see that it’s important to use the right filing system for the task otherwise you’d probably be better off with no filing system at all. But what does his have to do with computers? Well one of the major tasks for any computer is to process information and that means that it has to be able to store and retrieve information easily. Older computer systems did use the alphabetical system and it worked very well, but as soon as you started to build up a number of files, it started to get very messy and finding what you wanted became a challenge. This was addressed by having the ability to create a directory (often called a folder these days) in which you could put related files. This helped a lot, but the same old problem started to creep up on people. Yes, you could have a directory for accounts and another for tax returns, but what if you were an accounting firm and you had lots of accounts and lots of tax returns? Soon you were back to where you started as you either had a small number of directories with large amounts of files of a large number of directories with only one or two files in them.

The last change to filesystem structure which remains familiar to this day was to allow directories to contain other directories. This would allow you to have a great deal of flexibility to store content in the most appropriate ways. As always, there’s more than one way to arrange a given set of files, but by and large the system has served us well and with modern search technology, finding what we’re looking for is easier than ever.

More than one Filesystem

Here is where we hit a bit of a snag with this approach. A filesystem sits on top of some form of storage. That can be a hard disk, a USB stick, a DVD or any number of storage mediums. Each device is effectively independent from any other. This makes sense because you can take a USB stick from your laptop and then use it on your desktop machine. Clearly there is no link between the filesystem on the USB stick and your laptop because otherwise your desktop wouldn’t have all the information it would need to find the files.

Each filesystem has a ‘root directory’. This is the first directory on a device and it holds all of the other files and directories. Like the root of a tree, all the files and directories branch off from this central location. This does pose a bit of a problem though. If each device has its own filesystem and root directory, how do you easily present this structure to the end user? It turns out that they are two approaches to solving this problem. The first is separate roots and the second is a unified filesystem.

Separate Roots

This approach was adopted by Microsoft all the way back to the days of MSDOS and it’s still with us today. The approach is very simple. Each device has its own root entry and the user can then use this entry to locate the device of interest and then can simply navigate the filesystem as normal to find the file. Under Windows, each root is assigned its own letter. For historical reasons the system disk on Windows is referred to as ‘C:’ which you’ll often hear people refer to as ‘the C drive’. Floppy drives are assigned A: and B: (PCs with hard disks tended to have two floppy drives) and as few people had more than one hard disk, the CDROM drive tended to be assigned to D:.

This system is really easy to use and there is never any confusion about where a particular file is as you can tell simply by looking at the drive letter. However there are a couple of issues with this design. First, as root devices are assigned to letters and there are only 26 letters in the alphabet, you are limited to 26 devices. Hardly an issue to home users even today, but back when businesses were using large mainframes, they could have hundreds of such devices.

The second issue is that the user has to be aware of the physical location of their file. That can actually add complexity because it ties the location of the file to the physical device that the file is on. If a bigger disk is added to the system, and the files were moved to that new location, their root device would change. Any person (or software) that depends on a file’s location would need to be updated, and that’s never fun.

In short, although the system employed by Windows is simple and effective, it can become an administrative headache on very large and distributed systems. These days there are lots of ways around that and new technologies have emerged to hide this structure beneath the surface, so it really isn’t a huge issue for Windows systems today. However back when Unix reigned supreme, these technologies weren’t available and they went with the other approach, a unified filesystem.

Unified Filesystem

Linux being based on Unix has a unified filesystem. That means that unlike Windows that has multiple root devices, a Linux system has only one and it is always mounted on ‘/’ which is Unix-speak for the root directory. If there is only one root directory, then how does Linux handle additional devices, all of which we know contain their own filesystems?

The solution is to take the root directory of the new device and then attach it on an existing directory in the tree. This is known as mounting a filesystem. This allows Linux to have an almost unlimited number of devices as it can attach them anywhere in the existing tree. This means that you could mount one device and then mount another device on a directory inside the first device. From the user’s point of view, they can move through all these different directories as though they were all on one big device. The physical structure (that is all the disks and the raw data on them) is completely hidden from the end users. They have no way to tell they’ve crossed from one device to another.

This solves the issues highlighted with the separate root approach in that the unified filesystem is consistent regardless of what the underlying mechanism is. In fact you can mount network file shares and even virtual file systems in to the tree. Of course the benefits of the separate root approach have now become the downside for the unified approach. It’s no longer easy to see where things are going and this again can make things more confusing. Like Windows, Linux systems have evolved over time so that these problems aren’t quite as pronounced as they would otherwise be. However these benefits are mostly seen in the GUI and tend to be far less pronounced when using the command line.

The Mac has to be Different

As an interesting comparison, OSX on the Mac makes use of both approaches. Under the covers, OSX is a Unix based system and so it does have a unified file system. However unless you go to the command line (and most Mac users don’t), you will never see evidence of this. When you attach a USB stick to the Mac, it will mount it in a similar way as Linux, but it will show it to the user as though it were it’s own root device in a similar approach to Windows. However the Mac doesn’t assign a drive letter, it just sets up a unique name.

Bring it all Together

Admittedly that’s a fair bit of theory and you’re probably wondering where all the fun commands are that we promised to show you. You might also be wondering why we just bored you with all that and when (if ever) such information might be useful. The reason we’ve put emphasis on this theory up front is because when we started out with Linux a lot of the things that caught us out where preconceptions from using other operating systems. When one of us first installed Linux some 15 years ago, it took several hours of swearing before he figured out what “You need a root ‘/’ device meant”. This isn’t just a Linux thing, a dabbling with the BSD family of operating systems and its alternative way of handling disks caused us to wipe the wrong disk because we were thinking Linux and not BSD.

So hopefully this last section will save you from some of the pain that we went through when we first started out. You don’t need to memorize all the theory, but if you come away from this section with an appreciation that despite many similarities, Linux is not Windows or a Mac, then you will be ahead of the game.

Everything as a File

Now normally when we discuss this topic, we start by talking about hard disk partitions, because most people are familiar with those and there’s a nice easy mapping between the way Linux presents them and the physical device itself. As the Pi doesn’t technically have a hard disk (it pulls some strings to make the SD Card look and feel like one), this easy example isn’t actually available on the Pi. The thing is, it’s too good an example to pass up and so we’re going to stick with the classic. So without further ado, here are the files that describe the disk set up on one of our servers:

/dev/sda
/dev/sda1
/dev/sda2
/dev/sdb
/dev/sdb1

As you can see, these files all live in the /dev directory which is where Linux keeps all the device files. There’s quite a bit more to say on this, but we’ll come back to it a little later as trust us, it will make much more sense if we cover it last. So for now, let’s ignore the directory and focus on our collection of files. You’ll notice that in this example, all of the files start with the ‘sd’ prefix. This is because they are all ‘SCSI Disks’ and share the same driver. Out of interest, for machines that have IDE based disks, the prefix is ‘hd’ for ‘Hard Disk’.

So now we know that we have some SCSI based disks (SATA disks also show up as SCSI) but what else can we tell from those files? Well, you’ll notice that we have sda and we have sdb. The first SCSI disk on the system is assigned to sda and the second is assigned to sdb. It will probably come as no surprise to learn that the third disk will end up being assigned to sdc. As we don’t see an sdc in our example, we can assume that there are only two disks on this system.

Now we’re really getting somewhere. We can see that we have two SCSI disks on our system and now we just have one more thing to discuss - the numbers stuck on the end of the names. In this case, the number refers to the partition number on the drive. The first partition is 1 and the second partition is 2. Nice and simple with no real surprises we’re sure. With this additional information we cannot only specify which disk but also which partition on that disk.

This does of course beg the question of why you should care. Surely there must be an easier way to find out what disks are attached to the system? Well you’d be right, but we’re not looking at this to discover what’s connected to the system, rather we’re looking at how we can specify a particular device that we want to access. Remember, everything in Linux is a file and so even physical devices are represented in that way. When we want to access a hard disk or we want to access a particular parition we do so be accessing the relevant file.

Some real world examples would be useful here. When you want to make changes to the partition table on a hard disk, you use the ‘fdisk’ tool. The fdisk tool will of course need to know which disk you want to work. Assuming you want to partition your second disk, the command would look like this:

fdisk /dev/sdb

Because we’re partitioning the drive itself, we want to refer to the whole device and /dev/sdb lets us do that. But let’s say we have done our partitioning and we have one big partition that we want to format for Linux to use. The command for that is mkfs.ext4 (on modern Linux distributions) and as before, you’re going to need to tell the command what you want to format. In this case we want the first partition on our second disk so the command would look like this:

mkfs.ext4 /dev/sdb1

Again, this makes sense as we’re referring to a specific partition on a specific device. No big leaps of faith here, but we can now share a little secret with you. The mkfs.ext4 command, doesn’t really care what filename you give it. If you had forgotten to put the ‘1’ on the end, it would have happily formatted the entire device. This is a case of “everything as a file” at work - because everything is a file and files are all accessed in the same way, our tools simply read and write to files - they neither know or care what it is they’re writing to!

This concept takes a bit more effort to really understand. Under the covers Linux manages all of these different devices for us, each with its own specific requirements and drivers. To keep things simple for application developers (not to mention the users), Linux hides all of this complexity and instead presents each device to us as a specific file. This brings us nicely back to the /dev directory (we told you it would come up). The files in that directory are special in that they don’t really exist, at least not as physical files that are stored anywhere. Linux creates a virtual filesystem that it then populates with files for the devices that are available on a given computer. As our Pi doesn’t have any SATA or SCSI disks, you won’t find any /dev/sd files on your Pi.

Filesystem Layout

Okay, now let’s take a look at how the filesystem is laid out on your Pi. You’ll find that the structure is very similar to what you’d find on any Linux system so not only will this help you master your Pi but you’ll also be able to navigate servers as well. So let’s get stuck in…

/ (Root Directory)

This is the root directory (not to be confused with the root user’s home directory) and represents the top of the file hierarchy. Everything goes somewhere in or under this directory, no exceptions. As we discussed earlier in the chapter, Linux has a unified hierarchy and this is where it all starts.

/root

This is the home directory for the root user and spends a large amount of its time being confused with the root directory (that is ‘/’). It can get more confusing because if you are in the root directory and someone asks you to “go into the root directory”, which one do they actually mean? Often you can easily tell from context, but if not, don’t be afraid to ask for clarification; you’d be surprised how often this one comes up.

/etc

/etc is arguably one of the most important directories on your system. It contains all of the configuration files for not only your system but also the applications that you might have installed (such as the Apache web server). Many users new to Linux take care to back up their applications but often forget about the configurations stored in /etc. At the end of the day the application can usually be replaced quite easily, but getting that config right from scratch can be a real headache. This is a directory you want to take great care of.

/proc

/proc is a virtual filesystem that the kernel uses to provide easy access from userland tools. Everything you ever needed to know about the system state or running processes can be found in /proc. Two common examples are the CPU configuration (stored in /proc/cpuinfo) and memory usage (stored in /proc/meminfo). Most of this information is read only which makes sense as it is just a virtual representation. However some files do allow communication to go both ways and you can potentially use it to tweak kernel and system settings whilst the machine is running.

/var

/var is generally where you’ll find files created by your applications and the system itself. For example most applications will store their logs in /var/log/ and many will store the lock files in /var/run/. The apache web server used to use /var/www/ for storing a website’s files. In fact we highlighted this feature in a previous book. In modern distributions this is no longer the case with such files tending to be located in the /srv/ directory.

/boot

Traditionally the /boot directory actually lived on its own small partition on the first hard disk. At the time, the majority of computers were unable to boot from a single big partition, so it was very common to see these split out. On modern machines, this is no longer a problem and so this directory is often included in the root directory directly. As its name suggests, it holds the key files needed to boot a system, including the bootloader and the Linux kernel itself.

/bin and /sbin

These locations store user and administrative programs respectively. Usually a normal user only has /bin in their path and so they effectively can’t see applications in /sbin. There are some applications that the user can access even though they’re generally used only by administrators, but the user needs to know where they are. Generally this doesn’t pose a problem and you won’t ever need to go looking for anything.

/dev

We’ve already touched on this directory in the “Everything as a file” section. It contains a file for each device or sub-device on the current system and provides a way for system tools (and of course users) to easily access the hardware on a particular machine. Apart from the disk devices that we touched on earlier, there are also devices for graphics cards, sound cards, virtual terminals and more.

Image Note  You won’t find network cards in the /dev directory as they’re considered a special case. To find information on network interfaces, you need to use the ‘sudo ifconfig -a’ command. This will list all the network devices that Linux knows about.

/home

Traditionally all user home directories would be stored under /home and this was often its own partition, disk or network share. The idea was to keep user data separate from system data and applications. Most Unix based systems still follow this rule but there are occasions where you will some home directories living elsewhere i.e. the root user’s home directory. The Mac for example stores home directories in /Users and some businesses will put different users in different locations based upon their needs. While /home generally only contains home directories, there’s no requirement that home directories actually reside here.

/lib

This directory contains library files that are needed by various applications. Libraries allow functionality to be packaged up and then shared by other applications. A good example might be a database driver. So that these applications can find those libraries, they need to be installed in a known location and with a known file name. It’s rare that you’d need to poke around in this directory, but if you do you should be careful because breaking things in here could affect the stability of your system.

/lost+found

We mention this one for completeness but it’s not really part of the filesystem structure per se. Rather it’s where files get placed when the filesystem loses track of them. For example if a disk were to be damaged, you would need to run a disk repair. Some files may be recovered but for various reasons, it may be impossible to determine where that file came from. If that happens, Linux will place those lost souls in this directory. We’ve never had cause to look into this directory and most likely you’ll never need to look in there either. Lastly, this directory can turn up in the root directory of any mounted filesystem, not just in / because each filesystem has to track its lost souls independently.

/media

This directory is relatively new to Linux and was added to make a clear distinction between mounted external devices (such as those in /mnt) and removable media such as USB sticks, cameras and media players. These are usually handled automatically under Linux and so usually you won’t manually add or remove anything in this directory.

/mnt

Short for mount, the /mnt directory was traditionally where you would mount additional filesystems. If you wanted to attach a network share or an external hard disk, you would create a directory in /mnt and mount in there. Floppy disks and CDROMS were also generally found here as /mnt/floppy and /mnt/cdrom respectively. However in recent years this directory has mostly fallen by the wayside.

/usr

This is where most of the software on the machine ends up and so it is often the largest directory on a server (at least if you don’t count user home areas). Although it’s useful to know where the software lives on your machine, because everything is handled automatically for you, everything in here should ”just work”.

/opt

This directory is a jack of all trades. On some systems it is packed full of applications and on others it remains completely empty. It’s usually used for third party software and applications. For example the Oracle database server installs in /opt/ by default. You probably won’t find much cause to use this directory, but be careful if you do because it’s easy to forget things that are sitting in /opt/ when it comes to taking backups and so forth.

/srv

Another relative newcomer, the /srv/ directory is the designated location for storing data for services that serve files. This is the new home for the Apache web server for example. Although this directory seems to be present, some applications still don’t make use of it either out of custom or simply because everyone is used to the contents being somewhere else. If you’re looking for things that used to be in /var/ this is probably a good place to look next.

/sys

This directory contains system information and like /proc it is stored only in memory, as it’s a virtual system. It doesn’t seem to get an awful lot of use and we ourselves have never even had to look inside this directory. However, we have it on good authority that /sys is very useful and something we should be very happy to have.

/tmp

This directory is the computers scratch pad and all sorts of applications create files in this directory. It is used whenever temporary file storage is used (such as during processing). Historically this directory was supposed to be emptied on a reboot, but in practice it rarely was. With newer distributions moving to systemd, /tmp will almost certainly be vaporized on a reboot. Be very careful if you’re putting anything in here, as it is unlikely to survive you rebooting your Pi.

Wrapping it up

That pretty much sums up the key areas on the Linux filesystem. No doubt there are other little nooks and crannies that you might come across as you look around your system but they will probably be a sub set of these. Remember, although this structure is followed by most systems, most doesn’t necessarily mean all and you might come across some differences or where directories are used in other ways.

Putting it to Work

We’ve covered a huge amount so far in this chapter. We’ve looked at how filesystems work and how Unix systems (and Linux in particular) have adopted a unified approach. We looked at “Everything as a file” and we showed how these special files and filesystems fit into the grand scheme of things. We wrapped all of that up with an overview of what goes where on the filesystem and what they do. What we haven’t done yet is actually used the filesystem ourselves.

So far we have been pretty much “hands free” but that’s all going to change in this section. We’re going to start off by showing you how to create directories and move about on the file system. Once we’ve given you the power to create, we’ll then give you the power to destroy and you’ll be able to remove files and directories at will (not always a good idea mind). We’ll wrap this section up with a quick overview on Linux file permissions and how to read and set them. Let us begin!

Where are we? Using pwd

The first thing we need to show you is how to figure out where you are on the system. The easiest way to get your bearings is to look at the command prompt. We talked about this in the last chapter. As a quick refresher here is what the command prompt looks like while we’re sitting in our home directory:

[pi@raspberrypi ~]$

The tilde (aka the squiggly line) is short hand for the current user’s home directory. We’ve highlighted this feature already when we first introduced it so we won’t go through it again, but let’s see what it looks like when we’re in the /usr/lib directory:

[pi@raspberrypi lib]$

Well this could pose a problem. We know we’re in /usr/lib but the prompt only shows us the last part of the path. This is actually a good idea because although you can set your prompt to show the whole path (and it seems like a good idea at the time) you will soon get irritated when the path takes up most of the screen. That doesn’t solve the problem though. For all we know we could be in /usr/lib or /danger/lib. Needless to say this could have unpleasant consequences. So what we need is the ‘pwd’ command. We touched on this useful tool in the last chapter as well, but in case you missed it, here it is again:

[pi@raspberrypi lib]$ pwd
/usr/lib/
[pi@raspberrypi lib]$

This tool is very useful for telling us where we are but it doesn’t tell us anything about what’s in the same location with us. It’s like being blindfolded and told you’re standing in the kitchen; it’s a great start, but you’re still effectively blind. You’d certainly like to know who and what is in the room with you and for that we need the ‘ls’ command. This command does have a large number of options though and so we will only cover the most common ones, those that we use every day. Actually you can just remember them as recipes because you’ll often just pass the same options time after time (or at least we do).

First, let’s get back to our home directories with:

[pi@raspberrypi lib]$ cd ~

What’s in here with us? Using ls

Now let’s see what’s in here with us:

[pi@raspberrypi ~]$ ls
Desktop    python_games
[pi@raspberrypi ~]$

Although we can’t easily show it in the book, Desktop and python_games are both colored deep blue. This tells us they are directories. At present we don’t have any files in this directory, or do we? Actually we do but by they are considered hidden files. Under Linux any file that begins with a period (or full stop) is considered hidden. There’s nothing special about the files themselves and more often than not they are various config or temporary files that various applications have created. We generally don’t want these files cluttering up our display, so ls and friends don’t show them. We can however force ls to show us those files with the ‘-a’ flag like so:

[pi@raspberrypi ~]$ ls
.  ..  .bash_history  .bash_logout  .bashrc  .profile  python_games
[pi@raspberrypi ~]$

Creating Files to Play with, Using Touch

For now we’ll leave the -a flag alone and create our own files to play with. As we haven’t covered how to create and edit text files yet (we will show you how to do that in Chapter 7) we’ll introduce you to another little tool called ‘touch’. Under Linux, files have two timestamps - the creation timestamp and the last modified timestamp. These allow you to see when a file was created and when it was last updated. This is useful from an administration point of view because you can see which files are actively being used, but various tools (such as backup scripts) use this timestamp to figure out if a file has changed since they last looked at them. Sometimes it is useful to be able to update that timestamp without changing the contents of the file and this is where touch comes in. It touches the file which updates the timestamp but if the file doesn’t currently exist, then touch will create it for you. In other words it’s a great tool for creating empty files. So, let’s start off by creating a couple of originally named files:

[pi@raspberrypi ~]$ touch raspberry
[pi@raspberrypi ~]$ touch pi
[pi@raspberrypi ~]$ ls
Desktop pi python_games raspberry
[pi@raspberrypi ~]$

And that’s all there is to it. As you can see from the ls that followed it, we now have two additional files. This time they are gray in color which tells us that they’re normal files. By coloring the entries for us, ls makes it much easier to see what we’re doing. For example, any file that is executable, will be colored in green, but we’ll come back to file permissions later in this chapter.

So far we only have two files, but if we had ten or twenty, it would start to get a bit crowded in here. The way to handle that is of course to create directories to store our files (and potentially other directories) in and that’s what we’ll look at next.

Image Note  Directories and folders are basically the same thing. Originally called directories, Microsoft started referring to them as folders which they felt was a better description. Although Linux used to use the term directories, as it has become more of a desktop operating system with people moving over from Windows, the folder terminology has become increasingly common.

Somewhere to Store our Files, Using Mkdir

To create a directory, we use the ‘mkdir’ or ‘make directory’ command. Unsurprisingly this will create a new directory. However if there is a file with the same name or the directory already exists, you will get an error message. Let’s create a directory called ‘pifun’ to store our files:

[pi@raspberrypi ~]$ mkdir pifun
[pi@raspberrypi ~]$ mkdir pifun
mkdir: cannot create directory 'pifun': File exists
[pi@raspberrypi ~]$

As you can see, trying to create the directory twice will cause an error. Don’t be distracted by ‘File exists’, this could actually refer to either a directory or a file. Another quick ls and we’ll see that things are moving along quite nicely:

[pi@raspberrypi ~]$ ls
Desktop pi pifun python_games raspberry
[pi@raspberrypi ~]$

Making Use of Our New Directory, Using the mv Command

So now that we have our directory, let’s tidy up the mess we’ve been making. As we want to move the files into our new directory (rather than just create a copy of them), we’ll need to use the ‘mv’ (or ‘move’) command. This command is a bit more complicated than the ones we’ve covered before as it takes two arguments rather than one. This makes sense though because not only do we need to tell mv what we want to move but we also need to tell it where we want it to move the file to. As with most file commands under Linux, the first argument is the source and the second argument is the destination. Let’s move those files now:

[pi@raspberrypi ~]$ mv pi pifun
[pi@raspberrypi ~]$ mv raspberry pifun
[pi@raspberrypi ~]$ ls
Desktop pifun python_games
[pi@raspberrypi ~]$

So far so good. Now we want to make sure that our files arrived in one piece. There are two ways we could do that. We could go into the pifun directory with ‘cd’ and then run the ls command or alternatively, we could simply give ls the path to the directory that we want to look into. We’ve already used the first approach, so let’s try give the second approach a try:

[pi@raspberrypi ~]$ ls pifun
pi raspberry
[pi@raspberrypi ~]$

Time for Some Cloning, How to use the cp Command

Admittedly this isn’t very exciting, but as a newly minted administrator, you’ll be spending a lot of time moving files about and checking where things are. So far you’ve learned how to move a file but what if you just want to copy a file instead? When taking a back up or getting a selection of files ready to send to a friend, you want to actually keep the originals. For this we use the ‘cp’ command which you’ve probably already correctly guessed is short for ‘copy’. Let’s move into our new directory and copy one of our files:

[pi@raspberrypi ~]$ cd pifun
[pi@raspberrypi pifun]$ cp pi pi2
[pi@raspberrypi pifun]$ ls
pi pi2 raspberry
[pi@raspberrypi pifun]$

That worked well, we now have pi and pi2 just as we expected. Let’s try the same thing again only this time, we’ll copy a directory:

[pi@raspberrypi pifun]$ mkdir moarpi
[pi@raspberrypi pifun]$ cp moarpi moarpi2
cp: ommiting directory 'moarpi'
[pi@raspberrypi pifun]$

That time it didn’t quite go according to plan. The reason our copy attempt failed was because by default ‘cp’ will only copy individual files. It won’t copy entire directories. The reasoning behind this is that when you copy a directory, you copy everything within it, including all of its files and directories. This could be a lot of data and could take up a lot of space as well as a lot of time to complete. Forcing us to be explicit about our intentions (which will soon become second nature to you) means that when we mean to copy a single file but accidentally pick a directory, we will get stopped before any copying takes place.

That’s all well and good but what if you really did want to copy that directory? When we first looked at the copy command we mentioned tasks like taking backups and let’s be honest, you’re much more likely to want to back up a directory than a list of specific files. We can get the behavior we’re looking for by telling the ‘cp’ that we want to copy recursively. This will then copy the directory and anything within that directory to the destination. We specify this by using the ‘-r’ flag like so:

[pi@raspberrypi pifun]$ cp -r moarpi moarpi2
[pi@raspberrypi pifun]$ ls
moarpi moarpi2 pi pi2 raspberry
[pi@raspberrypi pifun]$

Unlike the copy command, when you move a directory there is no need to specify that you want to do so recursively, this is because moving a directory without its contents wouldn’t make an awful lot of sense.

The Power to Destroy, Using the rm Command

So far we’ve shown you how to create files and directories and how to copy and move them about. Now we’re going to show you how to destroy those files with the remove or ‘rm’ command. It goes without saying that the rm command is one of the most dangerous in your arsenal. It can quite easily destroy an entire server if you’re not careful and we know people who have accidentally done just that.

For a change of pace, let’s look at how we would delete an empty directory. We can use the ‘rmdir’ command for this which is short for ‘remove directory’. Now the catch with this command is that it will only delete directories that are completely empty. If there is even a single file inside, this command will fail. This makes it very safe to use, but not all that practical as generally when you delete a directory, you also want to remove all of its contents. Let’s kill two birds with one stone:

[pi@raspberrypi pifun]$ rmdir moarpi2
[pi@raspberrypi pifun]$ rm moarpi
rm: cannot remove 'moarpi': Is a directory
[pi@raspberrypi pifun]$

We were able to delete moarpi2 with ‘rmdir’ because the directory itself was empty, but when we tried using the rm command, it refused to cooperate. This is because the rm command was written with similar reasoning to the copy command. As removing a directory is far more dangerous than simply copying it, this is probably a good thing. We can use the same ‘-r’ flag to tell rm to delete recursively:

[pi@raspberrypi pifun]$ rm -r moarpi
[pi@raspberrypi pifun]$

Success! Now sometimes when you try this, especially on a large directory with lots of files and sub-directories, you can end up with lots of issues that cause rm to give up. For example, some files might be write protected. You can suppress these errors by using the ‘-f’ flag. This means ‘force’ and is akin to saying “Damn the torpedoes! Full speed ahead!”. That sounds like a great idea until you stop to think what would happen if you ran this command (which you should NEVER do):

[pi@raspberrypi pifun]$ rm -rf /

If you accidentally run that command, rm will proceed to delete absolutely everything on your system. If you happen to have a USB hard disk attached or you’ve mounted some network shares, then you’re in real trouble because rm won’t confine itself to your internal disks - it will crawl through the entire tree deleting everything in its wake. This is one of the reasons why you should use a normal user account for day to day tasks. Your own user will not have sufficient privileges to delete anything critical to the system - but even then, chances are high that you can still damage all your attached media. You need to be very very careful whenever you use the rm command and you’d better double and triple check it because Linux will assume you know what you’re doing and it won’t ask for confirmation!

The ‘rm’ command can also remove files simply by providing the path to the file. You don’t need to use the ‘-r’ flag for this operation so you can simply do:

[pi@raspberrypi pifun]$ rm pi2
[pi@raspberrypi pifun]$ ls
pi raspberry
[pi@raspberrypi pifun]$

And that in a nutshell is how you move about and manipulate the file system.

Fully Qualified and Relative Paths

In Linux there are two ways to specify a path. You can either give a fully qualified path that starts with a forward slash, or you can give a relative path which starts with either a filename, directory name, a dot or two dots. Strange though these might sound, they are both just ways of providing a specific location to your programs.

A path is considered fully qualified when it starts from a fixed reference point i.e. the root directory. Regardless of where you are on a system, a fully qualified path will always point to the same location. It’s like the old bell tower in the middle of town, if you give anyone directions using that as your reference point, you have a common anchor that both you and your friend know how to reach.

On the other hand a relative path depends on your current location to make sense. You can specify paths using ./ to mean the current directory or ../ to mean the next directory up. If you had a path that looks like ../../test.txt, this would only work from a few specific locations. It’s nice and short as well as being easy to type. The same file might be accessible with /home/pi/test.txt. Unlike the relative path, this one can be used from anywhere on the file system without any problem.

So when should you use one or the other? The answer is, you should use whichever option is the most convenient or makes the most sense for the task. Sometimes it’s faster or easier to use a fully qualified path. Other times you are buried deep down in the tree and writing fully qualified paths would be tedious at best and totally confusing at worst.

Users and Groups

We’re only going to touch on the basics of users and groups here so that you know enough to understand the file permissions section that is coming up next. Users and groups are key to the way Linux secures your files and you’ll need to know about them before we move on to the next section.

In the Unix way of thinking every person has their own username. A username identifies a particular person or entity (for example a web server might have its own username) on a particular system. So far we’ve spent most of our time as the ‘pi’ user but we’ve also seen that we can become the ‘root’ user. Your username is the key that Linux uses to identify you as you.

Groups are similarly straight forward. Each user belongs to one primary group, but may actually be a member of any number of groups on the system. On a university system, a student’s username might have its own private group (standard practice on Red Hat and Debian systems these days) but they might also belong to a student group and a research group. They might also belong to a group specific to their department. Groups are useful to us administrators because we can group a selection of users together and treat them as a single entity. This makes things such as file permissions much easier to manage.

When you create a new user on your Pi, you will automatically create a group with the same name. On some systems, users would by default join a user’s group - but as you’ll see in the next section, this could lead to accidentally giving people access to files which they shouldn’t have. Because a private group is by definition private, no one else will be a member and so no one can gain access to your files just because they happen to be in the same group as you. This is why on modern systems you’ll usually see that the owner and group of a file happen to be the same.

We cover users and groups in more depth in Chapter 8 on BASH.

File Permissions

File permissions allow you to express who you want to be able to access your files and what exactly they’re allowed to do with them. There are three different permissions that you can set. The first is whether or not someone can read your file. The second is whether they can write to your file and third is whether they can execute it (i.e. run it like an application).

Of course just being able to set these permissions on a file isn’t particularly flexible. You might want to give access to only a certain group of people and restrict everyone else. This is where users and groups come into play. On Linux there are effectively three roles that a given user might fall into. The first is user and refers to the owner of the file. The second is group which refers to the group that owns the file. The last is technically known as world but it is also often referred to as other.

Each role gets its own combination of permissions, that is you can define whether any of those three roles can read, write or execute your files. We’re going to show you how to do just that but before we do that, we need to show you how to see what permissions are actually in effect and so now is a good time to show you how to use the ‘-lh’ option for the ‘ls’ command. Let’s try running it now:

[pi@raspberrypi pifun]$ ls -lh
-rw-r--r-- 1 pi pi 0 Oct 7 16:29 pi
-rw-r--r-- 1 pi pi 0 Oct 7 16:58 pi2
-rw-r--r-- 1 pi pi 0 Oct 7 16:29 raspberry
[pi@raspberrypi pifun]$

The ‘-lh’ argument specifies that we want ls to show us a list of files (-l) and that we want file sizes to be in human readable format (-h). Without the human readable flag, ls will show us all sizes in bytes which when you’re dealing with large files is not very easy to read. It doesn’t really matter in this example because our files are empty anyway.

There are two things that we’re really interested in as far as file permissions go. The first describes the permissions currently in force and the second shows us which user and group owns the file. Let’s break it down for the raspberry file:

-rw-r--r-- 1 pi pi 0 Oct 7 16:29 raspberry

The file permissions part is:

-rw-r--r--

There are ten possible slots in that list and for now, we’re only interest in the first nine. If a particular permission is missing (or in the case of the first hyphen where it’s a normal file), ls will show a hyphen. A normal file always has a hyphen in the first slot. If it was referencing a directory, the first slot would be a ‘d’ to highlight that it’s not a file. This slot can also be an ‘l’ if the file is a link (or shortcut) and we’ll show you how to use these in the next section. For now though we can ignore the first slot and focus on the final nine slots.

The remaining nine slots are grouped in threes to give us three groups. These correspond to user, group and world roles respectively. Each of three slots in each group represents a specific permission, read, write and execute. Where the permission is set, you will see a letter, but when the permission is not set, you will get the hyphen. If our raspberry file had all permissions set, it would look like this:

-rwxrwxrwx

Let’s split that out a bit so it’s a little easier to read:

-   rwx   rwx  rwx

So if we look at the first three, we can see that the owner has read, write and execute permissions. We can also see that group and world also have full permissions. To interpret what these permissions mean though we really need to know who actually owns the file. Let’s take a look at the part of the line that shows who owns the file:

pi pi

Well, that wasn’t too painful. Remember on modern Linux machines, users have their own private groups which are named after the user. That’s what we’re seeing here. The first ‘pi’ refers to the owner which is of course the pi user. By default when a file is created the group ownership is set to the user’s default group. In this case that would be our private group which is also called ‘pi’. So if we look at our original file entry:

-rw-r--r-- 1 pi pi 0 Oct 7 16:29 raspberry

we can read this as “The pi user has read and write privileges. The group has read privileges and world also has read privileges”. Linux applies these permissions in a specific order based on who you are. If your username matches the owner of the file, then the user permissions will apply when you try to access it. If you’re not the owner but you are in the same group as the file, then Linux will apply the group permissions to you. If you’re neither the owner or in the same group, Linux will apply the permissions from the world role. In our example though, the permissions for world and group are identical, so if you’re not the owner, regardless you will get the same level of access. However only the owner can actually save changes to the file. There is one exception to this rule however - the root user. The root user is effectively immune to file permissions and can change permissions and file ownership for any file on the system, regardless of who the owner actually is.

The execute permission allows you to execute a file as a program. This is a security feature so you can effectively stop people executing commands that you don’t want them to. However you have to be careful because if a user can read your file, there is nothing stopping them from copying it to their own file and making that file executable. The execute bit has another purpose when it comes to directories. Obviously you can’t execute a directory so instead this flag means that the user (or group or world) is allowed to browse the directory, that is, they can do an ‘ls’ on it. They might not be able to access anything in the directory but they can still have a peak and see what’s hiding in there. If you give a user permission to read a directory but not to execute, they will be able to read a file inside, but they wouldn’t be able to browse for it, they would have to know the file name in advance.

That’s really all there is to it. There is a feature called “Extended File Attributes” but we’re not going to cover those in this book. They provide a great deal more flexibility than the standard model but are similarly more complicated. If you’re used to how Windows handles permissions, then you’ll find that extended file attributes are a bit more inline with what you’re used to.

Setting File Permissions

First we’re going to look at how we can set file permissions and for this, Table 4-1 will be very helpful.

Table 4-1. Setting Permissions

Role

How to apply

What to apply

u - user

g - group

o - other /world

a - all

+ - add

- - remove

= - explicitly set

r - read

w - write

x - execute

We will be using the chmod command which changes file permissions. You can specify permissions as a combination of the above values. These can be combined in three different ways. You can add permissions, take permissions away and explicitly set permissions. The difference is that the first two will leave all the other permissions intact after they’ve done their thing. If you explicitly set permissions, any unspecified permissions will be revoked.

Let’s start out removing all permissions from everyone for our pi file:

[pi@raspberrypi pifun]$ chmod a=  pi
[pi@raspberrypi pifun]$ ls -lh
---------- 1 pi pi 0 Oct  8 03:52 pi
[pi@raspberrypi pifun]$

We can see that all file permissions have been removed from the file, but how does the command actually work? Well permissions are specified with three parts. Who you want the change to apply to, how you want the change applied and what you want the change to be. In this case we applied the changes to ‘a’ which is basically short hand for ‘ugo’ i.e. it applies the changes to everyone. We used the equals sign which means we want to explicitly set the permissions and then we didn’t actually supply any permissions. If a permission is absent it is assumed not to be set and so in our example by not supplying any permissions we effectively revoked all of them regardless of what they were previously.

Seeing as it’s our file, we want to give ourselves full permissions. Admittedly the execute bit is not much use in this case (but you’ll find it invaluable when you start scripting - see Chapter 8) but we’re going to give it to ourselves anyway. We can do that with this command:

[pi@raspberrypi pifun]$ chmod u+rwx pi
[pi@raspberrypi pifun]$ ls -lh
-rwx------ 1 pi pi 0 Oct  8 03:52 pi
[pi@raspberrypi pifun]$

Let’s pick this command apart. We specified that we wanted to change only the user’s permissions, that we wanted to add them (not that it mattered in this case because we’d removed all permissions before hand so an equals sign would have done the same job) and that we wanted read, write and execute privileges. To wrap up this example let’s restore read access to the group and world roles:

[pi@raspberrypi pifun]$ chmod go+r pi
[pi@raspberrypi pifun]$ ls -lh
-rwxr--r-- 1 pi pi 0 Oct  8 03:52 pi
[pi@raspberrypi pifun]$

Just for completeness let’s step through this last example. We want to apply the permissions to the group and other roles, we want to add the permissions to what is already there and we want to grant read privileges. And that is pretty much it for setting file permissions. There is an alternative style that uses numbers rather than letters to specify what permissions you want to set. To get the same effect as what we have already (i.e. it doesn’t have any affect) we would use:

[pi@raspberrypi pifun]$ chmod 744 pi
[pi@raspberrypi pifun]$ ls -lh
-rwxr--r-- 1 pi pi 0 Oct  8 03:52 pi
[pi@raspberrypi pifun]$

In this system, each permission has its own value. Read is 4, write is 2 and execute is 1. To set the permissions, you add up the numbers to get the total. For example to set all permissions you’d add 4 and 2 and 1 to get 7. For read you would simply do 4 plus 0 plus 0 which of course gives you 4. Put them all together we get 744. This syntax is the original syntax used on most Unix systems. Using the letters is a relatively new idea but at the end of the day they both achieve the same results. The main benefit of the new syntax is that it’s a lot clearer and easier to pick up. Personally we tend to use the number style but that’s only because we’ve been doing it for so long and it’s become second nature to us. You should feel free to use whatever system you feel the most comfortable with.

So now you can manipulate permissions like a guru but we are still missing the second part of the puzzle, we haven’t shown you how to change ownership of the file. This is actually a lot less common than you might think, far less common than tweaking the occasional file permission that’s for sure. There’s also another little wrinkle. A normal user (that is anyone other than root) cannot actually change which user owns the file. The reason for this is that if you accidentally assign the file to another user by accident, you have no way to actually get that file back again. Of course being all knowing and all seeing the root user can change the ownership for any file on the system.

We can simulate “rootness” by using sudo. As discussed earlier this little command acts as a filter of sorts. It always runs as root, regardless of who executes it, and executes commands as root on their behalf. To prevent any shenanigans, sudo will check the user and the command they are trying to run against an approved list. If you’re on that list (and the pi user is) you can execute all sorts of magic without every technically becoming root yourself.

To use sudo all we have to do is prefix the command with the ‘sudo’ command. That’s pretty much it. When you first run sudo it will ask you for a password. This is the password for your particular user and not the password for the root user. The aim is that you can prove that you are the pi user and then sudo will check to see what the pi user is allowed to do. This means that if you have lots of users on your computer, and you want to let them do some more powerful commands but don’t want to give them root access, you can set up sudo to allow them to execute a specific command without having to hand over the keys to the mansion.

Let’s start off by trying to give the file to the root user using the ‘chown’ (or change ownership) command:

[pi@raspberrypi pifun]$ chown root pi
chown: changing ownership of 'pi': Operation not permitted
[pi@raspberrypi pifun]$

The operation not permitted is Linux’s way of telling us to get stuffed. To pull this off we’ll need root privileges so let’s put sudo to work for us and run the command again:

[pi@raspberrypi pifun]$ sudo chown root pi
[pi@raspberrypi pifun]$ ls -lh
-rwxr--r-- 1 root pi 0 Oct  8 03:52 pi
[pi@raspberrypi pifun]$

Success! We were able to change the owner to the root user. This will work with any valid user and any file or directory that you wish to change. There is another command called ‘chgrp’ which you won’t be surprised to know allows you to change which group owns a particular file. Now there is a bit of an issue with this command as well. Although normal users are allowed to change the group, they are only allowed to change it to a group of which they’re a member. If your user is only a member of its private group, then you won’t be able to do an awful lot with this command either.

Once again it’s root and sudo to the rescue. As root can do whatever it pleases, it can change the group accordingly. As it happens, it looks an awful lot like our last command:

[pi@raspberrypi pifun]$ sudo chgrp root pi
[pi@raspberrypi pifun]$ ls -lh
-rwxr--r-- 1 root root 0 Oct  8 03:52 pi
[pi@raspberrypi pifun]$

And there we go - the file now belongs to the root group and the root user. When you do have to change file ownership though it’s much more common to need to change both the user and the group that owns the file. It’s relatively rare to change just the group (we can’t remember when we last used the chgrp command). The chown command provides a shortcut that allows us to set both a new owner and a new group at the same time. Let’s use this shortcut now to return the ownership of the file to our pi user. We’ll still need to use sudo of course:

[pi@raspberrypi pifun]$ sudo chown pi:pi pi
[pi@raspberrypi pifun]$ ls -lh
-rwxr--r-- 1 pi pi 0 Oct  8 03:52 pi
[pi@raspberrypi pifun]$

With the shortcut you just specify the user and the group separated by a colon. One last thing we need to cover with these commands is they only operate specifically on the file you provide. If you provide a directory rather than a file, it will set the permissions on the directory but those changes won’t filter down through to all the files. Sometimes that’s what you want, but more often you want the changes to propagate. Unlike the cp and rm commands that use -r, these two commands use -R, (that is they use the capital letter rather than the lowercase letter). Be careful when you use this because often file permissions are precisely set and if you waltz through obliterating them with your new version, there’s no way to undo the damage. As always double check what you’ve typed before you press the enter key.

Shortcuts and Links

Linux allows you to create links (or shortcuts) by using the ln command (short for link). There are two types of link, one is called hard and one is called soft. A soft link is more like what you might see on a Windows system after using the “create shortcut” feature. It creates a file that is just a pointer to the real location of the file elsewhere on disk. The hard link however is more interesting. When you use a hard link you have effectively created two names for the same file. That might sound like semantics, and with most modern applications being able to follow a soft link there’s rarely a need to use a hard link. Hard links are also restricted to a single file system and that file system has to support them (most Linux filesystems do). The main benefit of a hard link is that the hard link is completely indistinguishable from the original file, they are simply two names pointing to the same location. To avoid confusion and to allow links to work across filesystems, you should use a soft link.

Let’s do a quick example to show this in action:

[pi@raspberrypi pifun]$ ln pi pi1
[pi@raspberrypi pifun]$ ln -s pi pi2
[pi@raspberrypi pifun]$ ls -lh
-rw-rw-r-- 2 miggyx miggyx    0 Oct  8 08:14 pi
-rw-rw-r-- 2 miggyx miggyx    0 Oct  8 08:14 pi1
lrwxrwxrwx 1 miggyx miggyx    3 Oct  8 08:33 pi2 -> pi
[pi@raspberrypi pifun]$

Let’s have a look at what we’ve got here. pi and pi1 are identical in every way but that’s not really a surprise because apart from the name, they are the same file. You will notice that the number after the file permissions block now shows ‘2’ for pi and pi1. This tells us that there are currently two filenames pointing at this particular file. Also not much of a surprise seeing as we’re the ones that created the second entry. Much more interesting is pi2 which we created with a soft link. First we can see that the file permissions have all been set. This isn’t a problem because when Linux follows the soft link to the real file, it’s the real file’s permissions that will be used to define who can access the file. The soft link really just points out the location. We can also see that the filename itself is a bit different. It shows the filename that we originally give the soft link but it also shows the file that the soft link points to. In this case the file happens to be in the same directory but it could just as easily have been anywhere on the system.

That’s really all there is to it for creating links. They can be useful when you want to make one directory or file appear to be in a new location. For example a program might write to a data directory and you want to move that directory on to a bigger disk. No problem, you can move it to the bigger disk and then create a soft link to it with the same name. The application probably won’t even notice. This can really save you a lot of headache, especially when time is something of a premium (and let’s be honest when is it not?).

Summary

This chapter has given you the inside scoop on all things filesystem. We’ve looked at the history and shown why our filesystems look the way they do. We then touched on how they hang together and how the Linux filesystem itself is structured. We then went on to put that to good use and brought you up to speed on all the basics for creating, copying, moving and deleting your files.

We then looked at file permissions and how they are enforced and how we can go about setting them to match our needs. We also looked at the more traditional way of setting file permissions should you ever come across it. We then showed you how Linux applies these permissions and how you can change which user and group owns a specific file. We then rounded everything off by touching on how you can create links and the differences between both soft and hard varieties.

In the next chapter we’re going to expose you to all of the most common commands that you’ll find on your Pi. These are the commands that will become part of your toolbox that you’ll regularly dip into. In fact we use all of these commands in our daily work. So, onwards to Chapter 5!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.81.166