CHAPTER 4

image

The File-Paths to Success

This chapter takes a look at what a filesystem is and how different operating systems try to make sense of them. We explain the difference between separate root filesystems and unified root filesystems, and why knowing this up front might save you a lot of headache.

We then move on to looking at the filesystem presented to you on your Pi. We explore the standard layout and explain what goes where and what it actually does. We will also show off some simple commands that highlight some benefits of the UNIX “everything-is-a-file” approach to system design.

Next we will show you how to move around your system, and you’ll learn about fully qualified and relative paths. We’ll show you how to create directories and files (and then copy, move, and delete them), and we’ll explore Linux file permissions and show you how Linux decides who can do what to your files. We’ll also touch on users and groups, and show you how to set ownership and file permissions on your own files.

We’ll then round off the chapter by showing you how to create the Linux equivalent of shortcuts.

What Is a Filing System?

We actually planned to start this section with a dictionary definition of what a filing system is, followed by our interpretation. Unfortunately, most of the definitions we could find basically said “a filing system is a system for filing,” we’ll skip that part and go straight to our own definition. It does suggest, however, that although a filing system is easy to talk about, it is a lot harder to define. We think a filing system is this:

  • A way of sorting and categorizing data that makes finding something easier

This definition covers everything from having a simple in-tray (at least you know where the document is, even if you can’t put your hand on it right away); to the way you view your contacts on your phone (that’s an alphabetical system right there); all the way to some of the more exotic systems such as the Dewey Decimal System, used to categorize nonfiction content at your local library.

Actually, libraries are an interesting use-case for filing systems because most of them have at least two going on at the same time. Generally libraries tend to arrange fictional work by the surname (and then forename) of the author, whereas the nonfiction section is sorted based on Dewey Decimal numbers. By using two systems at once, they solve a fundamental problem. Storing books by the author’s name is great if you have a favorite author and want to find more books. When it comes to nonfiction, though, you probably want to browse a selection of books on a given topic, without a particular author in mind. If those books were sorted by author name, you’d probably never be able to find anything.

Some libraries take this a step further by sorting fictional books first by genre (say Sci-Fi or Fantasy) and then within that category they sort by author name. This has some pretty neat benefits. You can still go straight to your favorite author because you’ll know the genre, but you can also browse for similar books with the same genre. If you are looking for the next best thing in the fantasy genre, you can simply stand in that section and flick through.

So far, so good. Filing systems make finding things easier, and we can see that it’s important to use the correct filing system for the task; otherwise, you’d probably be better off with no filing system at all.

But what does this have to do with computers? Well, one of the major tasks for any computer is to process information, which means that it has to be able to store and retrieve information easily. Older computer systems did use the alphabetical system and it worked very well, but as soon as there was a buildup of a number of files, it started to get very messy; and finding information became a challenge.

This issue was addressed by having the ability to create a directory (often called a folder these days), in which you could put related files. This helped a lot, but the same old problem started to creep up on people. Yes, you could have a directory for accounts and another for tax returns, but what if you were an accounting firm and you had lots of accounts and lots of tax returns? Soon you were back to where you started because you either had a small number of directories with large amounts of files or a large number of directories with only one or two files in them.

Around the same time, the term filing system (used predominantly for paper-based systems) became shortened to filesystem, which is almost exclusively used to refer to the way a computer stores data on a disk. Because this is the term you will come across most often, we’ll switch to using the filesystem term as well from now on.

The last change to filesystem structure that remains familiar to this day was to allow directories to contain other directories. This would allow you to have a great deal of flexibility to store content in the most appropriate ways. As always, there’s more than one way to arrange a given set of files, but by and large the system has served us well and with modern search technology, finding what we’re looking for is easier than ever.

More than One Filesystem

Here is where we hit a bit of a snag with this approach. A filesystem sits on top of some form of storage. That can be a hard disk, a USB stick, a DVD, or any number of storage media. Each device is effectively independent from any other. This makes sense because you can take a USB stick from your laptop and then use it on your desktop machine. Clearly there is no link between the filesystem on the USB stick and your laptop because otherwise your desktop wouldn’t have all the information it would need to find the files.

Each filesystem has a root directory, which is the main directory on a device and holds all the other files and directories. Like the root of a tree, all the files and directories branch off from this central location. This does pose a bit of a problem, though. If each device has its own filesystem and root directory, how do you easily present this structure to the end user? It turns out that there are two approaches to solving this problem. The first is separate roots; the second is a unified filesystem.

Separate Roots

This approach was adopted by Microsoft all the way back in the days of MS-DOS and it’s still with us today. The approach is very simple. Each device has its own root entry and the user can then use this entry to locate the device of interest and then can simply navigate the filesystem as normal to find the file. Under Windows, each root is assigned its own letter. For historical reasons, the system disk on Windows is referred to as C:, which you’ll often hear people refer to as the C drive. Floppy drives are assigned A: and B: (PCs with hard disks tended to have two floppy drives) and because few people had more than one hard disk, the CD-ROM drive tended to be assigned to D:.

This system is really easy to use, and there is never any confusion about where a particular file is because you can tell simply by looking at the drive letter. However, there are a couple of issues with this design:

  • First, because root devices are assigned to letters, and there are only 26 letters in the alphabet, you are limited to 26 devices. Hardly an issue to home users even today, but back when businesses were using large mainframes, they could have hundreds of such devices.
  • The second issue is that the user has to be aware of the physical location of the file, which can actually add complexity because it ties the location of the file to the physical device that the file is on. If a bigger disk is added to the system, and the files were moved to that new location, their root device would change. Any person (or software) that depends on a file’s location would need to be updated, and that’s never fun.

In short, although the system employed by Windows is simple and effective, it can become an administrative headache on very large and distributed systems. These days there are lots of ways around that, and new techniques have emerged to hide this structure beneath the surface, such as the Microsoft Distributed File System (DFS) for network shares and libraries for grouping content in one place on your desktop such as documents and music. So it really isn’t a huge issue for Windows systems today. However, back when UNIX reigned supreme, these technologies weren’t available, and they went with the other approach: a unified filesystem.

Unified Filesystem

Linux, being based on UNIX, has a unified filesystem. That means that unlike Windows, which has multiple root devices, a Linux system has only one and it is always mounted on /, which is UNIX-speak for the root directory. If there is only one root directory, then how does Linux handle additional devices, all of which we know contain their own filesystems?

The solution is to take the root directory of the new device and then attach it on an existing directory in the tree. This is known as mounting a filesystem, which allows Linux to have an almost unlimited number of devices because it can attach them anywhere in the existing tree. This means that you could mount one device and then mount another device on a directory inside the first device. From the user’s point of view, they can move through all these different directories as if they were all on one big device. The physical structure (that is all the disks and the raw data on them) is completely hidden from the end users. They have no way to tell they’ve crossed from one device to another.

This solves the issues highlighted with the separate root approach in that the unified filesystem is consistent regardless of what the underlying mechanism is. In fact, you can mount network file shares and even virtual filesystems into the tree. Of course, the benefits of the separate root approach have now become the downside for the unified approach. It’s no longer easy to see where things are going, and this again can make things more confusing. Like Windows, Linux systems have evolved over time so that these problems aren’t quite as pronounced as they would otherwise be. However, these benefits are mostly seen in the GUI and tend to be far less pronounced when using the command line.

The Mac has to be Different

As an interesting comparison, OS X on the Mac makes use of both approaches. Under the covers, OS X is a UNIX-based system and so it does have a unified filesystem. However, unless you go to the command line (and most Mac users don’t), you will never see evidence of this. When you attach a USB stick to the Mac, it will mount it in a similar way as Linux, but it will show it to the user as if it were its own root device in a similar approach to Windows. However the Mac doesn’t assign a drive letter, it just sets up a unique name.

Bring it All Together

Admittedly that’s a fair bit of theory and you’re probably wondering where all the fun commands are that we promised to show you. You might also be wondering why we just bored you with all that, and when (if ever) such information might be useful. The reason we’ve put emphasis on this theory up front is because when we started out with Linux, a lot of the things that caught us out were preconceptions from using other operating systems. When one of us first installed Linux some 15 years ago, it took several hours of swearing before he figured out what “you need a root / device” meant. This isn’t just a Linux thing, a dabbling with the BSD family of operating systems and its alternative way of handling disks caused us to wipe the wrong disk because we were thinking Linux and not BSD.

So hopefully this last section will save you from some of the pain that we went through when we first started out. You don’t need to memorize all the theory, but if you come away from this section with an appreciation that despite many similarities, Linux is not Windows or a Mac, then you will be ahead of the game.

Everything as a File

Now normally when we discuss this topic, we start by talking about hard disk partitions because most people are familiar with them, and there’s a nice easy mapping between the way Linux presents them and the physical device. Because the Pi doesn’t technically have a hard disk (it pulls some strings to make the SD card look and feel like one), this easy example isn’t actually available on the Pi. The thing is, it’s too good an example to pass up, so we’re going to stick with the classic. So without further ado, here are the files that describe the disk set up on one of our servers:

/dev/sda
/dev/sda1
/dev/sda2
/dev/sdb
/dev/sdb1

As you can see, these files all live in the /dev directory, which is where Linux keeps all the device files. There’s quite a bit more to say on this, but we’ll come back to it a little later because, trust us, it will make much more sense if we cover it last. So for now, let’s ignore the directory and focus on our collection of files. You’ll notice that in this example, all the files start with the sd prefix because they are all “SCSI disks” and share the same driver. For machines that have IDE-based disks, the prefix is hd for hard disk.

So now we know that we have some SCSI-based disks (SATA disks also show up as SCSI), but what else can we tell from those files? Well, you’ll notice that we have sda and we have sdb. The first SCSI disk on the system is assigned to sda, and the second is assigned to sdb. No surprise that the third disk will end up being assigned to sdc. Because we don’t see an sdc in our example, we can assume that there are only two disks on this system.

Now we’re really getting somewhere. There are two SCSI disks on our system, and we just have one more thing to discuss: the numbers stuck on the end of the names. In this case, the number refers to the partition number on the drive. The first partition is 1, and the second partition is 2. Nice and simple with no real surprises, we’re sure. With this additional information, we not only specify which disk but also which partition on that disk.

This does, of course, beg the question of why you should care. Surely there must be an easier way to find out which disks are attached to the system? Well you’d be right, but we’re not looking at this to discover what’s connected to the system. Instead, we’re looking at how we can specify a particular device that we want to access. This is because everything in Linux is a file, so even physical devices are represented in that way. When we want to access a hard disk or we want to access a particular partition, we do so by accessing the relevant file. Not only physical devices follow this rule; directories are files as well. In day-to-day use, you don’t have to think about this, but it’s a very powerful feature. By treating everything as a file, Linux provides a standard way for all programs to talk to all things. This greatly simplifies development and allows us to do some impressive things, such as redirecting input and output or using the output of one program as the input for another. We cover this in more depth in Chapter 5.

Some real-world examples are useful here. When you want to make changes to the partition table on a hard disk, you use the fdisk tool. The fdisk tool will need to know which disk you want to work. Assuming that you want to partition your second disk, the command would look like this:

$ fdisk /dev/sdb

Because we’re partitioning the drive, we want to refer to the whole device, and /dev/sdb lets us do that. But let’s say we have done our partitioning, and we have one big partition that we want to format for Linux to use. The command for that is mkfs.ext4 (on modern Linux distributions), and as before, you’re going to need to tell the command what you want to format. In this case, we want the first partition on our second disk so the command would look like this:

$ mkfs.ext4 /dev/sdb1

Again, this makes sense because we’re referring to a specific partition on a specific device. No big leaps of faith here, but we can now share a little secret with you. The mkfs.ext4 command doesn’t really care what filename you give it. If you forgot to put the 1 on the end, it would have happily formatted the entire device. This is a case of “everything as a file” at work because everything is a file, and files are all accessed in the same way. Our tools simply read and write to files; they neither know nor care what it is they’re writing to!

This concept takes a bit more effort to really understand. Under the covers, Linux manages all these different devices for us, each with its own specific requirements and drivers. To keep things simple for application developers (not to mention the users), Linux hides all this complexity and instead presents each device as a specific file. This brings us nicely back to the /dev directory (we told you it would come up). The files in that directory are special in that they don’t really exist, at least not as physical files that are stored anywhere. Linux creates a virtual filesystem that it then populates with files for the devices that are available on a given computer. Because our Pi doesn’t have any SATA or SCSI disks, you won’t find any /dev/sd files on your Pi.

Filesystem Layout

Let’s look at how the filesystem is laid out on your Pi. You’ll find that the structure is very similar to what you’d find on any Linux system, so not only will this section help you master your Pi but you’ll also be able to navigate servers as well.

/  (Root Directory)

This is the root directory (not to be confused with the root user’s home directory) that represents the top of the file hierarchy. Everything goes somewhere in or under this directory, no exceptions. As we discussed earlier in the chapter, Linux has a unified hierarchy, and this is where it all starts.

/root

This is the home directory for the root user and spends a large amount of its time being confused with the root directory (/). It can get more confusing because if you are in the root directory and someone asks you to “go into the root directory,” which one do they actually mean? Often you can easily tell from the context, but if not, don’t be afraid to ask for clarification; you’d be surprised how often this one comes up.

/etc

/etc is arguably one of the most important directories on your system. It contains most of the configuration files for not only your system but also the applications that you might have installed (such as the Apache web server). Many users new to Linux take care to back up their applications, but often forget about the configurations stored in /etc. At the end of the day, the application can usually be replaced quite easily, but getting that config right from scratch can be a real headache. This is a directory you want to take great care of.

/proc

/proc is a virtual filesystem that the kernel uses to provide easy access from userland tools (that is, software that is run outside of the kernel). Everything you ever needed to know about the system state or running processes can be found in /proc. Two common examples are the CPU configuration (stored in /proc/cpuinfo) and memory usage (stored in /proc/meminfo). Most of this information is read-only, which makes sense because it is just a virtual representation. However, some files do allow communication to go both ways and you can potentially use it to tweak kernel and system settings while the machine is running.

/var

/var is generally where you’ll find files created by your applications and the system itself. For example, most applications will store their logs in /var/log/, and many will store the lock files in /var/run/. Lock files don’t have any meaningful content; an application just uses them to work out whether it is already running so that you can’t accidentally have two copies running at once. On Raspbian, the Apache web server uses /var/www/ for storing a website’s files. In other distributions, this is not the case—such files tended to be located in the /srv/ directory. That said, more mature applications may still be in /var, even on modern distributions.

/boot

Traditionally, the /boot directory actually lived on its own small partition on the first hard disk. At the time, the majority of computers were unable to boot from a single big partition, so it was very common to see these split out. On modern machines (although the Pi still requires it), this is no longer a problem, so this directory is often included in the root directory directly. As its name suggests, it holds the key files needed to boot a system, including the bootloader and the Linux kernel.

/bin and /sbin

These locations store user and administrative programs, respectively. When you try to run a program from the command line, there are specific places that the system will look for the application. This list of locations is known as the path. Usually a normal user only has /bin in their path, so they effectively can’t see applications in /sbin. There are some applications that the user can access even though they’re generally used only by administrators, but the user needs to know where they are. This usually doesn’t pose a problem, and you won’t ever need to go looking for anything.

/dev

We already touched on this directory in the “Everything as a File” section. It contains a file for each device or subdevice on the current system and provides a way for system tools (and, of course, users) to easily access the hardware on a particular machine. Besides the disk devices discussed earlier, there are also devices for graphics cards, sound cards, virtual terminals, and more.

image Note   You won’t find network cards in the /dev directory because they’re considered a special case. To find information on network interfaces, you need to use the sudo ifconfig -a command. This will list all the network devices that Linux knows about. For more information, check out Chapters 3 and 9.

/home

Traditionally, all user home directories would be stored under /home and this was often its own partition, disk, or network share. The idea was to keep user data separate from system data and applications. Most UNIX-based systems still follow this rule, but there are occasions when you will find some home directories living elsewhere (i.e., the root user’s home directory). The Mac, for example, stores home directories in /Users, and some businesses will put different users in different locations based upon their needs. Although /home generally contains only home directories, there’s no requirement that home directories actually reside here.

/lib

This directory contains library files that are needed by various applications. Libraries allow functionality to be packaged up and then shared by other applications. A good example might be a database driver that allows applications to access a database. So that these applications can find those libraries, they need to be installed in a known location and with a known file name. It’s rare that you’d need to poke around in this directory, but if you do, you should be careful because breaking things in here could affect the stability of your system.

/lost+found

We mention this one for completeness, but it’s not really part of the filesystem structure per se. Rather it’s where files get placed when the filesystem loses track of them. For example, if a disk were to be damaged, you would need to run a disk repair. Some files may be recovered, but for various reasons, it may be impossible to determine where that file came from. If that happens, Linux will place those lost souls in this directory. We’ve never had cause to look into this directory, and most likely you’ll never need to look in there either.

/mnt

Short for mount, the /mnt directory was traditionally where you would mount additional filesystems. If you wanted to attach a network share or an external hard disk, you would create a directory in /mnt and mount in there. Floppy disks and CDROMS also need to be mounted before use and were also generally mounted here as /mnt/floppy and /mnt/cdrom, respectively. However, in recent years, this directory has mostly fallen by the wayside because more people have removable media rather than always attached storage (see the next section on /media).

/media

This directory is relatively new to Linux and was added to make a clear distinction between mounted external devices (such as those in /mnt) and removable media such as USB sticks, cameras, and media players. These are usually handled automatically under Linux, and so usually you won’t manually add or remove anything in this directory.

/usr

This is where most of the software on the machine ends up and so it is often the largest directory on a server (at least if you don’t count user home areas). Although it’s useful to know where the software lives on your machine, because everything is handled automatically for you, everything in here should “just work”.

/opt

This directory is a jack of all trades. On some systems, it is packed full of applications; on others, it remains completely empty. It’s usually used for third-party software and applications. For example the Oracle database server installs in /opt/ by default. You probably won’t find much cause to use this directory, but be careful if you do because it is easy to forget things that are sitting in /opt/ when it comes to taking backups and so forth.

/srv

Another relative newcomer, the /srv/ directory is the designated location for storing data for services that serve files. Although this directory seems to be present, some applications still don’t make use of it either out of custom or simply because everyone is used to the contents being somewhere else. If you’re looking for things that used to be in /var/, this is probably a good place to look next.

/sys

This directory contains system information, and unlike /proc/ (which is only in memory), the contents of this directory are written and stored on disk. It doesn’t seem to get an awful lot of use, and we have never even had to look inside this directory. However, it needs to be useful only once for you to be glad it was there.

/tmp

This directory is the computer’s scratch pad, and all sorts of applications create files in this directory. It is used whenever temporary file storage is used (such as during processing). This directory is emptied by default on a reboot, but it can still potentially get quite large. In theory, all applications are supposed to treat /tmp/ as transitory; that is, they should never expect a file to be there after a restart.

Wrapping it Up

That pretty much sums up the key areas on the Linux filesystem. No doubt there are other little nooks and crannies that you might come across as you look around your system, but they will probably be a subset of these. Remember that although this structure is followed by most systems, most doesn’t necessarily mean all—and you might come across some differences or where directories are used in other ways.

Putting it to Work

We’ve covered a huge amount of information so far in this chapter. We’ve looked at how filesystems work and how UNIX systems (and Linux in particular) have adopted a unified approach. We looked at “everything as a file” and we showed how these special files and filesystems fit into the grand scheme of things. We wrapped all that up with an overview of what goes where on the filesystem and what they do. What we haven’t done yet is actually used the filesystem.

So far, we have been pretty much “hands-free,” but that’s all going to change in this section. We’ll start by showing you how to create directories and move about on the filesystem. Once we’ve given you the power to create, we’ll then give you the power to destroy, and you’ll be able to remove files and directories at will (not always a good idea). We’ll wrap this section up with a quick overview on Linux file permissions and how to read and set them. Let us begin!

Where Are We? Using pwd

The first thing we need to show you is how to figure out where you are on the system. The easiest way to get your bearings is to look at the command prompt. We talked about this in the last chapter—as a quick refresher, here is what the command prompt looks like while we’re sitting in our home directory:

pi@raspberrypi: ∼$

The tilde (the squiggly line) is shorthand for the current user’s home directory. We highlighted this feature already when we first introduced it, so we won’t go through it again, but what if we want to know which directory we are actually in? What we need is the pwd command. We touched on this useful tool in the last chapter as well, but in case you missed it, here it is again:

pi@raspberrypi: ∼$ pwd
/home/pi
pi@raspberrypi ∼$

This tool is very useful for telling us where we are, but it doesn’t tell us anything about what’s in the same location with us. It’s like being blindfolded and told you’re standing in the kitchen; it’s a great start, but you’re still effectively blind. You’d certainly like to know who and what is in the room with you; for that, you need the ls command. This command does have a large number of options, though, and so we will only cover the most common ones, those that we use every day. Actually you can just remember them as recipes because you’ll often just pass the same options time after time (or at least we do).

What’s in Here with Us? Using ls

Now let’s see what’s in here with us:

pi@raspberrypi: ∼$ ls
Desktop    python_games
pi@raspberrypi: ∼$

Although we can’t easily show it in the book, Desktop and python_games are both colored deep blue. This tells us they are directories. At present, we don’t have any files in this directory, or do we? Actually we do, but by then they are considered hidden files. Under Linux, any file that begins with a period (full stop) is considered hidden. There’s nothing special about the files; they are various config or temporary files that various applications have created. We generally don’t want these files cluttering up our display, so ls and friends don’t show them. We can, however, force ls to show us those files with the -a flag like so (although you will most likely see slightly different results on your Pi):

pi@raspberrypi: ∼$ ls -a
.              .cache
..             .config
.bash_history  .dbus
.bash_logout   Desktop
.bashrc        python_games
.dbus          .ssh
 
pi@raspberrypi: ∼$

Creating Files to Play with: Using Touch

For now, we’ll leave the -a flag alone and create our own files to play with. Because we haven’t covered how to create and edit text files yet (we will show you how to do that in Chapter 6), we’ll introduce you to another little tool called touch. Under Linux, files have two timestamps:

  • The creation timestamp
  • The last modified timestamp

These timestamps allow you to see when a file was created and when it was last updated. This is useful from an administration point of view because you can see which files are actively being used, but various tools (such as backup scripts) use this timestamp to figure out whether a file has changed since they last looked at them. Sometimes it is useful to be able to update that timestamp without changing the contents of the file, and this is where touch comes in. It touches the file that updates the timestamp, but if the file doesn’t currently exist, touch will create it for you. In other words, it’s a great tool for creating empty files. So, let’s start by creating a couple of originally named files:

pi@raspberrypi: ∼$ touch raspberry
pi@raspberrypi: ∼$ touch pi
pi@raspberrypi: ∼$ ls
Desktop pi python_games raspberry
pi@raspberrypi: ∼$

And that’s all there is to it. As you can see from the ls that followed it, we now have two additional files. This time they are gray in color, which tells us that they’re normal files. By coloring the entries for us, ls makes it much easier to see what we’re doing. For example, any file that is executable will be colored green, but we’ll come back to file permissions later in this chapter.

So far we only have two files, but if we had 10 or 20, it would start to get a bit crowded in here. The way to handle that is, of course, to create directories to store our files (and potentially other directories) in, and that’s what we’ll look at next.

image Note   Directories and folders are basically the same thing. Originally called directories, Microsoft started referring to them as folders, which they felt was a better description. Although Linux used to use the term directories because it has become more of a desktop operating system with people moving over from Windows, the folder terminology has become increasingly common.

Somewhere to Store our Files: Using mkdir

To create a directory, we use the mkdir (or make directory) command. Unsurprisingly, it will create a new directory. However, if there is a file with the same name, or if the directory already exists, you will get an error message. Let’s create a directory called pifun to store our files:

pi@raspberrypi: ∼$ mkdir pifun
pi@raspberrypi: ∼$ mkdir pifun
mkdir: cannot create directory 'pifun': File exists
pi@raspberrypi: ∼$

As you can see, trying to create the directory twice will cause an error. Don’t be distracted by File exists; this could actually refer to either a directory or a file. Another quick ls, and we’ll see that things are moving along quite nicely:

pi@raspberrypi: ∼$ ls
Desktop pi pifun python_games raspberry
pi@raspberrypi: ∼$

Making Use of a New Directory: Using the mv Command

So now that we have our directory, let’s tidy up the mess we’ve been making. Because we want to move the files into our new directory (rather than just create a copy of them), we’ll need to use the mv (move) command. This command is a bit more complicated than the ones we’ve covered before because it takes two arguments (that is, the things you want the command to act on) rather than one. This makes sense, though, because not only do we need to tell mv what we want to move but we also need to tell it where we want it to move the file to. As with most file commands under Linux, the first argument is the source, and the second argument is the destination. Let’s move those files now:

pi@raspberrypi: ∼$ mv pi pifun
pi@raspberrypi: ∼$ mv raspberry pifun
pi@raspberrypi: ∼$ ls
Desktop pifun python_games
pi@raspberrypi: ∼$

So far, so good. Now we want to make sure that our files arrived in one piece. There are two ways to do that. We could go into the pifun directory with cd and then run the ls command or alternatively, we could simply give ls the path to the directory that we want to look into. We’ve already used the first approach, so let’s try give the second approach a try:

pi@raspberrypi: ∼$ ls pifun
pi raspberry
pi@raspberrypi: ∼$

Time for Some Cloning: How to Use the cp Command

Admittedly this isn’t very exciting, but as a newly minted administrator, you’ll be spending a lot of time moving files about and checking where things are. So far, you’ve learned how to move a file, but what if you just want to copy a file instead? When making a backup or getting a selection of files ready to send to a friend, you want to actually keep the originals. For this, we use the cp command (short for copy). Let’s move into our new directory and copy one of our files:

pi@raspberrypi: ∼$ cd pifun
[pi@raspberrypi pifun]$ cp pi pi2
[pi@raspberrypi pifun]$ ls
pi pi2 raspberry
[pi@raspberrypi pifun]$

That worked well, we now have pi and pi2, just as expected. Let’s try the same thing again, only this time we’ll copy a directory:

[pi@raspberrypi pifun]$ mkdir moarpi
[pi@raspberrypi pifun]$ cp moarpi moarpi2
cp: ommiting directory 'moarpi'
[pi@raspberrypi pifun]$

This time it didn’t quite go according to plan. The reason our copy attempt failed was because by default cp will only copy individual files. It won’t copy entire directories. Even though directories are technically treated as files, the way we use them via various commands mimics a real-world folder. The reasoning behind this is that when you copy a directory, you copy everything within it, including all its files and directories. This could be a lot of data and could take up a lot of space as well as a lot of time to complete. Forcing us to be explicit about our intentions (which soon becomes second nature) means that when we mean to copy a single file but accidentally pick a directory, we will get stopped before any copying takes place.

That’s all well and good, but what if you really did want to copy that directory? When we first looked at the cp command, we mentioned tasks such as taking backups, and (let’s be honest) you’re much more likely to want to back up a directory than a list of specific files. We can get the behavior we’re looking for by telling the cp that we want to copy recursively. This will then copy the directory and anything within that directory to the destination. We specify this by using the -r flag like so:

[pi@raspberrypi pifun]$ cp -r moarpi moarpi2
[pi@raspberrypi pifun]$ ls
moarpi moarpi2 pi pi2 raspberry
[pi@raspberrypi pifun]$

Unlike the copy command, when you move a directory, there is no need to specify that you want to do so recursively because moving a directory without its contents wouldn’t make an awful lot of sense.

The Power to Destroy: Using the rm Command

So far, we’ve shown you how to create files and directories, and how to copy and move them about. Now we’re going to show you how to destroy those files with the remove or rm command. It goes without saying that the rm command is one of the most dangerous in your arsenal. It can quite easily destroy an entire server if you’re not careful, and we know people who have accidentally done just that.

For a change of pace, let’s look at how we can delete an empty directory. Before we look at how to do it with rm, we can use the rmdir command for this (which is short for remove directory). Now the catch with this command is that it will delete only directories that are completely empty. If there is even a single file inside, this command will fail. This makes it very safe to use, but not all that practical because generally when you delete a directory, you also want to remove all its contents. Let’s kill two birds with one stone:

[pi@raspberrypi pifun]$ rmdir moarpi2
[pi@raspberrypi pifun]$ rm moarpi
rm: cannot remove 'moarpi': Is a directory
[pi@raspberrypi pifun]$

We could delete moarpi2 with rmdir because the directory was empty, but when we tried using the rm command, it refused to cooperate. This is because the rm command was written with similar reasoning to the cp command. Because removing a directory is far more dangerous than simply copying it, this is probably a good thing. We can use the same -r flag to tell rm to delete recursively:

[pi@raspberrypi pifun]$ rm -r moarpi
[pi@raspberrypi pifun]$

Success! Now sometimes when you try this, especially on a large directory with lots of files and subdirectories, you can end up with lots of issues that cause rm to give up. For example, some files might be write-protected. You can suppress these errors by using the -f flag, which means force and is akin to saying, “Damn the torpedoes! Full speed ahead!” That sounds like a great idea until you stop to think what would happen if you ran this command (which you should never do):

[pi@raspberrypi pifun]$ rm -rf /

If you accidentally run that command, rm will proceed to delete absolutely everything on your system. If you happen to have a USB hard disk attached or you’ve mounted some network shares, you’re in real trouble because rm won’t confine itself to your internal disks; it will crawl through the entire tree deleting everything in its wake. This is one of the reasons why you should use a normal user account for day-to-day tasks. Your own user will not have sufficient privileges to delete anything critical to the system, but even then, chances are high that you can still damage all your attached media. You need to be very, very careful whenever you use the rm command and you’d better double- and triple-check it because Linux will assume that you know what you’re doing and won’t ask for confirmation!

The rm command can also remove files simply by providing the path to the file. You don’t need to use the -r flag for this operation, so you can simply do this:

[pi@raspberrypi pifun]$ rm pi2
[pi@raspberrypi pifun]$ ls
pi raspberry
[pi@raspberrypi pifun]$

And that in a nutshell is how you move about and manipulate the filesystem.

Fully Qualified and Relative Paths

In Linux, there are two ways to specify a path to a file or directory:

  • You can give a fully qualified path that starts with a forward slash
  • You can give a relative path that starts with a filename, directory name, a dot, or two dots

Strange though these might sound, they are both just ways of providing a specific location to your programs.

A path is considered fully qualified when it starts from a fixed reference point (i.e., the root directory). Regardless of where you are on a system, a fully qualified path will always point to the same location. It’s like the old bell tower in the middle of town; if you give anyone directions using that as your reference point, you have a common anchor that both you and your friend know how to reach.

On the other hand, a relative path depends on your current location to make sense. You can specify paths using ./ to mean the current directory or ../ to mean the parent directory (the next one up from the current location). If you had a path that looks like ../../test.txt, this would only work from a few specific locations (where the file is two levels above your current directory). However, it’s nice and short as well as being easy to type. Of course, the same file (if it were created in your home directory) would be accessible with /home/pi/test.txt, which unlike the relative path, can be used from anywhere on the filesystem pretty easily. Relative paths can be tricky when you’re several levels deep and you’re not sure where you are, but you do know the full path of the file you want. If you are in /home/pi/Documents/Work/Projects_2012/Spreadsheets/ and you want to access test.txt, you could either use the full path or alternatively use ../../../../text.txt. As you can see, in this case, the relative path is more confusing than the fully qualified path.

So when should you use one or the other? You should use whichever option is the most convenient or makes the most sense for the task. Sometimes it’s faster or easier to use a fully qualified path. Other times, you are buried deep down in the tree, and writing fully qualified paths would be tedious at best and totally confusing at worst.

Users and Groups

We’re only going to touch on the basics of users and groups here so that you know enough to understand the file permissions section that is coming up next. Users and groups are key to the way Linux secures your files, and you’ll need to know about them before we move on to the next section.

In the UNIX way of thinking, all people have their own usernames. A username identifies a particular person or entity (for example, a web server might have its own username) on a particular system. So far, we’ve spent most of our time as the pi user, but we’ve also seen that we can become the root user. Your username is the key that Linux uses to identify you as you.

Groups are similarly straightforward. Each user belongs to one primary group, but may actually be a member of any number of groups on the system. On a university system, a student’s username might have its own private group (standard practice on Red Hat and Debian systems these days) but they might also belong to a student group and a research group. They might also belong to a group specific to their department. Groups are useful to administrators because we can group a selection of users together and treat them as a single entity. This makes things such as file permissions much easier to manage.

When you create a new user on your Pi, you will automatically create a group with the same name. On some systems, users would by default join a user’s group, but as you’ll see in the next section, this could lead to accidentally giving people access to files that they shouldn’t have. Because a private group is by definition private, no one else will be a member, and no one can gain access to your files just because they happen to be in the same group as you. This is why on modern systems you’ll usually see that the owner and group of a file happen to be the same.

File Permissions

File permissions allow you to determine who you want to be able to access your files and what exactly they’re allowed to do with them. There are three different permissions that you can set. The first is whether or not someone can read your file. The second is whether they can write to your file. And third is whether they can execute it (i.e., run it like an application).

Of course, just being able to set these permissions on a file isn’t particularly flexible. You might want to give access to only a certain group of people and restrict everyone else. This is where users and groups come into play. On Linux, there are three roles that a given user might fall into:

  • User: Refers to the owner of the file
  • Group: Refers to the group that owns the file
  • World: Also often referred to as other

Each role gets its own combination of permissions; that is, you can define whether any of those three roles can read, write, or execute your files. We’re going to show you how to do just that but before we do that, we need to show you how to see what permissions are actually in effect and so now is a good time to show you how to use the -lh option for the ls command. Let’s try running it now:

[pi@raspberrypi pifun]$ ls -lh
-rw-r--r-- 1 pi pi 0 Oct 7 16:29 pi
-rw-r--r-- 1 pi pi 0 Oct 7 16:58 pi2
-rw-r--r-- 1 pi pi 0 Oct 7 16:29 raspberry
[pi@raspberrypi pifun]$

The -lh argument specifies that we want ls to show us a list of files (−l) and that we want file sizes to be in human readable format (−h). Without the human readable flag, ls will show us all sizes in bytes, which when you’re dealing with large files is not very easy to read. It doesn’t really matter in this example because our files are empty, anyway.

There are two things that we’re really interested in as far as file permissions go. The first describes the permissions currently in force, and the second shows us which user and group owns the file. Let’s break it down for the Raspberry file:

-rw-r--r-- 1 pi pi 0 Oct 7 16:29 raspberry

The file permissions part is:

-rw-r--r--

There are ten possible slots in that list. If a particular permission is missing, ls will show a hyphen. However, a normal file always has a hyphen in the first slot. If it were referencing a directory, the first slot would be a d to highlight that it’s not a file. This slot can also be an l if the file is a link (or shortcut) and we’ll show you how to use these in the next section. For now though we can ignore the first slot and focus on the final nine slots.

The remaining nine slots are grouped in threes to give us three groups. These correspond to user, group, and world roles, respectively. Each of three slots in each group represents a specific permission: read, write, and execute. Where the permission is set, you will see a letter, but when the permission is not set, you will get the hyphen. If our Raspberry file had all permissions set, it would look like this:

-rwxrwxrwx

Let’s split that out a bit so it’s a little easier to read:

-   rwx   rwx  rwx

So if we look at the first three permissions, we can see that the owner has read, write, and execute permissions. We can also see that group and world also have full permissions. To interpret what these permissions mean, though, we really need to know who actually owns the file. Let’s take a look at the part of the line that shows who owns the file:

pi pi

Well, that wasn’t too painful. Remember that on modern Linux machines, users have their own private groups that are named after the user. That’s what we’re seeing here. The first pi refers to the owner, which is, of course, the pi user. By default, when a file is created, the group ownership is set to the user’s default group. In this case, that would be our private group which is also called pi (the second pi). So look at our original file entry:

-rw-r--r-- 1 pi pi 0 Oct 7 16:29 raspberry

We can read this as “The pi user has read and write privileges. The group has read privileges and world also has read privileges.” Linux applies these permissions in a specific order based on who you are.

  1. If your username matches the owner of the file, the user permissions will apply when you try to access it.
  2. If you’re not the owner, but you are in the same group as the file, Linux will apply the group permissions to you.
  3. If you’re neither the owner or in the same group, Linux will apply the permissions from the world role.

In our example, though, the permissions for world and group are identical, so if you’re not the owner, you will get the same level of access. However only the owner can actually save changes to the file. There is one exception to this rule, however: the root user. The root user is effectively immune to file permissions and can change permissions and file ownership for any file on the system, regardless of who the owner actually is.

The execute permission allows you to execute a file as a program. This is a security feature so you can effectively stop people executing commands that you don’t want them to. However, you have to be careful because if users can read your file, there is nothing stopping them from copying it to their own file and making that file executable.

The execute bit has another purpose when it comes to directories. Obviously, you can’t execute a directory so instead this flag means that the user (or group or world) is allowed to browse the directory—they can do an ls on it. They might not be able to access anything in the directory, but they can still have a peek and see what’s hiding in there. If you give users permission to read a directory but not to execute, they will be able to read a file inside, but they wouldn’t be able to browse for it. They would have to know the filename in advance. This also applies to access files in GUI applications, but it can be annoying typing in a specific path because most applications expect you to browse to the file you want.

That’s really all there is to it. There is a feature called “extended file attributes,” but we’re not going to cover those in this book. They provide a great deal more flexibility than the standard model, but are similarly more complicated. If you’re used to the way Windows handles permissions, you’ll find that extended file attributes are a bit more in line with what you’re used to.

Setting File Permissions

First we’re going to look at how we can set file permissions (see Table 4-1).

Table 4-1. Available arguments for the chmod command

Part of Permission Setting Possible Arguments
Role u - user
g - group
o- other /world
a - all
How to apply + − add
- - remove
= − explicitly set
What to apply r - read
w - write
x - execute

We will be using the chmod command, which changes file permissions. You can specify permissions as a combination of the preceding values. These can be combined in three different ways.

  • Add permissions
  • Take permissions away
  • Explicitly set permissions

The difference is that the first two will leave all the other permissions intact after they’ve done their thing. If you explicitly set permissions, any unspecified permissions will be revoked.

Let’s start out by removing all permissions from everyone for our pi file:

[pi@raspberrypi pifun]$ chmod a=  pi
[pi@raspberrypi pifun]$ ls -lh
---------- 1 pi pi 0 Oct  8 03:52 pi
[pi@raspberrypi pifun]$

All file permissions have been removed from the file, but how does the command actually work? Permissions are specified with three parts (corresponding to the three rows in Table 4-1):

  • Who you want the change to apply to
  • How you want the change applied
  • What you want the change to be

In this case we applied the changes to a, which is basically shorthand for ugo (it applies the changes to everyone). We used the equals sign, which means we want to explicitly set the permissions and then we didn’t actually supply any permissions. If a permission is absent, it is assumed not to be set, so in our example by not supplying any permissions, we effectively revoked them all, regardless of what they were previously.

Seeing as it is our file, we want to give ourselves full permissions. Admittedly, the execute bit is not much use in this case (you’ll find it invaluable when you start scripting—see Chapter 7), but we’re going to give it to ourselves anyway. We can do that with this command:

[pi@raspberrypi pifun]$ chmod u+rwx pi
[pi@raspberrypi pifun]$ ls -lh
-rwx------ 1 pi pi 0 Oct  8 03:52 pi
[pi@raspberrypi pifun]$

Let’s pick this command apart. We specified that we wanted to change only the user’s permissions, that we wanted to add them (not that it mattered in this case because we’d removed all permissions beforehand so an equals sign would have done the same job) and that we wanted read, write, and execute privileges. To wrap up this example, let’s restore read access to the group and world roles:

[pi@raspberrypi pifun]$ chmod go+r pi
[pi@raspberrypi pifun]$ ls -lh
-rwxr--r-- 1 pi pi 0 Oct  8 03:52 pi
[pi@raspberrypi pifun]$

Just for completeness, let’s step through this last example. We want to apply the permissions to the group and other roles, we want to add the permissions to what is already there, and we want to grant read privileges. And that is pretty much it for setting file permissions. There is, however, an alternative style that uses numbers rather than letters to specify what permissions you want to set. To get the same effect as what we have already (i.e., it doesn’t have any effect), we would use this:

[pi@raspberrypi pifun]$ chmod 744 pi
[pi@raspberrypi pifun]$ ls -lh
-rwxr--r-- 1 pi pi 0 Oct  8 03:52 pi
[pi@raspberrypi pifun]$

In this system, each permission has its own value:

  • Read is 4
  • Write is 2
  • Execute is 1

To set the permissions, you add up the numbers to get the total for each set of permissions (user, group, world). For example, to set all permissions you’d add 4 and 2 and 1 to get 7. For read, you would simply do 4 plus 0 plus 0, which of course gives you 4. Put them all together for the user, the group and world we get 744. This syntax is the original syntax used on most UNIX systems. Using the letters is a relatively new idea, but at the end of the day they both achieve the same results. The main benefit of the new syntax is that it’s a lot clearer and easier to pick up. Personally, we tend to use the number style, but that’s only because we’ve been doing it for so long and it’s become second nature to us. You should feel free to use whatever system you feel the most comfortable with.

So now you can manipulate permissions like a guru, but we are still missing the second part of the puzzle; we haven’t shown you how to change ownership of the file. This is actually a lot less common than you might think; far less common than tweaking the occasional file permission, that’s for sure. There’s also another little wrinkle. A normal user (anyone other than root) cannot actually change which user owns the file because if you accidentally assign the file to another user by accident, you have no way to actually get that file back again. Of course, being all-knowing and all-seeing, the root user can change the ownership for any file on the system.

We can simulate “rootness” by using sudo. As discussed earlier, this little command acts as a filter of sorts. It always runs as root, regardless of who executes it, and executes commands as root on their behalf. To prevent any shenanigans, sudo will check the user and the command they are trying to run against an approved list. If you’re on that list (and the pi user is), you can execute all sorts of magic without every technically becoming root yourself.

To use sudo, all we have to do is prefix the command we want to run with the sudo command. That’s pretty much it. When you first run sudo, it will ask you for a password. This is the password for your particular user, not the password for the root user. The aim is that you can prove that you are the pi user, and then sudo will check to see what the pi user is allowed to do. This means that if you have lots of users on your computer, and you want to let them do some more powerful commands but don’t want to give them root access, you can set up sudo to allow them to execute a specific command without having to hand over the keys to the mansion.

Let’s start off by trying to give the file to the root user using the chown (or change ownership) command:

[pi@raspberrypi pifun]$ chown root pi
chown: changing ownership of 'pi': Operation not permitted
[pi@raspberrypi pifun]$

The operation not permitted is Linux’s way of telling us to get stuffed. As mentioned earlier, only the root user is allowed to change the owner of a file. To pull this off, we’ll need root privileges, so let’s put sudo to work for us and run the command again:

[pi@raspberrypi pifun]$ sudo chown root pi
[pi@raspberrypi pifun]$ ls -lh
-rwxr--r-- 1 root pi 0 Oct  8 03:52 pi
[pi@raspberrypi pifun]$

Success! We were able to change the owner to the root user. This will work with any valid user and any file or directory that you wish to change. There is another command called chgrp which you won’t be surprised to know allows you to change which group owns a particular file. Now there is a bit of an issue with this command as well. Although normal users are allowed to change the group, they are only allowed to change it to a group of which they’re a member. If your user is only a member of its private group, then you won’t be able to do an awful lot with this command either.

Once again, it’s root and sudo to the rescue. Because root can do whatever it pleases, it can change the group accordingly. As it happens, it looks an awful lot like our last command:

[pi@raspberrypi pifun]$ sudo chgrp root pi
[pi@raspberrypi pifun]$ ls -lh
-rwxr--r-- 1 root root 0 Oct  8 03:52 pi
[pi@raspberrypi pifun]$

And there we go—the file now belongs to the root group and the root user. When you do have to change file ownership, it’s much more common to need to change both the user and the group that owns the file. It’s relatively rare to change just the group (we can’t remember when we last used the chgrp command). The chown command provides a shortcut that allows us to set both a new owner and a new group at the same time. Let’s use this shortcut now to return the ownership of the file to our pi user. We’ll still need to use sudo, of course:

[pi@raspberrypi pifun]$ sudo chown pi:pi pi
[pi@raspberrypi pifun]$ ls -lh
-rwxr--r-- 1 pi pi 0 Oct  8 03:52 pi
[pi@raspberrypi pifun]$

With the shortcut, you just specify the user and the group separated by a colon. One last thing we need to cover with these commands is that they only operate specifically on the file you provide. If you provide a directory rather than a file, it will set the permissions on the directory, but those changes won’t filter down through to all the files. Sometimes that’s what you want, but more often you want the changes to propagate. Unlike the cp and rm commands that use -r, these two commands use -R (they use the capital letter rather than the lowercase letter). Be careful when you use this because often file permissions are precisely set, and if you waltz through obliterating them with your new version, there’s no way to undo the damage. As always, double-check what you’ve typed before you press the enter key.

Shortcuts and Links

Linux allows you to create links (or shortcuts) by using the ln command (short for link). There are two types of links:

  • A soft link is like what you might see on a Windows system after using the Create Shortcut feature. It creates a file that is just a pointer to the real location of the file elsewhere on disk. You create a soft link by specifying –s when using the ln command.
  • The hard link is more interesting. When you use a hard link you have effectively created two names for the same file. That might sound like semantics, and with most modern applications being able to follow a soft link, there’s rarely a need to use a hard link. Hard links are also restricted to a single filesystem that has to support them (most Linux filesystems do). The main benefit of a hard link is that the hard link is completely indistinguishable from the original file; they are simply two names pointing to the same location. You don’t need to specify anything to create a hard link as ln will do so by default.

image Note   To avoid confusion and to allow links to work across filesystems, you should use a soft link.

Let’s do a quick example to show this in action:

[pi@raspberrypi pifun]$ ln pi pi1
[pi@raspberrypi pifun]$ ln -s pi pi2
[pi@raspberrypi pifun]$ ls -lh
-rw-rw-r-- 2 miggyx miggyx    0 Oct  8 08:14 pi
-rw-rw-r-- 2 miggyx miggyx    0 Oct  8 08:14 pi1
lrwxrwxrwx 1 miggyx miggyx    3 Oct  8 08:33 pi2 -> pi
[pi@raspberrypi pifun]$

Let’s have a look at what we’ve got here. pi and pi1 are identical in every way, but that’s not really a surprise because apart from the name, they are the same file. Notice that the number after the file permissions block now shows 2 for pi and pi1. This tells us that there are currently two filenames pointing at this particular file. Also not much of a surprise because we’re the ones that created the second entry.

Much more interesting is pi2, which we created with a soft link. First we can see that the file permissions have all been set. This isn’t a problem because when Linux follows the soft link to the real file, it’s the real file’s permissions that will be used to define who can access the file. The soft link really just points out the location. We can also see that the filename itself is a bit different. It shows the filename that we originally give the soft link, but it also shows the file that the soft link points to. In this case, the file happens to be in the same directory, but it could just as easily have been anywhere on the system.

That’s really all there is to it for creating links. They can be useful when you want to make one directory or file appear to be in a new location. For example, a program might write to a data directory, and you want to move that directory on to a bigger disk. No problem; you can move it to the bigger disk and then create a soft link to it with the same name. The application probably won’t even notice. This can really save you a lot of headache, especially when time is something of a premium. (And let’s be honest: when is it not?)

Summary

This chapter has given you the inside scoop on all things filesystem. We’ve looked at the history and shown why our filesystems look the way they do. We then touched on how they hang together and how the Linux filesystem is structured. We put that to good use and brought you up to speed on all the basics for creating, copying, moving, and deleting your files.

We discussed file permissions and how they are enforced, and how we can set them to match our needs. We also looked at the more traditional way of setting file permissions (in case it is ever needed). We then showed you how Linux applies these permissions and how you can change which user and group owns a specific file. We rounded everything off by discussing how you can create links and the differences between both soft and hard varieties.

In the next chapter, we’re going to expose you to all the most common commands that you’ll find on your Pi. These are the commands that will become part of your toolbox that you’ll regularly dip into. In fact, we use all these commands in our daily work. So, onward to Chapter 5!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.100.34