alp-Chapter 08 - Linux System Calls.pdf

(255 KB) Pobierz
../alp/advanced-linux-programming.pdf (9)
8
Linux System Calls
S OFAR , WE VE PRESENTED A VARIETY OF FUNCTIONS that your program can invoke
to perform system-related functions, such as parsing command-line options, manipu-
lating processes, and mapping memory. If you look under the hood, you’ll find that
these functions fall into two categories, based on how they are implemented.
n A library function is an ordinary function that resides in a library external to your
program. Most of the library functions we’ve presented so far are in the standard
C library, libc . For example, getopt_long and mkstemp are functions provided in
the C library.
A call to a library function is just like any other function call.The arguments are
placed in processor registers or onto the stack, and execution is transferred to
the start of the function’s code, which typically resides in a loaded shared library.
n A system call is implemented in the Linux kernel.When a program makes a
system call, the arguments are packaged up and handed to the kernel, which
takes over execution of the program until the call completes. A system call isn’t
an ordinary function call, and a special procedure is required to transfer control
to the kernel. However, the GNU C library (the implementation of the standard
C library provided with GNU/Linux systems) wraps Linux system calls with
functions so that you can call them easily. Low-level I/O functions such as open
and read are examples of system calls on Linux.
3444593.003.png 3444593.004.png
168 Chapter 8 Linux System Calls
The set of Linux system calls forms the most basic interface between programs
and the Linux kernel. Each call presents a basic operation or capability.
Some system calls are very powerful and can exert great influence on the
system. For instance, some system calls enable you to shut down the Linux
system or to allocate system resources and prevent other users from accessing
them.These calls have the restriction that only processes running with superuser
privilege (programs run by the root account) can invoke them.These calls fail if
invoked by a nonsuperuser process.
Note that a library function may invoke one or more other library functions or system
calls as part of its implementation.
Linux currently provides about 200 different system calls. A listing of system calls
for your version of the Linux kernel is in /usr/include/asm/unistd.h . Some of these
are for internal use by the system, and others are used only in implementing special-
ized library functions. In this chapter, we’ll present a selection of system calls that are
likely to be the most useful to application and system programmers.
Most of these system calls are declared in <unistd.h> .
8.1 Using strace
Before we start discussing system calls, it will be useful to present a command with
which you can learn about and debug system calls.The strace command traces the
execution of another program, listing any system calls the program makes and any sig-
nals it receives.
To watch the system calls and signals in a program, simply invoke strace , followed
by the program and its command-line arguments. For example, to watch the system
calls that are invoked by the hostname 1 command, use this command:
% strace hostname
This produces a couple screens of output. Each line corresponds to a single system
call. For each call, the system call’s name is listed, followed by its arguments (or abbre-
viated arguments, if they are very long) and its return value.Where possible, strace
conveniently displays symbolic names instead of numerical values for arguments and
return values, and it displays the fields of structures passed by a pointer into the system
call. Note that strace does not show ordinary function calls.
In the output from strace hostname , the first line shows the execve system call
that invokes the hostname program: 2
execve(“/bin/hostname”, [“hostname”], [/* 49 vars */]) = 0
1. hostname invoked without any flags simply prints out the computer’s hostname to
standard output.
2. In Linux, the exec family of functions is implemented via the execve system call.
3444593.005.png
8.2 access: Testing File Permissions
169
The first argument is the name of the program to run; the second is its argument list,
consisting of only a single element; and the third is its environment list, which strace
omits for brevity.The next 30 or so lines are part of the mechanism that loads the
standard C library from a shared library file.
Toward the end are system calls that actually help do the program’s work.The
uname system call is used to obtain the system’s hostname from the kernel,
uname({sys=”Linux”, node=”myhostname”, ...}) = 0
Observe that strace helpfully labels the fields ( sys and node ) of the structure argu-
ment.This structure is filled in by the system call—Linux sets the sys field to the
operating system name and the node field to the system’s hostname.The uname call is
discussed further in Section 8.15, “ uname .”
Finally, the write system call produces output. Recall that file descriptor 1 corre-
sponds to standard output.The third argument is the number of characters to write,
and the return value is the number of characters that were actually written.
write(1, “myhostname\n”, 11) = 11
This may appear garbled when you run strace because the output from the hostname
program itself is mixed in with the output from strace .
If the program you’re tracing produces lots of output, it is sometimes more conve-
nient to redirect the output from strace into a file. Use the option -o filename to
do this.
Understanding all the output from strace requires detailed familiarity with the
design of the Linux kernel and execution environment. Much of this is of limited
interest to application programmers. However, some understanding is useful for debug-
ging tricky problems or understanding how other programs work.
8.2 access : Testing File Permissions
The access system call determines whether the calling process has access permission
to a file. It can check any combination of read, write, and execute permission, and it
can also check for a file’s existence.
The access call takes two arguments.The first is the path to the file to check.The
second is a bitwise or of R_OK , W_OK , and X_OK , corresponding to read, write, and exe-
cute permission.The return value is 0 if the process has all the specified permissions. If
the file exists but the calling process does not have the specified permissions, access
returns –1 and sets errno to EACCES (or EROFS , if write permission was requested for a
file on a read-only file system).
If the second argument is F_OK , access simply checks for the file’s existence. If the file
exists, the return value is 0; if not, the return value is –1 and errno is set to ENOENT . Note
that errno may instead be set to EACCES if a directory in the file path is inaccessible.
3444593.006.png
170 Chapter 8 Linux System Calls
The program shown in Listing 8.1 uses access to check for a file’s existence and to
determine read and write permissions. Specify the name of the file to check on the
command line.
Listing 8.1 ( check-access.c ) Check File Access Permissions
#include <errno.h>
#include <stdio.h>
#include <unistd.h>
int main (int argc, char* argv[])
{
char* path = argv[1];
int rval;
/* Check file existence. */
rval = access (path, F_OK);
if (rval == 0)
printf (“%s exists\n”, path);
else {
if (errno == ENOENT)
printf (“%s does not exist\n”, path);
else if (errno == EACCES)
printf (“%s is not accessible\n”, path);
return 0;
}
/* Check read access. */
rval = access (path, R_OK);
if (rval == 0)
printf (“%s is readable\n”, path);
else
printf (“%s is not readable (access denied)\n”, path);
/* Check write access. */
rval = access (path, W_OK);
if (rval == 0)
printf (“%s is writable\n”, path);
else if (errno == EACCES)
printf (“%s is not writable (access denied)\n”, path);
else if (errno == EROFS)
printf (“%s is not writable (read-only filesystem)\n”, path);
return 0;
}
For example, to check access permissions for a file named README on a CD-ROM,
invoke it like this:
% ./check-access /mnt/cdrom/README
/mnt/cdrom/README exists
/mnt/cdrom/README is readable
/mnt/cdrom/README is not writable (read-only filesystem)
3444593.001.png
8.3 fcntl: Locks and Other File Operations
171
8.3 fcntl : Locks and Other File Operations
The fcntl system call is the access point for several advanced operations on file
descriptors.The first argument to fcntl is an open file descriptor, and the second is a
value that indicates which operation is to be performed. For some operations, fcntl
takes an additional argument.We’ll describe here one of the most useful fcntl opera-
tions, file locking. See the fcntl man page for information about the others.
The fcntl system call allows a program to place a read lock or a write lock on a
file, somewhat analogous to the mutex locks discussed in Chapter 5, “Interprocess
Communication.” A read lock is placed on a readable file descriptor, and a write lock
is placed on a writable file descriptor. More than one process may hold a read lock on
the same file at the same time, but only one process may hold a write lock, and the
same file may not be both locked for read and locked for write. Note that placing a
lock does not actually prevent other processes from opening the file, reading from it,
or writing to it, unless they acquire locks with fcntl as well.
To place a lock on a file, first create and zero out a struct flock variable. Set the
l_type field of the structure to F_RDLCK for a read lock or F_WRLCK for a write lock.
Then call fcntl , passing a file descriptor to the file, the F_SETLCKW operation code, and
a pointer to the struct flock variable. If another process holds a lock that prevents a
new lock from being acquired, fcntl blocks until that lock is released.
The program in Listing 8.2 opens a file for writing whose name is provided on the
command line, and then places a write lock on it.The program waits for the user to
hit Enter and then unlocks and closes the file.
Listing 8.2 ( lock-file.c ) Create a Write Lock with fcntl
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int main (int argc, char* argv[])
{
char* file = argv[1];
int fd;
struct flock lock;
printf (“opening %s\n”, file);
/* Open a file descriptor to the file. */
fd = open (file, O_WRONLY);
printf (“locking\n”);
/* Initialize the flock structure. */
memset (&lock, 0, sizeof(lock));
lock.l_type = F_WRLCK;
/* Place a write lock on the file. */
fcntl (fd, F_SETLKW, &lock);
continues
3444593.002.png
Zgłoś jeśli naruszono regulamin