Hacking The Linux.pdf

(86 KB) Pobierz
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
Hacking the Linux 2.6 kernel, Part 2: Making your
first hack
Kernel source, system calls, and kernel modules and patches
Skill Level: Introductory
Freelance writer
Software Engineer
IBM
02 Aug 2005
In this second of a two-part series, discover the organization of the Linux kernel
source, build an understanding of system calls, and craft your own kernel modules
and patches.
Section 1. Before you start
Learn what these tutorials can teach you, and what you need to run the examples in
them.
About this series
The capability of being modified is perhaps one of Linux's greatest strengths, and
anyone who has dabbled with the source code has at least stood at the gates of the
kingdom, if not opened them up and walked inside.
These two tutorials are intended to get you started. They are for anyone who knows
a little bit of programming and who wants to contribute to the development of Linux,
who feels that something is missing in the kernel and wants to fix that, or who just
wants to find out how a real operating system works.
Page 1 of 25
858756905.001.png
developerWorks®
ibm.com/developerWorks
About this tutorial
This tutorial is a sequel to " Hacking the Linux 2.6 kernel, Part 1: Getting ready ."
Please read Part 1 before diving into Part 2.
We start where Part 1 left off by providing an overview of the kernel source. In this
tutorial, we review where the various parts of the kernel are located in the source
tree, what order they execute in, and how to go looking for a particular piece of code.
We then explain system calls, teach you how to make your own modules, and finally
instruct you on how to create, apply, and submit patches.
Prerequisites
To run the examples in this tutorial, you need a Linux box, root access on this Linux
box (or a sympathetic admin), the ability to reboot this box several times a day, an
installed compilation environment, and a way to get the kernel source.
The system prerequisites are covered in detail in Part 1 under "Requirement details."
If you're not up on these details, you'll probably want to brush up before going on to
the next section of this tutorial.
Section 2. Overview of the kernel source
The source tree
Let's start with the top-level directory of the Linux source tree, which is usually but
not always in /usr/src/linux-<version> . We won't get too detailed, because
the Linux source changes constantly, but we'll try to give you enough information to
figure out where a certain driver or function is.
Makefile : This file is the top-level makefile for the whole source tree. It defines a
lot of useful variables and rules, such as the default gcc compilation flags.
Documentation/ : This directory contains a lot of useful (but often out of date)
information about configuring the kernel, running with a ramdisk, and similar things.
The help entries corresponding to different configuration options are not found here,
though -- they're found in Kconfig files in each source directory.
arch/ : All the architecture-specific code is in this directory and in the
include/asm-<arch> directories. Each architecture has its own directory
underneath this directory. For example, the code for a PowerPC-based computer
would be found under arch/ppc . You will find low-level memory management,
Making your first hack
Page 2 of 25
ibm.com/developerWorks
developerWorks®
interrupt handling, early initialization, assembly routines, and much more in these
directories.
crypto/ : This is a cryptographic API for use by the kernel itself.
drivers/ : As a general rule, code to run peripheral devices is found in
subdirectories of this directory. This includes video drivers, network card drivers,
low-level SCSI drivers, and other similar things. For example, most network card
drivers are found in drivers/net . Some higher level code to glue all the drivers of
one type together may or may not be included in the same directory as the low-level
drivers themselves.
fs/ : Both the generic filesystem code (known as the VFS, or Virtual File System)
and the code for each different filesystem are found in this directory. One of the most
commonly used filesystems in Linux is the ext2 filesystem; the code to read the ext2
format is found in fs/ext2 . Not all of the filesystems compile or run; the more
obscure filesystems are always a good candidate for someone looking for a kernel
project.
include/ : Most of the header files included at the beginning of a .c file are found
in this directory. Architecture-specific include files are in asm-<arch> . Part of the
kernel build process creates the symbolic link from asm to asm-<arch> , so that
#include <asm/file.h> will get the proper file for that architecture without
having to hardcode it into the .c file. The other directories contain
non-architecture-specific header files. If a structure, constant, or variable is used in
more than one .c file, it should be probably be in one of these header files.
init/ : This directory contains the files main.c , code for creating early userspace ,
and other initialization code. main.c can be thought of as the kernel "glue." We'll
talk more about main.c in the next section. Early userspace provides functionality
that needs to be available while a Linux kernel is coming up, but that doesn't need to
be run inside the kernel itself.
ipc/ : IPC stands for interprocess communication . It contains the code for shared
memory, semaphores, and other forms of IPC.
kernel/ : Generic kernel-level code that doesn't fit anywhere else goes in here. The
upper-level system-call code is here, along with the printk() code, the scheduler,
signal-handling code, and much more. The files have informative names, so you can
type ls kernel/ and guess fairly accurately at what each file does.
lib/ : Routines of generic usefulness to all kernel code are put in here. Common
string operations, debugging routines, and command-line parsing code are all in
here.
mm/ : High-level memory-management code is in this directory. Virtual memory (VM)
is implemented through these routines in conjunction with the low-level
architecture-specific routines usually found in arch/<arch>/mm/ . Early-boot
memory management (needed before the memory subsystem is fully set up) is done
here, as well as memory mapping of files, management of page caches, memory
Page 3 of 25
developerWorks®
ibm.com/developerWorks
allocation, and swap out of pages in RAM (along with many other things).
net/ : The high-level networking code is here. The low-level network drivers pass
received packets up to and get packets to send from this level, which may pass the
data to a user-level application, discard the data, or use it in-kernel, depending on
the packet. The net/core directory contains code useful to most of the different
network protocols, as do some of the files in the net/ directory itself. Specific
network protocols are implemented in subdirectories of net/ . For example, IP
(version 4) code is found in the directory net/ipv4 .
scripts/ : This directory contains scripts that are useful in building the kernel, but
does not include any code that is incorporated into the kernel itself. The various
configuration tools keep their files in here, for example.
security/ : Code for different Linux security models can be found here, such as
NSA Security-Enhanced Linux and socket and network security hooks, as well as
other security options.
sound/ : Drivers for sound cards and other sound related code is placed here.
usr/ : This directory contains code that builds a cpio-format archive containing a
root filesystem image which will be used for early userspace.
Where does it all come together?
The central connecting point of the whole Linux kernel is the file init/main.c .
Each architecture executes some low-level set-up functions and then executes the
function called start_kernel (which is found in init/main.c ).
The order of execution of code looks something like this:
Architecture-specific set-up code (in arch/<arch>/*)
|
v
The function start_kernel() (in init/main.c)
|
v
The function init() (in init/main.c)
|
v
The user level "init" program
More details on the order of execution
In more detail, this is what happens:
Architecture-specific set-up code that:
Making your first hack
Page 4 of 25
858756905.002.png 858756905.003.png 858756905.004.png
 
ibm.com/developerWorks
developerWorks®
Unzips and moves the kernel code itself, if necessary
Initializes the hardware
This may include setting up low-level memory management
Transfers control to the function start_kernel()
start_kernel() does, among other things:
Print out the kernel version and command line
Start output to the console
Enable interrupts
Calibrate the delay loop
Calls rest_init() which:
Starts a kernel thread to run the init() function
Enters the idle loop
init() :
Starts the other processors (on SMP machines)
Starts the device subsystems
Mounts the root filesystem
Frees up unused kernel memory
Runs /sbin/init (or /etc/init , or...)
At this point, the userlevel init program is running; it will do things like start
networking services and run getty (the login program) on your console(s).
You can figure out when a subsystem is initialized from start_kernel() or
init() by putting in your own printk s and seeing when the printk s from that
subsystem appear with regard to your own printk s. For example, if you wanted to
find out when the ALSA sound system was initialized, put printk s at the beginning
of start_kernel() and init() and look for where "Advanced Linux Sound
Architecture [...]" is printed out relative to your printk s. (Part 2 offers help and tips
for using the printk() function.)
Finding things in the kernel source tree
So, you want to start working on say, the USB driver. Where do you start looking for
the USB code?
First, you can try a find command from the top-level kernel directory:
Page 5 of 25
Zgłoś jeśli naruszono regulamin