Hacking The Linux.pdf

Hacking the Linux 2.6 kernel, Part 2: Making your

first hack

Kernel source, system calls, and kernel modules and patches

Skill Level: Introductory

Lina Mårtensson ( linam@tyst.nu )

Freelance writer

Valerie Henson ( val@nmt.edu )

Software Engineer

IBM

02 Aug 2005

In this second of a two-part series, discover the organization of the Linux kernel

source, build an understanding of system calls, and craft your own kernel modules

and patches.

Section 1. Before you start

Learn what these tutorials can teach you, and what you need to run the examples in

them.

About this series

The capability of being modified is perhaps one of Linux's greatest strengths, and

anyone who has dabbled with the source code has at least stood at the gates of the

kingdom, if not opened them up and walked inside.

These two tutorials are intended to get you started. They are for anyone who knows

a little bit of programming and who wants to contribute to the development of Linux,

who feels that something is missing in the kernel and wants to fix that, or who just

wants to find out how a real operating system works.

Making your first hack

Page 1 of 25

developerWorks®

ibm.com/developerWorks

About this tutorial

This tutorial is a sequel to " Hacking the Linux 2.6 kernel, Part 1: Getting ready ."

Please read Part 1 before diving into Part 2.

We start where Part 1 left off by providing an overview of the kernel source. In this

tutorial, we review where the various parts of the kernel are located in the source

tree, what order they execute in, and how to go looking for a particular piece of code.

We then explain system calls, teach you how to make your own modules, and finally

instruct you on how to create, apply, and submit patches.

Prerequisites

To run the examples in this tutorial, you need a Linux box, root access on this Linux

box (or a sympathetic admin), the ability to reboot this box several times a day, an

installed compilation environment, and a way to get the kernel source.

The system prerequisites are covered in detail in Part 1 under "Requirement details."

If you're not up on these details, you'll probably want to brush up before going on to

the next section of this tutorial.

Section 2. Overview of the kernel source

The source tree

Let's start with the top-level directory of the Linux source tree, which is usually but

not always in /usr/src/linux-<version> . We won't get too detailed, because

the Linux source changes constantly, but we'll try to give you enough information to

figure out where a certain driver or function is.

Makefile : This file is the top-level makefile for the whole source tree. It defines a

lot of useful variables and rules, such as the default gcc compilation flags.

Documentation/ : This directory contains a lot of useful (but often out of date)

information about configuring the kernel, running with a ramdisk, and similar things.

The help entries corresponding to different configuration options are not found here,

though -- they're found in Kconfig files in each source directory.

arch/ : All the architecture-specific code is in this directory and in the

include/asm-<arch> directories. Each architecture has its own directory

underneath this directory. For example, the code for a PowerPC-based computer

would be found under arch/ppc . You will find low-level memory management,

Making your first hack

Page 2 of 25

ibm.com/developerWorks

developerWorks®

interrupt handling, early initialization, assembly routines, and much more in these

directories.

crypto/ : This is a cryptographic API for use by the kernel itself.

drivers/ : As a general rule, code to run peripheral devices is found in

subdirectories of this directory. This includes video drivers, network card drivers,

low-level SCSI drivers, and other similar things. For example, most network card

drivers are found in drivers/net . Some higher level code to glue all the drivers of

one type together may or may not be included in the same directory as the low-level

drivers themselves.

fs/ : Both the generic filesystem code (known as the VFS, or Virtual File System)

and the code for each different filesystem are found in this directory. One of the most

commonly used filesystems in Linux is the ext2 filesystem; the code to read the ext2

format is found in fs/ext2 . Not all of the filesystems compile or run; the more

obscure filesystems are always a good candidate for someone looking for a kernel

project.

include/ : Most of the header files included at the beginning of a .c file are found

in this directory. Architecture-specific include files are in asm-<arch> . Part of the

kernel build process creates the symbolic link from asm to asm-<arch> , so that

#include <asm/file.h> will get the proper file for that architecture without

having to hardcode it into the .c file. The other directories contain

non-architecture-specific header files. If a structure, constant, or variable is used in

more than one .c file, it should be probably be in one of these header files.

init/ : This directory contains the files main.c , code for creating early userspace ,

and other initialization code. main.c can be thought of as the kernel "glue." We'll

talk more about main.c in the next section. Early userspace provides functionality

that needs to be available while a Linux kernel is coming up, but that doesn't need to

be run inside the kernel itself.

ipc/ : IPC stands for interprocess communication . It contains the code for shared

memory, semaphores, and other forms of IPC.

kernel/ : Generic kernel-level code that doesn't fit anywhere else goes in here. The

upper-level system-call code is here, along with the printk() code, the scheduler,

signal-handling code, and much more. The files have informative names, so you can

type ls kernel/ and guess fairly accurately at what each file does.

lib/ : Routines of generic usefulness to all kernel code are put in here. Common

string operations, debugging routines, and command-line parsing code are all in

here.

mm/ : High-level memory-management code is in this directory. Virtual memory (VM)

is implemented through these routines in conjunction with the low-level

architecture-specific routines usually found in arch/<arch>/mm/ . Early-boot

memory management (needed before the memory subsystem is fully set up) is done

here, as well as memory mapping of files, management of page caches, memory

Making your first hack

Page 3 of 25

developerWorks®

ibm.com/developerWorks

allocation, and swap out of pages in RAM (along with many other things).

net/ : The high-level networking code is here. The low-level network drivers pass

received packets up to and get packets to send from this level, which may pass the

data to a user-level application, discard the data, or use it in-kernel, depending on

the packet. The net/core directory contains code useful to most of the different

network protocols, as do some of the files in the net/ directory itself. Specific

network protocols are implemented in subdirectories of net/ . For example, IP

(version 4) code is found in the directory net/ipv4 .

scripts/ : This directory contains scripts that are useful in building the kernel, but

does not include any code that is incorporated into the kernel itself. The various

configuration tools keep their files in here, for example.

security/ : Code for different Linux security models can be found here, such as

NSA Security-Enhanced Linux and socket and network security hooks, as well as

other security options.

sound/ : Drivers for sound cards and other sound related code is placed here.

usr/ : This directory contains code that builds a cpio-format archive containing a

root filesystem image which will be used for early userspace.

Where does it all come together?

The central connecting point of the whole Linux kernel is the file init/main.c .

Each architecture executes some low-level set-up functions and then executes the

function called start_kernel (which is found in init/main.c ).

The order of execution of code looks something like this:

Architecture-specific set-up code (in arch/<arch>/*)

The function start_kernel() (in init/main.c)

The function init() (in init/main.c)

The user level "init" program

More details on the order of execution

In more detail, this is what happens:

•

Architecture-specific set-up code that:

Making your first hack

Page 4 of 25

ibm.com/developerWorks

developerWorks®

•

Unzips and moves the kernel code itself, if necessary

•

Initializes the hardware

•

This may include setting up low-level memory management

•

Transfers control to the function start_kernel()

• start_kernel() does, among other things:

•

Print out the kernel version and command line

•

Start output to the console

•

Enable interrupts

•

Calibrate the delay loop

•

Calls rest_init() which:

•

Starts a kernel thread to run the init() function

•

Enters the idle loop

• init() :

•

Starts the other processors (on SMP machines)

•

Starts the device subsystems

•

Mounts the root filesystem

•

Frees up unused kernel memory

•

Runs /sbin/init (or /etc/init , or...)

At this point, the userlevel init program is running; it will do things like start

networking services and run getty (the login program) on your console(s).

You can figure out when a subsystem is initialized from start_kernel() or

init() by putting in your own printk s and seeing when the printk s from that

subsystem appear with regard to your own printk s. For example, if you wanted to

find out when the ALSA sound system was initialized, put printk s at the beginning

of start_kernel() and init() and look for where "Advanced Linux Sound

Architecture [...]" is printed out relative to your printk s. (Part 2 offers help and tips

for using the printk() function.)

Finding things in the kernel source tree

So, you want to start working on say, the USB driver. Where do you start looking for

the USB code?

First, you can try a find command from the top-level kernel directory:

Making your first hack

Page 5 of 25

Plik z chomika:

Inne pliki z tego folderu:

Inne foldery tego chomika: