Monday, August 23, 2010

Linux Device Driver One Liners

Types of Driver Class.

Char Driver : Permits sequential Read/Write only.
Block Driver : Permits random + buffered/block Read /write.
Network Driver : Directly transfer packet of data to kernel. Not mapped to file system.

Tainted Kernel is any part of kernel code without GPL ( or friends ) License.

Device Driver Structure

__init() -> Any Code within this is executed first, at the time of loading. Not mandatory.
__exit() -> This part is executed an the time of driver unloading.

Parameters can be passed to device driver using "module_params()". If R/W is given then it can be changed on the fly with " $ echo "value" > /sys/module/parameter/"

Finally, Any sys call the device driver handles are mapped to the "fops" entry points of the driver. For example , the sys-call close() is mapped to "release()" entry point of the driver.

Major Number are used to identify driver associated with device
Minor Number are used only within the device driver to distinguish different functionality.

Important Structures

File Structure : Kernel Created new file structure on every open(). This is passed to rest of the function. Separate allocated memory per open() is called "private data'.
Inode Structure : There is just 1 inode structure per device.

task_struct = keeps all info about tasks.

Interrupt & Exception

  • Interrupt can never be lost.
  • Interrupt are shared. That is all handlers get the interrupt, the one which is responsible consumes it. The rest discards it.
  • Interrupts are Maskable and Non-Maskable interrupts.
  • Under SMP only one CPU will handle interrupt of same type.
  • Cannot sleep in interrupt context.
  • Cannot call schedule().

Top Half : Actual Routine, which ack interrupt. Copy data from device to buffer. Small and fast.
Bottom Half : Do delay processing work here. There are 3 types of bottom half.
  • Softirq : Run in interrupt context
  • Tasklets : Run in interrupt context
  • Workqueues: Run in process context. Can sleep.

Timers

Jiffies : Is a counter which increments on every timer interrupt
TSC : Time stamp clock, register which increments on every clock cycle. H/W dependent.
HPET : High precision event timer. Is an hardware.

IOCTLS

Is an driver entry point. User can call driver entry point using ioctls.

PROCFS

Read only FS in memory. Used for debugging and information gathering.

SysFS

Virtual File system in memory.

Race conditions.

Prevent two or more threads accessing same resource.

Atomic function : Execute single instruction which cannot be interrupt.

Mutex: Sleepable locking mechanism.
  • interrupt-able mutex: single can break it.
  • non-interrupt-able mutex: only user / owner can break it
  • its non recursive.
Spin lock : Non sleepable locking mechanism.


Semaphore : Sleepable locking mechanism.

  • Counting semaphore : More than one thread can hold a semaphore.
  • binary semaphore : Same as mutex.

Memory management

  • Zones:
  • 0-16mb : DMA
  • 16-896 : Normal
  • 896 - 4GB : High.

kmalloc() : Allocate memory.
GFP_KERNEL: NORMAL , CAN SLEEP AND BLOCK.
GFP_ATOMIC: INTERRUPT, CANNOT BLOCK
GFP_DMA : USED FOR DMA
kfree(): Free memory.

vmalloc() : Get continuous memory from virtual address. And cannot be used in interrupt context and DMA.
vfree() : free memory allocated from vmalloc.

Boot Memory + early allocation

Device driver can only allocated 4MB max. To circumvent this "alloc_bootmem()" is used which can allocate any amount.

Reading / Writing To and from user space to kernel space use below functions as dereferencing pointer passed from user space into kernel space is bad.
  • put_user()
  • get_user()
  • copy_to_user()
  • copy_from_user()

Storage Systems One Liners

NAS : Network Attached Storage exposes file system. E.g NFS,CIFS,SMB.

SAN : Storage Area Network exposes volumes or block devices. ( scsi )

iSCSI : Exposes scsi devices as volumes. Can be accessed via tcp/ip.
  • iscsi Target : Actual data destination, the real storage space. ( Server )
  • iscsi Initiator : Client side. A way to find and connect target.
FC : Fiber Channel a storage protocol. Normally used with SAN.

AOE : ATA over Ethernet. Normal IDE over LAN.

FCoE : Fiber channel over Ethernet.

Benefits of Storage
  • Disaster Recovery
  • Easy backup / Storage points
  • Easy management

Unix Therory One Liners.

Type of Files

  • HardLink : Multiple path to same inode. Cannot Span different file system.
  • SoftLink : Alternate path to file. Can span file systems
  • Special Files : Kernel object represented as files. E.g Char,Block.
  • Pipes : IPC mechanism. Its supports one way communication within same process parent child realm.
  • FIFO : Also called Named pipes. IPC mechanism. Unrelated process can communicated.
  • Sockets : IPC mechanism. Two or more unrelated process can communicate. Two or more unrelated machines can communicate.

Process vs Threads

  • Process a program in memory. Running, and can be scheduled by the CPU. Takes memory / space.
  • Threads a unit of execution within a process. Shares most of the data structure of the process except the stack. Which is separate for each thread. Two threading library under UNIX is pthreads and NPTL .

  • fork() = creates a new process. New process is called the "child". Its an exact replica of the parent when its created. Its gets its own stack and heap on COW.
  • fork() returns twice, once for child (0) and other for parent ( pid of child ).

  • If child die before parent is called a zombie.
  • If parent die before child the child process is called an orphan.


What child process gets ...

  • All signal which are not ignored.

What child does not get ...

  • File locks in parent.
  • No pending signals.

  • exit() - Kills a process.
  • atexit() - Another variation of exit which calls an user defined method before exiting.

Parent getting status of child.

  • wait() - Parent retrieves information when child dies.
  • waitpid() - Get information of a single child whose pid matches.

Deamons = process running in background with not terminal attached.


I/O

  • Buffered I/O = Write / Read in buffers size. Can be forced to flush(). The opposite is non-buffered ( here user manages its own buffer write/read frequency ).
  • Multiplexed I/O = Write/Read with more than one file descriptors. ( select(), poll() )
  • Scatter/Gather Write/Read = Read/Write to many buffers in a single call.
  • MMAP = map file into memory.

Locking

  • Advisory = Not enforced. ( eg spin lock)
  • Mandatory = Enforced.
Signals

  • Are asynchronous notifications.

Saturday, July 24, 2010

The Owl Kernel

This Owl Kernel looks promising keep posted to this site. This will be better then your CS lectures.

Wednesday, May 12, 2010

Lvalue & Rvalue.

Lvalue and Rvalue discussion unwittingly took many minutes of students time in his collage days. And it bought smiles on the face of an interviewer and frown on the face of interviewee. But with such an impressive resume, a programmer does not put much thought to it once he/she get out of the collage and clears the interview. This is one topic which is comparatively hard to explain than actually put it into action. Because it comes in naturally , which is very hard thing to come by in 'C'.

The rule simply put, LV is the address of , and RV is the actual contents.

Lets see what Lvalue (LV) & Rvalue (RV) are with a simple 'C' program.

Lesson:01: This will not compile.

...
int a = 0;
/* Here 'a' is RV. And its actual contents is 0 (zero). While numerical 5 is LV (or trying to be) because what then is the address of '5'. This program will fail with "lvalue required ..." error. By this the compiler is saying that to store the contents ( RV aka 0) i need an address. With numerical 5 i am unable to get it. So i am angry and i will not proceed */
5 = a;
printf("a = %d\n",a);
...

Lesson:02: This will compile.

...
int a = 0;
/* Here 'a' is LV. And its address will be '&a'. While numerical 5 is RV and its contents is also 5. */
a = 5;
printf("a = %d\n",a);
...


Lesson:03: This will compile. This is extension of Lesson:02 to explain things better.

...
int a = 0;
int b = 5;
/* Here 'a' is LV. And its address will be '&a'. While for 'b', the compiler will take the contents which is 5 , so here 'b' is RV */
a = b;
printf("a = %d\n",a);
...


Thanks
Arshad

Sunday, May 9, 2010

The Computer Architecture, The Engineers and The Programmer.

I was lucky to see computer when it was still growing in India. My GWBASIC course in school (sometimes i used to get a BASICA floppy) would happily run on Modi-Olivetti 8086 machines. And I started to call myself a programmer. With GWBAISC ( programming language ) I thought I could do anything with the machine. But in fact it would drive me crazy thinking how people could code prince of Persia, while i would struggle to get a "block" in graphic mode move (XORing, And I learned about Blitting later). Few years later while going through Ray Duncan ( Advance MSDOS programming ) book it hit me, I came to know about 4 screens in x86 Architecture, where you loop through and give an impression of movement.

Architecture... is this important ?

Few years rolled by and I was caught in the likes of Assembly, C & Linux File-system, while I used to encounter "sizeof(int)", but Architecture still got a step motherly treatment from me. Architecture for me was "morris mano" and it ended with collage.

One day, being confident in C and assembly language i decided to code my own OS. As a challenge to myself and as a test to myself that if I claim I know every bit of this computer system, lets write a minimal 32bit protected OS. After all I know C and assembly, how hard could this be.

I was wrong. Very wrong. My pride too a beating, my ego was thrashed. With all the C and assembly I could not budge an inch with my new OS. I understood programming language is just a vehicle to express my ideas to the CPU. I must be friend with the structure of the computer or the Architecture. Once I shelved my C & UNIX books and got hold of Intel/AMD developers guide. My OS since then has made huge strides.

Now I understood GDT's & IDT's, why LDT's are not used ( seldom used ) after x386. Now i can take bare cpu and bring it up. I completely understood the relationship between ring0-3 and DPL. The call-gates, and why if you do not set up interrupt while OS bring-up and that interrupt is generated ... it goes for a triple fault. For many more please refer Intel/AMD manual. ;-)

People working on board bring-up or device driver, or any OS, knowledge of CPU Architecture is compulsory and it completes the programmer. Any training institution or collage or university who does not stress upon training Architecture leaves its students incomplete. And I see many esteem one's doing this...unfortunately.

I'll conclude this by quoting Intel Architecture S/w Developer Manual Starting Note " ... Refer all four volumes when evaluating your design needs".

Saturday, May 8, 2010

Writing Device Driver, without installing full kernel source

It is possible to write device driver without installing and compiling
the full kernel source. But there is a disadvantage, with this method
as any changes to the kernel ( eg: enable lguest etc ...) will not be
possible. For changes to kernel options you will have to install the
full kernel source. And do the cycle of make ; make install ; make modules
etc...

Ok , back to topic. Quick way to write device driver....

; install few packages
# apt get install make build-essential exuberant-ctags libncurses5-dev
; get your kernel version
# uname -r
; install kernel headers
# apt-get install linux-headers-()

; Fire away you device driver (kernel) code.

Allow Root to Remote login via ssh

; install package
# apt-get install openssh-server

; edit sshd_config
# vi /etc/ssh/sshd_config
And have this line "PermitRootLogin Yes" uncommented. Or add
your own.

; restart ssh server
# /etc/init.d/ssh restart

Sunday, January 24, 2010

Blending C and Assembly (nasm)

First difference between 'extern' and 'global'.
extern : Assure assembler that the function will is defined someplace else.
global : Any procedure marked global can be referenced from anywhere.
===========================================================================
blank.asm ( name of file )

; simply returns 0
GLOBAL _blank
section .text
_blank:
mov eax, 0
ret

===========================================================================
test.c ( name of file )

#include

extern int _blank();

int
main(){
int ret = 5;
ret = _blank();
printf("Ret is [%d]\n",ret);
return 0;
}

===========================================================================

$ nasm -felf blank.asm -> outputs -> blank.o
$ gcc -o test test.c blank.o
$ ./test
Ret is [0]

Friday, January 15, 2010

Install pydbg and paimei under windows

01. Get python 2.4.4 for windows.(http://www.python.org/download/releases/2.4.4)
-> Run the installer and install python ( say to c:\python24 )

02. Get ctypes for this version. (http://downloads.sourceforge.net/ctypes/ctypes-1.0.1.win32-py2.4.exe?modtime=1161376216&big_mirror=0)
-> Install by double clicking (Follow the leads given by the installer.)

03. Download paimei. ( http://www.openrce.org/downloads/download_file/208 )
-> ( say your paimei file is PaiMei-1.1-win32.exe )

04. Get source from http://paimei.googlecode.com/svn/trunk
-> put it under c:\paimei_src ( just a folder of your choice )

05. copy (step 03) PaiMei-1.1-win32.exe under "c:\paimei_src"\installer

06. run c:\paimei_src\__install_requirements.py
->
/* Actual output of __install_requirements.py */
looking for PaiMei -> PyDbg ... FOUND
looking for PaiMei -> PIDA ... FOUND
looking for PaiMei -> pGRAPH ... FOUND
looking for PaiMei -> Utilities ... FOUND

Install PaiMei framework libraries to Python site packages? y

...
/* end */

07. Thats it!
-> echo "from pydbg import *" >> test.py
-> python test.py

it should run without any errors.


... And its very fitting to say here

Happing Hacking...

Monday, January 11, 2010

Map partition from a disk for mounting

/* Say there is a disk image(disk_image.img) where individual partitions needs
to be mounted */

# file disk_image.img
disk_image.img : x86 boot sector; partition 4: ID=0x4, active, starthead 1, startsector 32, 8160 sectors, extended partition table (last)\011

/* simple matter of using kpartx utility */
# kpartx -av disk_image.img

/* the output will be ... */
loop1p1 : < string showing start - ends >
...

/* mount the required partition */
# mount /dev/mapper/loop1p1 /mnt/ -o loop

/* clean up */
# kpartx -d disk_image.img

Wednesday, January 6, 2010

Working with Eclipse and pydev under windows

1. Get Python from "http://www.python.org/ftp/python/3.1.1/python-3.1.1.msi"

2. Get Eclipse from "http://www.eclipse.org/downloads/download.php?file=/technology/epp/downloads/release/galileo/SR1/eclipse-cpp-galileo-SR1-win32.zip"

3. Extract both to your favorite folder.

4. Launch eclipse and goto Help->Install Software Updates

4a. Input "http://pydev.org/updates" Where it asks to "select for site". And
follow the instructions. Eclipse will restart to reflect the new changes.

5. Under Eclipse Goto Windows->Preference

5a. Under this goto Pydev->"Interpreter-Python"

5b. Click the "New" button on the top left and input the path to your python
interpreter(python.exe) and click apply.