Monday, September 24, 2007

CompSci Research interest...

So I made an innovative discovery last month... probably nothing you all are interested in, but I wanted to document it as mine :-). I've never seen this done before and I'm a bit happy about it.

DISCLAIMER: This is probably not interesting to anyone who reads this blog - but it's a curious mental exercise for myself (and you)!

A common problem in multi-threaded design is locking... as in two or more threads/processes/CPU's are sharing access to a common piece of information - be it a structure, block of memory, what have you.

And lately we have been using several mutex locks (mutually exclusive) - meaning one execution task can OWN it for reading and writing. Or perhaps we use semaphores (many shared) which allows several execution tasks to own it.

A more interesting innovation is a read/write lock - which is used when readers outnumber writers (in general). Basically you have many tasks reading this shared data - but only a few writing to it.

There are existing systems for read/write locks in place - specifically in pthreads (unix typically) and ERESOURCES in windows. I was forced recently to write my own due to the inadequacy of the windows kernel to provide a usable lock in DISPATCH/DPC interrupt mode.

Background: The Windows XP kernel uses the same mechanism to lock as it does for pre-emption (executing higher-priority code while in the middle of doing something else). This "feature" means that when a thread wants to lock for something that takes a long while (oh, like say, encryption) that in effect you are killing everything else running on that particular CPU.

So, the solution I propose is to use an atomic read/write lock that does not utilize DISPATCH/DPC mode at all. Where this is tricky though - is that in order to utilize this - you have to atomically compare and store both the read lock and write lock - at the same time.

So, here's the bit of hackery! It turns out there is no "dual operand" compare and store on X86. There was on the 68040 - but it's long dead. But what we DO have is a long long compare and store instruction - essentially a 64 bit CAS (compare and store) called CMPXCHG8B.

We're creating a pair of spin locks here - the read lock allows up to 2^32 readers and the write lock is for a single writer. The writer gets precedence over readers by forcing them to wait until completion.

The hackery comes in play in that we implement the locks in a structure like this - and make it's total length 64 bits like so:


// Packed on 4 byte boundary
__declspec(align(4)) typedef struct {
LONG ro_locks;
LONG rw_gate;
} GATE_LOCK;


Ah HA! So this suddenly becomes a much simpler problem. Now we can operate on both the read/write locks atomically with a single instruction!

So, to create a read lock - we do this:

volatile __int64 c, old;
// c is volatile, should be atomic
c = lock->gateLock.ro_locks;
// We're comparing the entire gate lock with the ro_locks count
// Should block if a read-write lock has been set

while( (old = CAS( ( volatile unsigned __int64 *) & lock->gateLock,
c + 1, c )) != c+1) {
c = old & 0xffffffff;
} // while RO_LOCKS


Notice the 0xffffffff - this is masking (setting to zero) the top bits in the 64 bit integer - thus requiring that the write lock is not in place.

Here's the write lock:

// Set the gate lock, drain out the readers
XCHNG( (volatile LONG *) &lock->gateLock.rw_gate, 1);
// Wait for the readers to empty out
while( (__int64)1<<32 != CAS( (volatile unsigned __int64 *) & lock->gateLock,
(__int64 ) 1 << 32,
(__int64 ) lock->gateLock.rw_gate << 31 ))
} // while RW_GATE


Right now I'm using a basic mutex lock to ensure that writers are only let in one at a time - but I'll update this code to be mutex-free soon. This will involve testing for zero and updating the value to 1 - a basic spin-lock in essence.

6 comments:

Mark Zuniga said...

It's funny. Because I work in a field loaded with Luddites, I always feel like the Mr. IT. The reality is that I have an above-average understanding of programs like Word or Adobe Acrobat, and I can sometimes fake my way around a solution (e.g. I taught myself how to use *.tiff to essentially type on a printed form). In terms of your work, I'm using leeches to suck out the bad humors.

Convivialdingo said...

Well, I feel the same way about legal cases!

I know just enough to get myself through, but have little background in established precedent and case histories.

Unknown said...

I can't say that I'm surprised that the Windows XP scheduler is as dumb as it is, but I would expect that it would give CPU time to other processes as needed. Is that not the case?

I've also never heard of ERESOURCES. Is that what the typical win32 thread use?

Convivialdingo said...

Hi matt - nice to meet you.

For this class of locks (spinlock) the plain answer is no. You are stopping the schedular because the spinlock escalates the interrupt level of the CPU.

Quick overview:
The XP kernel has 3 levels of software interrupt precedence controlled by the Interrupt Dispatch Level. Passive, APC, and DPC.

Normally we think of an interrupt as hardware - for instance the clock chip signals the CPU that a tick has happened. When that happens the CPU records the EIP/return address frame, looks up the interrupt number and executes the interrupt code.

Software interrupts can be generated as well, using the int instruction - just like the old DOS days.

A higher interrupt can interrupt code running at a lower level.

What is happening here is that the XP kernel utilizes this software interrupt to stop the schedular from doing it's job - also known as preemption.

But the kernel ALSO utilizes this software interrupt to perform locking. And so you're paying an enormous price of stopping everything on that CPU for a simple lock.

What I'm attempting to do is to lower this cost by avoiding the interrupt - which should allow other threads to execute. The danger of deadlock/livelock should be carefully avoided. Read only locks are reentrant - meaning that you can lock again within a lock. Read/write locks are not - and that's where you have to be careful.

Anyway - hope you enjoyed my mini-course here :)

Johnny Ong said...

so u are a computer geek too

Convivialdingo said...

Yup, I'm a geek.