2008年8月29日星期五

Sharing Memory Between Drivers and Applications

At one time or another, most driver writers will have the need to share memory between a driver and a user-mode program. And, as with most such things, there are a wide variety of ways to accomplish the goal of sharing a block of memory between a driver and a user-mode application. Some of these approaches are decidedly right and some are wrong. Two of the easiest techniques are:

• The application sends an IOCTL to the driver, providing a pointer to a buffer that the driver and the application thereafter share.

• The driver allocates a block of memory (from nonpaged pool for example), maps that block of memory back in the address space of a specific user-mode process, and returns the address to the application.

For the sake of brevity, we’ll restrict our discussion to these two, straightforward, techniques. Other perfectly acceptable techniques include sharing a named section that’s backed by either the paging file or a memory mapped file. Perhaps we’ll discuss those in a future article. Also, note that this article won’t specifically address sharing memory that’s resident on a device. While many of the concepts are the same, sharing device memory with a user-mode program brings with it its own set of special challenges.

Sharing Buffers Using IOCTLs
Sharing memory between a driver and a user-mode app using a buffer described with an IOCTL is the simplest form of “memory sharing”. After all, it’s identical to the way drivers support other, more typical, I/O requests. The base address and length of the buffer to be shared are specified by the application in the OutBuffer of a call to the Win32 function DeviceIoControl().

The only interesting decision for the driver writer who uses this method of buffer sharing is which buffer method (or, “transfer type” as it’s known) to specify for the IOCTL. Either METHOD_DIRECT (that is, using an MDL) or METHOD_NEITHER (using user virtual addresses) will work. If METHOD_DIRECT is used, the user buffer will be locked into memory. The driver will also need to call MmGetSystemAddressForMdlSafe() to map the described data buffer into kernel virtual address space. An advantage of this method is that the driver can access the shared memory buffer from an arbitrary process context, and at any IRQL.

There are a number of restrictions and caveats inherent in using METHOD_NEITHER to describe a shared memory buffer. Basically, these are the same ones that apply any time a driver uses this method. Chief among these is the rule that the driver must only access the buffer in the context of the requesting process. This is because access to the shared buffer is via the buffer’s user virtual address. This will almost certainly mean that the driver must be at the top of the device stack, called directly by the user application via the I/O Manager. There can be no intermediate or file system drivers layered above the driver. Again practically speaking, this probably also means that the driver is restricted to accessing the user buffer from within its dispatch routines, when called by the requesting process.

Another important restriction inherent in using METHOD_NEITHER is that access by the driver to the user buffer must always be done at IRQL PASSIVE_LEVEL. This is because the I/O manager hasn’t locked the user buffer in memory, and it could be paged out when accessed by the driver. If the driver can’t meet this requirement, it will need to build an MDL and then lock the buffer in memory.

Another, perhaps less immediately obvious, restriction to this method – regardless of the transfer type chosen – is that the memory to be shared must be allocated by the user mode application. The amount of memory that can be allocated can be restricted, for example, due to quota limitations. Additionally, user applications cannot allocate physically contiguous or non-cached memory. Still, if all a driver and a user mode application need to do is pass data back and forth using a reasonably-sized data buffer, this technique can be both easy and useful.

As easy as it is, using IOCTLs to share memory between a driver and a user-mode application is also one of the most frequently misused schemes. One common mistake new NT driver writers make when using this scheme is that they complete the IOCTL sent by the application after having retrieved the buffer address from it. This is a very bad thing. Why? What happens if the user application suddenly exits, for example, due to an exception? With no I/O operation in progress to track the reference on the user buffer, the driver could unintentionally overwrite a random chunk of memory. Another problem is that when using METHOD_DIRECT, if the IRP with the MDL is completed the buffer will no longer be mapped into system address space. An attempt to access the previously valid kernel virtual address (obtained using MmGetSystemAddressForMdlSafe()) will crash the system. This is generally to be avoided.

Mapping Kernel Memory To User Mode
That leaves us with the second scheme mentioned above: Mapping a buffer allocated in kernel mode into the user virtual address space of a specified process. This scheme is surprising easy, uses API familiar to most NT driver writers, and yet allows the driver to retain maximum control of the type of memory being allocated.

The driver uses whatever standard method it desires to allocate the buffer to be shared. For example, if the driver needs a device (logical) address appropriate for DMA, as well as a kernel virtual address for the memory block, it could allocate the memory using AllocateCommonBuffer(). If no special memory characteristics are required and the amount of memory to be shared is modest, the driver can allocate the buffer from nonpaged pool.

The driver allocates an MDL to describe the buffer using IoAllocateMdl(). In addition to allocating the MDL from the I/O Manager’s look-aside list, this function fills in the MDL’s “fixed” part. Next, to fill in the variable part of the MDL (the part with the page pointers) the driver calls MmBuildMdlForNonPagedPool().

With an MDL built that describes the buffer to be shared, the driver is now ready to map that buffer into the address space of the user process. This is accomplished using the function MmMapLockedPagesSpecifyCache() (for Win2K) or MmMapLockedPages() (for NT V4).

The only “tricks” you need to know about calling either of the MmMapLocked…() functions are (a) you must call the function from within the context of the process into which you want to map the buffer, and (b) you specify UserMode for the AccessMode parameter. The value returned from the MmMapLocked…() call is the user virtual address into which the buffer described by the MDL has been mapped. The driver can return that to the user application in a buffer in response to an IOCTL. That’s all there is to it. Put together, the code to accomplish this process is shown in Figure 1.

PVOID
CreateAndMapMemory()
{
PVOID buffer;
PMDL mdl;
PVOID userVAToReturn;

//
// Allocate a 4K buffer to share with the application
//
buffer = ExAllocatePoolWithTag(NonPagedPool,
PAGE_SIZE,
'MpaM');

if(!buffer) {
return(NULL);
}

//
// Allocate and initalize an MDL that describes the buffer
//
mdl = IoAllocateMdl(buffer,
PAGE_SIZE,
FALSE,
FALSE,
NULL);

if(!mdl) {
ExFreePool(buffer);
return(NULL);
}

//
// Finish building the MDL -- Fill in the "page portion"
//
MmBuildMdlForNonPagedPool(mdl);

#if NT_40

//
// Map the buffer into user space
//
// NOTE: This function bug checks if out of PTEs
//
userVAToReturn = MmMapLockedPages(mdl,
UserMode);

#else

//
// The preferred V5 way to map the buffer into user space
//
userVAToReturn =
MmMapLockedPagesSpecifyCache(mdl, // MDL
UserMode, // Mode
MmCached, // Caching
NULL, // Address
FALSE, // Bugcheck?
NormalPagePriority); // Priority

//
// If we get NULL back, the request didn't work.
// I'm thinkin' that's better than a bug check anyday.
//
if(!userVAToReturn) {

IoFreeMdl(mdl);
ExFreePool(buffer);
return(NULL);
}

#endif

//
// Store away both the mapped VA and the MDL address, so that
// later we can call MmUnmapLockedPages(StoredPointer, StoredMdl)
//
StoredPointer = userVAToReturn;
StoredMdl = mdl;

DbgPrint("UserVA = 0x%0x\n", userVAToReturn);

return(userVAToReturn);
}



Figure 1 — Allocating a Buffer & Mapping Into User Mode


Of course, this method does have the disadvantage that the call to MmMapLocked…() must be done in the context of the process into which you want the buffer to be mapped. This might at first make this method appear no more flexible than the method that uses an IOCTL with METHOD_NEITHER. However, unlike that method, this one only requires one function (MmMapLocked…()) to be called in the target process’ context. Because many drivers for OEM devices are in a device stacks of one above the bus (that is, there is no device above them, and no driver but the bus driver below them) this condition will be easily met. For the rare device driver that will want to share a buffer directly with a user-mode application that’s located deep within a device stack, an enterprising driver writer can probably find a safe way to call MmMapLocked…() in the context of the requesting process.

After the shared memory buffer has been mapped, like the method that uses the IOCTL with METHOD_DIRECT, the reference to the shared buffer can take place from an arbitrary process context, and even at elevated IRQL (because the shared buffer is not pageable).

If you use this method, there is one final thing that you’ll have to keep in mind: You will have to ensure that your driver provides a method to unmap those pages that you mapped into the user process any time the user process exits. Failure to do this will cause the system to crash as soon as the app exits, which is definitely to be avoided. One easy way that we’ve found of doing this is to unmap the pages whenever the application closes the device. Because closing the handle – expected or otherwise – always results in an IRP_MJ_CLEANUP being received by your driver for the File Object that represented the applications open instance of your device, you can be sure this will work. You want to perform this operation at CLEANUP time, no CLOSE, because you can be (relatively) assured that you will get the cleanup IRP in the context of the requesting thread.

Other Challenges
Despite the mechanism used, the driver and application will need a common method of synchronizing access to the shared buffer. This can be done in a variety of ways. Probably the simplest mechanism is sharing one or more named events. When an application calls CreateEvent(), the named event is automatically created in the Object Manager’s BaseNamedObjects directory. A driver can open, and share, these event objects by calling IoCreateNotificationEvent(), and specifying the same name as was specified in user mode (except, of course, specifying “\BaseNamedObjects” as the directory).

In Summary
We’ve looked at two methods for allowing a driver and a user-mode application to share a data buffer: Using a buffer created by a user application and passed to a driver via an IOCTL, and using a buffer created by the driver and mapped into the application’s address space using one of the MmMapLocked…() functions. Both methods are relatively simple, as long as you follow a few rules. Have fun!

1 条评论:

wedday 说...

DeviceIoControl Buffers
From paulsan@microsoftSPAM.com Mon Jan 11 12:03:46 1999 Path: relief.cts.com!newshub.cts.com!mercury.cts.com!socal.verio.net!
nntp.ni.net!peerfeed.ncal.verio.net!news.idt.net!logbridge.uoregon.edu!
netnews1.nw.verio.net!netnews.nwnet.net!news.microsoft.com!news
From: "-Paul"
Newsgroups: comp.os.ms-windows.programmer.nt.kernel-mode
Subject: Re: DeviceIoControl input/output buffer access in driver
Date: Mon, 11 Jan 1999 12:03:46 -0800
Organization: Microsoft Corp.
Message-ID: <77dlgd$kfe@news.dns.microsoft.com>


Here is an explanation of buffers and DeviceIoControl.

First, here are the parameters,

BOOL DeviceIoControl(
HANDLE hDevice, // handle to device of interest
DWORD dwIoControlCode, // control code of operation
// to perform
LPVOID lpInBuffer, // pointer to buffer to supply
// input data
DWORD nInBufferSize, // size of input buffer
LPVOID lpOutBuffer, // pointer to buffer to receive
// output data
DWORD nOutBufferSize, // size of output buffer
LPDWORD lpBytesReturned, // pointer to variable to receive
// output byte count
LPOVERLAPPED lpOverlapped // pointer to overlapped structure
// for asynchronous operation
);

METHOD_BUFFERED
user-mode perspective
lpInBuffer - optional, contains data that is written to the driver
lpOutBuffer - optional, contains data that is read from the driver after the call has completed

lpInBuffer and lpOutBuffer can be two buffers or a single shared buffer. If a shared buffer, lpInBuffer is overwritten by lpOutBuffer.


I/O Manager perspective
examines nInBufferSize and nOutBufferSize. Allocates memory from non-paged pool and puts the address of this pool in Irp->AssociatedIrp.SystemBuffer. The size of this buffer is equal to the size of the larger of the two bufferes. This buffer is accessible at any IRQL.

copies nInBufferSize to irpSp->Parameters.DeviceIoControl.InputBufferLength
copies nOutBufferSize to
irpSp->Parameters.DeviceIoControl.OutputBufferLength
copies contents of lpInBuffer to SystemBuffer allocated above
calls your driver

Device Driver perspective
you have one buffer, Irp->AssociatedIrp.SystemBuffer. You read input data from this buffer and you write output data to the same buffer, overwriting the input data.

Before calling IoCompleteRequest, you must
- set IoStatus.Status to an approriate NtStatus
- if IoStatus.Status == STATUS_SUCCESS
set IoStatus.Information to the
number of bytes you want copied
from the SystemBuffer back into
lpOutBuffer.



I/O Manager Completion Routine perspective
looks at IoStatus block, if IoStatus.Status = STATUS_SUCCESS, then
copies the number of bytes specified by IoStatus.Information from
Irp->AssociatedIrp.SystemBuffer into lpOutBuffer
completes the request


METHOD_IN_DIRECT
user-mode perspective
lpInBuffer - optional, contains data that is written to the driver. This buffer is used in the exact same fashion as METHOD_BUFFERED. To avoid confusion, mentally rename this buffer to lpControlBuffer. This is typically a small, optional buffer that might contain a control structure with useful information for the device driver. This buffer is small and is double buffered.

lpOutBuffer - NOT OPTIONAL, This LARGE buffer contains data that is read by the driver. To avoid confusion, mentally rename this buffer to lpDataTransferBuffer. This is physically the same buffer that the device driver will read from. There is no double buffering. Technically, this buffer is still optional, but since you are using this buffering method, what would be the point???


I/O Manager perspective
If lpInBuffer exists, allocates memory from non-paged pool and puts the address of this pool in Irp->AssociatedIrp.SystemBuffer. This buffer is accessible at any IRQL.

copies nInBufferSize to irpSp->Parameters.DeviceIoControl.InputBufferLength
copies nOutBufferSize to
irpSp->Parameters.DeviceIoControl.OutputBufferLength
copies contents of lpInBuffer to SystemBuffer allocated above

So far this is completely identical to METHOD_BUFFERED. Most likely lpInBuffer (mentally renamed to lpControlBuffer) is very small in size.

For lpOutBuffer (mentally renamed to lpDataTransferBuffer), an MDL is allocated. lpOutBuffer is probed and locked into memory. Then, the user buffer virtual addresses are checked to be sure they are readable in the caller's access mode.

The MDL is address is stored in Irp->MdlAddress.
Your driver is called.

Device Driver perspective
The device driver can read the copy of lpOutBuffer [should be lpInBuffer--jeh] via Irp->AssociatedIrp.SystemBuffer. Anything written by the device driver to this buffer is lost. The I/O Manager does not copy any data back to the user-mode buffers as it did in the completion routine for METHOD_BUFFERED.

Art Baker's book is wrong in this respect (page 168, "data going from the driver back to the caller is passed through an intermediate system-space buffer" and page 177, "When the IOCTL IRP is completed, the contents of the system buffer will be copied back into the callers original output buffer".

The device driver accesses the Win32 buffer [lpOutBuffer -- jeh] directly via Irp->MdlAddress. The driver uses whatever Mdl API's to read the buffer. Usually, this buffer is to be written to some mass storage media or some similar operation. Since this is a large data transfer, assume a completion routine is required.

mark the Irp pending
queue it
return status pending


Device Driver Completion Routine perspective
standard completion routine operations
set IoStatus.Status to an approriate NtStatus
IoStatus.Information is not needed
complete the request


[I disagree with the "IoStatus.Information is not needed" comment. This longword is passed back to the caller as the returned lpBytesReturned value, and the application may want to see this. This represents the number of bytes of the lpOutBuffer actually written to the device. -- jeh]

I/O Manager Completion Routine perspective
standard I/O Manager completion routine operations
unmap the pages
deallocate the Mdl
complete the request


METHOD_OUT_DIRECT
user-mode perspective
lpInBuffer - optional, contains data that is written to the driver. This buffer is used in the exact same fashion as METHOD_BUFFERED. To avoid confusion, mentally rename this buffer to lpControlBuffer. This is typically a small, optional buffer that might contain a control structure with useful information for the device driver. This buffer is smal and is double buffered.

lpOutBuffer - NOT OPTIONAL, This LARGE buffer contains data that is written by the driver and read by the user-mode application when the request is completed. To avoid confusion, mentally rename this buffer to lpDataTransferBuffer. This is physically the same buffer that the device driver will write to. There is no double buffering. Technically, this buffer is still optional, but since you are using this buffering method, what would be the point???

I/O Manager perspective
If lpInBuffer exists, allocates memory from non-paged pool and puts the address of this pool in Irp->AssociatedIrp.SystemBuffer. This buffer is accessible at any IRQL.

copies nInBufferSize to irpSp->Parameters.DeviceIoControl.InputBufferLength
copies nOutBufferSize to irpSp->Parameters.DeviceIoControl.OutputBufferLength
copies contents of lpInBuffer to SystemBuffer allocated above

So far this is completely identical to METHOD_BUFFERED. Most likely lpInBuffer (mentally renamed to lpControlBuffer) is very small in size.

For lpOutBuffer (mentally renamed to lpDataTransferBuffer), an MDL is allocated. lpOutBuffer is probed and locked into memory. Then the user buffer's addresses are checked to make sure the caller could write to them in the caller's access mode.

The MDL is address is stored in Irp->MdlAddress.
Your driver is called.

Device Driver perspective
The device driver can read the copy of lpOutBuffer [should be lpInBuffer--jeh] via Irp->AssociatedIrp.SystemBuffer. Anything written by the device driver to this buffer is lost.

The device driver accesses the Win32 buffer [lpOutBuffer -- jeh] directly via Irp->MdlAddress. The driver uses whatever Mdl API's to write data to the buffer. Usually, this buffer is to be read from some mass storage media or some similar operation. Since this is a large data transfer, assume a completion routine is required.

mark the Irp pending
queue it
return status pending

Device Driver Completion Routine perspective
standard completion routine operations
set IoStatus.Status to an approriate NtStatus
IoStatus.Information is not needed
complete the request

[I disagree with the "IoStatus.Information is not needed" comment. This longword is passed back to the caller as the returned lpBytesReturned value, and the application may want to see this. This represents the number of bytes of the lpOutBuffer actually read from the device. -- jeh]

I/O Manager Completion Routine perspective
standard I/O Manager completion routine operations
unmap the pages
deallocate the Mdl
complete the request

METHOD_NEITHER
I/O Manager perspective
Irp->UserBuffer = lpOutputBuffer;
IrpSp->Parameters.DeviceIoControl.Type3InputBuffer = lpInputBuffer;

No comments here. Don't use METHOD_NEITHER unless you know what you are doing. Simple rule.

If your IOCtl involves no data transfer buffers, then METHOD_NEITHER is the fastest path through the I/O Manager that involves an Irp.

Final Comment
Don't touch Irp->UserBuffer. This is a bookmark for the I/O Manager. Two major problems can occur. 1 - page fault at high IRQL, or 2 - you write something to Irp->UserBuffer and the I/O Manager overwrites you in its completion routine. File systems access Irp->UserBuffer, but FSD writers know all of the above and know when it is safe to touch Irp->UserBuffer.