.NET4 and Memory Mapped Files


I think I am really going to like .NET 4:


I certainly welcome this new addition to .NET, and for inter-process communication on a single machine, it’s fabulous. Move over WCF and Named Pipes memory mapped files coming through. Memory mapped files will be my first choice for the scenarios that were previously reserved for WCF and Named Pipes, well, as soon as .NET 4 ships.

With great power comes great responsibility, the memory mapped file approach seriously ups the ante when it comes to understanding and using multithreading in your code; incidentally I highly recommend Joe Duffy’s Concurrent Programming on Windows, it is on my must read list for any developer working in Windows today.

If this is a topic that you enjoy, you can find more information about this feature in .NET 4 on Salvador Patuel’s blog. As usual if you have any comments, questions, flames, enhancements I would love to hear from you. In the meantime, think deeply and enjoy.


Robert said...

I'm a little puzzled, but pardon my ignorance since I haven't had to code for Inter-Process Communications myself. Why would mapping to a physical file (usually on a disk) be good performance for IPC? Surely IPC should use methods which keep the data in main memory (RAM) where possible, since RAM is several orders of magnitude faster than disk access? I realise that the actual write to disk probably doesn't happen (due to lazy write), but it still strikes me as odd that mapping to a physical file (e.g. a file on the desktop in your example) is faster than creating a named pipe. Can someone please explain?

Anonymous said...

More importantly, the power of a memory mapped file would only be unleashed if you could use pointers in an unsafe context. Mapping a file and then performing a read, which copies into the buffer is enormously wasteful and in fact, basically the same that P/Invoke does.

Do you know whether we'll be able to use an unsafe context to get a pointer to this? Because otherwise I don't see the point.

Stephen Remde said...

Wikipedia says:

"Accessing memory mapped files is faster than using direct read and write operations for two reasons. Firstly, a system call is orders of magnitude slower than a simple change of program's local memory. Secondly, in most operating systems the memory region mapped actually is the kernel's page cache (file cache), meaning that no copies need to be created in user space."

How much this affects managed code, I don't know.

Robert said...

Yes, I understand that it's quicker for reading/writing files. But I was talking about IPC. As good old Wikipedia also explains, Named Pipes avoid extra IO operations, since they remain in memory. MMFs will produce some disk IO whenever the file system driver decides to commit. I guess that's why MMFs are ideally suited to small amounts of data, which makes it less likely that the file system (or disk I/O subsystem whatever it is) will decide to commit anything to disk.

Here's some food for thought:

To be fair, it's doubtful this will ever be an issue for most applications.

Incidentally, from web searching, it appears that MMFs are the simplest way for 3 or more processes to communicate (e.g. if for some reason you want data to be sent from 1 process to 2 more processes on the same PC). So there's another advantage to them.

Rob said...

I don't know about IPC, but the exciting part for me is replicating Perl's tie() method, where one can easily tie a data structure to a file on disk. This helps greatly when running large, complex batch applications, as you can use and fill up your data structures without worrying about running out of memory. When a batch app takes 3 days to run, this can be quite painful.

Dan Finch said...

According to this article, memory-mapped files do not necessarily correspond to a physical file. In fact, named pipes make used of memory-mapped files.