DESCRIPTION:
The Operating Systems and Network Lab within the Computer Science Department is building and evaluating real-world high-performance systems software. A project in distributed cluster resource virtualization led by Professor Gopalan has developed the MemX system to provide business a competitive edge in fast processing of large quantities of data. MemX pools together and virtualizes cluster-wide memory in order to transparently execute unmodified large-memory and large-dataset applications over commodity gigabit Ethernet LAN—without the need for any code changes, recompilation, or relinking. MemX offers a means to overcome the limits on physical memory (DRAM) capacity within a single machine and efficiently harness the unused memory capacity across a low-latency Gigabit Ethernet cluster.
In most installed systems, large-memory applications that exceed memory limits will page (swap) from DRAM to local, physical disk, incurring large disk Input / Output (I/O) latencies. On the other hand, access to memory of remote machines across a Gigabit Ethernet LAN can be orders of magnitude faster than access to local disk. MemX exploits this observation to alleviate the I/O bottlenecks associated with such applications. It enables unmodified large memory applications to transparently access the cluster-wide unused memory resources with only micro-second latencies. MemX uses local disk is used only as a last resort, i.e., after exhausting all available remote memory.
The MemX system allows any applications to use the virtualized remote memory pool either as a low-latency swap device or as a low-latency file system for storing large non-persistent datasets. MemX fills the widening performance gap between fast access to local DRAM and slow access to local and remote disks. In a network of multiple remote memory consumers (clients) and remote memory contributors (servers), MemX allows both a single client to harness memory from multiple servers and a single server to share its memory pool among multiple clients, thus efficiently multiplexing cluster-wide memory resources.
MemX also automatically balances loads among back-end services, features a fault-recovery mechanism and a means to adapt dynamically to resource variations as aggregated memory space grows and shrinks with contributors joining or leaving the network. In addition, MemX also supports large memory applications that may execute within a virtual machine (VM) environment, such as web services hosted in data centers or high-performance and grid computing applications.
Detailed performance evaluations of MemX using a number of unmodified benchmarks show a 2 to 20X speed-up compared to local and iSCSI disks in unmodified applications such as in-memory sorting, 3-D rendering, sql-bench, and network simulations
POTENTIAL APPLICATIONS:
MemX can support most demanding memory-hungry workloads in a transparent manner. Few examples of such applications include:
- Application-specific pre-processing of large GIS and mapping databases
- Low-latency database transactions for web services
- Back-end support for cloud computing
- Data mining
- Multimedia processing
- Image processing
- Large-scale scientific simulations
ADVANTAGES:
- Plug-n-play access to cluster-wide memory pool without rewriting, recompiling, or relinking any applications or libraries.
- Application processing speeds 2-20X faster than local disk
- Utilizing the installed base of commodity Gigabit Ethernet LANs.
- Less expensive than purchasing a single large-DRAM machine.
- Built-in reliability and dynamic resource adaptation features
PATENT STATUS
Patent pending. US Patent Application #11/957,410 filed 12/15/07.
ADDITIONAL REFERENCE INFORMATION
A description of ongoing research on MemX underway at the Operating Systems and Networks Lab (OSNET), as well as citations of publications can be accessed at:
http://osnet.cs.binghamton.edu/projects/memx.html
|