[Design View / Design Solution]
Master On-Chip Embedded Multiprocessor Coherence
Although snoopy virtual-bus approaches are the first step, hybrid snoopy-directory schemes will be the next trend in embedded coherence.
Deadlock/Livelock In addition to choosing the method of serialization and type of coherence protocol, cache-coherence protocol designers must guarantee that the protocol is deadlock- and livelock-free, given limited resource/buffer constraints. This is particularly relevant in packet-switched, interconnect-based coherence.
There are two types of deadlocks—interconnect and protocol. Both generally occur due to buffer constraints in a packet-switched interconnect. Protocol deadlocks should be carefully considered when designing coherence protocols (Fig. 3a). Common schemes that prevent deadlocks include separating a transaction's request path from the reply/response path, and guaranteeing that a cache or a memory agent responds to a request in any state.
To accomplish the first scheme, designers usually use virtual channels7 (Fig. 3b). Transactions flowing in any virtual channel follow a FIFO order, and a blocking event in the stream causes a backpressure that can be traced all the way to the source. Hence, as long as the sinks (of transactions) make forward progress, so does the system.
Livelock in a distributed system takes place when there's a halt to forward progress. At the processor, this is reflected in the program counter of some Load/Store not making forward progress. This frequently occurs when multiple caches try unsuccessfully to gain ownership of a cache line. If a global serial order is properly established in the system, each agent can handle requests in that order. The global serial order itself must be established in a fair manner. Various resources (ports, buses, buffers) all need to be fairly allocated to the multiple threads/processors.
Another concern related to livelock prevention is flow control. A system's flow control limits resource allocation. Done in an ad-hoc manner, it could result in livelock. A common case is overuses of retries or negative acknowledgements (NACKs) while responding to a request.
Other Considerations Beyond deadlock and livelock, designers should consider the following issues:
Cache hierarchies and DMA: Issues of deadlock surface as transactions traverse the cache hierarchy. Usually, one can adopt the same mechanism used in the broader protocol to keep the requests and replies on separate (virtual or real) channels/FIFOs.
Another issue concerns determining the level at which to enforce coherence (L1 Cache, L2 Cache, or L3 Cache). Where will the I/O enter/extract cache lines from the coherence domain? Solutions that involve issues of inclusion are usually very application- or system-specific. Hints can be supplied to the coherence system for prefetching and data placement by incorporating the hints into the coherence system's transaction set. The obvious example is in routing, where the headers of an incoming IP packet need to be matched against a table to determine the destination buffer/interface for that packet. These headers can be placed close to the lower cache levels by coloring transactions with hints, such as Read/Write Hit/Miss policies.
Synchronization and barrier operations: Many ISAs offer various atomic primitives that must be mapped onto the coherence system. LL/SC, for instance, is a common atomic primitive in modern ISAs.4 This form of atomicity is prone to livelock if not implemented correctly, and can lead to deadlock. Weaker memory systems require a safety net, called barriers, to force a certain behavior (usually between Stores and Loads issued from a processor or thread) during sensitive code sequences. This is generally achieved by inserting special barrier instructions supported by the ISA. The coherence system may need to respond to these operations by dynamically stalling certain transactions to support their behavior.
Further Reading:
Hellestrand, G. R., Rapid Design of Software-Rich Chips: Executable Specification -> Realization , white paper, VaST Systems Technology Corp., Oct. 2002