Thread programming II

In my previous post, I was talking about threads, this is a continuation… The trend toward multi-core processors is expected to continue into the future, and software should be threaded to take full advantage of it. Meaning multi thread and multi-core go hand in hand.

Processor manufacturer’s not simply increasing the clock speed to increase the performance since it produces too much heat, much more power consumption, more costs, but when system scaled out (adding more cores) the requirements are very close to single processor attributes (cost, cooling requirements, and heat)

Typically clock speed on dual-core processors are slower than single-core CPU,  so the performance gains can be marginal. Then why it is going to be exiting? Modern processors can run 4 hyper-threads on a single CPU, What if we have 32 cores? It will be 128 threads!!!  For consumer facing applications we could see this difference in Games, Virus scan, and other processor incentive multi-threaded applications.  In the near future we may have thousands of cores in single CPU, especially for server side processing.

Multi processing has two models, for ex: If we assign one or more processors to specific task (ex: OS) the remaining processors could be used for user tasks. This is called Asymmetric.  On other hand, the popular model is load balancing among the cores. This model is called Symmetrical and improving overall performance increase up to 80% per processor. Why it is not 100%? Since processing cores sharing the same system bus and memory bandwidth, thread management etc. 

Thread Design Considerations:
- Threads should conflict with each other as little as possible.

- Proper mapping of functional/Data decomposition into one or more threads:
The first step in designing threading system is to determine where the system is spending the most execution time. Then have to find out the computations which can be divided into independent tasks (functional decomposition). If a stream of instructions can be partitioned into groups (Methods) that are independent of each other, we may assign thread for each method/group, Data may be shared among threads, but it should as minimum as possible. If the computation involves large amount of Dataset then we could divide data sets (Data decomposition) with associated computation and assign thread to each one. Note: Functional decomposition has multiple threads and each may be assigned to different functions, may share data. Data decomposition shares same computation (ex: loop) among datasets. 

- The critical region (if present) should present as minimum as possible, so that other waiting threads could execute the critical region.

- We may try to apply design patterns or derive one. Implementation involved selecting different level (assembly level to high level) of abstraction and again selecting available options from each level, for example in high level interface, we can select range of options from runtime API’s(.NET Thread) to OS API’s (win 32) to auto parallelize by compilers to Language extensions (openMP) to internal thread libraries. We can even mix all the above in a single application implementation. The next step is debugging/testing where non-deterministic errors should be uncovered ex: race condition, deadlock etc. I believe finding/reproducing error is the hardest task in thread programming. The final step is tuning for performance which may lead to redesign.

- Once applications coded for Hyper threading then there is no recoding necessary for multi core systems. Modern compilers may insert threading code.

 

Post your comment