Tuesday, November 3, 2009

Patterns in Parallel Computing - Task Decomposition Pattern in a nut shell

In parallel programing we can find patterns that can use to achieve the parallelization. Here I listed the patterns that used in parallel programing.
  • Task Decomposition Pattern
  • Data Decomposition Pattern
  • Group Task Pattern
  • Order Task Pattern
  • Data Sharing Pattern
  • Design Evaluation Pattern
Task Decomposition in a nut shell...

When achieving task based parallelism programmer must aware about the following points
  • He must understands the which are the computationally intensive parts of the problem.
  • Key Data structures
  • How the data is used as the problem's solution unfolds.
Fundamentally every parallel algorithm involves a collection of tasks that can executes concurrently and the challenge is to find those tasks and crafts an algorithm that lets them run concurrently.It will be easy to achieve task based decomposition when it is set of independent task.

Lets' see what are the factors which influence in parallel design
  • Flexibility : Allow to adapted to different implementation requirements.
  • Efficiency : Parallel program must scale efficiently with the size of the parallel computer. Ex: Task decomposition needs to keep all the PEs busy.
  • Simplicity : Task decomposition needs to be complex enough to get the job done.
In a task based decomposition we look at the problem as a collection of distinct tasks paying particular attention to
  • The actions that are carried out to solve the problem.
  • Whether these actions are distinct and relatively independent.
Tasks can be found in many different places.
  • Each task corresponds to a distinct call to a function defining a task for each function call... leads to what is called a functional decomposition.
  • Distinct iterations of the loops with in an algorithm.
  • A large data structure is decomposed and multiple units of execution concurrently update different chunks of the data structure. In this case the tasks are those updates on individual chunks.
Matrix multiplication
When we consider a multiplication of two matrices A and B(C=A.B). We can produce a task based decomposition by considering the calculation of each element of the product as a separate task. Each task needs to access to one row of A and one column of B. All tasks are independent. Also all the data that is shared among the tasks A and B is read-only. That mean an implementation of a shared memory environment.

But this algorithm has a poor performance because the memory access time is slow compared to floating point arithmetic. Bandwidth of the memory subsystem would limit the performance.

But we can design a algorithm that maximize the reuse of data loaded into a processor's caches. This can be achieved by two different ways.
  • Using the Group Tasks Pattern: tasks which use similar sort of elements are grouped and run on the same UE.
  • Data Decomposition: Design the algorithm from the beginning around the way the matrices fit into the caches.
Most of the cases we will use the Task decomposition with the conjunction of Group Tasks pattern and Data decomposition.

0 comments:

Post a Comment

Custom Search

My Video Bar

Loading...