Difference between revisions of "Consolidate Domain Decomposition"
From MohidWiki
(→Topology) |
(→Topology) |
||
Line 7: | Line 7: | ||
== Topology == | == Topology == | ||
− | [[File:Ddc.rcm.png|thumb|left|alt= | + | [[File:Ddc.rcm.png|thumb|left|alt=Diagram of the architecture for the consolidation of domain decomposition HDF result files.]]There are 2 kinds of MPI processes running: |
− | *a | + | *a Parser; and |
*2 workers. | *2 workers. | ||
+ | |||
+ | The Parser identifies result files to be consolidated. It keeps 2 lists: | ||
+ | *a list of jobs to be done; and | ||
+ | *a list of workers. | ||
+ | The list of workers will be used in the end, when there are no more pending jobs. A message will be send to every worker notifying them that they should terminate. | ||
+ | The list of jobs works like this: | ||
+ | #a Worker is idle it sends a request to the Parser; | ||
+ | #the Parser sends the first unfinished job to the worker and moves it from the first to the last position of the list; | ||
+ | #when a Worker terminates a job sends a processed message to the Parser; | ||
+ | #the Parser removes the completed job from the list. | ||
+ | This strategy for the jobs list ensures fault tolerance. |
Revision as of 19:31, 30 November 2014
The program Consolidate Domain Decomposition uses the actor model to parallelize file processing tasks. For instance, suppose there are hydrodynamic and water quality files from 2 sub-domains:
- hyd_submod_1.hdf and wq_submod_1.hdf; and
- hyd_submod_2.hdf and wq_submod_2.hdf.
The desired consolidated files are hyd.hdf and wq.hdf, resulting form the concatenation of:
- hyd_submod_1.hdf and hyd_submod_2.hdf to produce hyd.hdf; and
- wq_submod_1.hdf and wq_submod_2.hdf for the wq.hdf
Topology
There are 2 kinds of MPI processes running:- a Parser; and
- 2 workers.
The Parser identifies result files to be consolidated. It keeps 2 lists:
- a list of jobs to be done; and
- a list of workers.
The list of workers will be used in the end, when there are no more pending jobs. A message will be send to every worker notifying them that they should terminate. The list of jobs works like this:
- a Worker is idle it sends a request to the Parser;
- the Parser sends the first unfinished job to the worker and moves it from the first to the last position of the list;
- when a Worker terminates a job sends a processed message to the Parser;
- the Parser removes the completed job from the list.
This strategy for the jobs list ensures fault tolerance.