Using ROOT PROOF

If you have a big amount of data to process, you can essentially accelerate the data processing in your ROOT macros with Parallel Root Facility (PROOF). The information on the PROOF system is presented here.
The PROOF system is a part of the ROOT environment and included in the ROOT distribution that is why it is not needed to install an additional software. It uses data independent parallelism based on the lack of event correlations to process different events in parallel that leads to good scalability. One of the most important PROOF properties is a transparency: the same program code can execute both sequentially and concurrently. PROOF orients on three parallel architectures. PROOF-Lite parallelizes data processing on one multiprocessor or multi-core machine. PROOF parallelizes processing on heterogeneous computing clusters and in the GRID system.
PROOF support was added to the software of the experiment and reconstruction code was rewritten according to the PROOF rules and classes. The last new parameter of the reconstruction: run_type has default ‘local’ value for sequential processing, i.e. without PROOF. There are two possibilities to use PROOF in the reconstruction: on user multi-core local machine or on the PROOF cluster:

  1. To parallel event processing on your multicore local machine with PROOF-Lite, you can use ‘proof’ string value and can limit the threads number by “workers” number, e.g. for MpdRoot:
  2. $ root reco.C(“file_to_process.root”, “result_file.root”, 0, 1000, “proof”)

    – where 0 – number of the start event, 1000 – count of events to process. In this case PROOF parallelizes processing of 1000 events with thread count being equal logical processor count of your machine.
    In the following case PROOF parallelizes event processing with number of threads being equal 5, i.e. given in the last parameter:

    $ root reco.C(“file_to_process.root”, “result_file.root”, 0, 1000, “proof:workers=5”)
  3. To speed up the event reconstruction on the PROOF cluster with PROOF server, the last parameter is used and also can limit the number of the workers.
  4. PROOF parallelizes the processing of 1000 event on the NICA prototype cluster with number of process being equal the count of logical processors allocated for PROOF:

    $ root reco.C(“file_to_process.root”, “result_file.root”, 0, 1000, 
    “proof:mpd@nc10.jinr.ru:21001”)

    In order to parallel event processing with PROOF on the NICA prototype cluster by 15 workers (processes):

    $ root reco.C(“file_to_process.root”, “result_file.root”, 0, 1000, 
    “proof:mpd@nc10.jinr.ru:21001:workers=15”)

The reconstruction macro has the comment lines with different examples at the beginning to run the event reconstruction with PROOF in parallel.

Using PROOF in software of the experiment is also decribed at this MS PowerPoint presentation. If you often use your analysis macro that takes a lot of time and you want to parallelize it with PROOF or you have any questions about PROOF parallelization in our software, please, email gertsen@jinr.ru.

Leave a Reply