Concurrent Clustering
Contents
Concurrent Clustering#
Goal#
In an image covering a large region of the sky, identifying all the clusters can be a time-consuming operation. One possible approach to speed up this process is to explore different region of the sky in parallel, using any kind of concurrency technology. All the applications we have developped so far are sequential. But all modern operating systems (Linux, OSX, Windows…) allow one application to run several tasks concurrently, potentially taking advantage of modern multi-core processors. This exercise is a brief introduction to multithreading and multiprocessing through asynchronous programming. It relies on library concurrent.futures, which makes it rather easy.
A demonstration example#
Look at the example below.
1#!/usr/bin/env python
2# -*- coding: utf-8 -*-
3
4import concurrent.futures
5import time
6
7
8def add(value1, value2):
9 time.sleep(2)
10 return value1 + value2
11
12
13exe = concurrent.futures.ProcessPoolExecutor()
14
15time0 = time.time()
16
17future_res1 = exe.submit(add, 10, 1)
18future_res2 = exe.submit(add, 20, 2)
19
20print(f"res1: {future_res1.result()}")
21print(f"res2: {future_res2.result()}")
22
23time1 = time.time()
24
25print(f"time: {time1 - time0} seconds")
Instead of directly calling add()
, the program is giving those calls to the instance of ProcessPoolExecutor
,
which will execute the function within a child process. The call to submit
is not blocking the execution.
Instead of getting directly the result, one get a “future” result, which will be ready later. When one
call future_res1.result()
, it is blocking the execution until the result is really available. Thus, the
usual practice is to first submit all the computing tasks, so that they can start running concurrently,
then to require all the results.
The example uses an instance of ProcessPoolExecutor
, which will launch multiple processes. Using
multiprocessing can accelerate or not a programm, depending on the time saved when using several cores
simultaneously, compared to the time lost to create the new processes. If your computing time is too
low, compared to the process creation time, you do not speed-up your program.
You can also try with ThreadPoolExecutor
, which will launch multiple threads instead. Threads are
light-weight processes, faster to create, but for many python programs, using multi-threading does not
accelerate the execution, because Python suffers from a “Global Interpreter Lock” (GIL), meaning only
one thread at a time can ask to interpret some Python code. In the example above, there is not enough
instructions to suffer from the GIL. In the following step below, one may also see efficient
multi-threading, because many time-consuming tasks are done by Numpy, which is an external library
written in C, not needing the python interpreter, thus not suffering from the GIL.
Implementation details#
This project can be started by duplicating your solution to exercise 4
into a new file called pja_concurrent_clustering.py
. Add timing instructions, and run the code
with the data file global.fits
, which is significantly larger and longer to process. Save the
output, so to check that your future modifications will not break the results.
If you feel the process of the data file global.fits
is too long, a first optional
optimization may be to reimplement the convolution using the function
https://docs.scipy.org/doc/scipy-0.16.0/reference/generated/scipy.signal.convolve2d.html.
This should significantly speed-up the convolution.
Then, the goal of this project is the use of the library concurrent.futures
,
so to introduce concurrency in your clustering, and profit from several
hardware cores. Always check that your final numerical results stay identical.
Result#
Try to run sequentially, or with a ThreadPoolExecutor
, or with a
ProcessPoolExecutor
. Do you succeed to speed-up your execution ?