Python may be faster compared with MATLAB in some cases when a part of the program directly calls the C language. If we want to use Python to do parallel computation, we need use the ‘multiprocessing’ package.
Very detailed introduction and examples can be found in this
There are many different ways of achieving multiprocessing (not multi threads which will be restriced by GIL in python), here I show the simplest realization by using “pool”
import multiprocessing as mp
import numpy as np
%use a function to define what we want to do
#do something here
# number of process you are going to use
if __name__ == '__main__':#this is needed in windows
pool = mp.Pool(processes=processnum)
# there are many parameters and we can choose one of them
# for example "a" as a loop element to be calculated parallely.
results = [pool.apply_async(fun, args=(a,b,c,d)) for a in range(loop_mat)]
#get the values arbitrarily
value_1= [p.get() for p in results]# use this get function to obtain the calculated value
pool.terminate() #shut down
In above examples, the
pool.apply_async function will arrange the tasks in different cores simultaneouly.
What needs to be emphasized is that the paralleling computing should be used in which the single process is very slow and each core do the job very slowly. We shouldn’t distribute many small jobs which needs very short time to complete. Then most of the time will be used to distribute works.
In many cases when we use the
multiprocessing we would want to know the progress and how to show the progress is not a easy task. We may first think that we can print something during the process. However, the print content will only show when all the results joined and returned. After searching on the internet, I know the solution
The answer from Zeawoas shows the correct use
for anybody looking for a simple solution working with Pool.apply_async():
1 >from multiprocessing import Pool 2 >from tqdm import tqdm 3 >from time import sleep 4 >def work(x): 5 sleep(0.5) 6 return x**2 7 >n = 10 8 >p = Pool(4) 9 >pbar = tqdm(total=n) 10 >res = [p.apply_async(work, args=( 11 i,), callback=lambda _: pbar.update(1)) for i in range(n)] 12 >results = [p.get() for p in res]