10. parallel¶
Functions for parallel computations on a single multi core machine using the standard library multiprocessing.
Not the programming details, but the way how to speed up some things.
- If your computation is already fast (e.g. <1s) go on without parallelisation. In an optimal case you gain a speedup as the number of cpu cores.
- If you want to use a cluster with all cpus, this is not the way (you need MPI).
Parallelisation is no magic and this module is for convenience for non specialist of parallel computing. The main thing is to pass additional parameters to the processes (a pool of workers) and loop only over one parameter given as list. Opening and closing of the pool is hidden in the function. In this way we can use a multicore machine with all cpus.
During testing I found that shared memory does not really speed up, if we just want to calculate a function e.g. for a list of different Q values dependent on model parameters. Here the pickling of numpy arrays is efficient enough compared to the computation we do. The amount of data pickled should not be too large as each process gets a copy and pickling needs time.
If speed is an issue and shared memory gets important i advice using Fortran with OpenMP as used for ff.cloudScattering with parallel computation and shared memory. For me this was easier than the different solutions around.
We use here only non modified input data and return a new dataset, so we dont need to care about what happens if one process changes the data needed in another process (race conditions,…), anyway its not shared. Please keep this in mind and dont complain if you find a way to modify input data.
For easier debugging (to find the position of an error in the pdb debugger) use the option debug. In this case the multiprocessing is not used and the debugger finds the error correctly.
See example in doForList.
Parallel functions
doForList (funktion, looplist, *args, **kwargs) |
Calculates function for values in looplist in a pool of workers in parallel using multiprocessing. |
doForQlist (funktion, qList, *args, **kwargs) |
Calculates for qlist the function in a pool of workers using multiprocessing. |
psphereAverage (funktion[, relError]) |
Parallel evaluation of spherical average of function. |
Helper functions
randomPointsOnSphere (NN[, r, skip]) |
N quasi random points on sphere of radius r based on low-discrepancy sequence. |
randomPointsInCube (NN[, skip, dim]) |
N quasi random points in cube of edge 1 based on low-discrepancy sequence. |
rphitheta2xyz (RPT) |
Transformation spherical coordinates [r,phi,theta] to cartesian coordinates [x,y,z] |
fibonacciLatticePointsOnSphere (NN[, r]) |
Fibonacci lattice points on a sphere with radius r (default r=1) |
haltonSequence (size, dim[, skip]) |
Pseudo random numbers from the Halton sequence in interval [0,1]. |
Functions for parallel computations on a single multi core machine using the standard library multiprocessing.
Not the programming details, but the way how to speed up some things.
- If your computation is already fast (e.g. <1s) go on without parallelisation. In an optimal case you gain a speedup as the number of cpu cores.
- If you want to use a cluster with all cpus, this is not the way (you need MPI).
Parallelisation is no magic and this module is for convenience for non specialist of parallel computing. The main thing is to pass additional parameters to the processes (a pool of workers) and loop only over one parameter given as list. Opening and closing of the pool is hidden in the function. In this way we can use a multicore machine with all cpus.
During testing I found that shared memory does not really speed up, if we just want to calculate a function e.g. for a list of different Q values dependent on model parameters. Here the pickling of numpy arrays is efficient enough compared to the computation we do. The amount of data pickled should not be too large as each process gets a copy and pickling needs time.
If speed is an issue and shared memory gets important i advice using Fortran with OpenMP as used for ff.cloudScattering with parallel computation and shared memory. For me this was easier than the different solutions around.
We use here only non modified input data and return a new dataset, so we dont need to care about what happens if one process changes the data needed in another process (race conditions,…), anyway its not shared. Please keep this in mind and dont complain if you find a way to modify input data.
For easier debugging (to find the position of an error in the pdb debugger) use the option debug. In this case the multiprocessing is not used and the debugger finds the error correctly.
See example in doForList.
-
jscatter.parallel.
doForList
(funktion, looplist, *args, **kwargs)[source]¶ Calculates function for values in looplist in a pool of workers in parallel using multiprocessing.
Like multiprocessing map_async but distributes automatically all given arguments.
Parameters: - funktion : function
Function to process with arguments (looplist[i],args,kwargs) Return value of function should contain parameters or at least the loopover value to allow a check, if desired.
- looplist : list
List of values to loop over.
- loopover : string, int,default= Not given
Name of argument to use for looping over with values in looplist. If not given the first argument is used, which should be not included as argument.
- ncpu : int, optional
- Number of cpus in the pool.
- not given or 0 -> all cpus are used
- int>0 min (ncpu, mp.cpu_count)
- int<0 ncpu not to use
- cb : None, function
Callback after each calculation.
- debug : int
debug > 0 allows serial output for testing
Returns: - list : list of function return values as [result1,result2,…..]
The order of return values is not explicitly synced to looplist.
Notes
The return array of function may be prepended with the value looplist[i] as reference. E.g.:
def f(x,a,b,c,d): return [x,x+a+b+c+d]
Examples
def f(x,a,b,c,d): res=x+a+b+c+d return [x,res] # loop over first argument, here x res = js.parallel.doForList(f,looplist=np.arange(100),a=1,b=2,c=3,d=11) # loop over 'd' ignoring the given d=11 (which can be omitted here) res = js.parallel.doForList(f,looplist=np.arange(100),loopover='d',x=0,a=1,b=2,c=3,d=11)
-
jscatter.parallel.
doForQlist
(funktion, qList, *args, **kwargs)[source]¶ Calculates for qlist the function in a pool of workers using multiprocessing.
Calcs [function(Qi, *args, **kwargs) for Qi in qlist ] in parallel. The return value of function will contain the value Qi as reference.
Parameters: - funktion : function
Function to process with arguments (looplist[i],args,kwargs)
- qList : list
List of values for first argument in function. qList value prepends the arguments args.
- ncpu : int, optional
- number of cpus in the poolnot given or 0 -> all cpus are usedint>0 min (ncpu, mp.cpu_count)int<0 ncpu not to use
- cb :function, optional
Callback after each calculation
- debug : int
debug > 0 allows serial output for testing
Returns: - list : ndim function_return.ndim+1
The list elements will be prepended with the value qlist[i] as reference.
Examples
def f(x,a,b,c,d): return [x+a+b+c+d] # loop over first argument here x js.parallel.doForList(f,Qlist=np.arange(100),a=1,b=2,c=3,d=11)
-
jscatter.parallel.
fibonacciLatticePointsOnSphere
(NN, r=1)[source]¶ Fibonacci lattice points on a sphere with radius r (default r=1)
This can be used to integrate efficiently over a sphere with well distributed points.
Parameters: - NN : integer
number of points = 2*N+1
- r : float, default 1
radius of sphere
Returns: - list of [r,phi,theta] pairs in radians
phi azimuth -pi<phi<pi; theta polar angle 0<theta<pi
References
[1] Measurement of Areas on a Sphere Using Fibonacci and Latitude–Longitude Lattices Á. González Mathematical Geosciences 42, 49-64 (2009) Examples
import jscatter as js import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D points=js.formel.fibonacciLatticePointsOnSphere(1000) pp=list(filter(lambda a:(a[1]>0) & (a[1]<np.pi/2) & (a[2]>0) & (a[2]<np.pi/2),points)) pxyz=js.formel.rphitheta2xyz(pp) fig = plt.figure() ax = fig.add_subplot(111, projection='3d') ax.scatter(pxyz[:,0],pxyz[:,1],pxyz[:,2],color="k",s=20) ax.set_xlim([-1,1]) ax.set_ylim([-1,1]) ax.set_zlim([-1,1]) ax.set_aspect("equal") plt.tight_layout() plt.show(block=False) points=js.formel.fibonacciLatticePointsOnSphere(1000) pp=list(filter(lambda a:(a[2]>0.3) & (a[2]<1) ,points)) v=js.formel.rphitheta2xyz(pp) R=js.formel.rotationMatrix([1,0,0],np.deg2rad(-30)) pxyz=np.dot(R,v.T).T #points in polar coordinates prpt=js.formel.xyz2rphitheta(np.dot(R,pxyz.T).T) fig = plt.figure() ax = fig.add_subplot(111, projection='3d') ax.scatter(pxyz[:,0],pxyz[:,1],pxyz[:,2],color="k",s=20) ax.set_xlim([-1,1]) ax.set_ylim([-1,1]) ax.set_zlim([-1,1]) ax.set_aspect("equal") ax.set_xlabel('x') ax.set_ylabel('y') ax.set_zlabel('z') plt.tight_layout() plt.show(block=False)
-
jscatter.parallel.
haltonSequence
(size, dim, skip=0)[source]¶ Pseudo random numbers from the Halton sequence in interval [0,1].
To use them as coordinate points transpose the array.
Parameters: - size : int
Samples from the sequence
- dim : int
Dimensions
- skip : int
Number of points to skip in Halton sequence .
Returns: - array
References
[1] https://mail.python.org/pipermail/scipy-user/2013-June/034741.html Author: Sebastien Paris, Josef Perktold translation from c [2] https://en.wikipedia.org/wiki/Low-discrepancy_sequence Examples
import jscatter as js import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D fig = plt.figure() ax = fig.add_subplot(111, projection='3d') for i,color in enumerate(['b','g','r','y']): # create halton sequence and shift it to needed shape pxyz=js.parallel.haltonSequence(400,3).T*2-1 ax.scatter(pxyz[:,0],pxyz[:,1],pxyz[:,2],color=color,s=20) ax.set_xlim([-1,1]) ax.set_ylim([-1,1]) ax.set_zlim([-1,1]) ax.set_aspect("equal") plt.tight_layout() plt.show(block=False)
-
jscatter.parallel.
psphereAverage
(funktion, relError=300, *args, **kwargs)[source]¶ Parallel evaluation of spherical average of function.
A Fibonacci lattice or Monte Carlo integration with pseudo random grid is used.
Parameters: - funktion : function
Function to evaluate. Function first argument gets cartesian coordinate [x,y,z] of point on unit sphere.
- relError : float, default 300
- Determines how points on sphere are selected
- >1 Fibonacci Lattice with relError*2+1 points
- 0<1 Pseudo random points on sphere (see randomPointsOnSphere).
- Stops if relative improvement in mean is less than relError (uses steps of 40 new points). Final error is (stddev of N points) /sqrt(N) as for Monte Carlo methods even if it is not a correct 1-sigma error in this case.
- arg,kwargs :
forwarded to function
Returns: - array like with values from function and appended error
Notes
- Works also on single core machines.
- For integration over a continuous function as a form factor in scattering the random points are not statistically independent. Think of neighbouring points on an isosurface which are correlated and therefore the standard deviation is biased. In this case the Fibonacci lattice is the better choice as the standard deviation in a random sample is not a measure of error but more a measure of the differences on the isosurface.
Examples
def f(x,r): return [js.formel.xyz2rphitheta(x)[1:].sum()*r] js.parallel.psphereAverage(f,relError=500,r=1) js.parallel.psphereAverage(f,relError=0.01,r=1)
-
jscatter.parallel.
randomPointsInCube
(NN, skip=0, dim=3)[source]¶ N quasi random points in cube of edge 1 based on low-discrepancy sequence.
For numerical integration quasi random numbers are better than random samples as the error drops faster [1]. Here we use the Halton sequence to generate the sequence. Skipping points makes the sequence additive and does not repeat points.
Parameters: - NN : int
Number of points to generate.
- skip : int
Number of points to skip in Halton sequence .
- dim : int, default 3
Dimension of the cube.
Returns: - array of [x,y,z]
References
[1] (1, 2) https://en.wikipedia.org/wiki/Low-discrepancy_sequence Examples
# random cubes of random points in cube import jscatter as js import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D fig = plt.figure() ax = fig.add_subplot(111, projection='3d') N=30 cubes=js.parallel.randomPointsInCube(20)*3 for i,color in enumerate(['b','g','r','y','k']*3): points=js.parallel.randomPointsInCube(N,skip=N*i).T pxyz=points*0.3+cubes[i][:,None] ax.scatter(pxyz[0,:],pxyz[1,:],pxyz[2,:],color=color,s=20) ax.set_xlim([0,3]) ax.set_ylim([0,3]) ax.set_zlim([0,3]) ax.set_aspect("equal") plt.tight_layout() plt.show(block=False)
-
jscatter.parallel.
randomPointsOnSphere
(NN, r=1, skip=0)[source]¶ N quasi random points on sphere of radius r based on low-discrepancy sequence.
For numerical integration quasi random numbers are better than random samples as the error drops faster [1]. Here we use the Halton sequence to generate the sequence. Skipping points makes the sequence additive and does not repeat points.
Parameters: - NN : int
Number of points to generate.
- r : float
Radius of sphere
- skip : int
Number of points to skip in Halton sequence .
Returns: - array of [r,phi,theta] pairs in radians
References
[1] (1, 2) https://en.wikipedia.org/wiki/Low-discrepancy_sequence Examples
import jscatter as js import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D fig = plt.figure() ax = fig.add_subplot(111, projection='3d') for i,color in enumerate(['b','g','r','y']): points=js.parallel.randomPointsOnSphere(400,skip=400*i) points=points[points[:,1]>0,:] pxyz=js.formel.rphitheta2xyz(points) ax.scatter(pxyz[:,0],pxyz[:,1],pxyz[:,2],color=color,s=20) ax.set_xlim([-1,1]) ax.set_ylim([-1,1]) ax.set_zlim([-1,1]) ax.set_aspect("equal") plt.tight_layout() plt.show(block=False)
-
jscatter.parallel.
rphitheta2xyz
(RPT)[source]¶ Transformation spherical coordinates [r,phi,theta] to cartesian coordinates [x,y,z]
Parameters: - RPT : array Nx3
- dim Nx3 with [r,phi,theta] coordinatesr : float lengthphi : float azimuth -pi < phi < pitheta : float polar angle 0 < theta < pi
Returns: - Array with same dimension as RPT.