ghPython – New component and parallel modules
Just in time for Christmas… ghPython 0.6.0.3 was released this week and it has two new features that I’m really excited about.
A little background
David Rutten was visiting the McNeel Seattle office in November to discuss future work on Grasshopper and Rhino. When David is in town it always gives me the chance to brainstorm with him and try to solve some of the features that users ask for. Two features that we commonly hear about are “how can I do what X component does, but through RhinoCommon/code?” and “how can I improve performance on my computer with many CPUs?”
Out of these chats came the two major new features in ghPython 0.6.0.3; the ability to call components from python and an easy way to do this using multiple threads. ghPython 0.6.0.3 ships with a new package (ghpythonlib) that supports these two new features.
Components As Functions (node-in-code)
There is a module in ghpythonlib called components which attempts to make every component available in python in the form of an easy to call function. Here’s a sample to help paint the picture.
import ghpythonlib.components as ghcomp # call Voronoi component with input points curves = ghcomp.Voronoi(points) # call Area component with curves from Voronoi centroids = ghcomp.Area(curves).centroid
Notice that the above sample is just three lines of script (and two lines of comments to help describe what is happening.)
Here is a sample gh file
Of course you can mix in other python to perform calculations on the results of the component function calls. I tweaked the above example to find the curve generated from Voronoi that has the largest area.
import ghpythonlib.components as ghcomp curves = ghcomp.Voronoi(points) areas = ghcomp.Area(curves).area #find the biggest curve in the set max_area = -1 max_curve = None for i, curve in enumerate(curves): if areas[i] > max_area: max_area = areas[i] max_curve = curve
Remember, this can be done for almost every component in Grasshopper (including every installed add-on component.) I use the term almost because there are cases where the function call doesn’t make sense. These cases are for things like Kangaroo or timers where the state of the component is important between iterations. Fortunately this is pretty rare.
Along with the new functionality that this provides, I also found myself simplifying existing gh definition files by simply lumping together a bunch of related components into a single python script.
Use those CPUs
Along with components is another module in ghpythonlib called parallel. This module has a single function called “run” which takes a list of data as input and a single function that should be called for each item in the list. What the run function does is call this function on as many threads as there are processors in your computer and then properly collect the results so you get a list of return values in the same order as the input list. The return value is whatever your custom function returns. I could show how this is done with the previous samples, but those already run so fast that there is no need to attempt to multithread them. Instead I put together a sample that typically takes around a second to complete on my computer; slicing a brep with 100 planes.
import ghpythonlib.components as ghcomp import ghpythonlib.parallel #custom function that is executed by parallel.run def slice_at_angle(plane): result = ghcomp.BrepXPlane(brep, plane) if result: return result.curves if parallel: slices = ghpythonlib.parallel.run(slice_at_angle, planes, True) else: slices = ghcomp.BrepXPlane(brep, planes).curves
In the above image I’m passing the variable called parallel into the python script with a value of false. This makes the code execute on a single thread and as you can see by the profiler that the performance is the same in the python script as it is just using the BrepXPlane component (which is expected.)
Now when I toggle the input parallel variable to a value of true, the parallel.run function is executed. This function calls my custom slice_at_angle function 105 times, each time passing in a single plane and all on multiple threads. On my computer with 4 CPUs the execution time drops from one second to 313 milliseconds! A 3X speed boost by just adding a couple lines of script.
Give this new build of ghPython a try. I’m sure there will be questions and probably a bug or two to fix, but it gets fun pretty fast once you get the hang of it.
This is AWESOME!
How do you handle component output variables with spaces in the name?
The autocomplete for functions should provide a description of the input and output names. The results are in the form of a named tuple which means you can access the output using indices like a regular tuple or using names like I show in my samples
Ah, perfect. thanks!
Thank you!!!!!!!!!!!
Brilliant
Hi Steve,
A great innovation !
But do I need to install some other stuff besides the ghpython.gha ? THe autocomplete does not work and the example file above results in this error:
Runtime error (FormatException): Guid should contain 32 digits with 4 dashes (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx).
Traceback:
line 105, in __build_module, “C:\Users\sander\AppData\Roaming\McNeel\Rhinoceros\5.0\Plug-ins\IronPython (814d908a-e25c-493d-97e9-ee3861957f49)\settings\lib\ghpythonlib\components.py”
line 126, in , “C:\Users\sander\AppData\Roaming\McNeel\Rhinoceros\5.0\Plug-ins\IronPython (814d908a-e25c-493d-97e9-ee3861957f49)\settings\lib\ghpythonlib\components.py”
line 1, in script
thx
S
crossposted this to gh forum, previous post from giulio makes me think a botched sr6 install is the cullprit.
reinstalling all of rhinoz fixed this.
That’s a great improvement with the access to the gh component!!
Thanks Steve! Great work!
Awesome!
I am confused with something:
“Run” function takes exactly 3 arguments: name of the function, an argument of that function (the one that contains a list) and flatten boolean. The reason why you defined function “slice_at_angle” with only one argument (plane) and just used “brep” as a global variable inside it, is because:
If you defined “slice_at_angle” with two arguments (brep,plane) both of these would have to be supplied into the “run” function as “run” function represents call for “slice_at_angle” function. And currently run function only supports working on one list, not two? Meaning: second argument from “slice_at_angle” could not be passed?
I understand that it is pointless to use “brep” input data for multi-threading as it is a just a single surface, not a list. But what happens when we have a grasshopper component that has two or three inputs, both consisted of lists? Then only one of those two inputs could be passed as an argument in the “run” function, right? And again we will define our function (in this case “slice_at_angle”) with one argument too?
Does this make any sense?
The run function doesn’t really know anything specific about grasshopper. All it does is gets an input list and a function. For every item in the input list, call the function and collect the result. The function receives each item in the list as a the argument. The list could be a list of tuples and in that case each tuple would be passed to the function.
If you used a different list, say
items = [brep, planes for plane in planes]
Then (brep,planes) would be passed as the input argument to the slice function. I hope that makes sense.
What am I doing wrong?
Here’s a better example of what I mean. I’ll use a gist since replying with python scripts here is pretty ugly
gistfile1.py
hosted with ❤ by GitHub
Thank you
Hi Steve,
I was wondering if I could manage myself to adapt the code to some quite cpu consuming functions we use here, but already stumbled over the first one:
cap a brep.
Could you give me a hint what’s wrong?
It works for “non-parallel” use, but I get a “line 10” error, “brep is not iterable”.
Could you post this question at http://discourse.mcneel.com instead? Answering questions with python samples works WAY better over there than here on this wordpress blog. Thanks.
Are you just passing a single brep into the component or a list of breps? I would also use different variable names so it is clearer what the code is doing. Maybe the input should be “breps” which is what you would pass to you parallel.run function and then leave the add_caps_for_open_breps function the same because you are working with a single brep from the list.
Here’s what I came up with (and I get about a 3X speed boost):
hey steve
trying to use BlendCurve component with ghpythonlib, i get no error messages, everything sees fine but i get a null result.
Following on my previous question on the blendCurve component. what am i doing wrong?
import ghpythonlib.components as ghcomp
output = ghcomp.BlendCurve(curveA,curveB,1.0,1.0,2)
Are you setting the Type Hints on the input to curve?
many thanks steve. that solved the issue. d
I have a stupid question: why aren’t all the gh components that would benefit from running parallel like this?
🙂 Really happy to see this, perfect reason to jump on python over the holidays.
They probably eventually will in Grasshopper 2. What we have right now seems like a great way to prototype multi-threading component code and to find any bottlenecks or bugs in the system.
Hi Steve,
This is very interesting! Thanks. I do have a question though. In your example 2, when I duplicate the sphere (say 10 times) and graft the results, the Python component in parallel mode is actually three times slower than the original component. Do you get the same behavior?
Hi Ramon,
Can you email me the gh definition that you are testing with so I can see exactly what you are testing?
Thanks
Steve, loving this new functionality! I’m having a little trouble with the ghcomp.Treestatistics() command as it doesn’t seem to return a list of data trees as the Grasshopper Component would. Can you confirm that it’s working on your end?
Thanks
Hi Steve,
I’m really excited to try out the new parallel processing feature of ghpython, and I actually have a case where it’s application would be perfect. I need to intersect a lot of little mesh cubes with rays, which is a component that takes 8 minutes!!! to run in my Grasshopper script (99% of the total script’s processing time).
I’m trying to parallelize the MeshRay function, but have had limited success. The parallel python component I wrote performs almost twice as slowly as the native Grasshopper one. I adapted a little test script (https://www.dropbox.com/s/dnsu4w5x66c4rar/ghpython_MT_JS.gh) with a much simpler mesh, which shows a performance decrease from the native grasshopper MeshRay. I would have though that dramatically increasing the mesh complexity would show dividends from using the parallel Python component, but it only seems to get worse.
Do you know if there’s something I’m doing incorrectly?
Thanks very much for your help! I’ve been over it many times 😦
Hi steve
I have an array of hourly light measurements throughout a year.
This gives me an array with {0} .. {1000} as my measure points and (n)=8760 as the length of each list in the array.
Heres the question, because this computer could really benefit from multicore here.
How can i split up an operation when looking at those numbers. Let’s say find the maximum value in each point. Or find an hour with the highest value of all my numbers summed up.
Is it any list to get the names of classes of the GH components… to be called from python?? I really need to get the names of other the intersection components to execute them and call them from a python script, if it is possible. Cheers!
What intersection components are you referring to? You should get a list of what is available when you hit the period in your script and see an autocomplete list