Parallel processing creates errors in behaviorspace output

This is a report for NetLogo 6.4, running on Macs and PCs.

A colleague and I have written a procedure to output patch information to a CSV file to use in behaviorspace experiments (copied below). This works most of the time. But sometimes, when a new run starts (I assume that runs on parallel processors are patched together at this point), it inserts extra spaces, characters, or newlines. This does not happen every time the run changes but does sometimes.

Here is an example of 2 lines from the CSV output. The first line is correct, the second one is not. behaviorspace-run-number is the last number in each line. The second line switches from run 3 to run 5

-1,-8,10,1,12,14,18,3,
10,-2,-4,10,1,12,19,16,5,

There are extra commas added to both lines and a spurious 10 added to the beginning of the second line. Run 8 ends after the second line and a new run starts.

Here is another example

-1,4,10,5,12,21,14,8
-1,10,10,5,16,17,8,814,15,16,5

No extra commas are added but spurious 14,15,16,5 is added to the end of the 2nd line.

This seems to be serious bug producing incorrect output with parallel processing.

Here is our output code

Hi Michael,

Each BehaviorSpace run happens on a separate thread, and is therefore subject to race conditions for shared resources such as files. Internally, this issue has been resolved, which is why we provide the various output file functionality for BehaviorSpace experiments. Is there any reason why you can’t include this patch data in the metrics section of your experiment? It appears to me that the formatting you’re doing here could be automatically done using the spreadsheet and lists output files. Let me know if anything needs to be clarified or if I misunderstood the problem.

Isaac B.

P.S: This doesn’t make a difference if using the provided output files works for you, but it is worth noting that your assumption that parallel runs are patched together at the start of each new run is not correct. Runs are only patched together at the end of the experiment.

Unfortunately, it cannot AFAICT. We want a structured CSV file for analysis, where each patch is a row and each patch variable is a column. I’d hoped that the new list format might generate that but it does not seem to be able to do it.

Behaviorspace is mostly designed to provide output in which each row is the state of the model at each step or at the end of an individual run. We need each row to be the state of each agent (patches in our case).

Export world sort of gets at this, but it includes everything, requiring a lot of cut and paste, and can’t be set to only export one agent type. And I can’t think of how export world could be included in a Behaviorspace experiment with parameter sweeps and repeated runs.

I’m open to suggestions however.

RE the PS: Something is producing extraneous characters at the ends of runs when we run in parallel. This disappears using a single core. That is why I suggested that it is something to do with how the parallel threads are combined. To build the output file, we have to write to the file at the end of each run (post run commands). Maybe it has something to do with how files are written.

Hi Michael,

Thanks for clarifying, I understand the problem now. Unfortunately, BehaviorSpace currently does not support any interactions between the threads, because each run is supposed to be a self-contained instance of the model. Accessing a shared resource from multiple threads without explicit synchronization is undefined behavior in any programming language, so what you’re trying to do is a conceptual issue, not a bug with BehaviorSpace specifically.

This could theoretically be solved with some sort of implicit synchronization, but that could significantly increase the run time of experiments. Also, it’s generally a better idea to avoid undefined behavior, rather than trying to deal with the consequences. I think it would be worth looking into adding some sort of explicit syntax for BehaviorSpace experiments to indicate access to a shared resource like a file. Let me know your thoughts on that.

For the time being, the simplest solution (from a NetLogo perspective) would be to create a separate file for each run, then combine all the files together at the end of the experiment. This would avoid any strange behavior resulting from parallel runs.

Isaac B.

Syntax to deal with this is a good idea. CoMSES.Net has been working for several years to develop workflows to enable NetLogo and other models to be run in high throughput computing environments, like OSG. This is sort of BehaviorSpace on steroids and conceptually a very good way to enable modelers to scale up their work. It has been a tough slog however. We now have templates to help prepare models to be run in this environment. But automated tools for distributing the jobs and then trying to put the results back together again remain a large challenge.

The best workaround I’ve found so far is simply to run the experiments with only 1 thread.

Is there a reason that creating a separate file for each run, as Isaac suggested, won’t work?

It will work but we need a single file in the end. This would require patching together files from multiple runs, each with their own header row, which would need to be deleted. So considerable manual effort post facto before we can analyze the data. Setting BehaviorSpace to a single thread will create the needed file automatically and doesn’t take very long to do so.

Okay, if running on a single thread is fine, that that’s great. If you want help writing a little script (in python or bash) which you could run from the command line and will patch together all the files at the end, let us know and we can help with that.

If it helps at all, I would suggest creating each individual run file without a header, since you plan to remove them at the end. Then when patching together the files, add the header first, then you can add each file as is without having to remove anything.

Isaac B.

This is pretty much what our save-patches procedure does automatically. It adds a header if the output file does not exist. Otherwise, it just adds patch output, without a header, to the existing file at the end of each run.

Yes, that’s my goal. Since what you’re doing doesn’t work correctly, I suggested a workaround that allows you to get the same functionality without any undefined behavior.