Parallel processing creates errors in behaviorspace output

cmbarton · April 8, 2025, 7:41pm

This is a report for NetLogo 6.4, running on Macs and PCs.

A colleague and I have written a procedure to output patch information to a CSV file to use in behaviorspace experiments (copied below). This works most of the time. But sometimes, when a new run starts (I assume that runs on parallel processors are patched together at this point), it inserts extra spaces, characters, or newlines. This does not happen every time the run changes but does sometimes.

Here is an example of 2 lines from the CSV output. The first line is correct, the second one is not. behaviorspace-run-number is the last number in each line. The second line switches from run 3 to run 5

-1,-8,10,1,12,14,18,3,
10,-2,-4,10,1,12,19,16,5,

There are extra commas added to both lines and a spurious 10 added to the beginning of the second line. Run 8 ends after the second line and a new run starts.

Here is another example

-1,4,10,5,12,21,14,8
-1,10,10,5,16,17,8,814,15,16,5

No extra commas are added but spurious 14,15,16,5 is added to the end of the 2nd line.

This seems to be serious bug producing incorrect output with parallel processing.

Here is our output code

to save-patches
  ;; saves the values of all patches to a CSV file for a BehaviorSpace experiment

  ;; create a file name base on experiment name
  let filebase "patches"
  let filename (word filebase "_" behaviorspace-experiment-name ".csv")

  ;; set up output CSV and column headers
  ifelse file-exists? filename
    [] ;; if the file and headers have already been created, don't do it again
    [file-open filename ;; If file does not exist, create it and add column headers
    let header ""
    set header (word header "pxcor,pycor,tolerance,interaction_dist,p-point,p-limace,p-flake, run-number")
    file-print header
    file-close]

  ;; output patch information to CSV file
  file-open filename
  let p ""
  ask patches [set p (word pxcor "," pycor "," tolerance "," interaction-dist "," p-point "," p-limace "," p-flake "," behaviorspace-run-number)
   file-print p]
  file-close
end

ERSUCC · April 8, 2025, 11:59pm

Hi Michael,

Each BehaviorSpace run happens on a separate thread, and is therefore subject to race conditions for shared resources such as files. Internally, this issue has been resolved, which is why we provide the various output file functionality for BehaviorSpace experiments. Is there any reason why you can’t include this patch data in the metrics section of your experiment? It appears to me that the formatting you’re doing here could be automatically done using the spreadsheet and lists output files. Let me know if anything needs to be clarified or if I misunderstood the problem.

Isaac B.

P.S: This doesn’t make a difference if using the provided output files works for you, but it is worth noting that your assumption that parallel runs are patched together at the start of each new run is not correct. Runs are only patched together at the end of the experiment.

cmbarton · April 9, 2025, 12:23am

Unfortunately, it cannot AFAICT. We want a structured CSV file for analysis, where each patch is a row and each patch variable is a column. I’d hoped that the new list format might generate that but it does not seem to be able to do it.

Behaviorspace is mostly designed to provide output in which each row is the state of the model at each step or at the end of an individual run. We need each row to be the state of each agent (patches in our case).

Export world sort of gets at this, but it includes everything, requiring a lot of cut and paste, and can’t be set to only export one agent type. And I can’t think of how export world could be included in a Behaviorspace experiment with parameter sweeps and repeated runs.

I’m open to suggestions however.

RE the PS: Something is producing extraneous characters at the ends of runs when we run in parallel. This disappears using a single core. That is why I suggested that it is something to do with how the parallel threads are combined. To build the output file, we have to write to the file at the end of each run (post run commands). Maybe it has something to do with how files are written.

ERSUCC · April 9, 2025, 3:33am

Hi Michael,

Thanks for clarifying, I understand the problem now. Unfortunately, BehaviorSpace currently does not support any interactions between the threads, because each run is supposed to be a self-contained instance of the model. Accessing a shared resource from multiple threads without explicit synchronization is undefined behavior in any programming language, so what you’re trying to do is a conceptual issue, not a bug with BehaviorSpace specifically.

This could theoretically be solved with some sort of implicit synchronization, but that could significantly increase the run time of experiments. Also, it’s generally a better idea to avoid undefined behavior, rather than trying to deal with the consequences. I think it would be worth looking into adding some sort of explicit syntax for BehaviorSpace experiments to indicate access to a shared resource like a file. Let me know your thoughts on that.

For the time being, the simplest solution (from a NetLogo perspective) would be to create a separate file for each run, then combine all the files together at the end of the experiment. This would avoid any strange behavior resulting from parallel runs.

Isaac B.

cmbarton · April 9, 2025, 3:53pm

Syntax to deal with this is a good idea. CoMSES.Net has been working for several years to develop workflows to enable NetLogo and other models to be run in high throughput computing environments, like OSG. This is sort of BehaviorSpace on steroids and conceptually a very good way to enable modelers to scale up their work. It has been a tough slog however. We now have templates to help prepare models to be run in this environment. But automated tools for distributing the jobs and then trying to put the results back together again remain a large challenge.

The best workaround I’ve found so far is simply to run the experiments with only 1 thread.

jzkelter · April 9, 2025, 5:35pm

Is there a reason that creating a separate file for each run, as Isaac suggested, won’t work?

cmbarton · April 9, 2025, 6:42pm

It will work but we need a single file in the end. This would require patching together files from multiple runs, each with their own header row, which would need to be deleted. So considerable manual effort post facto before we can analyze the data. Setting BehaviorSpace to a single thread will create the needed file automatically and doesn’t take very long to do so.

jzkelter · April 9, 2025, 7:23pm

Okay, if running on a single thread is fine, that that’s great. If you want help writing a little script (in python or bash) which you could run from the command line and will patch together all the files at the end, let us know and we can help with that.

ERSUCC · April 10, 2025, 4:11am

If it helps at all, I would suggest creating each individual run file without a header, since you plan to remove them at the end. Then when patching together the files, add the header first, then you can add each file as is without having to remove anything.

Isaac B.

cmbarton · April 10, 2025, 4:35pm

This is pretty much what our save-patches procedure does automatically. It adds a header if the output file does not exist. Otherwise, it just adds patch output, without a header, to the existing file at the end of each run.

ERSUCC · April 10, 2025, 6:13pm

Yes, that’s my goal. Since what you’re doing doesn’t work correctly, I suggested a workaround that allows you to get the same functionality without any undefined behavior.