Apparent bug in BehaviorSpace's new 'Run metrics when'

SteveRailsback · October 9, 2024, 2:57pm

We used Run metrics when in a BehaviorSpace experiment and the table output seems to have errors.

In the Run metrics when box we put the name of a reporter that reports true only when the simulation time is on a list of dates and times (from the time extension) when we want output. We should get 7 outputs per model run.
The experiment had 16 runs, all of which were executed at the same time on 16 processors.
For all the model runs we get all 7 lines of output, each with the correct value of the output reporter that provides the simulated date and time. (I am going to try to attach an Excel file with this table, sorted by run number. It has fewer than 7 lines of output for some runs because I killed the experiment before all runs finished.)
However, for some but not all runs, we got exactly the same values for all the other output reporters on all 7 output lines: the number of fish, their size etc. were given the same values on every date. I verified from other output that the values should have changed. BehaviorSpace kept reporting the values from the 1st output of a model run instead of updating them.

I can provide the test case…
Steve R.

Apparently I cannot upload an Excel file so here is a screen capture of it.

And the BehaviorSpace setup (please forgive that the (new) user entered is-census? = TRUE instead of just is-census?)

jzkelter · October 9, 2024, 7:16pm

We’ve now enabled uploading excel files. @aaaronb can you address the actual bug?

aaaronb · October 9, 2024, 8:50pm

Hi Steve,
We can look into it if you send us a reproducible example - model, experiment name, and table output.
It might help to put random-seed (474 + behaviorspace-run-number) (any number is fine) before the setup, so each run will have its own predictable random seed.

Aaron

SteveRailsback · October 9, 2024, 11:07pm

Hi Aaron,
I will send you the code etc. (it’s a big ugly one).
But I have a new clue: I realized that the runs had run-time errors. When I modified the code to prevent the run-time errors, the BehaviorSpace results looked as they should. So perhaps the problem is an interaction with error handling?
Steve

ERSUCC · October 11, 2024, 4:42am

I would be interested to know if the result is any different with table output, since it writes to the destination file at different times than the other three formats. Also, to clarify, would the desired behavior be stopping any output after an error, outputting blank lines after an error, or outputting an error message for each data point after an error?

Isaac B.

EDIT: Sorry, I read the initial post wrong. I guess my corrected statement is I would be interested to know if the result is any different with spreadsheet output.

SteveRailsback · October 11, 2024, 2:58pm

Hi Isaac,

I just tried this with the version of Flocking I made with run-time errors.

I got the same unexpected behavior with spreadsheet output: when there
were run-time errors, (a) the spreadsheet contained results for every
tick instead of only when run metrics when was true, and (b) the
results (value of the output reporter) were the same for every tick.

Desired behavior:

a) I appreciate what BehaviorSpace does now, which is throw up a
run-time error dialog but keep going. Often run-time errors are rare and
not that important, so the experiment isn’t necessarily worthless. But
of course it needs to be obvious to the user that there were errors.

b) One improvement I can suggest to help users know there were errors:
the run-time error dialog comes up behind the two other BehaviorSpace
dialogs, and you can only move the top one (“Running Experiment”) while
an experiment is running. Even if you move that top dialog, the run-time
error dialog is hidden behind the dialog “BehaviorSpace”, which cannot
be moved. So you can’t read the run-time error dialog while the
experiment runs.

It would be nice if the error dialogs popped up in front or somewhere
else where it’s not behind anything, or if you could move the
“BehaviorSpace” dialog to see it. In fact, it would be nice for several
reasons to be able to move the BehaviorSpace dialog while experiments
are running because it hides part of the Interface.

c) It would also be nice if the BehaviorSpace output files tell the user
that there were run-time errors during a run. (It already does a great
job of reporting errors in the output reporters themselves.) Maybe a
flag in the output for any run with errors?

d) It’s really not bad the way it is, so I would hesitate to make other
things worse to get these desired behaviors.

Steve

ERSUCC · October 11, 2024, 3:32pm

It isn’t a good idea to allow interactions below the run dialog, since the user can’t be trusted to not change the model while it’s running. The best compromise I can think of at the moment is to hide the experiments dialog while the run dialog is open, then reopen it once the experiment completes or is aborted. Would that be an acceptable fix if it is possible? I suspect the blocking behavior of the run dialog also is the cause of the error dialog issue you mentioned; I’ll see if I can resolve the issue without removing the blocking behavior of the run dialog.

uri · October 11, 2024, 7:34pm

These suggestions sound reasonable to me.

—Uri

ERSUCC · October 17, 2024, 2:31am

Hi Steve,

I did some investigation into the issue, and here’s what I have found so far:

When a runtime error occurs, the code immediately returns from the current procedure. This means that if you encounter a runtime error before tick is called in the go procedure, the procedure will exit without calling tick, meaning that the next time around the tick number will be the same and the error is likely to occur a second time. I made a minimal reproducible example with a divide by zero error to test this discovery, and found that putting the call to tick before the divide by zero line resulted in the spreadsheet output having the correct number of rows, whereas putting the call to tick after the divide by zero line resulted in the spreadsheet output having an erroneous row for every tick after the error occurred. With this discovery in mind, I think this portion of the issue (and its fix for users) should be documented, but it is not an issue that needs to be fixed by us.

The problem is that regardless of the location of the call to tick, all the data after the error is invalid, since there was one iteration of go that exited before all the commands had been run. Therefore, it makes the most sense to me that all output files should stop outputting data for a run that encounters a runtime error, ending with a row marked “Runtime Error” or something like that. This would alert users to the problem while also resolving the aforementioned behavior, regardless of the location of the call to tick in the go procedure.

One could also argue that this is more than just a BehaviorSpace issue, it’s a fundamental issue with the way errors are handled in NetLogo. Therefore, another potential fix for this issue would be to report runtime errors but continue execution of the code wherever possible. In most (if not all) cases, a runtime error only applies to the line where it occurred, but the code could continue running. This would obviously create bogus results, but as long as the error is still reported to the user, it may be less confusing for the user. For example, in your case, it would have allowed you to identify the bug in your model, but you wouldn’t have had anything to report to the development team.

Hopefully this helps, let me know your thoughts!

Isaac B.

SteveRailsback · October 17, 2024, 3:36pm

Hi Isaac,

I follow your thinking, but the situation I’m looking at (and is
probably much more common) is when the runtime error is not in the go
proceedure.

In my example (modify the Flocking model’s “flock” procedure to remove
its “if any? flockmates” check) the problem with BehaviorSpace table
output persists if I move “tick” to the top of the go proceedure.

By the way, moving “tick” around is not always trivial; in some models
that would have a big effect. That’s especially true in models that use
the time extension to tie ticks to date/time values that affect the
simulation.

Steve

ERSUCC · October 17, 2024, 4:01pm

Good morning Steve,

Thank you for letting me know about the issues with moving the tick command, I take back that suggestion. I looked at the Flocking example you gave, and the runtime error is actually still occurring during the go procedure, since flock is called at the beginning of go. Since moving tick is no longer an option, I think the best option is to truncate the BehaviorSpace output after the runtime error and add a final line indicating at which tick it happened.

Isaac B.

ERSUCC · October 17, 2024, 4:10pm

I do have one other idea, although this is probably too big of a change for now. It seems like fairly often someone has a bug related to ticks being incorrect in BehaviorSpace experiments. One fix for this would be to automatically synchronize the tick to the number of steps after the go procedure is called during a BehaviorSpace experiment. This may be a breaking change, but it would solve many prior issues with experiments.