NetLogo program to filter large output files

SteveRailsback · August 1, 2025, 8:30pm

BehaviorSpace experiments and long simulations can produce output files that are inconveniently long–for example, too large to open in a spreadsheet. Often, we are only interested in a subset of the output, such as from only the ticks when the number of turtles is non-zero.

I wrote a NetLogo program that reads a CSV file, filters out unwanted lines, and writes a new CSV file with only the lines that the user wants to keep. The user defines which data to keep by editing some example filtering statements. The program can use the Time extension to filter output by date or time.

The program is designed so that multiple files can be filtered one new file, which is convenient if each run in a BehaviorSpace experiment produces its own output file.

The code is here:
OutputFileFilterer.nlogo (16.1 KB)

ERSUCC · August 9, 2025, 7:00pm

Hi Steve,

For many cases, I think you could use the “Run metrics when” field added in 6.4 to achieve your desired output. For instance, in your example above, you could add the expression count turtles > 0 to the “Run metrics when” field to get the output you described. You could do a similar thing with the time extension, for example by adding an expression like time:get "hour" dt = 5 to the “Run metrics when” field. Does this resolve the issue, or are there cases for which this would not achieve the desired output?

Isaac B.

SteveRailsback · August 10, 2025, 8:47pm

Hi Isaac,

My large-file-filter program preceded the availability of the “Run
metrics when” capability in NetLogo 6.4. I agree, that capability can be
used to reduce unnecessary BehaviorSpace output. But sometimes you might
filter a big output file differently for different kinds of results–you
might not know when you set up an experiment exactly how you’re going to
look at the output.

We also still use it when each run in a BehaviorSpace experiment writes
its own output files (which you can do by making
behaviorspace-run-number part of the file name). You can then
concatenate all those files (e.g., by using a Windows or Linux command)
into one file and filter it to get the results you want (and get rid of
extraneous header lines).

And sometimes, for example when producing code-testing output not via
BehaviorSpace, it is much easier to just create gigantic output files
and then filter them in various ways than to write code to limit the output.

Steve

ERSUCC · August 11, 2025, 5:58pm

That makes sense. It sounds like maybe there should be a separate tool for analysis of various output files, rather than adding this tool specifically to BehaviorSpace. This could be a good candidate for something to add to a future test or analysis extension, along with the other things you mentioned recently.