Processing streams of data¶
It is often the case that, during the same recording session, the experimentalist records only some temporal chunks and not the whole experiment. However, because the neurons are the same all over the recording, it is better to process them as a single datafile. The code can handle such streams of data, either from multiple sources (several data files), or within the same source if supported by the file format (chunks in a single file).
Chunks spread over several files¶
You can use the multi-files
stream mode in the [data]
section.
Note
If you just want to process several independent files, coming from different recording sessions, you need to use the batch mode (see the documentation on the parameters)
For the sake of clarity, we assume that all your files are labelled
mydata_0.extension
mydata_1.extension
- …
mydata_N.extension
Launch the code on the first file:
>> spyking-circus mydata_0.extension
The code will create a parameter file, mydata_0.params
. Edit the file, and in the [data]
section, set stream_mode
to multi-files
. Relaunch the code on the first file only:
>> spyking-circus mydata_0.extension
The code will now display something like:
##################################################################
##### Welcome to the SpyKING CIRCUS #####
##### #####
##### Written by P.Yger and O.Marre #####
##################################################################
Steps : fitting
GPU detected : True
Number of CPU : 12
Number of GPU : 6
Shared Memory : True
Parallel HDF5 : False
Hostfile : /home/spiky/spyking-circus/circus.hosts
##################################################################
------------------------- Informations -------------------------
| Number of recorded channels : 256
| Number of analyzed channels : 256
| Data type : uint16
| Sampling rate : 20 kHz
| Header offset for the data : 1881
| Duration of the recording : 184 min 37 s
| Width of the templates : 5 ms
| Spatial radius considered : 250 um
| Stationarity : True
| Waveform alignment : True
| Skip strong artefacts : True
| Template Extraction : median-raw
| Streams : multi-files (19 found)
------------------------------------------------------------------
The key line here is the one stating that the code has detected 19 files, and will process them as a single one.
Note
The multi-files mode assumes that all files have the same properties: mapping, data type, data offset, … It has to be the case if they are all coming from the same recording session
- While running, in its first phase (filtering), two options are possible:
- if your file format allows write access, and
overwrite
is set toTrue
in thedata
section, then every individual data file will be overwritten and filtered on site - if your file format does not allow write access, or
overwrite
isFalse
, the code will filter and concatenate all files into a new file, saved as afloat32
binary file calledmydata_all_sc.extension
. Templates are then detected onto this single files, and fitting is also applied onto it.
- if your file format allows write access, and
Chunks contained in several folders¶
For some particular file formats (i.e. openephys), all the data are stored within a single folder, and your experiment may be split over several folders. In order to deal with that, the code can virtually concatenate files found in several folders, using the mode multi-folders
. When activating such a mode for the stream_mode
, the code will search for all folders, at the root of the file currently used, and will search inside all of them if compatible recordings can be found. If yes, they will all be concatenated virtually, such that all the folders are processed as a whole
Chunks contained in the same datafile¶
For more complex data structures, several recordings sessions can be saved within the same datafile. Assuming the file format allows it (see the documentation on the file formats), the code can still stream all those chunks of data in order to process them as a whole. To do so, use exactly the same procedure as below, except that the stream_mode
may be different, for example single-file
.
Visualizing results from several streams¶
Multi-files¶
As said, results are obtained on a single file mydata_all.extension
, resulting of the concatenation of all the individual files. So when you are launching the GUI:
>> circus-gui-matlab mydata_0.extension
what you are seeing are all the spikes on all files. Here you can delete/merge templates, see the devoted GUI section for that (GUI). Note that you need to process data in such a manner, because otherwise, if looking at all results individually, you would have a very hard time keeping track of the templates over several files. Plus, you would not get all the information contained in the whole recording (the same neuron could be silent during some temporal chunks, but spiking during others).
Getting individual results from streams¶
Once your manual sorting session is done, you can simply split the results in order to get one result file per data file. To do so, simply launch:
>> circus-multi mydata_0.extension
- This will create several files
mydata_0.results.hdf5
mydata_1.results.hdf5
- …
mydata_N.results.hdf5
In each of them, you’ll find the spike times of the given streams, between 0 and T, if T is the length of file i.