Generated Files

In this section, we will review the different files that are generated by the algorithm, and at the end of which step. In all the following, we will assume that the data are path/mydata.extension. All data are generated in the path path/mydata/. To know more about what is performed during the different steps of the algorithm, please see details on the algorithm, or wait for the publication.

Whitening

At the end of that step, a single HDF5 file mydata.basis.hdf5 is produced, containing several objects

  • /thresholds the N thresholds, for all N electrodes. Note that values are positive, and should be multiply by the threshold parameter in the configuration file (see documentation on parameters)
  • /spatial The spatial matrix used for whitening the data (size N x N)
  • /temporal The temporal filter used for whitening the data (size Nt if Nt is the temporal width of the template)
  • /proj and /rec The projection matrix obtained by PCA, and also its inverse, to represent a single waveform. (Size Nt x F if F is the number of features kept (5 by default))
  • /waveforms 1000 randomly chosen waveforms over all channels

Clustering

At the end of that step, several files are produced
  • mydata.clusters.hdf5 A HDF5 file that will encapsulates a lot of informations about the clusters, for every electrodes. What were the points selected, the spike times of those points, what was the labels assigned by the clustering, and also the rho and delta values resulting of the clustering algorithm used [Rodriguez et Laio, 2014]. To be more precise, the file has the following fields

    • /data_i: the data points collected on electrode i, after PCA
    • /clusters_i: the labels of those points after clustering
    • /times_i: the spike times at which those spikes are
    • /debug_i: a 2D array with rhos and deltas for those points (see clustering algorithm)
    • /electrodes: an array with the prefered electrodes of all K templates
  • mydata.templates.hdf5 A HDF5 file storing all the templates, and also their orthogonal projections. So this matrix has a size that is twice the number of templates 2k. Only the first k elements are the real templates. Note also that every templates has a given range of allowed amplitudes limits, and we are also saving the norms norms for internal purposes. To be more precise, the file has the following fields

    • /temp_shape: the dimension of the template matrix N x Nt x 2K if N is the number of electrodes, Nt the temporal width of the templates, and K the number of templates. Only the first K components are real templates
    • /temp_x: the x values to reconstruct the sparse matrix
    • /temp_y: the y values to reconstruct the sparse matrix
    • /temp_data: the values to reconstruct the sparse matrix
    • /norms : the 2K norms of all templates
    • /limits: the K limits [amin, amax] of the real templates
    • /maxoverlap: a K x K matrix with only the maximum value of the overlaps accross the temporal dimension
    • /maxlag: a K x K matrix with the indices leading to the maxoverlap values obtained. In a nutshell, for all pairs of templates, those are the temporal shifts leading to the maximum of the cross-correlation between templates
  • mydata.overlap.hdf5 A HDF5 file used internally during the fitting procedure. This file can be pretty big, and is also saved using a sparse structure. To be more precise, the file has the following fields

    • /over_shape: the dimension of the overlap matrix 2K x 2K x 2Nt - 1 if K is the number of templates, and Nt the temporal width of the templates
    • /over_x: the x values to reconstruct the sparse matrix
    • /over_y: the y values to reconstruct the sparse matrix
    • /over_data: the values to reconstruct the sparse matrix

Fitting

At the end of that step, a single HDF5 file mydata.result.hdf5 is produced, containing several objects

  • /spiketimes/temp_i for a template i, the times at which this particular template has been fitted.
  • /amplitudes/temp_i for a template i, the amplitudes used at the given spike times. Note that those amplitudes has two component, but only the first one is relevant. The second one is the one used for the orthogonal template, and does not need to be analyzed.
  • /gspikes/elec_i if the collect_all mode was activated, then for electrode i, the times at which spikes peaking there have not been fitted.

Note

Spike times are saved in time steps

Converting

At the end of that step, several numpy files are produced in a path path/mydata.GUI. They are all related to phy, so see the devoted documentation