Skip to content

File Formats

Input data format

EMGFlow accepts data in plaintext .CSV file format. Files should have the following format:

  • Row 1 - Column headers
  • Col 1 - Labelled Time, and contains the timestamps of sampled data
  • Col 2:n - Assumed to be sEMG or related signal data.

As an example, here is the first few rows of some sample data included in the package with added NaN values:

TimeEMG_zygEMG_cor
0.0005-0.145569-0.097046
0.0010-0.129089-0.100708
0.0015-0.118713-0.101929
0.0020NaNNaN
0.0025-0.104370-0.097656
.........

For proper calculations, ensure the 'Time' column has an equal difference between each sequential row.

EMGFlow uses pd.read_csv() from the Pandas package to read in files. Continuing with the documentation for Pandas, the following values are interpreted as NaN:

" ", "#N/A", "#N/A N/A", "#NA", "-1.#IND", "-1.#QNAN", "-NaN", "-nan", "1.#IND", "1.#QNAN", "<NA>", "N/A", "NA", "NULL", "NaN", "None", "n/a", "nan", "null"

Output Data Format

EMGFlow outputs a single plaintext .CSV file format containing features extracted from all processed input files. The output file has the following format:

File_pathEMG_Zyg_Min...EMG_Zyg_SpecPmissingEMG_Cor_Min...EMG_Cor_SpecPmissing
01\sample_data_01.csv0.0027489562796772...0.005250.0035784079507369...0.0058
01\sample_data_02.csv0.0049906242832924...0.000850.0040041486806442...0.00035
02\sample_data_03.csv0.0020494782305965...0.01680.0021287377144706...0.01805
02\sample_data_04.csv0.0024170458470961...0.00210.0021508113881666...0.001

Within the output feature file, each muscle's features are blocked together. E.g., muscle 1 spans columns 2:36, muscle 2 spans 37:71 etc. The number of columns is dependent on the number of recorded muscles contained in the input data.

Column name definitions:

Column nameFeatureType
EMG_MinMinimum voltageTime-series
EMG_MaxMaximum voltageTime-series
EMG_MeanMean voltageTime-series
EMG_SDStandard deviation of voltageTime-series
EMG_SkewSkew of voltageTime-series
EMG_KurtosisKurtosis of voltageTime-series
EMG_IEMGIntegrated EMG of voltageTime-series
EMG_MAVMean absolute value of voltageTime-series
EMG_MMAV1Modified mean absolute value 1 of voltageTime-series
EMG_MMAV2Modified mean absolute value 2 of voltageTime-series
EMG_SSISimple square integral of voltageTime-series
EMG_VARVariance of voltageTime-series
EMG_VOrderV-order of voltageTime-series
EMG_RMSRoot mean square of voltageTime-series
EMG_WLWaveform length of voltageTime-series
EMG_LOGLog-detector of voltageTime-series
EMG_MFLMaximum fractal length of voltageTime-series
EMG_APAverage power of voltageTime-series
EMG_Timeseries_PmissingPercentage of missing dataTime-series
EMG_Max_FreqMaximum frequencySpectral
EMG_MDFMedian frequencySpectral
EMG_MNFMean frequencySpectral
EMG_Twitch_RatioTwitch ratio of frequencySpectral
EMG_Twitch_IndexTwitch index of frequencySpectral
EMG_Twitch_SlopeTwitch slope of frequencySpectral
EMG_SCSpectral centroid of frequencySpectral
EMG_SFltSpectral flatness of frequencySpectral
EMG_SFlxSpectral flux of frequencySpectral
EMG_SSSpectral spread of frequencySpectral
EMG_SDecSpectral decrease of frequencySpectral
EMG_SESpectral entropy of frequencySpectral
EMG_SRSpectral rolloff of frequencySpectral
EMG_SBwSpectral bandwidthSpectral
EMG_Spec_PMissingPercentage of missing dataSpectral

Sample Data Files

The sample data used in EMGFlow is taken from PeakAffectDS.

PeakAffectDS NameEMGFlow NameTimeframe
01-03-01.csvsample_data_01.csv50.0005s : 60.0000s
01-04-01.csvsample_data_02.csv130.0005s : 140.0000s
02-06-02.csvsample_data_03.csv10.0005s : 20.0000s
02-07-02.csvsample_data_04.csv15.0005s : 25.0000s

File Manipulations

The sample data files were manipulated in the following ways:

  • sample_data_01.csv: Added chunk of NaN values.
  • sample_data_02.csv: Added scattered individual NaN values.
  • sample_data_03.csv: Added several large chunks of NaN values creating "data islands"
  • sample_data_04.csv: Injected a small high-intensity bandlimited noise pulse.