Functions¶
ECG¶
Submodule for NeuroKit.
- ecg_analyze(data, sampling_rate=1000, method='auto', subepoch_rate=[None, None])[source]¶
Performs ECG analysis on either epochs (event-related analysis) or on longer periods of data such as resting- state data.
- Parameters
data (Union[dict, pd.DataFrame]) – A dictionary of epochs, containing one DataFrame per epoch, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df(). Can also take a DataFrame of processed signals from a longer period of data, typically generated by ecg_process() or bio_process(). Can also take a dict containing sets of separate periods of data.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Defaults to 1000Hz.
method (str) – Can be one of ‘event-related’ for event-related analysis on epochs, or ‘interval-related’ for analysis on longer periods of data. Defaults to ‘auto’ where the right method will be chosen based on the mean duration of the data (‘event-related’ for duration under 10s).
subepoch_rate (list) – For event-related analysis,, a smaller “sub-epoch” within the epoch of an event can be specified. The ECG rate-related features of this “sub-epoch” (e.g., ECG_Rate, ECG_Rate_Max), relative to the baseline (where applicable), will be computed. The first value of the list specifies the start of the sub-epoch and the second specifies the end of the sub-epoch (in seconds), e.g., subepoch_rate = [1, 3] or subepoch_rate = [1, None]. Defaults to [None, None].
- Returns
DataFrame – A dataframe containing the analyzed ECG features. If event-related analysis is conducted, each epoch is indicated by the Label column. See ecg_eventrelated() and ecg_intervalrelated() docstrings for details.
See also
bio_process
,ecg_process
,epochs_create
,ecg_eventrelated
,ecg_intervalrelated
Examples
>>> import neurokit2 as nk >>> >>> # Example 1: Download the data for event-related analysis >>> data = nk.data("bio_eventrelated_100hz") >>> >>> # Process the data for event-related analysis >>> df, info = nk.bio_process(ecg=data["ECG"], sampling_rate=100) >>> events = nk.events_find(data["Photosensor"], threshold_keep='below', ... event_conditions=["Negative", "Neutral", ... "Neutral", "Negative"]) >>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=-0.1, epochs_end=1.9) >>> >>> # Analyze >>> analyze_epochs = nk.ecg_analyze(epochs, sampling_rate=100) >>> analyze_epochs
>>> >>> # Example 2: Download the resting-state data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Process the data >>> df, info = nk.ecg_process(data["ECG"], sampling_rate=100) >>> >>> # Analyze >>> analyze_df = nk.ecg_analyze(df, sampling_rate=100) >>> analyze_df
- ecg_clean(ecg_signal, sampling_rate=1000, method='neurokit')[source]¶
Clean an ECG signal.
Prepare a raw ECG signal for R-peak detection with the specified method.
- Parameters
ecg_signal (Union[list, np.array, pd.Series]) – The raw ECG channel.
sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 1000.
method (str) – The processing pipeline to apply. Can be one of ‘neurokit’ (default), ‘biosppy’, ‘pantompkins1985’, ‘hamilton2002’, ‘elgendi2010’, ‘engzeemod2012’.
- Returns
array – Vector containing the cleaned ECG signal.
See also
ecg_findpeaks
,signal_rate
,ecg_process
,ecg_plot
Examples
>>> import pandas as pd >>> import neurokit2 as nk >>> import matplotlib.pyplot as plt >>> >>> ecg = nk.ecg_simulate(duration=10, sampling_rate=1000) >>> signals = pd.DataFrame({"ECG_Raw" : ecg, ... "ECG_NeuroKit" : nk.ecg_clean(ecg, sampling_rate=1000, method="neurokit"), ... "ECG_BioSPPy" : nk.ecg_clean(ecg, sampling_rate=1000, method="biosppy"), ... "ECG_PanTompkins" : nk.ecg_clean(ecg, sampling_rate=1000, method="pantompkins1985"), ... "ECG_Hamilton" : nk.ecg_clean(ecg, sampling_rate=1000, method="hamilton2002"), ... "ECG_Elgendi" : nk.ecg_clean(ecg, sampling_rate=1000, method="elgendi2010"), ... "ECG_EngZeeMod" : nk.ecg_clean(ecg, sampling_rate=1000, method="engzeemod2012")}) >>> signals.plot() <AxesSubplot:>
References
Jiapu Pan and Willis J. Tompkins. A Real-Time QRS Detection Algorithm. In: IEEE Transactions on Biomedical Engineering BME-32.3 (1985), pp. 230–236.
Hamilton, Open Source ECG Analysis Software Documentation, E.P.Limited, 2002.
- ecg_delineate(ecg_cleaned, rpeaks=None, sampling_rate=1000, method='dwt', show=False, show_type='peaks', check=False)[source]¶
Delineate QRS complex.
Function to delineate the QRS complex.
Cardiac Cycle: A typical ECG heartbeat consists of a P wave, a QRS complex and a T wave. The P wave represents the wave of depolarization that spreads from the SA-node throughout the atria. The QRS complex reflects the rapid depolarization of the right and left ventricles. Since the ventricles are the largest part of the heart, in terms of mass, the QRS complex usually has a much larger amplitude than the P-wave. The T wave represents the ventricular repolarization of the ventricles.On rare occasions, a U wave can be seen following the T wave. The U wave is believed to be related to the last remnants of ventricular repolarization.
- Parameters
ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG channel as returned by ecg_clean().
rpeaks (Union[list, np.array, pd.Series]) – The samples at which R-peaks occur. Accessible with the key “ECG_R_Peaks” in the info dictionary returned by ecg_findpeaks().
sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 500.
method (str) – Can be one of ‘peak’ for a peak-based method, ‘cwt’ for continuous wavelet transform or ‘dwt’ (default) for discrete wavelet transform.
show (bool) – If True, will return a plot to visualizing the delineated waves information.
show_type (str) – The type of delineated waves information showed in the plot.
check (bool) – Defaults to False. If True, replaces the delineated features with np.nan if its standardized distance from R-peaks is more than 3.
- Returns
waves (dict) – A dictionary containing additional information. For derivative method, the dictionary contains the samples at which P-peaks, Q-peaks, S-peaks, T-peaks, P-onsets and T-offsets occur, accessible with the key “ECG_P_Peaks”, “ECG_Q_Peaks”, “ECG_S_Peaks”, “ECG_T_Peaks”, “ECG_P_Onsets”, “ECG_T_Offsets” respectively.
For wavelet methods, in addition to the above information, the dictionary contains the samples at which QRS-onsets and QRS-offsets occur, accessible with the key “ECG_P_Peaks”, “ECG_T_Peaks”, “ECG_P_Onsets”, “ECG_P_Offsets”, “ECG_Q_Peaks”, “ECG_S_Peaks”, “ECG_T_Onsets”, “ECG_T_Offsets”, “ECG_R_Onsets”, “ECG_R_Offsets” respectively.
signals (DataFrame) – A DataFrame of same length as the input signal in which occurences of peaks, onsets and offsets marked as “1” in a list of zeros.
See also
ecg_clean
,signal_fixpeaks
,ecg_peaks
,signal_rate
,ecg_process
,ecg_plot
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=10, sampling_rate=1000) >>> cleaned = nk.ecg_clean(ecg, sampling_rate=1000) >>> _, rpeaks = nk.ecg_peaks(cleaned, sampling_rate=1000) >>> signals, waves = nk.ecg_delineate(cleaned, rpeaks, sampling_rate=1000) >>> nk.events_plot(waves["ECG_P_Peaks"], cleaned) <Figure ...> >>> nk.events_plot(waves["ECG_T_Peaks"], cleaned) <Figure ...>
References
Martínez, J. P., Almeida, R., Olmos, S., Rocha, A. P., & Laguna, P. (2004). A wavelet-based ECG delineator: evaluation on standard databases. IEEE Transactions on biomedical engineering, 51(4), 570-581.
Performs event-related ECG analysis on epochs.
- Parameters
epochs (Union[dict, pd.DataFrame]) – A dict containing one DataFrame per event/trial, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df().
silent (bool) – If True, silence possible warnings.
subepoch_rate (list) – A smaller “sub-epoch” within the epoch of an event can be specified. The ECG rate-related features of this “sub-epoch” (e.g., ECG_Rate, ECG_Rate_Max), relative to the baseline (where applicable), will be computed. The first value of the list specifies the start of the sub-epoch and the second specifies the end of the sub-epoch (in seconds), e.g., subepoch_rate = [1, 3] or subepoch_rate = [1, None]. Defaults to [None, None].
- Returns
DataFrame – A dataframe containing the analyzed ECG features for each epoch, with each epoch indicated by the Label column (if not present, by the Index column). The analyzed features consist of the following:
”ECG_Rate_Max”: the maximum heart rate after stimulus onset.
”ECG_Rate_Min”: the minimum heart rate after stimulus onset.
”ECG_Rate_Mean”: the mean heart rate after stimulus onset.
”ECG_Rate_SD”: the standard deviation of the heart rate after stimulus onset.
”ECG_Rate_Max_Time”: the time at which maximum heart rate occurs.
”ECG_Rate_Min_Time”: the time at which minimum heart rate occurs.
”ECG_Phase_Atrial”: indication of whether the onset of the event concurs with respiratory systole (1) or diastole (0).
”ECG_Phase_Ventricular”: indication of whether the onset of the event concurs with respiratory systole (1) or diastole (0).
”ECG_Phase_Atrial_Completion”: indication of the stage of the current cardiac (atrial) phase (0 to 1) at the onset of the event.
”ECG_Phase_Ventricular_Completion”: indication of the stage of the current cardiac (ventricular) phase (0 to 1) at the onset of the event.
We also include the following experimental features related to the parameters of a quadratic model:
”ECG_Rate_Trend_Linear”: The parameter corresponding to the linear trend.
”ECG_Rate_Trend_Quadratic”: The parameter corresponding to the curvature.
”ECG_Rate_Trend_R2”: the quality of the quadratic model. If too low, the parameters might not be reliable or meaningful.
See also
events_find
,epochs_create
,bio_process
Examples
>>> import neurokit2 as nk >>> >>> # Example with simulated data >>> ecg, info = nk.ecg_process(nk.ecg_simulate(duration=20)) >>> >>> # Process the data >>> epochs = nk.epochs_create(ecg, events=[5000, 10000, 15000], ... epochs_start=-0.1, epochs_end=1.9) >>> nk.ecg_eventrelated(epochs) Label Event_Onset ... ECG_Phase_Completion_Ventricular ECG_Quality_Mean 1 1 ... ... ... ... 2 2 ... ... ... ... 3 3 ... ... ... ...
[3 rows x 17 columns] >>> >>> # Example with real data >>> data = nk.data(“bio_eventrelated_100hz”) >>> >>> # Process the data >>> df, info = nk.bio_process(ecg=data[“ECG”], sampling_rate=100) >>> events = nk.events_find(data[“Photosensor”], … threshold_keep=’below’, … event_conditions=[“Negative”, “Neutral”, … “Neutral”, “Negative”]) >>> epochs = nk.epochs_create(df, events, sampling_rate=100, … epochs_start=-0.1, epochs_end=1.9) >>> nk.ecg_eventrelated(epochs) #doctest: +ELLIPSIS
Label Condition … ECG_Phase_Completion_Ventricular ECG_Quality_Mean
1 1 Negative … … … 2 2 Neutral … … … 3 3 Neutral … … … 4 4 Negative … … …
[4 rows x 18 columns]
- ecg_findpeaks(ecg_cleaned, sampling_rate=1000, method='neurokit', show=False)[source]¶
Find R-peaks in an ECG signal.
Low-level function used by ecg_peaks() to identify R-peaks in an ECG signal using a different set of algorithms. See ecg_peaks() for details.
- Parameters
ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG channel as returned by ecg_clean().
sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 1000.
method (string) – The algorithm to be used for R-peak detection. Can be one of ‘neurokit’ (default), ‘pantompkins1985’, ‘hamilton2002’, ‘christov2004’, ‘gamboa2008’, ‘elgendi2010’, ‘engzeemod2012’, ‘kalidas2017’, ‘martinez2003’, ‘rodrigues2021’ or ‘promac’.
show (bool) – If True, will return a plot to visualizing the thresholds used in the algorithm. Useful for debugging.
- Returns
info (dict) – A dictionary containing additional information, in this case the samples at which R-peaks occur, accessible with the key “ECG_R_Peaks”.
See also
ecg_clean
,signal_fixpeaks
,ecg_peaks
,ecg_rate
,ecg_process
,ecg_plot
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=10, sampling_rate=1000) >>> cleaned = nk.ecg_clean(ecg, sampling_rate=1000) >>> info = nk.ecg_findpeaks(cleaned) >>> nk.events_plot(info["ECG_R_Peaks"], cleaned) <Figure ...>
>>> >>> # Different methods >>> neurokit = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="neurokit"), method="neurokit") >>> pantompkins1985 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="pantompkins1985"), method="pantompkins1985") >>> nabian2018 = nk.ecg_findpeaks(cleaned, method="nabian2018") >>> hamilton2002 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="hamilton2002"), method="hamilton2002") >>> martinez2003 = nk.ecg_findpeaks(cleaned, method="martinez2003") >>> christov2004 = nk.ecg_findpeaks(cleaned, method="christov2004") >>> gamboa2008 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="gamboa2008"), method="gamboa2008") >>> elgendi2010 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="elgendi2010"), method="elgendi2010") >>> engzeemod2012 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="engzeemod2012"), method="engzeemod2012") >>> kalidas2017 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="kalidas2017"), method="kalidas2017") >>> rodrigues2021 = nk.ecg_findpeaks(cleaned, method="rodrigues2021") >>> >>> # Visualize >>> nk.events_plot([neurokit["ECG_R_Peaks"], ... pantompkins1985["ECG_R_Peaks"], ... nabian2018["ECG_R_Peaks"], ... hamilton2002["ECG_R_Peaks"], ... christov2004["ECG_R_Peaks"], ... gamboa2008["ECG_R_Peaks"], ... elgendi2010["ECG_R_Peaks"], ... engzeemod2012["ECG_R_Peaks"], ... kalidas2017["ECG_R_Peaks"], ... martinez2003["ECG_R_Peaks"], ... rodrigues2021["ECG_R_Peaks"]], cleaned) <Figure ...> >>> >>> # Method-agreement >>> ecg = nk.ecg_simulate(duration=10, sampling_rate=500) >>> ecg = nk.signal_distort(ecg, ... sampling_rate=500, ... noise_amplitude=0.2, noise_frequency=[25, 50], ... artifacts_amplitude=0.2, artifacts_frequency=50) >>> nk.ecg_findpeaks(ecg, sampling_rate=1000, method="promac", show=True) {'ECG_R_Peaks': array(...)}
References
Rodrigues, Tiago & Samoutphonh, Sirisack & Plácido da Silva, Hugo & Fred, Ana. (2021). A Low-Complexity R-peak Detection Algorithm with Adaptive Thresholding for Wearable Devices.
Gamboa, H. (2008). Multi-modal behavioral biometrics based on hci and electrophysiology. PhD ThesisUniversidade.
Zong, W., Heldt, T., Moody, G. B., & Mark, R. G. (2003). An open-source algorithm to detect onset of arterial blood pressure pulses. In Computers in Cardiology, 2003 (pp. 259-262). IEEE.
Hamilton, P. (2002, September). Open source ECG analysis. In Computers in cardiology (pp. 101-104). IEEE.
Pan, J., & Tompkins, W. J. (1985). A real-time QRS detection algorithm. IEEE transactions on biomedical engineering, (3), 230-236.
Engelse, W. A. H., & Zeelenberg, C. (1979). A single scan algorithm for QRS detection and feature extraction IEEE Comput Cardiol. Long Beach: IEEE Computer Society.
Lourenço, A., Silva, H., Leite, P., Lourenço, R., & Fred, A. L. (2012). Real Time Electrocardiogram Segmentation for Finger based ECG Biometrics. In Biosignals (pp. 49-54).
Nabian, M., Yin, Y., Wormwood, J., Quigley, K. S., Barrett, L. F., Ostadabbas, S. (2018). An Open-Source Feature Extraction Tool for the Analysis of Peripheral Physiological Data. IEEE Journal of Translational Engineering in Health and Medicine, 6, 1-11. doi:10.1109/jtehm.2018.2878000
Performs ECG analysis on longer periods of data (typically > 10 seconds), such as resting-state data.
- Parameters
data (Union[dict, pd.DataFrame]) – A DataFrame containing the different processed signal(s) as different columns, typically generated by ecg_process() or bio_process(). Can also take a dict containing sets of separately processed DataFrames.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
- Returns
DataFrame – A dataframe containing the analyzed ECG features. The analyzed features consist of the following:
”ECG_Rate_Mean”: the mean heart rate.
”ECG_HRV”: the different heart rate variability metrices.
See hrv_summary() docstrings for details.
See also
bio_process
,ecg_eventrelated
Examples
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Process the data >>> df, info = nk.ecg_process(data["ECG"], sampling_rate=100) >>> >>> # Single dataframe is passed >>> nk.ecg_intervalrelated(df, sampling_rate=100) ECG_Rate_Mean HRV_MeanNN ... 0 ...
>>> epochs = nk.epochs_create(df, events=[0, 15000], sampling_rate=100, ... epochs_end=150) >>> nk.ecg_intervalrelated(epochs) Label ECG_Rate_Mean ... 1 ...
…
- ecg_peaks(ecg_cleaned, sampling_rate=1000, method='neurokit', correct_artifacts=False)[source]¶
Find R-peaks in an ECG signal.
Find R-peaks in an ECG signal using the specified method.
- Parameters
ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG channel as returned by ecg_clean().
sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 1000.
method (string) – The algorithm to be used for R-peak detection. Can be one of ‘neurokit’ (default), ‘pantompkins1985’, ‘hamilton2002’, ‘christov2004’, ‘gamboa2008’, ‘elgendi2010’, ‘engzeemod2012’ or ‘kalidas2017’.
correct_artifacts (bool) – Whether or not to identify artifacts as defined by Jukka A. Lipponen & Mika P. Tarvainen (2019): A robust algorithm for heart rate variability time series artefact correction using novel beat classification, Journal of Medical Engineering & Technology, DOI: 10.1080/03091902.2019.1640306.
- Returns
signals (DataFrame) – A DataFrame of same length as the input signal in which occurences of R-peaks marked as “1” in a list of zeros with the same length as ecg_cleaned. Accessible with the keys “ECG_R_Peaks”.
info (dict) – A dictionary containing additional information, in this case the samples at which R-peaks occur, accessible with the key “ECG_R_Peaks”, as well as the signals’ sampling rate.
See also
ecg_clean
,ecg_findpeaks
,ecg_process
,ecg_plot
,signal_rate
,signal_fixpeaks
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=10, sampling_rate=1000) >>> cleaned = nk.ecg_clean(ecg, sampling_rate=1000) >>> signals, info = nk.ecg_peaks(cleaned, correct_artifacts=True) >>> nk.events_plot(info["ECG_R_Peaks"], cleaned) <Figure ...>
References
Gamboa, H. (2008). Multi-modal behavioral biometrics based on hci and electrophysiology. PhD ThesisUniversidade.
W. Zong, T. Heldt, G.B. Moody, and R.G. Mark. An open-source algorithm to detect onset of arterial blood pressure pulses. In Computers in Cardiology, 2003, pages 259–262, 2003.
Hamilton, Open Source ECG Analysis Software Documentation, E.P.Limited, 2002.
Jiapu Pan and Willis J. Tompkins. A Real-Time QRS Detection Algorithm. In: IEEE Transactions on Biomedical Engineering BME-32.3 (1985), pp. 230–236.
C. Zeelenberg, A single scan algorithm for QRS detection and feature extraction, IEEE Comp. in Cardiology, vol. 6, pp. 37-42, 1979
A. Lourenco, H. Silva, P. Leite, R. Lourenco and A. Fred, “Real Time Electrocardiogram Segmentation for Finger Based ECG Biometrics”, BIOSIGNALS 2012, pp. 49-54, 2012.
- ecg_phase(ecg_cleaned, rpeaks=None, delineate_info=None, sampling_rate=None)[source]¶
Compute cardiac phase (for both atrial and ventricular).
Finds the cardiac phase, labelled as 1 for systole and 0 for diastole.
- Parameters
ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG channel as returned by ecg_clean().
rpeaks (list or array or DataFrame or Series or dict) – The samples at which the different ECG peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with ecg_findpeaks() or ecg_peaks().
delineate_info (dict) – A dictionary containing additional information of ecg delineation and can be obtained with ecg_delineate().
sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to None.
- Returns
signals (DataFrame) – A DataFrame of same length as ecg_signal containing the following columns:
”ECG_Phase_Atrial”: cardiac phase, marked by “1” for systole and “0” for diastole.
”ECG_Phase_Completion_Atrial”: cardiac phase (atrial) completion, expressed in percentage (from 0 to 1), representing the stage of the current cardiac phase.
”ECG_Phase_Ventricular”: cardiac phase, marked by “1” for systole and “0” for diastole.
”ECG_Phase_Completion_Ventricular”: cardiac phase (ventricular) completion, expressed in percentage (from 0 to 1), representing the stage of the current cardiac phase.
See also
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=10, sampling_rate=1000) >>> cleaned = nk.ecg_clean(ecg, sampling_rate=1000) >>> _, rpeaks = nk.ecg_peaks(cleaned) >>> signals, waves = nk.ecg_delineate(cleaned, rpeaks, sampling_rate=1000) >>> >>> cardiac_phase = nk.ecg_phase(ecg_cleaned=cleaned, rpeaks=rpeaks, ... delineate_info=waves, sampling_rate=1000) >>> nk.signal_plot([cleaned, cardiac_phase], standardize=True)
- ecg_plot(ecg_signals, rpeaks=None, sampling_rate=None, show_type='default')[source]¶
Visualize ECG data.
- Parameters
ecg_signals (DataFrame) – DataFrame obtained from ecg_process().
rpeaks (dict) – The samples at which the R-peak occur. Dict returned by ecg_process(). Defaults to None.
sampling_rate (int) – The sampling frequency of the ECG (in Hz, i.e., samples/second). Needs to be supplied if the data should be plotted over time in seconds. Otherwise the data is plotted over samples. Defaults to None. Must be specified to plot artifacts.
show_type (str) – Visualize the ECG data with ‘default’ or visualize artifacts thresholds with ‘artifacts’ produced by ecg_fixpeaks(), or ‘full’ to visualize both.
- Returns
fig – Figure representing a plot of the processed ecg signals (and peak artifacts).
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=15, sampling_rate=1000, heart_rate=80) >>> signals, info = nk.ecg_process(ecg, sampling_rate=1000) >>> nk.ecg_plot(signals, sampling_rate=1000, show_type='default') <Figure ...>
See also
- ecg_process(ecg_signal, sampling_rate=1000, method='neurokit')[source]¶
Process an ECG signal.
Convenience function that automatically processes an ECG signal.
- Parameters
ecg_signal (Union[list, np.array, pd.Series]) – The raw ECG channel.
sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 1000.
method (str) – The processing pipeline to apply. Defaults to “neurokit”.
- Returns
signals (DataFrame) – A DataFrame of the same length as the ecg_signal containing the following columns:
”ECG_Raw”: the raw signal.
”ECG_Clean”: the cleaned signal.
”ECG_R_Peaks”: the R-peaks marked as “1” in a list of zeros.
”ECG_Rate”: heart rate interpolated between R-peaks.
”ECG_P_Peaks”: the P-peaks marked as “1” in a list of zeros
”ECG_Q_Peaks”: the Q-peaks marked as “1” in a list of zeros .
”ECG_S_Peaks”: the S-peaks marked as “1” in a list of zeros.
”ECG_T_Peaks”: the T-peaks marked as “1” in a list of zeros.
”ECG_P_Onsets”: the P-onsets marked as “1” in a list of zeros.
- ”ECG_P_Offsets”: the P-offsets marked as “1” in a list of zeros
(only when method in ecg_delineate is wavelet).
- ”ECG_T_Onsets”: the T-onsets marked as “1” in a list of zeros
(only when method in ecg_delineate is wavelet).
”ECG_T_Offsets”: the T-offsets marked as “1” in a list of zeros.
- ”ECG_R_Onsets”: the R-onsets marked as “1” in a list of zeros
(only when method in ecg_delineate is wavelet).
- ”ECG_R_Offsets”: the R-offsets marked as “1” in a list of zeros
(only when method in ecg_delineate is wavelet).
”ECG_Phase_Atrial”: cardiac phase, marked by “1” for systole and “0” for diastole.
”ECG_Phase_Ventricular”: cardiac phase, marked by “1” for systole and “0” for diastole.
”ECG_Atrial_PhaseCompletion”: cardiac phase (atrial) completion, expressed in percentage (from 0 to 1), representing the stage of the current cardiac phase.
”ECG_Ventricular_PhaseCompletion”: cardiac phase (ventricular) completion, expressed in percentage (from 0 to 1), representing the stage of the current cardiac phase.
info (dict) – A dictionary containing the samples at which the R-peaks occur, accessible with the key “ECG_Peaks”, as well as the signals’ sampling rate.
See also
ecg_clean
,ecg_findpeaks
,ecg_plot
,signal_rate
,signal_fixpeaks
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=15, sampling_rate=1000, heart_rate=80) >>> signals, info = nk.ecg_process(ecg, sampling_rate=1000) >>> nk.ecg_plot(signals) <Figure ...>
- ecg_quality(ecg_cleaned, rpeaks=None, sampling_rate=1000, method='averageQRS', approach=None)[source]¶
Quality of ECG Signal.
The “averageQRS” method computes a continuous index of quality of the ECG signal, by interpolating the distance of each QRS segment from the average QRS segment present in the data. This index is therefore relative: 1 corresponds to heartbeats that are the closest to the average sample and 0 corresponds to the most distant heartbeat from that average sample. Note that 1 does not necessarily means “good”: if the majority of samples are bad, than being close to the average will likely mean bad as well. Use this index with care and plot it alongside your ECG signal to see if it makes sense.
The “zhao2018” method (Zhao et la., 2018) extracts several signal quality indexes (SQIs): QRS wave power spectrum distribution pSQI, kurtosis kSQI, and baseline relative power basSQI. An additional R peak detection match qSQI was originally computed in the paper but left out in this algorithm. The indices were originally weighted with a ratio of [0.4, 0.4, 0.1, 0.1] to generate the final classification outcome, but because qSQI was dropped, the weights have been rearranged to [0.6, 0.2, 0.2] for pSQI, kSQI and basSQI respectively.
- Parameters
ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG signal in the form of a vector of values.
rpeaks (tuple or list) – The list of R-peak samples returned by ecg_peaks(). If None, peaks is computed from the signal input.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
method (str) – The method for computing ECG signal quality, can be “averageQRS” (default) or “zhao2018”.
approach (str) – The data fusion approach as documented in Zhao et al. (2018). Can be “simple” or “fuzzy”. The former performs simple heuristic fusion of SQIs and the latter performs fuzzy comprehensive evaluation. If None (default), simple heuristic fusion is used.
**kwargs – Keyword arguments to be passed to signal_power() in the computation of basSQI and pSQI.
- Returns
array or str – Vector containing the quality index ranging from 0 to 1 for “averageQRS” method, returns string classification (“Unacceptable”, “Barely Acceptable” or “Excellent”) of the signal for “zhao2018 method”.
See also
References
Zhao, Z., & Zhang, Y. (2018). “SQI quality evaluation mechanism of single-lead ECG signal based on simple heuristic fusion and fuzzy comprehensive evaluation”. Frontiers in Physiology, 9, 727.
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=30, sampling_rate=300, noise=0.2) >>> ecg_cleaned = nk.ecg_clean(ecg, sampling_rate=300) >>> quality = nk.ecg_quality(ecg_cleaned, sampling_rate=300) >>> >>> nk.signal_plot([ecg_cleaned, quality], standardize=True) >>> >>> # Zhao et al. (2018) method >>> quality = nk.ecg_quality(ecg_cleaned, sampling_rate=300, method="zhao2018", approach="fuzzy")
- ecg_rate(peaks, sampling_rate=1000, desired_length=None, interpolation_method='monotone_cubic')¶
Calculate signal rate from a series of peaks.
This function can also be called either via
ecg_rate()
,`ppg_rate()
orrsp_rate()
(aliases provided for consistency).- Parameters
peaks (Union[list, np.array, pd.DataFrame, pd.Series, dict]) – The samples at which the peaks occur. If an array is passed in, it is assumed that it was obtained with signal_findpeaks(). If a DataFrame is passed in, it is assumed it is of the same length as the input signal in which occurrences of R-peaks are marked as “1”, with such containers obtained with e.g., ecg_findpeaks() or rsp_findpeaks().
sampling_rate (int) – The sampling frequency of the signal that contains peaks (in Hz, i.e., samples/second). Defaults to 1000.
desired_length (int) – If left at the default None, the returned rated will have the same number of elements as peaks. If set to a value larger than the sample at which the last peak occurs in the signal (i.e., peaks[-1]), the returned rate will be interpolated between peaks over desired_length samples. To interpolate the rate over the entire duration of the signal, set desired_length to the number of samples in the signal. Cannot be smaller than or equal to the sample at which the last peak occurs in the signal. Defaults to None.
interpolation_method (str) – Method used to interpolate the rate between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the y-direction). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.
- Returns
array – A vector containing the rate.
See also
signal_findpeaks
,signal_fixpeaks
,signal_plot
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=10, sampling_rate=1000, frequency=1) >>> info = nk.signal_findpeaks(signal) >>> >>> rate = nk.signal_rate(peaks=info["Peaks"], desired_length=len(signal)) >>> fig = nk.signal_plot(rate) >>> fig
- ecg_rsp(ecg_rate, sampling_rate=1000, method='vangent2019')[source]¶
Extract ECG Derived Respiration (EDR).
This implementation is far from being complete, as the information in the related papers prevents me from getting a full understanding of the procedure. Help is required!
- Parameters
ecg_rate (array) – The heart rate signal as obtained via ecg_rate().
sampling_rate (int) – The sampling frequency of the signal that contains the R-peaks (in Hz, i.e., samples/second). Defaults to 1000Hz.
method (str) – Can be one of ‘vangent2019’ (default), ‘soni2019’, ‘charlton2016’ or ‘sarkar2015’.
- Returns
array – A Numpy array containing the heart rate.
Examples
>>> import neurokit2 as nk >>> import pandas as pd >>> >>> # Get heart rate >>> data = nk.data("bio_eventrelated_100hz") >>> rpeaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100) >>> ecg_rate = nk.signal_rate(rpeaks, sampling_rate=100, desired_length=len(rpeaks)) >>> >>> >>> # Get ECG Derived Respiration (EDR) >>> edr = nk.ecg_rsp(ecg_rate, sampling_rate=100) >>> nk.standardize(pd.DataFrame({"EDR": edr, "RSP": data["RSP"]})).plot() <AxesSubplot:> >>> >>> # Method comparison (the closer to 0 the better) >>> nk.standardize(pd.DataFrame({"True RSP": data["RSP"], ... "vangent2019": nk.ecg_rsp(ecg_rate, sampling_rate=100, method="vangent2019"), ... "sarkar2015": nk.ecg_rsp(ecg_rate, sampling_rate=100, method="sarkar2015"), ... "charlton2016": nk.ecg_rsp(ecg_rate, sampling_rate=100, method="charlton2016"), ... "soni2019": nk.ecg_rsp(ecg_rate, sampling_rate=100, ... method="soni2019")})).plot() <AxesSubplot:>
References
van Gent, P., Farah, H., van Nes, N., & van Arem, B. (2019). HeartPy: A novel heart rate algorithm for the analysis of noisy signals. Transportation research part F: traffic psychology and behaviour, 66, 368-378.
Sarkar, S., Bhattacherjee, S., & Pal, S. (2015). Extraction of respiration signal from ECG for respiratory rate estimation.
Charlton, P. H., Bonnici, T., Tarassenko, L., Clifton, D. A., Beale, R., & Watkinson, P. J. (2016). An assessment of algorithms to estimate respiratory rate from the electrocardiogram and photoplethysmogram. Physiological measurement, 37(4), 610.
Soni, R., & Muniyandi, M. (2019). Breath rate variability: a novel measure to study the meditation effects. International Journal of Yoga, 12(1), 45.
- ecg_segment(ecg_cleaned, rpeaks=None, sampling_rate=1000, show=False)[source]¶
Segment an ECG signal into single heartbeats.
- Parameters
ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG channel as returned by ecg_clean().
rpeaks (dict) – The samples at which the R-peaks occur. Dict returned by ecg_peaks(). Defaults to None.
sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 1000.
show (bool) – If True, will return a plot of heartbeats. Defaults to False.
- Returns
dict – A dict containing DataFrames for all segmented heartbeats.
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=15, sampling_rate=1000, heart_rate=80) >>> ecg_cleaned = nk.ecg_clean(ecg, sampling_rate=1000) >>> nk.ecg_segment(ecg_cleaned, rpeaks=None, sampling_rate=1000, show=True) {'1': Signal Index Label ... '2': Signal Index Label ... '19': Signal Index Label ...}
- ecg_simulate(duration=10, length=None, sampling_rate=1000, noise=0.01, heart_rate=70, heart_rate_std=1, method='ecgsyn', random_state=None)[source]¶
Simulate an ECG/EKG signal.
Generate an artificial (synthetic) ECG signal of a given duration and sampling rate using either the ECGSYN dynamical model (McSharry et al., 2003) or a simpler model based on Daubechies wavelets to roughly approximate cardiac cycles.
- Parameters
duration (int) – Desired recording length in seconds.
sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second).
length (int) – The desired length of the signal (in samples).
noise (float) – Noise level (amplitude of the laplace noise).
heart_rate (int) – Desired simulated heart rate (in beats per minute). The default is 70. Note that for the ECGSYN method, random fluctuations are to be expected to mimick a real heart rate. These fluctuations can cause some slight discrepancies between the requested heart rate and the empirical heart rate, especially for shorter signals.
heart_rate_std (int) – Desired heart rate standard deviation (beats per minute).
method (str) – The model used to generate the signal. Can be ‘simple’ for a simulation based on Daubechies wavelets that roughly approximates a single cardiac cycle. If ‘ecgsyn’ (default), will use an advanced model desbribed McSharry et al. (2003).
random_state (int) – Seed for the random number generator.
- Returns
array – Vector containing the ECG signal.
Examples
>>> import pandas as pd >>> import neurokit2 as nk >>> >>> ecg1 = nk.ecg_simulate(duration=10, method="simple") >>> ecg2 = nk.ecg_simulate(duration=10, method="ecgsyn") >>> pd.DataFrame({"ECG_Simple": ecg1, ... "ECG_Complex": ecg2}).plot(subplots=True) array([<AxesSubplot:>, <AxesSubplot:>], dtype=object)
See also
rsp_simulate
,eda_simulate
,ppg_simulate
,emg_simulate
References
McSharry, P. E., Clifford, G. D., Tarassenko, L., & Smith, L. A. (2003). A dynamical model for
generating synthetic electrocardiogram signals. IEEE transactions on biomedical engineering, 50(3), 289-294. - https://github.com/diarmaidocualain/ecg_simulation
PPG¶
Submodule for NeuroKit.
- ppg_analyze(data, sampling_rate=1000, method='auto')[source]¶
Performs PPG analysis on either epochs (event-related analysis) or on longer periods of data such as resting- state data.
- Parameters
data (Union[dict, pd.DataFrame]) – A dictionary of epochs, containing one DataFrame per epoch, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df(). Can also take a DataFrame of processed signals from a longer period of data, typically generated by ppg_process() or bio_process(). Can also take a dict containing sets of separate periods of data.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Defaults to 1000Hz.
method (str) – Can be one of ‘event-related’ for event-related analysis on epochs, or ‘interval-related’ for analysis on longer periods of data. Defaults to ‘auto’ where the right method will be chosen based on the mean duration of the data (‘event-related’ for duration under 10s).
- Returns
DataFrame – A dataframe containing the analyzed PPG features. If event-related analysis is conducted, each epoch is indicated by the Label column. See ppg_eventrelated() and ppg_intervalrelated() docstrings for details.
See also
bio_process
,ppg_process
,epochs_create
,ppg_eventrelated
,ppg_intervalrelated
Examples
>>> import neurokit2 as nk >>> >>> # Example 1: Simulate data for event-related analysis >>> ppg = nk.ppg_simulate(duration=20, sampling_rate=1000) >>> >>> # Process data >>> ppg_signals, info = nk.ppg_process(ppg, sampling_rate=1000) >>> epochs = nk.epochs_create(ppg, events=[5000, 10000, 15000], ... epochs_start=-0.1, epochs_end=1.9) >>> >>> # Analyze >>> analyze_epochs = nk.ppg_analyze(epochs, sampling_rate=1000) >>> analyze_epochs
>>> >>> # Example 2: Download the resting-state data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Process the data >>> df, info = nk.ppg_process(data["PPG"], sampling_rate=100) >>> >>> # Analyze >>> analyze_df = nk.ppg_analyze(df, sampling_rate=100) >>> analyze_df
- ppg_clean(ppg_signal, sampling_rate=1000, heart_rate=None, method='elgendi')[source]¶
Clean a photoplethysmogram (PPG) signal.
Prepare a raw PPG signal for systolic peak detection.
- Parameters
ppg_signal (Union[list, np.array, pd.Series]) – The raw PPG channel.
heart_rate (Union[int, float]) – The heart rate of the PPG signal. Applicable only if method is “nabian2018” to check that filter frequency is appropriate.
sampling_rate (int) – The sampling frequency of the PPG (in Hz, i.e., samples/second). The default is 1000.
method (str) – The processing pipeline to apply. Can be one of “elgendi” or “nabian2018”. The default is “elgendi”.
- Returns
clean (array) – A vector containing the cleaned PPG.
See also
Examples
>>> import neurokit2 as nk >>> import pandas as pd >>> import matplotlib.pyplot as plt >>> >>> # Simulate and clean signal >>> ppg = nk.ppg_simulate(heart_rate=75, duration=30) >>> ppg_elgendi = nk.ppg_clean(ppg, method='elgendi') >>> ppg_nabian = nk.ppg_clean(ppg, method='nabian2018', heart_rate=75) >>> >>> # Plot and compare methods >>> signals = pd.DataFrame({"PPG_Raw" : ppg, ... "PPG_Elgendi" : ppg_elgendi, ... "PPG_Nabian" : ppg_nabian}) >>> signals.plot() <AxesSubplot:>
References
Nabian, M., Yin, Y., Wormwood, J., Quigley, K. S., Barrett, L. F., & Ostadabbas, S. (2018).
An Open-Source Feature Extraction Tool for the Analysis of Peripheral Physiological Data. IEEE Journal of Translational Engineering in Health and Medicine, 6, 1-11. doi:10.1109/jtehm.2018.2878000
Performs event-related PPG analysis on epochs.
- Parameters
epochs (Union[dict, pd.DataFrame]) – A dict containing one DataFrame per event/trial, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df().
silent (bool) – If True, silence possible warnings.
- Returns
DataFrame – A dataframe containing the analyzed PPG features for each epoch, with each epoch indicated by the Label column (if not present, by the Index column). The analyzed features consist of the following:
”PPG_Rate_Baseline”: the baseline heart rate (at stimulus onset).
”PPG_Rate_Max”: the maximum heart rate after stimulus onset.
”PPG_Rate_Min”: the minimum heart rate after stimulus onset.
”PPG_Rate_Mean”: the mean heart rate after stimulus onset.
”PPG_Rate_SD”: the standard deviation of the heart rate after stimulus onset.
”PPG_Rate_Max_Time”: the time at which maximum heart rate occurs.
”PPG_Rate_Min_Time”: the time at which minimum heart rate occurs.
We also include the following experimental features related to the parameters of a quadratic model:
”PPG_Rate_Trend_Linear”: The parameter corresponding to the linear trend.
”PPG_Rate_Trend_Quadratic”: The parameter corresponding to the curvature.
”PPG_Rate_Trend_R2”: the quality of the quadratic model. If too low, the parameters might not be reliable or meaningful.
See also
events_find
,epochs_create
,ppg_process
Examples
>>> import neurokit2 as nk >>> >>> # Example with simulated data >>> ppg, info = nk.ppg_process(nk.ppg_simulate(duration=20)) >>> >>> # Process the data >>> epochs = nk.epochs_create(ppg, events=[5000, 10000, 15000], ... epochs_start=-0.1, epochs_end=1.9) >>> nk.ppg_eventrelated(epochs) Label Event_Onset ... PPG_Rate_Trend_Linear PPG_Rate_Trend_R2 1 1 ... ... ... ... 2 2 ... ... ... ... 3 3 ... ... ... ...
[3 rows x 12 columns]
- ppg_findpeaks(ppg_cleaned, sampling_rate=1000, method='elgendi', show=False)[source]¶
Find systolic peaks in a photoplethysmogram (PPG) signal.
- Parameters
ppg_cleaned (Union[list, np.array, pd.Series]) – The cleaned PPG channel as returned by ppg_clean().
sampling_rate (int) – The sampling frequency of the PPG (in Hz, i.e., samples/second). The default is 1000.
method (str) – The processing pipeline to apply. Can be one of “elgendi”. The default is “elgendi”.
show (bool) – If True, returns a plot of the thresholds used during peak detection. Useful for debugging. The default is False.
- Returns
info (dict) – A dictionary containing additional information, in this case the samples at which systolic peaks occur, accessible with the key “PPG_Peaks”.
See also
Examples
>>> import neurokit2 as nk >>> import matplotlib.pyplot as plt >>> >>> ppg = nk.ppg_simulate(heart_rate=75, duration=30) >>> ppg_clean = nk.ppg_clean(ppg) >>> info = nk.ppg_findpeaks(ppg_clean) >>> peaks = info["PPG_Peaks"] >>> >>> plt.plot(ppg, label="raw PPG") >>> plt.plot(ppg_clean, label="clean PPG") >>> plt.scatter(peaks, ppg[peaks], c="r", label="systolic peaks") >>> plt.legend()
References
Elgendi M, Norton I, Brearley M, Abbott D, Schuurmans D (2013) Systolic Peak Detection in
Acceleration Photoplethysmograms Measured from Emergency Responders in Tropical Conditions. PLoS ONE 8(10): e76585. doi:10.1371/journal.pone.0076585.
Performs PPG analysis on longer periods of data (typically > 10 seconds), such as resting-state data.
- Parameters
data (Union[dict, pd.DataFrame]) – A DataFrame containing the different processed signal(s) as different columns, typically generated by ppg_process(). Can also take a dict containing sets of separately processed DataFrames.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
- Returns
DataFrame – A dataframe containing the analyzed ECG features. The analyzed features consist of the following:
”PPG_Rate_Mean”: the mean heart rate.
”ECG_HRV”: the different heart rate variability metrices.
See hrv_summary() docstrings for details.
See also
Examples
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Process the data >>> df, info = nk.ppg_process(data["PPG"], sampling_rate=100) >>> >>> # Single dataframe is passed >>> nk.ppg_intervalrelated(df, sampling_rate=100) PPG_Rate_Mean HRV_MeanNN ... 0 ...
>>> epochs = nk.epochs_create(df, events=[0, 15000], sampling_rate=100, ... epochs_end=150) >>> nk.ppg_intervalrelated(epochs) Label PPG_Rate_Mean ... 1 ...
…
- ppg_plot(ppg_signals, sampling_rate=None)[source]¶
Visualize photoplethysmogram (PPG) data.
- Parameters
ppg_signals (DataFrame) – DataFrame obtained from ppg_process().
sampling_rate (int) – The sampling frequency of the PPG (in Hz, i.e., samples/second). Needs to be supplied if the data should be plotted over time in seconds. Otherwise the data is plotted over samples. Defaults to None.
- Returns
fig – Figure representing a plot of the processed PPG signals.
Examples
>>> import neurokit2 as nk >>> >>> # Simulate data >>> ppg = nk.ppg_simulate(duration=10, sampling_rate=1000, heart_rate=70) >>> >>> # Process signal >>> signals, info = nk.ppg_process(ppg, sampling_rate=1000) >>> >>> # Plot >>> nk.ppg_plot(signals) <Figure ...>
See also
- ppg_process(ppg_signal, sampling_rate=1000, **kwargs)[source]¶
Process a photoplethysmogram (PPG) signal.
Convenience function that automatically processes a photoplethysmogram signal.
- Parameters
ppg_signal (Union[list, np.array, pd.Series]) – The raw PPG channel.
sampling_rate (int) – The sampling frequency of emg_signal (in Hz, i.e., samples/second).
- Returns
signals (DataFrame) – A DataFrame of same length as emg_signal containing the following columns: - “PPG_Raw”: the raw signal. - “PPG_Clean”: the cleaned signal. - “PPG_Rate”: the heart rate as measured based on PPG peaks. - “PPG_Peaks”: the PPG peaks marked as “1” in a list of zeros.
info (dict) – A dictionary containing the information of peaks and the signals’ sampling rate.
See also
Examples
>>> import neurokit2 as nk >>> >>> ppg = nk.ppg_simulate(duration=10, sampling_rate=1000, heart_rate=70) >>> signals, info = nk.ppg_process(ppg, sampling_rate=1000) >>> fig = nk.ppg_plot(signals) >>> fig
- ppg_rate(peaks, sampling_rate=1000, desired_length=None, interpolation_method='monotone_cubic')¶
Calculate signal rate from a series of peaks.
This function can also be called either via
ecg_rate()
,`ppg_rate()
orrsp_rate()
(aliases provided for consistency).- Parameters
peaks (Union[list, np.array, pd.DataFrame, pd.Series, dict]) – The samples at which the peaks occur. If an array is passed in, it is assumed that it was obtained with signal_findpeaks(). If a DataFrame is passed in, it is assumed it is of the same length as the input signal in which occurrences of R-peaks are marked as “1”, with such containers obtained with e.g., ecg_findpeaks() or rsp_findpeaks().
sampling_rate (int) – The sampling frequency of the signal that contains peaks (in Hz, i.e., samples/second). Defaults to 1000.
desired_length (int) – If left at the default None, the returned rated will have the same number of elements as peaks. If set to a value larger than the sample at which the last peak occurs in the signal (i.e., peaks[-1]), the returned rate will be interpolated between peaks over desired_length samples. To interpolate the rate over the entire duration of the signal, set desired_length to the number of samples in the signal. Cannot be smaller than or equal to the sample at which the last peak occurs in the signal. Defaults to None.
interpolation_method (str) – Method used to interpolate the rate between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the y-direction). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.
- Returns
array – A vector containing the rate.
See also
signal_findpeaks
,signal_fixpeaks
,signal_plot
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=10, sampling_rate=1000, frequency=1) >>> info = nk.signal_findpeaks(signal) >>> >>> rate = nk.signal_rate(peaks=info["Peaks"], desired_length=len(signal)) >>> fig = nk.signal_plot(rate) >>> fig
- ppg_simulate(duration=120, sampling_rate=1000, heart_rate=70, frequency_modulation=0.2, ibi_randomness=0.1, drift=0, motion_amplitude=0.1, powerline_amplitude=0.01, burst_number=0, burst_amplitude=1, random_state=None, show=False)[source]¶
Simulate a photoplethysmogram (PPG) signal.
Phenomenological approximation of PPG. The PPG wave is described with four landmarks: wave onset, location of the systolic peak, location of the dicrotic notch and location of the diastolic peaks. These landmarks are defined as x and y coordinates (in a time series). These coordinates are then interpolated at the desired sampling rate to obtain the PPG signal.
- Parameters
duration (int) – Desired recording length in seconds. The default is 120.
sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second). The default is 1000.
heart_rate (int) – Desired simulated heart rate (in beats per minute). The default is 70. Note that for the ECGSYN method, random fluctuations are to be expected to mimick a real heart rate. These fluctuations can cause some slight discrepancies between the requested heart rate and the empirical heart rate, especially for shorter signals.
frequency_modulation (float) – Float between 0 and 1. Determines how pronounced respiratory sinus arrythmia (RSA) is (0 corresponds to absence of RSA). The default is 0.3.
ibi_randomness (float) – Float between 0 and 1. Determines how much random noise there is in the duration of each PPG wave (0 corresponds to absence of variation). The default is 0.1.
drift (float) – Float between 0 and 1. Determines how pronounced the baseline drift (.05 Hz) is (0 corresponds to absence of baseline drift). The default is 1.
motion_amplitude (float) – Float between 0 and 1. Determines how pronounced the motion artifact (0.5 Hz) is (0 corresponds to absence of motion artifact). The default is 0.1.
powerline_amplitude (float) – Float between 0 and 1. Determines how pronounced the powerline artifact (50 Hz) is (0 corresponds to absence of powerline artifact). Note that powerline_amplitude > 0 is only possible if ‘sampling_rate’ is >= 500. The default is 0.1.
burst_amplitude (float) – Float between 0 and 1. Determines how pronounced high frequency burst artifacts are (0 corresponds to absence of bursts). The default is 1.
burst_number (int) – Determines how many high frequency burst artifacts occur. The default is 0.
show (bool) – If true, returns a plot of the landmarks and interpolated PPG. Useful for debugging.
random_state (int) – Seed for the random number generator. Keep it fixed for reproducible results.
- Returns
ppg (array) – A vector containing the PPG.
See also
ecg_simulate
,rsp_simulate
,eda_simulate
,emg_simulate
Examples
>>> import neurokit2 as nk >>> >>> ppg = nk.ppg_simulate(duration=40, sampling_rate=500, heart_rate=75, random_state=42)
HRV¶
- hrv(peaks, sampling_rate=1000, show=False, **kwargs)[source]¶
Computes indices of Heart Rate Variability (HRV).
Computes HRV indices in the time-, frequency-, and nonlinear domain. Note that a minimum duration of the signal containing the peaks is recommended for some HRV indices to be meaninful. For instance, 1, 2 and 5 minutes of high quality signal are the recomended minima for HF, LF and LF/HF, respectively. See references for details.
- Parameters
peaks (dict) – Samples at which cardiac extrema (i.e., R-peaks, systolic peaks) occur. Can be a list of indices or the output(s) of other functions such as ecg_peaks, ppg_peaks, ecg_process or bio_process.
sampling_rate (int, optional) – Sampling rate (Hz) of the continuous cardiac signal in which the peaks occur. Should be at least twice as high as the highest frequency in vhf. By default 1000.
show (bool, optional) – If True, returns the plots that are generates for each of the domains.
- Returns
DataFrame – Contains HRV metrics from three domains: - frequency (see hrv_frequency) - time (see hrv_time) - non-linear (see hrv_nonlinear) If RSP data is provided (e.g., output of bio_process): - rsa
Otherwise, to compute ECG-derived respiration, use hrv_rsa If no raw respiratory data is available, users can also choose to use `ecg_rsp <https://neurokit2.readthedocs.io/en/latest/functions.html#neurokit2.ecg.ecg_rsp`_ to obtain ECG-derived respiratory signal, although this is not an ideal procedure.
See also
ecg_peaks
,ppg_peaks
,hrv_time
,hrv_frequency
,hrv_nonlinear
,hrv_rsa
Examples
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Clean signal and Find peaks >>> ecg_cleaned = nk.ecg_clean(data["ECG"], sampling_rate=100) >>> peaks, info = nk.ecg_peaks(ecg_cleaned, sampling_rate=100, correct_artifacts=True) >>> >>> # Compute HRV indices >>> hrv_indices = nk.hrv(peaks, sampling_rate=100, show=True) >>> hrv_indices >>> >>> # Compute HRV from processed signals >>> signals, info = nk.bio_process(data, sampling_rate=100) >>> hrv = nk.hrv(signals, sampling_rate=100, show=True) >>> hrv
References
Pham, T., Lau, Z. J., Chen, S. H. A., & Makowski, D. (2021). Heart Rate Variability in Psychology:
A Review of HRV Indices and an Analysis Tutorial. Sensors, 21(12), 3998. https://doi:10.3390/s21123998
Stein, P. K. (2002). Assessing heart rate variability from real-world Holter reports. Cardiac
electrophysiology review, 6(3), 239-244.
Shaffer, F., & Ginsberg, J. P. (2017). An overview of heart rate variability metrics and norms.
Frontiers in public health, 5, 258.
- hrv_frequency(peaks, sampling_rate=1000, ulf=(0, 0.0033), vlf=(0.0033, 0.04), lf=(0.04, 0.15), hf=(0.15, 0.4), vhf=(0.4, 0.5), psd_method='welch', show=False, silent=True, normalize=True, order_criteria=None, **kwargs)[source]¶
Computes frequency-domain indices of Heart Rate Variability (HRV).
Note that a minimum duration of the signal containing the peaks is recommended for some HRV indices to be meaningful. For instance, 1, 2 and 5 minutes of high quality signal are the recomended minima for HF, LF and LF/HF, respectively. See references for details.
- Parameters
peaks (dict) – Samples at which cardiac extrema (i.e., R-peaks, systolic peaks) occur. Can be a list of indices or the output(s) of other functions such as ecg_peaks, ppg_peaks, ecg_process or bio_process.
sampling_rate (int, optional) – Sampling rate (Hz) of the continuous cardiac signal in which the peaks occur. Should be at least twice as high as the highest frequency in vhf. By default 1000.
ulf (tuple, optional) – Upper and lower limit of the ultra-low frequency band. By default (0, 0.0033).
vlf (tuple, optional) – Upper and lower limit of the very-low frequency band. By default (0.0033, 0.04).
lf (tuple, optional) – Upper and lower limit of the low frequency band. By default (0.04, 0.15).
hf (tuple, optional) – Upper and lower limit of the high frequency band. By default (0.15, 0.4).
vhf (tuple, optional) – Upper and lower limit of the very-high frequency band. By default (0.4, 0.5).
psd_method (str) – Method used for spectral density estimation. For details see signal.signal_power. By default “welch”.
silent (bool) – If False, warnings will be printed. Default to True.
show (bool) – If True, will plot the power in the different frequency bands.
normalize (bool) – Normalization of power by maximum PSD value. Default to True. Normalization allows comparison between different PSD methods.
order_criteria (str) – The criteria to automatically select order in parametric PSD (only used for autoregressive (AR) methods such as ‘burg’). Defaults to None.
**kwargs (optional) – Other arguments.
- Returns
DataFrame – Contains frequency domain HRV metrics: - ULF: The spectral power density pertaining to ultra low frequency band i.e., .0 to .0033 Hz by default. - VLF: The spectral power density pertaining to very low frequency band i.e., .0033 to .04 Hz by default. - LF: The spectral power density pertaining to low frequency band i.e., .04 to .15 Hz by default. - HF: The spectral power density pertaining to high frequency band i.e., .15 to .4 Hz by default. - VHF: The variability, or signal power, in very high frequency i.e., .4 to .5 Hz by default. - LFn: The normalized low frequency, obtained by dividing the low frequency power by the total power. - HFn: The normalized high frequency, obtained by dividing the low frequency power by the total power. - LnHF: The log transformed HF.
See also
ecg_peaks
,ppg_peaks
,hrv_summary
,hrv_time
,hrv_nonlinear
Examples
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Find peaks >>> peaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100) >>> >>> # Compute HRV indices >>> hrv_welch = nk.hrv_frequency(peaks, sampling_rate=100, show=True, psd_method="welch") >>> hrv_burg = nk.hrv_frequency(peaks, sampling_rate=100, show=True, psd_method="burg") >>> hrv_lomb = nk.hrv_frequency(peaks, sampling_rate=100, show=True, psd_method="lomb") >>> hrv_multitapers = nk.hrv_frequency(peaks, sampling_rate=100, show=True, psd_method="multitapers")
References
Stein, P. K. (2002). Assessing heart rate variability from real-world Holter reports. Cardiac
electrophysiology review, 6(3), 239-244.
Shaffer, F., & Ginsberg, J. P. (2017). An overview of heart rate variability metrics and norms.
Frontiers in public health, 5, 258.
Boardman, A., Schlindwein, F. S., & Rocha, A. P. (2002). A study on the optimum order of
autoregressive models for heart rate variability. Physiological measurement, 23(2), 325.
Bachler, M. (2017). Spectral Analysis of Unevenly Spaced Data: Models and Application in Heart
Rate Variability. Simul. Notes Eur., 27(4), 183-190.
- hrv_nonlinear(peaks, sampling_rate=1000, show=False, **kwargs)[source]¶
Computes nonlinear indices of Heart Rate Variability (HRV).
See references for details.
- Parameters
peaks (dict) – Samples at which cardiac extrema (i.e., R-peaks, systolic peaks) occur. Can be a list of indices or the output(s) of other functions such as ecg_peaks, ppg_peaks, ecg_process or bio_process.
sampling_rate (int, optional) – Sampling rate (Hz) of the continuous cardiac signal in which the peaks occur. Should be at least twice as high as the highest frequency in vhf. By default 1000.
dfa_windows (list) – A list of tuples containing the number of heartbeats to compute the DFA short term scaling exponent, α1 and the long term scaling exponent, α2, respectively. Defaults to [[4, 11], [12, None]], where α1 is estimated from 4 to 11 heartbeats and α2 is estimated from a larger number of heartbeats, i.e., 11 beats and above, based on Acharya et al. (2002).
show (bool, optional) – If True, will return a Poincaré plot, a scattergram, which plots each RR interval against the next successive one. The ellipse centers around the average RR interval. By default False.
**kwargs (optional) – Other arguments to be passed into fractal_dfa() and fractal_correlation().
- Returns
DataFrame – Contains non-linear HRV metrics:
Characteristics of the Poincaré Plot Geometry:
SD1: SD1 is a measure of the spread of RR intervals on the Poincaré plot
perpendicular to the line of identity. It is an index of short-term RR interval fluctuations, i.e., beat-to-beat variability. It is equivalent (although on another scale) to RMSSD, and therefore it is redundant to report correlations with both (Ciccone, 2017).
SD2: SD2 is a measure of the spread of RR intervals on the Poincaré plot along the
line of identity. It is an index of long-term RR interval fluctuations.
SD1SD2: the ratio between short and long term fluctuations of the RR intervals
(SD1 divided by SD2).
S: Area of ellipse described by SD1 and SD2 (
pi * SD1 * SD2
). It is
proportional to SD1SD2.
CSI: The Cardiac Sympathetic Index (Toichi, 1997), calculated by dividing the
longitudinal variability of the Poincaré plot (
4*SD2
) by its transverse variability (4*SD1
).CVI: The Cardiac Vagal Index (Toichi, 1997), equal to the logarithm of the product of
longitudinal (
4*SD2
) and transverse variability (4*SD1
).CSI_Modified: The modified CSI (Jeppesen, 2014) obtained by dividing the square of
the longitudinal variability by its transverse variability.
Indices of Heart Rate Asymmetry (HRA), i.e., asymmetry of the Poincaré plot (Yan, 2017):
GI: Guzik’s Index, defined as the distance of points above line of identity (LI)
to LI divided by the distance of all points in Poincaré plot to LI except those that are located on LI.
SI: Slope Index, defined as the phase angle of points above LI divided by the
phase angle of all points in Poincaré plot except those that are located on LI.
AI: Area Index, defined as the cumulative area of the sectors corresponding to
the points that are located above LI divided by the cumulative area of sectors corresponding to all points in the Poincaré plot except those that are located on LI.
PI: Porta’s Index, defined as the number of points below LI divided by the total
number of points in Poincaré plot except those that are located on LI.
SD1d and SD1a: short-term variance of contributions of decelerations
(prolongations of RR intervals) and accelerations (shortenings of RR intervals), respectively (Piskorski, 2011).
C1d and C1a: the contributions of heart rate decelerations and accelerations
to short-term HRV, respectively (Piskorski, 2011).
SD2d and SD2a: long-term variance of contributions of decelerations
(prolongations of RR intervals) and accelerations (shortenings of RR intervals), respectively (Piskorski, 2011).
C2d and C2a: the contributions of heart rate decelerations and accelerations
to long-term HRV, respectively (Piskorski, 2011).
SDNNd and SDNNa: total variance of contributions of decelerations
(prolongations of RR intervals) and accelerations (shortenings of RR intervals), respectively (Piskorski, 2011).
Cd and Ca: the total contributions of heart rate decelerations and
accelerations to HRV.
Indices of Heart Rate Fragmentation (Costa, 2017):
PIP: Percentage of inflection points of the RR intervals series.
IALS: Inverse of the average length of the acceleration/deceleration segments.
PSS: Percentage of short segments.
PAS: IPercentage of NN intervals in alternation segments.
Indices of Complexity:
ApEn: The approximate entropy measure of HRV, calculated by entropy_approximate().
SampEn: The sample entropy measure of HRV, calculated by entropy_sample().
ShanEn: The Shannon entropy measure of HRV, calculated by entropy_shannon().
FuzzyEn: The fuzzy entropy measure of HRV, calculated by entropy_fuzzy().
MSE: The multiscale entropy measure of HRV, calculated by entropy_multiscale().
CMSE: The composite multiscale entropy measure of HRV, calculated by entropy_multiscale().
RCMSE: The refined composite multiscale entropy measure of HRV, calculated by entropy_multiscale().
CD: The Correlation Dimension of the HR signal, calculated by fractal_correlation().
HFD: The Higuchi’s Fractal Dimension of the HR signal, calculated by fractal_higuchi().
kmax is set to “default”.
KFD: The Katz’s Fractal Dimension of the HR signal, calculated by fractal_katz().
LZC: The Lempel-Ziv complexity of the HR signal, calculated by fractal_lempelziv().
DFA_alpha1: The monofractal detrended fluctuation analysis of the HR signal corresponding
to short-term correlations, calculated by fractal_dfa().
DFA_alpha2: The monofractal detrended fluctuation analysis of the HR signal corresponding
to long-term correlations, calculated by fractal_dfa().
DFA_alpha1_ExpRange: The multifractal detrended fluctuation analysis of the HR signal
corresponding to short-term correlations, calculated by fractal_dfa(). ExpRange is the range of singularity exponents, correspoinding to the width of the singularity spectrum.
DFA_alpha2_ExpRange: The multifractal detrended fluctuation analysis of the HR signal
corresponding to long-term correlations, calculated by fractal_dfa(). ExpRange is the range of singularity exponents, correspoinding to the width of the singularity spectrum.
DFA_alpha1_ExpMean: Multifractal DFA. ExpMean is the mean of singularity exponents.
DFA_alpha2_ExpMean: Multifractal DFA. ExpMean is the mean of singularity exponents.
DFA_alpha1_DimRange: The multifractal detrended fluctuation analysis of the HR signal
corresponding to short-term correlations, calculated by fractal_dfa(). DimRange is the range of singularity dimensions, correspoinding to the height of the singularity spectrum.
DFA_alpha2_DimRange: The multifractal detrended fluctuation analysis of the HR signal
corresponding to long-term correlations, calculated by fractal_dfa(). DimRange is the range of singularity dimensions, correspoinding to the height of the singularity spectrum.
DFA_alpha1_DimMean: Multifractal DFA. Dimmean is the mean of singularity dimensions.
DFA_alpha2_DimMean: Multifractal DFA. Dimmean is the mean of singularity dimensions.
See also
ecg_peaks
,ppg_peaks
,hrv_frequency
,hrv_time
,hrv_summary
Examples
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Find peaks >>> peaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100) >>> >>> # Compute HRV indices >>> hrv = nk.hrv_nonlinear(peaks, sampling_rate=100, show=True) >>> hrv
References
Yan, C., Li, P., Ji, L., Yao, L., Karmakar, C., & Liu, C. (2017). Area asymmetry of heart
rate variability signal. Biomedical engineering online, 16(1), 112.
Ciccone, A. B., Siedlik, J. A., Wecht, J. M., Deckert, J. A., Nguyen, N. D., & Weir, J. P.
(2017). Reminder: RMSSD and SD1 are identical heart rate variability metrics. Muscle & nerve, 56(4), 674-678.
Shaffer, F., & Ginsberg, J. P. (2017). An overview of heart rate variability metrics and norms.
Frontiers in public health, 5, 258.
Costa, M. D., Davis, R. B., & Goldberger, A. L. (2017). Heart rate fragmentation: a new
approach to the analysis of cardiac interbeat interval dynamics. Front. Physiol. 8, 255 (2017).
Jeppesen, J., Beniczky, S., Johansen, P., Sidenius, P., & Fuglsang-Frederiksen, A. (2014).
Using Lorenz plot and Cardiac Sympathetic Index of heart rate variability for detecting seizures for patients with epilepsy. In 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (pp. 4563-4566). IEEE.
Piskorski, J., & Guzik, P. (2011). Asymmetric properties of long-term and total heart rate
variability. Medical & biological engineering & computing, 49(11), 1289-1297.
Stein, P. K. (2002). Assessing heart rate variability from real-world Holter reports. Cardiac
electrophysiology review, 6(3), 239-244.
Brennan, M. et al. (2001). Do Existing Measures of Poincaré Plot Geometry Reflect Nonlinear
Features of Heart Rate Variability?. IEEE Transactions on Biomedical Engineering, 48(11), 1342-1347.
Toichi, M., Sugiura, T., Murai, T., & Sengoku, A. (1997). A new method of assessing cardiac
autonomic function and its comparison with spectral analysis and coefficient of variation of R–R interval. Journal of the autonomic nervous system, 62(1-2), 79-84.
Acharya, R. U., Lim, C. M., & Joseph, P. (2002). Heart rate variability analysis using
correlation dimension and detrended fluctuation analysis. Itbm-Rbm, 23(6), 333-339.
- hrv_rsa(ecg_signals, rsp_signals=None, rpeaks=None, sampling_rate=1000, continuous=False, window=None, window_number=None)[source]¶
Respiratory Sinus Arrhythmia (RSA)
Respiratory sinus arrhythmia (RSA), also referred to as ‘cardiac coherence’ or ‘physiological coherence’ (though these terms are often encompassing a wider meaning), is the naturally occurring variation in heart rate during the breathing cycle. Metrics to quantify it are often used as a measure of parasympathetic nervous system activity. Neurophysiology informs us that the functional output of the myelinated vagus originating from the nucleus ambiguus has a respiratory rhythm. Thus, there would a temporal relation between the respiratory rhythm being expressed in the firing of these efferent pathways and the functional effect on the heart rate rhythm manifested as RSA. Importantly, several methods exist to quantify RSA:
The Peak-to-trough (P2T) algorithm measures the statistical range in milliseconds of the heart
period oscillation associated with synchronous respiration. Operationally, subtracting the shortest heart period during inspiration from the longest heart period during a breath cycle produces an estimate of RSA during each breath. The peak-to-trough method makes no statistical assumption or correction (e.g., adaptive filtering) regarding other sources of variance in the heart period time series that may confound, distort, or interact with the metric such as slower periodicities and baseline trend. Although it has been proposed that the P2T method “acts as a time-domain filter dynamically centered at the exact ongoing respiratory frequency” (Grossman, 1992), the method does not transform the time series in any way, as a filtering process would. Instead the method uses knowledge of the ongoing respiratory cycle to associate segments of the heart period time series with either inhalation or exhalation (Lewis, 2012).
The Porges-Bohrer (PB) algorithm assumes that heart period time series reflect the sum of several
component time series. Each of these component time series may be mediated by different neural mechanisms and may have different statistical features. The Porges-Bohrer method applies an algorithm that selectively extracts RSA, even when the periodic process representing RSA is superimposed on a complex baseline that may include aperiodic and slow periodic processes. Since the method is designed to remove sources of variance in the heart period time series other than the variance within the frequency band of spontaneous breathing, the method is capable of accurately quantifying RSA when the signal to noise ratio is low.
- Parameters
ecg_signals (DataFrame) – DataFrame obtained from ecg_process(). Should contain columns ECG_Rate and ECG_R_Peaks. Can also take a DataFrame comprising of both ECG and RSP signals, generated by bio_process().
rsp_signals (DataFrame) – DataFrame obtained from rsp_process(). Should contain columns RSP_Phase and RSP_PhaseCompletion. No impact when a DataFrame comprising of both the ECG and RSP signals are passed as ecg_signals. Defaults to None.
rpeaks (dict) – The samples at which the R-peaks of the ECG signal occur. Dict returned by ecg_peaks(), ecg_process(), or bio_process(). Defaults to None.
sampling_rate (int) – The sampling frequency of signals (in Hz, i.e., samples/second).
continuous (bool) – If False, will return RSA properties computed from the data (one value per index). If True, will return continuous estimations of RSA of the same length as the signal. See below for more details.
window (int) – For calculating RSA second by second. Length of each segment in seconds. If None (default), window will be set at 32 seconds.
window_number (int) – Between 2 and 8. For caculating RSA second by second. Number of windows to be calculated in Peak Matched Multiple Window. If None (default), window_number will be set at 8.
- Returns
rsa (dict) – A dictionary containing the RSA features, which includes:
”RSA_P2T_Values”: the estimate of RSA during each breath cycle, produced by subtracting the shortest heart period (or RR interval) from the longest heart period in ms.
”RSA_P2T_Mean”: the mean peak-to-trough across all cycles in ms
”RSA_P2T_Mean_log”: the logarithm of the mean of RSA estimates.
”RSA_P2T_SD”: the standard deviation of all RSA estimates.
”RSA_P2T_NoRSA”: the number of breath cycles from which RSA could not be calculated.
”RSA_PorgesBohrer”: the Porges-Bohrer estimate of RSA, optimal when the signal to noise ratio is low, in ln(ms^2).
Example
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_eventrelated_100hz") >>> >>> # Process the data >>> ecg_signals, info = nk.ecg_process(data["ECG"], sampling_rate=100) >>> rsp_signals, _ = nk.rsp_process(data["RSP"], sampling_rate=100) >>> >>> # Get RSA features >>> nk.hrv_rsa(ecg_signals, rsp_signals, info, sampling_rate=100, continuous=False) {'RSA_P2T_Mean': ..., 'RSA_P2T_Mean_log': ..., 'RSA_P2T_SD': ..., 'RSA_P2T_NoRSA': ..., 'RSA_PorgesBohrer': ..., 'RSA_Gates_Mean': ..., 'RSA_Gates_Mean_log': ..., 'RSA_Gates_SD': ...} >>> >>> # Get RSA as a continuous signal >>> rsa = nk.hrv_rsa(ecg_signals, rsp_signals, info, sampling_rate=100, continuous=True) >>> rsa RSA_P2T RSA_Gates 0 ... ... 1 ... ... 2 ... ... ... ... ...
[15000 rows x 2 columns] >>> nk.signal_plot([ecg_signals[“ECG_Rate”], rsp_signals[“RSP_Rate”], rsa], standardize=True)
References
Servant, D., Logier, R., Mouster, Y., & Goudemand, M. (2009). La variabilité de la fréquence cardiaque. Intérêts en psychiatrie. L’Encéphale, 35(5), 423–428. doi:10.1016/j.encep.2008.06.016
Lewis, G. F., Furman, S. A., McCool, M. F., & Porges, S. W. (2012). Statistical strategies to quantify respiratory sinus arrhythmia: Are commonly used metrics equivalent?. Biological psychology, 89(2), 349-364.
Zohar, A. H., Cloninger, C. R., & McCraty, R. (2013). Personality and heart rate variability: exploring pathways from personality to cardiac coherence and health. Open Journal of Social Sciences, 1(06), 32.
- hrv_time(peaks, sampling_rate=1000, show=False, **kwargs)[source]¶
Computes time-domain indices of Heart Rate Variability (HRV).
See references for details.
- Parameters
peaks (dict) – Samples at which cardiac extrema (i.e., R-peaks, systolic peaks) occur. Can be a list of indices or the output(s) of other functions such as ecg_peaks, ppg_peaks, ecg_process or bio_process.
sampling_rate (int, optional) – Sampling rate (Hz) of the continuous cardiac signal in which the peaks occur. Should be at least twice as high as the highest frequency in vhf. By default 1000.
show (bool) – If True, will plot the distribution of R-R intervals.
- Returns
DataFrame – Contains time domain HRV metrics: - MeanNN: The mean of the RR intervals. - SDNN: The standard deviation of the RR intervals. -SDANN1, SDANN2, SDANN5: The standard deviation of average RR intervals extracted from n-minute segments of time series data (1, 2 and 5 by default). Note that these indices require a minimal duration of signal to be computed (3, 6 and 15 minutes respectively) and will be silently skipped if the data provided is too short. -SDNNI1, SDNNI2, SDNNI5: The mean of the standard deviations of RR intervals extracted from n-minute segments of time series data (1, 2 and 5 by default). Note that these indices require a minimal duration of signal to be computed (3, 6 and 15 minutes respectively) and will be silently skipped if the data provided is too short. - RMSSD: The square root of the mean of the sum of successive differences between adjacent RR intervals. It is equivalent (although on another scale) to SD1, and therefore it is redundant to report correlations with both (Ciccone, 2017). - SDSD: The standard deviation of the successive differences between RR intervals. - CVNN: The standard deviation of the RR intervals (SDNN) divided by the mean of the RR intervals (MeanNN). - CVSD: The root mean square of the sum of successive differences (RMSSD) divided by the mean of the RR intervals (MeanNN). - MedianNN: The median of the absolute values of the successive differences between RR intervals. - MadNN: The median absolute deviation of the RR intervals. - HCVNN: The median absolute deviation of the RR intervals (MadNN) divided by the median of the absolute differences of their successive differences (MedianNN). - IQRNN: The interquartile range (IQR) of the RR intervals. - pNN50: The proportion of RR intervals greater than 50ms, out of the total number of RR intervals. - pNN20: The proportion of RR intervals greater than 20ms, out of the total number of RR intervals. - TINN: A geometrical parameter of the HRV, or more specifically, the baseline width of the RR intervals distribution obtained by triangular interpolation, where the error of least squares determines the triangle. It is an approximation of the RR interval distribution. - HTI: The HRV triangular index, measuring the total number of RR intervals divded by the height of the RR intervals histogram.
See also
ecg_peaks
,ppg_peaks
,hrv_frequency
,hrv_summary
,hrv_nonlinear
Examples
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Find peaks >>> peaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100) >>> >>> # Compute HRV indices >>> hrv = nk.hrv_time(peaks, sampling_rate=100, show=True)
References
Ciccone, A. B., Siedlik, J. A., Wecht, J. M., Deckert, J. A., Nguyen, N. D., & Weir, J. P.
(2017). Reminder: RMSSD and SD1 are identical heart rate variability metrics. Muscle & nerve, 56(4), 674-678.
Stein, P. K. (2002). Assessing heart rate variability from real-world Holter reports. Cardiac
electrophysiology review, 6(3), 239-244.
Shaffer, F., & Ginsberg, J. P. (2017). An overview of heart rate variability metrics and norms.
Frontiers in public health, 5, 258.
RSP¶
Submodule for NeuroKit.
- rsp_amplitude(rsp_cleaned, peaks, troughs=None, interpolation_method='monotone_cubic')[source]¶
Compute respiratory amplitude.
Compute respiratory amplitude given the raw respiration signal and its extrema.
- Parameters
rsp_cleaned (Union[list, np.array, pd.Series]) – The cleaned respiration channel as returned by rsp_clean().
peaks (list or array or DataFrame or Series or dict) – The samples at which the inhalation peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().
troughs (list or array or DataFrame or Series or dict) – The samples at which the inhalation troughs occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().
interpolation_method (str) – Method used to interpolate the amplitude between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the y-direction). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.
- Returns
array – A vector containing the respiratory amplitude.
See also
rsp_clean
,rsp_peaks
,signal_rate
,rsp_process
,rsp_plot
Examples
>>> import neurokit2 as nk >>> import pandas as pd >>> >>> rsp = nk.rsp_simulate(duration=90, respiratory_rate=15) >>> cleaned = nk.rsp_clean(rsp, sampling_rate=1000) >>> info, signals = nk.rsp_peaks(cleaned) >>> >>> amplitude = nk.rsp_amplitude(cleaned, signals) >>> fig = nk.signal_plot(pd.DataFrame({"RSP": rsp, "Amplitude": amplitude}), subplots=True) >>> fig
- rsp_analyze(data, sampling_rate=1000, method='auto', subepoch_rate=[None, None])[source]¶
Performs RSP analysis on either epochs (event-related analysis) or on longer periods of data such as resting- state data.
- Parameters
data (dict or DataFrame) – A dictionary of epochs, containing one DataFrame per epoch, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df(). Can also take a DataFrame of processed signals from a longer period of data, typically generated by rsp_process() or bio_process(). Can also take a dict containing sets of separate periods of data.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Defaults to 1000Hz.
method (str) – Can be one of ‘event-related’ for event-related analysis on epochs, or ‘interval-related’ for analysis on longer periods of data. Defaults to ‘auto’ where the right method will be chosen based on the mean duration of the data (‘event-related’ for duration under 10s).
subepoch_rate (list) – For event-related analysis,, a smaller “sub-epoch” within the epoch of an event can be specified. The ECG rate-related features of this “sub-epoch” (e.g., RSP_Rate, RSP_Rate_Max), relative to the baseline (where applicable), will be computed. The first value of the list specifies the start of the sub-epoch and the second specifies the end of the sub-epoch (in seconds), e.g., subepoch_rate = [1, 3] or subepoch_rate = [1, None]. Defaults to [None, None].
- Returns
DataFrame – A dataframe containing the analyzed RSP features. If event-related analysis is conducted, each epoch is indicated by the Label column. See rsp_eventrelated() and rsp_intervalrelated() docstrings for details.
See also
bio_process
,rsp_process
,epochs_create
,rsp_eventrelated
,rsp_intervalrelated
Examples
>>> import neurokit2 as nk
>>> # Example 1: Download the data for event-related analysis >>> data = nk.data("bio_eventrelated_100hz") >>> >>> # Process the data for event-related analysis >>> df, info = nk.bio_process(rsp=data["RSP"], sampling_rate=100) >>> events = nk.events_find(data["Photosensor"], threshold_keep='below', ... event_conditions=["Negative", "Neutral", "Neutral", "Negative"]) >>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=-0.1, epochs_end=1.9) >>> >>> # Analyze >>> nk.rsp_analyze(epochs, sampling_rate=100) >>> >>> # Example 2: Download the resting-state data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Process the data >>> df, info = nk.rsp_process(data["RSP"], sampling_rate=100) >>> >>> # Analyze >>> nk.rsp_analyze(df, sampling_rate=100)
- rsp_clean(rsp_signal, sampling_rate=1000, method='khodadad2018')[source]¶
Preprocess a respiration (RSP) signal.
Clean a respiration signal using different sets of parameters, such as ‘khodadad2018’ (linear detrending followed by a fifth order 2Hz low-pass IIR Butterworth filter) or BioSPPy (second order0.1 - 0.35 Hz bandpass Butterworth filter followed by a constant detrending).
- Parameters
rsp_signal (Union[list, np.array, pd.Series]) – The raw respiration channel (as measured, for instance, by a respiration belt).
sampling_rate (int) – The sampling frequency of rsp_signal (in Hz, i.e., samples/second).
method (str) – The processing pipeline to apply. Can be one of “khodadad2018” (default) or “biosppy”.
- Returns
array – Vector containing the cleaned respiratory signal.
See also
rsp_findpeaks
,signal_rate
,rsp_amplitude
,rsp_process
,rsp_plot
Examples
>>> import pandas as pd >>> import neurokit2 as nk >>> >>> rsp = nk.rsp_simulate(duration=30, sampling_rate=50, noise=0.01) >>> signals = pd.DataFrame({ "RSP_Raw": rsp, ... "RSP_Khodadad2018": nk.rsp_clean(rsp, sampling_rate=50, method="khodadad2018"), ... "RSP_BioSPPy": nk.rsp_clean(rsp, sampling_rate=50, method="biosppy")}) >>> fig = signals.plot() >>> fig
References
Performs event-related RSP analysis on epochs.
- Parameters
epochs (Union[dict, pd.DataFrame]) – A dict containing one DataFrame per event/trial, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df().
silent (bool) – If True, silence possible warnings.
subepoch_rate (list) – A smaller “sub-epoch” within the epoch of an event can be specified. The ECG rate-related features of this “sub-epoch” (e.g., RSP_Rate, RSP_Rate_Max), relative to the baseline (where applicable), will be computed. The first value of the list specifies the start of the sub-epoch and the second specifies the end of the sub-epoch (in seconds), e.g., subepoch_rate = [1, 3] or subepoch_rate = [1, None]. Defaults to [None, None].
- Returns
DataFrame – A dataframe containing the analyzed RSP features for each epoch, with each epoch indicated by the Label column (if not present, by the Index column). The analyzed features consist of the following: - “RSP_Rate_Max”: the maximum respiratory rate after stimulus onset. - “RSP_Rate_Min”: the minimum respiratory rate after stimulus onset. - “RSP_Rate_Mean”: the mean respiratory rate after stimulus onset. - “RSP_Rate_SD”: the standard deviation of the respiratory rate after stimulus onset. - “RSP_Rate_Max_Time”: the time at which maximum respiratory rate occurs. - “RSP_Rate_Min_Time”: the time at which minimum respiratory rate occurs. - “RSP_Amplitude_Max”: the change in maximum respiratory amplitude from before stimulus onset. - “RSP_Amplitude_Min”: the change in minimum respiratory amplitude from before stimulus onset. - “RSP_Amplitude_Mean”: the change in mean respiratory amplitude from before stimulus onset. - “RSP_Amplitude_SD”: the standard deviation of the respiratory amplitude after stimulus onset. - “RSP_Phase”: indication of whether the onset of the event concurs with respiratory inspiration (1) or expiration (0). - “RSP_PhaseCompletion”: indication of the stage of the current respiration phase (0 to 1) at the onset of the event.
See also
events_find
,epochs_create
,bio_process
Examples
>>> import neurokit2 as nk >>> >>> # Example with simulated data >>> rsp, info = nk.rsp_process(nk.rsp_simulate(duration=120)) >>> epochs = nk.epochs_create(rsp, events=[5000, 10000, 15000], epochs_start=-0.1, epochs_end=1.9) >>> >>> # Analyze >>> rsp1 = nk.rsp_eventrelated(epochs) >>> rsp1 >>> >>> # Example with real data >>> data = nk.data("bio_eventrelated_100hz") >>> >>> # Process the data >>> df, info = nk.bio_process(rsp=data["RSP"], sampling_rate=100) >>> events = nk.events_find(data["Photosensor"], threshold_keep='below', ... event_conditions=["Negative", "Neutral", "Neutral", "Negative"]) >>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=-0.1, epochs_end=2.9) >>> >>> # Analyze >>> rsp2 = nk.rsp_eventrelated(epochs) >>> rsp2
- rsp_findpeaks(rsp_cleaned, sampling_rate=1000, method='khodadad2018', amplitude_min=0.3)[source]¶
Extract extrema in a respiration (RSP) signal.
Low-level function used by rsp_peaks() to identify inhalation peaks and exhalation troughs in a preprocessed respiration signal using different sets of parameters. See rsp_peaks() for details.
- Parameters
rsp_cleaned (Union[list, np.array, pd.Series]) – The cleaned respiration channel as returned by rsp_clean().
sampling_rate (int) – The sampling frequency of ‘rsp_cleaned’ (in Hz, i.e., samples/second).
method (str) – The processing pipeline to apply. Can be one of “khodadad2018” (default) or “biosppy”.
amplitude_min (float) – Only applies if method is “khodadad2018”. Extrema that have a vertical distance smaller than (outlier_threshold * average vertical distance) to any direct neighbour are removed as false positive outliers. I.e., outlier_threshold should be a float with positive sign (the default is 0.3). Larger values of outlier_threshold correspond to more conservative thresholds (i.e., more extrema removed as outliers).
- Returns
info (dict) – A dictionary containing additional information, in this case the samples at which inhalation peaks and exhalation troughs occur, accessible with the keys “RSP_Peaks”, and “RSP_Troughs”, respectively.
See also
rsp_clean
,rsp_fixpeaks
,rsp_peaks
,signal_rate
,rsp_amplitude
,rsp_process
,rsp_plot
Examples
>>> import neurokit2 as nk >>> >>> rsp = nk.rsp_simulate(duration=30, respiratory_rate=15) >>> cleaned = nk.rsp_clean(rsp, sampling_rate=1000) >>> info = nk.rsp_findpeaks(cleaned) >>> fig = nk.events_plot([info["RSP_Peaks"], info["RSP_Troughs"]], cleaned) >>> fig
- rsp_fixpeaks(peaks, troughs=None)[source]¶
Correct RSP peaks.
Low-level function used by rsp_peaks() to correct the peaks found by rsp_findpeaks(). Doesn’t do anything for now for RSP. See rsp_peaks() for details.
- Parameters
peaks (list or array or DataFrame or Series or dict) – The samples at which the inhalation peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().
troughs (list or array or DataFrame or Series or dict) – The samples at which the inhalation troughs occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().
- Returns
info (dict) – A dictionary containing additional information, in this case the samples at which inhalation peaks and exhalation troughs occur, accessible with the keys “RSP_Peaks”, and “RSP_Troughs”, respectively.
See also
rsp_clean
,rsp_findpeaks
,rsp_peaks
,rsp_amplitude
,rsp_process
,rsp_plot
Examples
>>> import neurokit2 as nk >>> >>> rsp = nk.rsp_simulate(duration=30, respiratory_rate=15) >>> cleaned = nk.rsp_clean(rsp, sampling_rate=1000) >>> info = nk.rsp_findpeaks(cleaned) >>> info = nk.rsp_fixpeaks(info) >>> fig = nk.events_plot([info["RSP_Peaks"], info["RSP_Troughs"]], cleaned) >>> fig
Performs RSP analysis on longer periods of data (typically > 10 seconds), such as resting-state data.
- Parameters
data (DataFrame or dict) – A DataFrame containing the different processed signal(s) as different columns, typically generated by rsp_process() or bio_process(). Can also take a dict containing sets of separately processed DataFrames.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
- Returns
DataFrame – A dataframe containing the analyzed RSP features. The analyzed features consist of the following: - “RSP_Rate_Mean”: the mean respiratory rate. - “RSP_Amplitude_Mean”: the mean respiratory amplitude. - “RSP_RRV”: the different respiratory rate variability metrices. See rsp_rrv() docstrings for details. - “RSP_Phase_Duration_Inspiration”: the average inspiratory duration. - “RSP_Phase_Duration_Expiration”: the average expiratory duration. - “RSP_Phase_Duration_Ratio “: the inspiratory-to-expiratory time ratio (I/E).
See also
bio_process
,rsp_eventrelated
Examples
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Process the data >>> df, info = nk.rsp_process(data["RSP"], sampling_rate=100)
>>> # Single dataframe is passed >>> nk.rsp_intervalrelated(df, sampling_rate=100) >>> >>> epochs = nk.epochs_create(df, events=[0, 15000], sampling_rate=100, epochs_end=150) >>> nk.rsp_intervalrelated(epochs)
- rsp_peaks(rsp_cleaned, sampling_rate=1000, method='khodadad2018', amplitude_min=0.3)[source]¶
Identify extrema in a respiration (RSP) signal.
This function rsp_findpeaks() and rsp_fixpeaks to identify and process inhalation peaks and exhalation troughs in a preprocessed respiration signal using different sets of parameters, such as:
`Khodadad et al. (2018)
<https://iopscience.iop.org/article/10.1088/1361-6579/aad7e6/meta>`_
`BioSPPy
<https://github.com/PIA-Group/BioSPPy/blob/master/biosppy/signals/resp.py>`_
- Parameters
rsp_cleaned (Union[list, np.array, pd.Series]) – The cleaned respiration channel as returned by rsp_clean().
sampling_rate (int) – The sampling frequency of ‘rsp_cleaned’ (in Hz, i.e., samples/second).
method (str) – The processing pipeline to apply. Can be one of “khodadad2018” (default) or “biosppy”.
amplitude_min (float) – Only applies if method is “khodadad2018”. Extrema that have a vertical distance smaller than (outlier_threshold * average vertical distance) to any direct neighbour are removed as false positive outliers. i.e., outlier_threshold should be a float with positive sign (the default is 0.3). Larger values of outlier_threshold correspond to more conservative thresholds (i.e., more extrema removed as outliers).
- Returns
info (dict) – A dictionary containing additional information, in this case the samples at which inhalation peaks and exhalation troughs occur, accessible with the keys “RSP_Peaks”, and “RSP_Troughs”, respectively, as well as the signals’ sampling rate.
peak_signal (DataFrame) – A DataFrame of same length as the input signal in which occurences of inhalation peaks and exhalation troughs are marked as “1” in lists of zeros with the same length as rsp_cleaned. Accessible with the keys “RSP_Peaks” and “RSP_Troughs” respectively.
See also
rsp_clean
,signal_rate
,rsp_findpeaks
,rsp_fixpeaks
,rsp_amplitude
,rsp_process
,rsp_plot
Examples
>>> import neurokit2 as nk >>> import pandas as pd >>> >>> rsp = nk.rsp_simulate(duration=30, respiratory_rate=15) >>> cleaned = nk.rsp_clean(rsp, sampling_rate=1000) >>> peak_signal, info = nk.rsp_peaks(cleaned, sampling_rate=1000) >>> >>> data = pd.concat([pd.DataFrame({"RSP": rsp}), peak_signal], axis=1) >>> fig = nk.signal_plot(data) >>> fig
- rsp_phase(peaks, troughs=None, desired_length=None)[source]¶
Compute respiratory phase (inspiration and expiration).
Finds the respiratory phase, labelled as 1 for inspiration and 0 for expiration.
- Parameters
peaks (list or array or DataFrame or Series or dict) – The samples at which the inhalation peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().
troughs (list or array or DataFrame or Series or dict) – The samples at which the inhalation troughs occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().
desired_length (int) – By default, the returned respiration rate has the same number of elements as peaks. If set to an integer, the returned rate will be interpolated between peaks over desired_length samples. Has no effect if a DataFrame is passed in as the peaks argument.
- Returns
signals (DataFrame) – A DataFrame of same length as rsp_signal containing the following columns: - “RSP_Phase”: breathing phase, marked by “1” for inspiration and “0” for expiration. - “RSP_Phase_Completion”: breathing phase completion, expressed in percentage (from 0 to 1), representing the stage of the current respiratory phase.
See also
Examples
>>> import neurokit2 as nk >>> >>> rsp = nk.rsp_simulate(duration=30, respiratory_rate=15) >>> cleaned = nk.rsp_clean(rsp, sampling_rate=1000) >>> peak_signal, info = nk.rsp_peaks(cleaned) >>> >>> phase = nk.rsp_phase(peak_signal, desired_length=len(cleaned)) >>> fig = nk.signal_plot([rsp, phase], standardize=True) >>> fig
- rsp_plot(rsp_signals, sampling_rate=None)[source]¶
Visualize respiration (RSP) data.
- Parameters
rsp_signals (DataFrame) – DataFrame obtained from rsp_process().
sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second).
Examples
>>> import neurokit2 as nk >>> >>> rsp = nk.rsp_simulate(duration=90, respiratory_rate=15) >>> rsp_signals, info = nk.rsp_process(rsp, sampling_rate=1000) >>> fig = nk.rsp_plot(rsp_signals) >>> fig
- Returns
fig – Figure representing a plot of the processed rsp signals.
See also
- rsp_process(rsp_signal, sampling_rate=1000, method='khodadad2018')[source]¶
Process a respiration (RSP) signal.
Convenience function that automatically processes a respiration signal with one of the following methods:
`Khodadad et al. (2018)
<https://iopscience.iop.org/article/10.1088/1361-6579/aad7e6/meta>`_
`BioSPPy
<https://github.com/PIA-Group/BioSPPy/blob/master/biosppy/signals/resp.py>`_
- Parameters
rsp_signal (Union[list, np.array, pd.Series]) – The raw respiration channel (as measured, for instance, by a respiration belt).
sampling_rate (int) – The sampling frequency of rsp_signal (in Hz, i.e., samples/second).
method (str) – The processing pipeline to apply. Can be one of “khodadad2018” (default) or “biosppy”.
- Returns
signals (DataFrame) – A DataFrame of same length as rsp_signal containing the following columns: - “RSP_Raw”: the raw signal. - “RSP_Clean”: the cleaned signal. - “RSP_Peaks”: the inhalation peaks marked as “1” in a list of zeros. - “RSP_Troughs”: the exhalation troughs marked as “1” in a list of zeros. - “RSP_Rate”: breathing rate interpolated between inhalation peaks. - “RSP_Amplitude”: breathing amplitude interpolated between inhalation peaks. - “RSP_Phase”: breathing phase, marked by “1” for inspiration and “0” for expiration. - “RSP_PhaseCompletion”: breathing phase completion, expressed in percentage (from 0 to 1), representing the stage of the current respiratory phase.
info (dict) – A dictionary containing the samples at which inhalation peaks and exhalation troughs occur, accessible with the keys “RSP_Peaks”, and “RSP_Troughs” respectively, as well as the signals’ sampling rate.
See also
rsp_clean
,rsp_findpeaks
,signal_rate
,rsp_amplitude
,rsp_plot
Examples
>>> import neurokit2 as nk >>> >>> rsp = nk.rsp_simulate(duration=90, respiratory_rate=15) >>> signals, info = nk.rsp_process(rsp, sampling_rate=1000) >>> fig = nk.rsp_plot(signals) >>> fig
- rsp_rate(rsp_cleaned, peaks=None, sampling_rate=1000, window=10, hop_size=1, method='peak', peak_method='khodadad2018', interpolation_method='monotone_cubic')[source]¶
Find respiration rate by cross-correlation method. :Parameters: * rsp_cleaned (Union[list, np.array, pd.Series]) – The cleaned respiration channel as returned by rsp_clean().
peaks (Union[list, np.array, pd.Series, pd.DataFrame]) – The respiration peaks as returned by rsp_peaks(). If None (default), respiration peaks will be automatically identified from the rsp_clean signal.
sampling_rate (int) – The sampling frequency of ‘rsp_cleaned’ (in Hz, i.e., samples/second).
window (int) – The duration of the sliding window (in second). Default to 10 seconds.
hop_size (int) – The number of samples between each successive window. Default to 1 sample.
method (str) – Method can be either “peak” or “xcorr”. In “peak” method, rsp_rate is calculated from the periods between respiration peaks. In “xcorr” method, cross-correlations between the changes in respiration with a bank of sinusoids of different frequencies are caclulated to indentify the principal frequency of oscillation.
peak_method (str) – Method to identify respiration peaks. Can be one of “khodadad2018” (default) or “biosppy”.
interpolation_method (str) – Method used to interpolate the rate between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the y-direction). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.
- Returns
rsp_rate (np.ndarray) – Instantenous respiration rate.
Example
>>> import neurokit2 as nk >>> rsp_signal = nk.data("rsp_200hz.txt").iloc[:,0] >>> sampling_rate=200 >>> rsp_cleaned = nk.rsp_clean(rsp_signal) >>> rsp_rate_peak = nk.rsp_rate(rsp_cleaned, sampling_rate=sampling_rate, method="peaks") >>> rsp_rate_xcorr = nk.rsp_rate(rsp_cleaned, sampling_rate=sampling_rate, method="xcorr")
- rsp_rrv(rsp_rate, peaks=None, sampling_rate=1000, show=False, silent=True)[source]¶
Computes time domain and frequency domain features for Respiratory Rate Variability (RRV) analysis.
- Parameters
rsp_rate (array) – Array containing the respiratory rate, produced by signal_rate().
peaks (dict) – The samples at which the inhalation peaks occur. Dict returned by rsp_peaks(). Defaults to None.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
show (bool) – If True, will return a Poincaré plot, a scattergram, which plots each breath-to-breath interval against the next successive one. The ellipse centers around the average breath-to-breath interval. Defaults to False.
silent (bool) – If False, warnings will be printed. Default to True.
- Returns
DataFrame – DataFrame consisting of the computed RRV metrics, which includes: - “RRV_SDBB”: the standard deviation of the breath-to-breath intervals.
”RRV_RMSSD”: the root mean square of successive differences of the breath-to-breath intervals.
”RRV_SDSD”: the standard deviation of the successive differences between adjacent
breath-to-breath intervals.
”RRV_BBx”: the number of successive interval differences that are greater than x seconds.
”RRV-pBBx”: the proportion of breath-to-breath intervals that are greater than x seconds,
out of the total number of intervals.
”RRV_VLF”: spectral power density pertaining to very low frequency band i.e., 0 to .04 Hz by default.
”RRV_LF”: spectral power density pertaining to low frequency band i.e., .04 to .15 Hz by default.
”RRV_HF”: spectral power density pertaining to high frequency band i.e., .15 to .4 Hz by default.
”RRV_LFHF”: the ratio of low frequency power to high frequency power.
”RRV_LFn”: the normalized low frequency, obtained by dividing the low frequency power by the total power.
”RRV_HFn”: the normalized high frequency, obtained by dividing the low frequency power by total power.
”RRV_SD1”: SD1 is a measure of the spread of breath-to-breath intervals on the Poincaré
plot perpendicular to the line of identity. It is an index of short-term variability.
”RRV_SD2”: SD2 is a measure of the spread of breath-to-breath intervals on the Poincaré
plot along the line of identity. It is an index of long-term variability.
”RRV_SD2SD1”: the ratio between short and long term fluctuations of the breath-to-breath
intervals (SD2 divided by SD1).
”RRV_ApEn”: the approximate entropy of RRV, calculated by entropy_approximate().
”RRV_SampEn”: the sample entropy of RRV, calculated by entropy_sample().
”RRV_DFA_alpha1”: the “short-term” fluctuation value generated from Detrended Fluctuation
Analysis i.e. the root mean square deviation from the fitted trend of the breath-to-breath intervals. Will only be computed if mora than 160 breath cycles in the signal.
”RRV_DFA_alpha2”: the long-term fluctuation value. Will only be computed if mora than 640 breath
cycles in the signal.
RRV_alpha1_ExpRange: Multifractal DFA. ExpRange is the range of singularity exponents, correspoinding to the
width of the singularity spectrum.
RRV_alpha2_ExpRange: Multifractal DFA. ExpRange is the range of singularity exponents, correspoinding to the
width of the singularity spectrum.
RRV_alpha1_ExpMean: Multifractal DFA. ExpMean is the mean of singularity exponents.
RRV_alpha2_ExpMean: Multifractal DFA. ExpMean is the mean of singularity exponents.
RRV_alpha1_DimRange: Multifractal DFA. DimRange is the range of singularity dimensions, correspoinding to the
height of the singularity spectrum.
RRV_alpha2_DimRange: Multifractal DFA. DimRange is the range of singularity dimensions, correspoinding to the
height of the singularity spectrum.
RRV_alpha1_DimMean: Multifractal DFA. Dimmean is the mean of singularity dimensions.
RRV_alpha2_DimMean: Multifractal DFA. Dimmean is the mean of singularity dimensions.
See also
signal_rate
,rsp_peaks
,signal_power
,entropy_sample
,entropy_approximate
Examples
>>> import neurokit2 as nk >>> >>> rsp = nk.rsp_simulate(duration=90, respiratory_rate=15) >>> rsp, info = nk.rsp_process(rsp) >>> rrv = nk.rsp_rrv(rsp, show=True)
References
Soni, R., & Muniyandi, M. (2019). Breath rate variability: a novel measure to study the meditation
effects. International Journal of Yoga, 12(1), 45.
- rsp_simulate(duration=10, length=None, sampling_rate=1000, noise=0.01, respiratory_rate=15, method='breathmetrics', random_state=None)[source]¶
Simulate a respiratory signal.
Generate an artificial (synthetic) respiratory signal of a given duration and rate.
- Parameters
duration (int) – Desired length of duration (s).
sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second).
length (int) – The desired length of the signal (in samples).
noise (float) – Noise level (amplitude of the laplace noise).
respiratory_rate (float) – Desired number of breath cycles in one minute.
method (str) – The model used to generate the signal. Can be ‘sinusoidal’ for a simulation based on a trigonometric sine wave that roughly approximates a single respiratory cycle. If ‘breathmetrics’ (default), will use an advanced model desbribed Noto, et al. (2018).
random_state (int) – Seed for the random number generator.
- Returns
array – Vector containing the respiratory signal.
Examples
>>> import pandas as pd >>> import numpy as np >>> import neurokit2 as nk >>> >>> rsp1 = nk.rsp_simulate(duration=30, method="sinusoidal") >>> rsp2 = nk.rsp_simulate(duration=30, method="breathmetrics") >>> fig = pd.DataFrame({"RSP_Simple": rsp1, "RSP_Complex": rsp2}).plot(subplots=True) >>> fig
References
Noto, T., Zhou, G., Schuele, S., Templer, J., & Zelano, C. (2018). Automated analysis of breathing waveforms using BreathMetrics: A respiratory signal processing toolbox. Chemical Senses, 43(8), 583–597. https://doi.org/10.1093/chemse/bjy045
See also
rsp_clean
,rsp_findpeaks
,signal_rate
,rsp_process
,rsp_plot
EDA¶
Submodule for NeuroKit.
- eda_analyze(data, sampling_rate=1000, method='auto')[source]¶
Performs EDA analysis on either epochs (event-related analysis) or on longer periods of data such as resting- state data.
- Parameters
data (Union[dict, pd.DataFrame]) – A dictionary of epochs, containing one DataFrame per epoch, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df(). Can also take a DataFrame of processed signals from a longer period of data, typically generated by eda_process() or bio_process(). Can also take a dict containing sets of separate periods of data.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Defaults to 1000Hz.
method (str) – Can be one of ‘event-related’ for event-related analysis on epochs, or ‘interval-related’ for analysis on longer periods of data. Defaults to ‘auto’ where the right method will be chosen based on the mean duration of the data (‘event-related’ for duration under 10s).
- Returns
DataFrame – A dataframe containing the analyzed EDA features. If event-related analysis is conducted, each epoch is indicated by the Label column. See eda_eventrelated() and eda_intervalrelated() docstrings for details.
See also
bio_process
,eda_process
,epochs_create
,eda_eventrelated
,eda_intervalrelated
Examples
>>> import neurokit2 as nk
>>> # Example 1: Download the data for event-related analysis >>> data = nk.data("bio_eventrelated_100hz") >>> >>> # Process the data for event-related analysis >>> df, info = nk.bio_process(eda=data["EDA"], sampling_rate=100) >>> events = nk.events_find(data["Photosensor"], threshold_keep='below', ... event_conditions=["Negative", "Neutral", "Neutral", "Negative"]) >>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=-0.1, epochs_end=1.9) >>> >>> # Analyze >>> analyze_epochs = nk.eda_analyze(epochs, sampling_rate=100) >>> analyze_epochs >>> >>> # Example 2: Download the resting-state data >>> data = nk.data("bio_resting_8min_100hz") >>> >>> # Process the data >>> df, info = nk.eda_process(data["EDA"], sampling_rate=100) >>> >>> # Analyze >>> analyze_df = nk.eda_analyze(df, sampling_rate=100) >>> analyze_df
- eda_autocor(eda_cleaned, sampling_rate=1000, lag=4)[source]¶
Computes autocorrelation measure of raw EDA signal i.e., the correlation between the time series data and a specified time-lagged version of itself.
- Parameters
eda_cleaned (Union[list, np.array, pd.Series]) – The cleaned EDA signal.
sampling_rate (int) – The sampling frequency of raw EDA signal (in Hz, i.e., samples/second). Defaults to 1000Hz.
lag (int) – Time lag in seconds. Defaults to 4 seconds to avoid autoregressive correlations approaching 1, as recommended by Halem et al. (2020).
- Returns
float – Autocorrelation index of the eda signal.
See also
Examples
>>> import neurokit2 as nk >>> >>> # Simulate EDA signal >>> eda_signal = nk.eda_simulate(duration=5, scr_number=5, drift=0.1) >>> eda_cleaned = nk.eda_clean(eda_signal) >>> cor = nk.eda_autocor(eda_cleaned) >>> cor
References
Halem, S., van Roekel, E., Kroencke, L., Kuper, N., & Denissen, J. (2020). Moments That Matter? On the Complexity of Using Triggers Based on Skin Conductance to Sample Arousing Events Within an Experience Sampling Framework. European Journal of Personality.
- eda_changepoints(eda_cleaned)[source]¶
Calculate the number of change points using of the skin conductance signal in terms of mean and variance. Defaults to an algorithm penalty of 10000, as recommended by Halem et al. (2020).
- Parameters
eda_cleaned (Union[list, np.array, pd.Series]) – The cleaned EDA signal.
- Returns
float – Number of changepoints in the
See also
Examples
>>> import neurokit2 as nk >>> >>> # Simulate EDA signal >>> eda_signal = nk.eda_simulate(duration=5, scr_number=5, drift=0.1) >>> eda_cleaned = nk.eda_clean(eda_signal) >>> changepoints = nk.eda_changepoints(eda_cleaned) >>> changepoints
References
Halem, S., van Roekel, E., Kroencke, L., Kuper, N., & Denissen, J. (2020). Moments That Matter? On the Complexity of Using Triggers Based on Skin Conductance to Sample Arousing Events Within an Experience Sampling Framework. European Journal of Personality.
- eda_clean(eda_signal, sampling_rate=1000, method='neurokit')[source]¶
Preprocess Electrodermal Activity (EDA) signal.
- Parameters
eda_signal (Union[list, np.array, pd.Series]) – The raw EDA signal.
sampling_rate (int) – The sampling frequency of rsp_signal (in Hz, i.e., samples/second).
method (str) – The processing pipeline to apply. Can be one of ‘neurokit’ (default) or ‘biosppy’.
- Returns
array – Vector containing the cleaned EDA signal.
See also
Examples
>>> import pandas as pd >>> import neurokit2 as nk >>> >>> eda = nk.eda_simulate(duration=30, sampling_rate=100, scr_number=10, noise=0.01, drift=0.02) >>> signals = pd.DataFrame({"EDA_Raw": eda, ... "EDA_BioSPPy": nk.eda_clean(eda, sampling_rate=100,method='biosppy'), ... "EDA_NeuroKit": nk.eda_clean(eda, sampling_rate=100, method='neurokit')}) >>> fig = signals.plot() >>> fig
Performs event-related EDA analysis on epochs.
- Parameters
epochs (Union[dict, pd.DataFrame]) – A dict containing one DataFrame per event/trial, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df().
silent (bool) – If True, silence possible warnings.
- Returns
DataFrame – A dataframe containing the analyzed EDA features for each epoch, with each epoch indicated by the Label column (if not present, by the Index column). The analyzed features consist the following:
”EDA_SCR”: indication of whether Skin Conductance Response (SCR) occurs following the event (1 if an SCR onset is present and 0 if absent) and if so, its corresponding peak amplitude, time of peak, rise and recovery time. If there is no occurrence of SCR, nans are displayed for the below features.
”EDA_Peak_Amplitude”: the maximum amplitude of the phasic component of the signal.
”SCR_Peak_Amplitude”: the peak amplitude of the first SCR in each epoch.
”SCR_Peak_Amplitude_Time”: the timepoint of each first SCR peak amplitude.
”SCR_RiseTime”: the risetime of each first SCR i.e., the time it takes for SCR to reach peak amplitude from onset.
”SCR_RecoveryTime”: the half-recovery time of each first SCR i.e., the time it takes for SCR to decrease to half amplitude.
See also
events_find
,epochs_create
,bio_process
Examples
>>> import neurokit2 as nk >>> >>> # Example with simulated data >>> eda = nk.eda_simulate(duration=15, scr_number=3) >>> >>> # Process data >>> eda_signals, info = nk.eda_process(eda, sampling_rate=1000) >>> epochs = nk.epochs_create(eda_signals, events=[5000, 10000, 15000], sampling_rate=1000, ... epochs_start=-0.1, epochs_end=1.9) >>> >>> # Analyze >>> nk.eda_eventrelated(epochs) >>> >>> # Example with real data >>> data = nk.data("bio_eventrelated_100hz") >>> >>> # Process the data >>> df, info = nk.bio_process(eda=data["EDA"], sampling_rate=100) >>> events = nk.events_find(data["Photosensor"], threshold_keep='below', ... event_conditions=["Negative", "Neutral", "Neutral", "Negative"]) >>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=-0.1, epochs_end=6.9) >>> >>> # Analyze >>> nk.eda_eventrelated(epochs)
- eda_findpeaks(eda_phasic, sampling_rate=1000, method='neurokit', amplitude_min=0.1)[source]¶
Identify Skin Conductance Responses (SCR) in Electrodermal Activity (EDA).
Low-level function used by eda_peaks() to identify Skin Conductance Responses (SCR) peaks in the phasic component of Electrodermal Activity (EDA) with different possible methods. See eda_peaks() for details.
- Parameters
eda_phasic (Union[list, np.array, pd.Series]) – The phasic component of the EDA signal (from eda_phasic()).
sampling_rate (int) – The sampling frequency of the EDA signal (in Hz, i.e., samples/second).
method (str) – The processing pipeline to apply. Can be one of “neurokit” (default), “gamboa2008”, “kim2004” (the default in BioSPPy), “vanhalem2020” or “nabian2018”.
amplitude_min (float) – Only used if ‘method’ is ‘neurokit’ or ‘kim2004’. Minimum threshold by which to exclude SCRs (peaks) as relative to the largest amplitude in the signal.
- Returns
info (dict) – A dictionary containing additional information, in this case the aplitude of the SCR, the samples at which the SCR onset and the SCR peaks occur. Accessible with the keys “SCR_Amplitude”, “SCR_Onsets”, and “SCR_Peaks” respectively.
See also
eda_simulate
,eda_clean
,eda_phasic
,eda_fixpeaks
,eda_peaks
,eda_process
,eda_plot
Examples
>>> import neurokit2 as nk >>> >>> # Get phasic component >>> eda_signal = nk.eda_simulate(duration=30, scr_number=5, drift=0.1, noise=0) >>> eda_cleaned = nk.eda_clean(eda_signal) >>> eda = nk.eda_phasic(eda_cleaned) >>> eda_phasic = eda["EDA_Phasic"].values >>> >>> # Find peaks >>> gamboa2008 = nk.eda_findpeaks(eda_phasic, method="gamboa2008") >>> kim2004 = nk.eda_findpeaks(eda_phasic, method="kim2004") >>> neurokit = nk.eda_findpeaks(eda_phasic, method="neurokit") >>> vanhalem2020 = nk.eda_findpeaks(eda_phasic, method="vanhalem2020") >>> nabian2018 = nk.eda_findpeaks(eda_phasic, method="nabian2018") >>> fig = nk.events_plot([gamboa2008["SCR_Peaks"], kim2004["SCR_Peaks"], vanhalem2020["SCR_Peaks"], ... neurokit["SCR_Peaks"], nabian2018["SCR_Peaks"]], eda_phasic) >>> fig
References
Gamboa, H. (2008). Multi-modal behavioral biometrics based on hci and electrophysiology. PhD ThesisUniversidade.
Kim, K. H., Bang, S. W., & Kim, S. R. (2004). Emotion recognition system using short-term monitoring of physiological signals. Medical and biological engineering and computing, 42(3), 419-427.
van Halem, S., Van Roekel, E., Kroencke, L., Kuper, N., & Denissen, J. (2020). Moments That Matter? On the Complexity of Using Triggers Based on Skin Conductance to Sample Arousing Events Within an Experience Sampling Framework. European Journal of Personality.
Nabian, M., Yin, Y., Wormwood, J., Quigley, K. S., Barrett, L. F., & Ostadabbas, S. (2018). An Open-Source Feature Extraction Tool for the Analysis of Peripheral Physiological Data. IEEE journal of translational engineering in health and medicine, 6, 2800711. https://doi.org/10.1109/JTEHM.2018.2878000
- eda_fixpeaks(peaks, onsets=None, height=None)[source]¶
Correct Skin Conductance Responses (SCR) peaks.
Low-level function used by eda_peaks() to correct the peaks found by eda_findpeaks(). Doesn’t do anything for now for EDA. See eda_peaks() for details.
- Parameters
peaks (list or array or DataFrame or Series or dict) – The samples at which the SCR peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with eda_findpeaks().
onsets (list or array or DataFrame or Series or dict) – The samples at which the SCR onsets occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with eda_findpeaks(). Defaults to None.
height (list or array or DataFrame or Series or dict) – The samples at which the amplitude of the SCR peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with eda_findpeaks(). Defaults to None.
- Returns
info (dict) – A dictionary containing additional information, in this case the aplitude of the SCR, the samples at which the SCR onset and the SCR peaks occur. Accessible with the keys “SCR_Amplitude”, “SCR_Onsets”, and “SCR_Peaks” respectively.
See also
eda_simulate
,eda_clean
,eda_phasic
,eda_findpeaks
,eda_peaks
,eda_process
,eda_plot
Examples
>>> import neurokit2 as nk >>> >>> # Get phasic component >>> eda_signal = nk.eda_simulate(duration=30, scr_number=5, drift=0.1, noise=0) >>> eda_cleaned = nk.eda_clean(eda_signal) >>> eda = nk.eda_phasic(eda_cleaned) >>> eda_phasic = eda["EDA_Phasic"].values >>> >>> # Find and fix peaks >>> info = nk.eda_findpeaks(eda_phasic) >>> info = nk.eda_fixpeaks(info) >>> >>> fig = nk.events_plot(info["SCR_Peaks"], eda_phasic) >>> fig
Performs EDA analysis on longer periods of data (typically > 10 seconds), such as resting-state data.
- Parameters
data (Union[dict, pd.DataFrame]) – A DataFrame containing the different processed signal(s) as different columns, typically generated by eda_process() or bio_process(). Can also take a dict containing sets of separately processed DataFrames.
- Returns
DataFrame – A dataframe containing the analyzed EDA features. The analyzed features consist of the following: - “SCR_Peaks_N”: the number of occurrences of Skin Conductance Response (SCR). - “SCR_Peaks_Amplitude_Mean”: the mean amplitude of the SCR peak occurrences.
See also
bio_process
,eda_eventrelated
Examples
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_resting_8min_100hz") >>> >>> # Process the data >>> df, info = nk.eda_process(data["EDA"], sampling_rate=100) >>> >>> # Single dataframe is passed >>> nk.eda_intervalrelated(df) >>> >>> epochs = nk.epochs_create(df, events=[0, 25300], sampling_rate=100, epochs_end=20) >>> nk.eda_intervalrelated(epochs)
- eda_peaks(eda_phasic, sampling_rate=1000, method='neurokit', amplitude_min=0.1)[source]¶
Identify Skin Conductance Responses (SCR) in Electrodermal Activity (EDA).
Identify Skin Conductance Responses (SCR) peaks in the phasic component of Electrodermal Activity (EDA) with different possible methods, such as:
`Gamboa, H. (2008)
<http://www.lx.it.pt/~afred/pub/thesisHugoGamboa.pdf>`_ - Kim et al. (2004)
- Parameters
eda_phasic (Union[list, np.array, pd.Series]) – The phasic component of the EDA signal (from eda_phasic()).
sampling_rate (int) – The sampling frequency of the EDA signal (in Hz, i.e., samples/second).
method (str) – The processing pipeline to apply. Can be one of “neurokit” (default), “gamboa2008”, “kim2004” (the default in BioSPPy), “vanhalem2020” or “nabian2018”.
amplitude_min (float) – Only used if ‘method’ is ‘neurokit’ or ‘kim2004’. Minimum threshold by which to exclude SCRs (peaks) as relative to the largest amplitude in the signal.
- Returns
info (dict) – A dictionary containing additional information, in this case the aplitude of the SCR, the samples at which the SCR onset and the SCR peaks occur. Accessible with the keys “SCR_Amplitude”, “SCR_Onsets”, and “SCR_Peaks” respectively. It also contains the signals’ sampling rate.
signals (DataFrame) – A DataFrame of same length as the input signal in which occurences of SCR peaks are marked as “1” in lists of zeros with the same length as eda_cleaned. Accessible with the keys “SCR_Peaks”.
See also
Examples
>>> import neurokit2 as nk >>> >>> # Get phasic component >>> eda_signal = nk.eda_simulate(duration=30, scr_number=5, drift=0.1, noise=0, sampling_rate=100) >>> eda_cleaned = nk.eda_clean(eda_signal, sampling_rate=100) >>> eda = nk.eda_phasic(eda_cleaned, sampling_rate=100) >>> eda_phasic = eda["EDA_Phasic"].values >>> >>> # Find peaks >>> _, kim2004 = nk.eda_peaks(eda_phasic, method="kim2004") >>> _, neurokit = nk.eda_peaks(eda_phasic, method="neurokit") >>> _, nabian2018 = nk.eda_peaks(eda_phasic, method="nabian2018") >>> nk.events_plot([nabian2018["SCR_Peaks"], kim2004["SCR_Peaks"], neurokit["SCR_Peaks"]], eda_phasic) <Figure ...>
References
Gamboa, H. (2008). Multi-modal behavioral biometrics based on hci and electrophysiology. PhD ThesisUniversidade.
Kim, K. H., Bang, S. W., & Kim, S. R. (2004). Emotion recognition system using short-term monitoring of physiological signals. Medical and biological engineering and computing, 42(3), 419-427.
van Halem, S., Van Roekel, E., Kroencke, L., Kuper, N., & Denissen, J. (2020). Moments That Matter? On the Complexity of Using Triggers Based on Skin Conductance to Sample Arousing Events Within an Experience Sampling Framework. European Journal of Personality.
Nabian, M., Yin, Y., Wormwood, J., Quigley, K. S., Barrett, L. F., & Ostadabbas, S. (2018). An Open-Source Feature Extraction Tool for the Analysis of Peripheral Physiological Data. IEEE journal of translational engineering in health and medicine, 6, 2800711. https://doi.org/10.1109/JTEHM.2018.2878000
- eda_phasic(eda_signal, sampling_rate=1000, method='highpass')[source]¶
Decompose Electrodermal Activity (EDA) into Phasic and Tonic components.
Decompose the Electrodermal Activity (EDA) into two components, namely Phasic and Tonic, using different methods including cvxEDA (Greco, 2016) or Biopac’s Acqknowledge algorithms.
- Parameters
eda_signal (Union[list, np.array, pd.Series]) – The raw EDA signal.
sampling_rate (int) – The sampling frequency of raw EDA signal (in Hz, i.e., samples/second). Defaults to 1000Hz.
method (str) – The processing pipeline to apply. Can be one of “cvxEDA”, “median”, “smoothmedian”, “highpass”, “biopac”, or “acqknowledge”.
- Returns
DataFrame – DataFrame containing the ‘Tonic’ and the ‘Phasic’ components as columns.
See also
Examples
>>> import neurokit2 as nk >>> >>> # Simulate EDA signal >>> eda_signal = nk.eda_simulate(duration=30, scr_number=5, drift=0.1) >>> >>> # Decompose using different algorithms >>> cvxEDA = nk.eda_phasic(nk.standardize(eda_signal), method='cvxeda') >>> smoothMedian = nk.eda_phasic(nk.standardize(eda_signal), method='smoothmedian') >>> highpass = nk.eda_phasic(nk.standardize(eda_signal), method='highpass') >>> >>> data = pd.concat([cvxEDA.add_suffix('_cvxEDA'), smoothMedian.add_suffix('_SmoothMedian'), ... highpass.add_suffix('_Highpass')], axis=1) >>> data["EDA_Raw"] = eda_signal >>> fig = data.plot() >>> fig >>> >>> eda_signal = nk.data("bio_eventrelated_100hz")["EDA"] >>> data = nk.eda_phasic(nk.standardize(eda_signal), sampling_rate=100) >>> data["EDA_Raw"] = eda_signal >>> fig = nk.signal_plot(data, standardize=True) >>> fig
References
cvxEDA: https://github.com/lciti/cvxEDA.
Greco, A., Valenza, G., & Scilingo, E. P. (2016). Evaluation of CDA and CvxEDA Models. In Advances in Electrodermal Activity Processing with Applications for Mental Health (pp. 35-43). Springer International Publishing.
Greco, A., Valenza, G., Lanata, A., Scilingo, E. P., & Citi, L. (2016). cvxEDA: A convex optimization approach to electrodermal activity processing. IEEE Transactions on Biomedical Engineering, 63(4), 797-804.
- eda_plot(eda_signals, sampling_rate=None)[source]¶
Visualize electrodermal activity (EDA) data.
- Parameters
eda_signals (DataFrame) – DataFrame obtained from eda_process().
sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second). Defaults to None.
- Returns
fig – Figure representing a plot of the processed EDA signals.
Examples
>>> import neurokit2 as nk >>> >>> eda_signal = nk.eda_simulate(duration=30, scr_number=5, drift=0.1, noise=0, sampling_rate=250) >>> eda_signals, info = nk.eda_process(eda_signal, sampling_rate=250) >>> fig = nk.eda_plot(eda_signals) >>> fig
See also
- eda_process(eda_signal, sampling_rate=1000, method='neurokit')[source]¶
Process Electrodermal Activity (EDA).
Convenience function that automatically processes electrodermal activity (EDA) signal.
- Parameters
eda_signal (Union[list, np.array, pd.Series]) – The raw EDA signal.
sampling_rate (int) – The sampling frequency of rsp_signal (in Hz, i.e., samples/second).
method (str) – The processing pipeline to apply. Can be one of “biosppy” or “neurokit” (default).
- Returns
signals (DataFrame) – A DataFrame of same length as eda_signal containing the following columns:
”EDA_Raw”: the raw signal.
”EDA_Clean”: the cleaned signal.
”EDA_Tonic”: the tonic component of the signal, or the Tonic Skin Conductance Level (SCL).
”EDA_Phasic”: the phasic component of the signal, or the Phasic Skin Conductance Response (SCR).
”SCR_Onsets”: the samples at which the onsets of the peaks occur, marked as “1” in a list of zeros.
”SCR_Peaks”: the samples at which the peaks occur, marked as “1” in a list of zeros.
”SCR_Height”: the SCR amplitude of the signal including the Tonic component. Note that cumulative effects of close- occurring SCRs might lead to an underestimation of the amplitude.
”SCR_Amplitude”: the SCR amplitude of the signal excluding the Tonic component.
”SCR_RiseTime”: the time taken for SCR onset to reach peak amplitude within the SCR.
”SCR_Recovery”: the samples at which SCR peaks recover (decline) to half amplitude, marked as “1” in a list of zeros.
info (dict) – A dictionary containing the information of each SCR peak (see eda_findpeaks()), as well as the signals’ sampling rate.
See also
eda_simulate
,eda_clean
,eda_phasic
,eda_findpeaks
,eda_plot
Examples
>>> import neurokit2 as nk >>> >>> eda_signal = nk.eda_simulate(duration=30, scr_number=5, drift=0.1, noise=0) >>> signals, info = nk.eda_process(eda_signal, sampling_rate=1000) >>> fig = nk.eda_plot(signals) >>> fig
- eda_simulate(duration=10, length=None, sampling_rate=1000, noise=0.01, scr_number=1, drift=- 0.01, random_state=None)[source]¶
Simulate Electrodermal Activity (EDA) signal.
Generate an artificial (synthetic) EDA signal of a given duration and sampling rate.
- Parameters
duration (int) – Desired recording length in seconds.
sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second). Defaults to 1000Hz.
length (int) – The desired length of the signal (in samples). Defaults to None.
noise (float) – Noise level (amplitude of the laplace noise). Defaults to 0.01.
scr_number (int) – Desired number of skin conductance responses (SCRs), i.e., peaks. Defaults to 1.
drift (float or list) – The slope of a linear drift of the signal. Defaults to -0.01.
random_state (int) – Seed for the random number generator. Defaults to None.
- Returns
array – Vector containing the EDA signal.
Examples
>>> import neurokit2 as nk >>> import pandas as pd >>> >>> eda = nk.eda_simulate(duration=10, scr_number=3) >>> fig = nk.signal_plot(eda) >>> fig
See also
ecg_simulate
,rsp_simulate
,emg_simulate
,ppg_simulate
References
Bach, D. R., Flandin, G., Friston, K. J., & Dolan, R. J. (2010). Modelling event-related skin conductance responses. International Journal of Psychophysiology, 75(3), 349-356.
- eda_sympathetic(eda_signal, sampling_rate=1000, frequency_band=[0.045, 0.25], method='posada', show=False)[source]¶
Obtain electrodermal activity (EDA) indexes of sympathetic nervous system.
Derived from Posada-Quintero et al. (2016), who argue that dynamics of the sympathetic component of EDA signal is represented in the frequency band of 0.045-0.25Hz. See https://biosignal.uconn.edu/wp-content/uploads/sites/2503/2018/09/09_Posada_2016_AnnalsBME.pdf
- Parameters
eda_signal (Union[list, np.array, pd.Series]) – The EDA signal (i.e., a time series) in the form of a vector of values.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
frequency_band (list) – List indicating the frequency range to compute the the power spectral density in. Defaults to [0.045, 0.25].
method (str) – Can be one of ‘ghiasi’ or ‘posada’.
show (bool) – If True, will return a plot.
See also
signal_filter
,signal_power
,signal_psd
- Returns
dict – A dictionary containing the EDA symptathetic indexes, accessible by keys ‘EDA_Symp’ and ‘EDA_SympN’ (normalized, obtained by dividing EDA_Symp by total power). Plots power spectrum of the EDA signal within the specified frequency band if show is True.
Examples
>>> import neurokit2 as nk >>> >>> eda = nk.data('bio_resting_8min_100hz')['EDA'] >>> indexes_posada = nk.eda_sympathetic(eda, sampling_rate=100, method='posada', show=True) >>> indexes_ghiasi = nk.eda_sympathetic(eda, sampling_rate=100, method='ghiasi', show=True)
References
Ghiasi, S., Grecol, A., Nardelli, M., Catrambonel, V., Barbieri, R., Scilingo, E., & Valenza, G. (2018).
A New Sympathovagal Balance Index from Electrodermal Activity and Instantaneous Vagal Dynamics: A Preliminary Cold Pressor Study. 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). doi:10.1109/embc.2018.8512932 - Posada-Quintero, H. F., Florian, J. P., Orjuela-Cañón, A. D., Aljama-Corrales, T., Charleston-Villalobos, S., & Chon, K. H. (2016). Power spectral density analysis of electrodermal activity for sympathetic function assessment. Annals of biomedical engineering, 44(10), 3124-3135.
EMG¶
Submodule for NeuroKit.
- emg_activation(emg_amplitude=None, emg_cleaned=None, sampling_rate=1000, method='threshold', threshold='default', duration_min='default', size=None, threshold_size=None, **kwargs)[source]¶
Detects onset in EMG signal based on the amplitude threshold.
- Parameters
emg_amplitude (array) – At least one EMG-related signal. Either the amplitude of the EMG signal, obtained from
emg_amplitude()
for methods like ‘threshold’ or ‘mixture’), and / or the cleaned EMG signal (for methods like ‘pelt’, ‘biosppy’ or ‘silva’).emg_cleaned (array) – At least one EMG-related signal. Either the amplitude of the EMG signal, obtained from
emg_amplitude()
for methods like ‘threshold’ or ‘mixture’), and / or the cleaned EMG signal (for methods like ‘pelt’, ‘biosppy’ or ‘silva’).sampling_rate (int) – The sampling frequency of
emg_signal
(in Hz, i.e., samples/second).method (str) – The algorithm used to discriminate between activity and baseline. Can be one of ‘mixture’ (default) or ‘threshold’. If ‘mixture’, will use a Gaussian Mixture Model to categorize between the two states. If ‘threshold’, will consider as activated all points which amplitude is superior to the threshold. Can also be pelt or biosppy or silva.
threshold (str) – If
method
is ‘mixture’, then it corresponds to the minimum probability required to be considered as activated (default to 0.33). If method is ‘threshold’, then it corresponds to the minimum amplitude to detect as onset i.e., defaults to one tenth of the standard deviation ofemg_amplitude
. Ifmethod
is ‘silva’, defaults to 0.05. Ifmethod
is ‘biosppy’, defaults to 1.2 times of the mean of the absolute of the smoothed, full-wave-rectified signal. Ifmethod
is ‘pelt’, threshold defaults to None as changepoints are used as a basis for detection.duration_min (float) – The minimum duration of a period of activity or non-activity in seconds. If ‘default’, will be set to 0.05 (50 ms).
size (float or int) – Detection window size (seconds). Applicable only if
method
is ‘biosppy’ or ‘silva’. If None, defaults to 0.05 for ‘biosppy’ and 20 for ‘silva’.threshold_size (int) – Window size for calculation of the adaptive threshold. Must be bigger than the detection window size. Applicable only if
method
is ‘silva’. If None, defaults to 22.kwargs (optional) – Other arguments.
- Returns
info (dict) – A dictionary containing additional information, in this case the samples at which the onsets, offsets, and periods of activations of the EMG signal occur, accessible with the key “EMG_Onsets”, “EMG_Offsets”, and “EMG_Activity” respectively.
activity_signal (DataFrame) – A DataFrame of same length as the input signal in which occurences of onsets, offsets, and activity (above the threshold) of the EMG signal are marked as “1” in lists of zeros with the same length as emg_amplitude. Accessible with the keys “EMG_Onsets”, “EMG_Offsets”, and “EMG_Activity” respectively.
See also
emg_simulate
,emg_clean
,emg_amplitude
,emg_process
,emg_plot
Examples
>>> import neurokit2 as nk >>> >>> # Simulate signal and obtain amplitude >>> emg = nk.emg_simulate(duration=10, burst_number=3) >>> emg_cleaned = nk.emg_clean(emg) >>> emg_amplitude = nk.emg_amplitude(emg_cleaned) >>> >>> # Threshold method >>> activity, info = nk.emg_activation(emg_amplitude=emg_amplitude, method="threshold") >>> fig = nk.events_plot([info["EMG_Offsets"], info["EMG_Onsets"]], emg_cleaned) >>> fig >>> >>> # Threshold method >>> activity, info = nk.emg_activation(emg_cleaned=emg_cleaned, method="pelt") >>> fig = nk.events_plot([info["EMG_Offsets"], info["EMG_Onsets"]], emg_cleaned) >>> fig >>> >>> # Biosppy method >>> activity, info = nk.emg_activation(emg_cleaned=emg_cleaned, method="biosppy") >>> fig = nk.events_plot([info["EMG_Offsets"], info["EMG_Onsets"]], emg_cleaned) >>> fig >>> >>> # Silva method >>> activity, info = nk.emg_activation(emg_cleaned=emg_cleaned, method="silva") >>> fig = nk.events_plot([info["EMG_Offsets"], info["EMG_Onsets"]], emg_cleaned) >>> fig
References
BioSPPy: https://github.com/PIA-Group/BioSPPy/blob/master/biosppy/signals/emg.py
Modified emg.py for BioSPPy: https://gist.github.com/tostasmistas/747f4585198411c8c4bda5f312f27dfb
Silva H, Scherer R, Sousa J, Londral A , “Towards improving the ssability of
electromyographic interfacess”, Journal of Oral Rehabilitation, pp. 1–2, 2012.
- emg_amplitude(emg_cleaned)[source]¶
Compute electromyography (EMG) amplitude.
Compute electromyography amplitude given the cleaned respiration signal, done by calculating the linear envelope of the signal.
- Parameters
emg_cleaned (Union[list, np.array, pd.Series]) – The cleaned electromyography channel as returned by emg_clean().
- Returns
array – A vector containing the electromyography amplitude.
See also
emg_clean
,emg_rate
,emg_process
,emg_plot
Examples
>>> import neurokit2 as nk >>> import pandas as pd >>> >>> emg = nk.emg_simulate(duration=10, sampling_rate=1000, burst_number=3) >>> cleaned = nk.emg_clean(emg, sampling_rate=1000) >>> >>> amplitude = nk.emg_amplitude(cleaned) >>> fig = pd.DataFrame({"EMG": emg, "Amplitude": amplitude}).plot(subplots=True) >>> fig
- emg_analyze(data, sampling_rate=1000, method='auto')[source]¶
Performs EMG analysis on either epochs (event-related analysis) or on longer periods of data such as resting- state data.
- Parameters
data (Union[dict, pd.DataFrame]) – A dictionary of epochs, containing one DataFrame per epoch, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df(). Can also take a DataFrame of processed signals from a longer period of data, typically generated by emg_process() or bio_process(). Can also take a dict containing sets of separate periods of data.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Defaults to 1000Hz.
method (str) – Can be one of ‘event-related’ for event-related analysis on epochs, or ‘interval-related’ for analysis on longer periods of data. Defaults to ‘auto’ where the right method will be chosen based on the mean duration of the data (‘event-related’ for duration under 10s).
- Returns
DataFrame – A dataframe containing the analyzed EMG features. If event-related analysis is conducted, each epoch is indicated by the Label column. See emg_eventrelated() and emg_intervalrelated() docstrings for details.
See also
bio_process
,emg_process
,epochs_create
,emg_eventrelated
,emg_intervalrelated
Examples
>>> import neurokit2 as nk >>> import pandas as pd
>>> # Example with simulated data >>> emg = nk.emg_simulate(duration=20, sampling_rate=1000, burst_number=3) >>> emg_signals, info = nk.emg_process(emg, sampling_rate=1000) >>> epochs = nk.epochs_create(emg_signals, events=[3000, 6000, 9000], sampling_rate=1000, ... epochs_start=-0.1, epochs_end=1.9) >>> >>> # Event-related analysis >>> analyze_epochs = nk.emg_analyze(epochs, method="event-related") >>> analyze_epochs >>> >>> # Interval-related analysis >>> analyze_df = nk.emg_analyze(emg_signals, method="interval-related") >>> analyze_df
- emg_clean(emg_signal, sampling_rate=1000)[source]¶
Preprocess an electromyography (emg) signal.
Clean an EMG signal using a set of parameters, such as: in `BioSPPy <https://github.com/PIA-Group/BioSPPy/blob/e65da30f6379852ecb98f8e2e0c9b4b5175416c3/biosppy/signals/emg.py>>`_: fourth order 100 Hz highpass Butterworth filter followed by a constant detrending.
- Parameters
emg_signal (Union[list, np.array, pd.Series]) – The raw EMG channel.
sampling_rate (int) – The sampling frequency of emg_signal (in Hz, i.e., samples/second). Defaults to 1000.
- Returns
array – Vector containing the cleaned EMG signal.
See also
Examples
>>> import pandas as pd >>> import neurokit2 as nk >>> >>> emg = nk.emg_simulate(duration=10, sampling_rate=1000) >>> signals = pd.DataFrame({"EMG_Raw": emg, "EMG_Cleaned":nk.emg_clean(emg, sampling_rate=1000)}) >>> fig = signals.plot() >>> fig
Performs event-related EMG analysis on epochs.
- Parameters
epochs (Union[dict, pd.DataFrame]) – A dict containing one DataFrame per event/trial, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df().
silent (bool) – If True, silence possible warnings.
- Returns
DataFrame – A dataframe containing the analyzed EMG features for each epoch, with each epoch indicated by the Label column (if not present, by the Index column). The analyzed features consist of the following:
”EMG_Activation”: indication of whether there is muscular activation following the onset
of the event (1 if present, 0 if absent) and if so, its corresponding amplitude features and the number of activations in each epoch. If there is no activation, nans are displayed for the below features. - “EMG_Amplitude_Mean”: the mean amplitude of the activity. - “EMG_Amplitude_Max”: the maximum amplitude of the activity. - “EMG_Amplitude_SD”: the standard deviation of the activity amplitude. - “EMG_Amplitude_Max_Time”: the time of maximum amplitude. - “EMG_Bursts”: the number of activations, or bursts of activity, within each epoch.
See also
emg_simulate
,emg_process
,events_find
,epochs_create
Examples
>>> import neurokit2 as nk >>> >>> # Example with simulated data >>> emg = nk.emg_simulate(duration=20, sampling_rate=1000, burst_number=3) >>> emg_signals, info = nk.emg_process(emg, sampling_rate=1000) >>> epochs = nk.epochs_create(emg_signals, events=[3000, 6000, 9000], sampling_rate=1000, ... epochs_start=-0.1,epochs_end=1.9) >>> nk.emg_eventrelated(epochs)
Performs EMG analysis on longer periods of data (typically > 10 seconds), such as resting-state data.
- Parameters
data (Union[dict, pd.DataFrame]) – A DataFrame containing the different processed signal(s) as different columns, typically generated by emg_process() or bio_process(). Can also take a dict containing sets of separately processed DataFrames.
- Returns
DataFrame – A dataframe containing the analyzed EMG features. The analyzed features consist of the following: - “EMG_Activation_N”: the number of bursts of muscular activity. - “EMG_Amplitude_Mean”: the mean amplitude of the muscular activity.
See also
bio_process
,emg_eventrelated
Examples
>>> import neurokit2 as nk >>> >>> # Example with simulated data >>> emg = nk.emg_simulate(duration=40, sampling_rate=1000, burst_number=3) >>> emg_signals, info = nk.emg_process(emg, sampling_rate=1000) >>> >>> # Single dataframe is passed >>> nk.emg_intervalrelated(emg_signals) >>> >>> epochs = nk.epochs_create(emg_signals, events=[0, 20000], sampling_rate=1000, epochs_end=20) >>> nk.emg_intervalrelated(epochs)
- emg_plot(emg_signals, sampling_rate=None)[source]¶
Visualize electromyography (EMG) data.
- Parameters
emg_signals (DataFrame) – DataFrame obtained from emg_process().
sampling_rate (int) – The sampling frequency of the EMG (in Hz, i.e., samples/second). Needs to be supplied if the data should be plotted over time in seconds. Otherwise the data is plotted over samples. Defaults to None.
- Returns
fig – Figure representing a plot of the processed emg signals.
Examples
>>> import neurokit2 as nk >>> >>> emg = nk.emg_simulate(duration=10, sampling_rate=1000, burst_number=3) >>> emg_signals, _ = nk.emg_process(emg, sampling_rate=1000) >>> fig = nk.emg_plot(emg_signals) >>> fig
See also
ecg_process
- emg_process(emg_signal, sampling_rate=1000)[source]¶
Process a electromyography (EMG) signal.
Convenience function that automatically processes an electromyography signal.
- Parameters
emg_signal (Union[list, np.array, pd.Series]) – The raw electromyography channel.
sampling_rate (int) – The sampling frequency of emg_signal (in Hz, i.e., samples/second).
- Returns
signals (DataFrame) – A DataFrame of same length as emg_signal containing the following columns: - “EMG_Raw”: the raw signal. - “EMG_Clean”: the cleaned signal. - “EMG_Amplitude”: the signal amplitude, or the activation level of the signal. - “EMG_Activity”: the activity of the signal for which amplitude exceeds the threshold specified, marked as “1” in a list of zeros. - “EMG_Onsets”: the onsets of the amplitude, marked as “1” in a list of zeros. - “EMG_Offsets”: the offsets of the amplitude, marked as “1” in a list of zeros.
info (dict) – A dictionary containing the information of each amplitude onset, offset, and peak activity (see emg_activation()), as well as the signals’ sampling rate.
See also
Examples
>>> import neurokit2 as nk >>> >>> emg = nk.emg_simulate(duration=10, sampling_rate=1000, burst_number=3) >>> signals, info = nk.emg_process(emg, sampling_rate=1000) >>> fig = nk.emg_plot(signals) >>> fig
- emg_simulate(duration=10, length=None, sampling_rate=1000, noise=0.01, burst_number=1, burst_duration=1.0, random_state=42)[source]¶
Simulate an EMG signal.
Generate an artificial (synthetic) EMG signal of a given duration and sampling rate.
- Parameters
duration (int) – Desired recording length in seconds.
sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second).
length (int) – The desired length of the signal (in samples).
noise (float) – Noise level (gaussian noise).
burst_number (int) – Desired number of bursts of activity (active muscle periods).
burst_duration (float or list) – Duration of the bursts. Can be a float (each burst will have the same duration) or a list of durations for each bursts.
random_state (int) – Seed for the random number generator.
- Returns
array – Vector containing the EMG signal.
Examples
>>> import neurokit2 as nk >>> import pandas as pd >>> >>> emg = nk.emg_simulate(duration=10, burst_number=3) >>> fig = nk.signal_plot(emg) >>> fig
See also
ecg_simulate
,rsp_simulate
,eda_simulate
,ppg_simulate
References
This function is based on this script.
EEG¶
Submodule for NeuroKit.
- eeg_badchannels(eeg, bad_threshold=0.5, distance_threshold=0.99)[source]¶
Find bad channels.
- Parameters
eeg (np.ndarray) – An array (channels, times) of M/EEG data or a Raw or Epochs object from MNE.
bad_threshold (float) – The proportion of indices (for instance, the mean, the SD, the skewness, the kurtosis, etc.) on which an observation is considered an outlier to be considered as bad. The default, 0.5, means that a channel must score as an outlier on half or more of the indices.
distance_threshold (float) – The quantile that desfines the absolute distance from the mean, i.e., the z-score for a value of a variable to be considered an outlier. For instance, .975 becomes
scipy.stats.norm.ppf(.975) ~= 1.96
. The default value (.99) means that all observations beyond 2.33 SD from the mean will be classified as outliers.
- Returns
list – List of bad channel names
DataFrame – Information of each channel, such as standard deviation (SD), mean, median absolute deviation (MAD), skewness, kurtosis, amplitude, highest density intervals, number of zero crossings.
Examples
>>> import neurokit2 as nk >>> >>> eeg = nk.mne_data("filt-0-40_raw") >>> bads, info = nk.eeg_badchannels(eeg)
- eeg_diss(eeg, gfp=None, **kwargs)[source]¶
Global dissimilarity (DISS)
Global dissimilarity (DISS) is an index of configuration differences between two electric fields, independent of their strength. Like GFP, DISS was first introduced by Lehmann and Skrandies (1980). This parameter equals the square root of the mean of the squared differences between the potentials measured at each electrode (versus the average reference), each of which is first scaled to unitary strength by dividing by the instantaneous GFP.
- Parameters
eeg (np.ndarray) – An array (channels, times) of M/EEG data or a Raw or Epochs object from MNE.
gfp (list) – The Global Field Power (GFP). If None, will be obtained via
eeg_gfp()
.**kwargs – Optional arguments to be passed into
nk.eeg_gfp()
.
- Returns
np.ndarray – DISS of each sample point in the data.
Examples
>>> import neurokit2 as nk >>> >>> eeg = nk.mne_data("filt-0-40_raw") >>> eeg = eeg.set_eeg_reference('average') >>> >>> gfp = nk.eeg_gfp(eeg) >>> diss = nk.eeg_diss(eeg, gfp=gfp) >>> nk.signal_plot([gfp[0:300], diss[0:300]], standardize=True)
References
Lehmann, D., & Skrandies, W. (1980). Reference-free identification of components of
checkerboard-evoked multichannel potential fields. Electroencephalography and clinical neurophysiology, 48(6), 609-621.
- eeg_gfp(eeg, sampling_rate=None, normalize=False, method='l1', smooth=0, robust=False, standardize_eeg=False)[source]¶
Global Field Power (GFP)
Global Field Power (GFP) constitutes a reference-independent measure of response strength. GFP was first introduced by Lehmann and Skrandies (1980) and has since become a commonplace measure among M/EEG users. Mathematically, GFP is the standard deviation of all electrodes at a given time.
- Parameters
eeg (array) – An array (channels, times) of M/EEG data or a Raw or Epochs object from MNE.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Only necessary if smoothing is requested.
normalize (bool) – Normalize (divide each data point by the maximum value of the data) across time prior to GFP extraction.
method (str) – Can be either ‘l1’ or ‘l2’ to use the L1 or L2 norm.
smooth (float) – Can be either None or a float. If a float, will use this value, multiplied by the sampling rate.
robust (bool) – If True, the GFP extraction (the data standardization if requested) will be done using the median/MAD instead of the mean/SD.
standardize_eeg (bool) – Standardize (z-score) the data across time prior to GFP extraction using
nk.standardize()
.
- Returns
gfp (array) – The global field power of each sample point in the data.
Examples
>>> import neurokit2 as nk >>> >>> eeg = nk.mne_data("filt-0-40_raw") >>> eeg = nk.eeg_rereference(eeg, 'average') >>> eeg = eeg.get_data()[:, 0:500] # Get the 500 first data points >>> >>> # Compare L1 and L2 norms >>> l1 = nk.eeg_gfp(eeg, method="l1", normalize=True) >>> l2 = nk.eeg_gfp(eeg, method="l2", normalize=True) >>> nk.signal_plot([l1, l2]) >>> >>> # Mean-based vs. Median-based >>> gfp = nk.eeg_gfp(eeg, normalize=True) >>> gfp_r = nk.eeg_gfp(eeg, normalize=True, robust=True) >>> nk.signal_plot([gfp, gfp_r]) >>> >>> # Standardize the data >>> gfp = nk.eeg_gfp(eeg, normalize=True) >>> gfp_z = nk.eeg_gfp(eeg, normalize=True, standardize_eeg=True) >>> nk.signal_plot([gfp, gfp_z])
References
Lehmann, D., & Skrandies, W. (1980). Reference-free identification of components of
checkerboard-evoked multichannel potential fields. Electroencephalography and clinical neurophysiology, 48(6), 609-621.
- eeg_rereference(eeg, reference='average', robust=False, **kwargs)[source]¶
EEG Rereferencing
- Parameters
eeg (np.ndarray) – An array (channels, times) of M/EEG data or a Raw or Epochs object from MNE.
reference (str) – See
mne.set_eeg_reference()
. Can be a string (e.g., ‘average’, ‘lap’ for Laplacian “reference-free” transformation, i.e., CSD), or a list (e.g., [‘TP9’, ‘TP10’] for mastoid reference).robust (bool) – Only applied if reference is ‘average’. If True, will substract the median instead of the mean.
**kwargs – Optional arguments to be passed into
mne.set_eeg_rereference()
.
- Returns
object – The rereferenced raw mne object.
Examples
>>> import neurokit2 as nk >>> >>> eeg = nk.mne_data("filt-0-40_raw") >>> >>> # Difference between robust average >>> avg = nk.eeg_rereference(eeg, 'average', robust=False) >>> avg_r = nk.eeg_rereference(eeg, 'average', robust=True) >>> >>> nk.signal_plot([avg.get_data()[0, 0:1000], ... avg_r.get_data()[0, 0:1000]]) >>> >>> # Compare the rerefering of an array vs. the MNE object >>> data_mne = eeg.copy().set_eeg_reference('average', verbose=False).get_data() >>> data_nk = nk.eeg_rereference(eeg.get_data(), 'average') >>> >>> # Difference between average and LAP >>> lap = nk.eeg_rereference(eeg, 'lap') >>> >>> nk.signal_plot([avg.get_data()[0, 0:1000], ... lap.get_data()[0, 0:1000]], standardize=True)
References
Trujillo, L. T., Stanfield, C. T., & Vela, R. D. (2017). The effect of electroencephalogram (EEG)
reference choice on information-theoretic measures of the complexity and integration of EEG signals. Frontiers in Neuroscience, 11, 425.
- mne_channel_add(raw, channel, channel_type=None, channel_name=None, sync_index_raw=0, sync_index_channel=0)[source]¶
Add channel as array to MNE.
Add a channel to a mne’s Raw m/eeg file. It will basically synchronize the channel to the eeg data following a particular index and add it.
- Parameters
raw (mne.io.Raw) – Raw EEG data from MNE.
channel (list or array) – The signal to be added.
channel_type (str) – Channel type. Currently supported fields are ‘ecg’, ‘bio’, ‘stim’, ‘eog’, ‘misc’, ‘seeg’, ‘ecog’, ‘mag’, ‘eeg’, ‘ref_meg’, ‘grad’, ‘emg’, ‘hbr’ or ‘hbo’.
channel_name (str) – Desired channel name.
sync_index_raw (int or list) – An index (e.g., the onset of the same event marked in the same signal), in the raw data, by which to align the two inputs. This can be used in case the EEG data and the channel to add do not have the same onsets and must be aligned through some common event.
sync_index_channel (int or list) – An index (e.g., the onset of the same event marked in the same signal), in the channel to add, by which to align the two inputs. This can be used in case the EEG data and the channel to add do not have the same onsets and must be aligned through some common event.
- Returns
mne.io.Raw – Raw data in FIF format.
Example
>>> import neurokit2 as nk >>> import mne >>> >>> raw = nk.mne_data("filt-0-40_raw") >>> ecg = nk.ecg_simulate(length=50000) >>> >>> # Let the 42nd sample point in the EEG signal correspond to the 333rd point in the ECG >>> event_index_in_eeg = 42 >>> event_index_in_ecg = 333 >>> >>> raw = nk.mne_channel_add(raw, ... ecg, ... sync_index_raw=event_index_in_eeg, ... sync_index_channel=event_index_in_ecg, ... channel_type="ecg")
- mne_channel_extract(raw, what, name=None)[source]¶
Channel array extraction from MNE.
Select one or several channels by name and returns them in a dataframe.
- Parameters
raw (mne.io.Raw) – Raw EEG data.
what (str or list) – Can be ‘MEG’, which will extract all MEG channels, ‘EEG’, which will extract all EEG channels, or ‘EOG’, which will extract all EOG channels (that is, if channel names are named with prefixes of their type e.g., ‘EEG 001’ etc. or ‘EOG 061’). Provide exact a single or a list of channel’s name(s) if not (e.g., [‘124’, ‘125’]).
name (str or list) – Useful only when extracting one channel. Can also take a list of names for renaming multiple channels, Otherwise, defaults to None.
- Returns
DataFrame – A DataFrame or Series containing the channel(s).
Example
>>> import neurokit2 as nk >>> import mne >>> >>> raw = mne.io.read_raw_fif(mne.datasets.sample.data_path() + ... '/MEG/sample/sample_audvis_raw.fif', preload=True) >>> >>> raw_channel = nk.mne_channel_extract(raw, what=["EEG 060", "EEG 055"], name=['060', '055']) >>> eeg_channels = nk.mne_channel_extract(raw, "EEG") >>> eog_channels = nk.mne_channel_extract(raw, what='EOG', name='EOG')
- mne_data(what='raw', path=None)[source]¶
Utility function to easily access MNE datasets
- Parameters
what (str) – Can be ‘raw’ or ‘filt-0-40_raw’ (a filtered version).
path (str) – Defaults to None, assuming that the MNE data folder already exists. If not, specify the directory to download the folder.
- Returns
object – The raw mne object.
Examples
>>> import neurokit2 as nk >>> >>> raw = nk.mne_data(what="raw")
- mne_to_df(eeg)[source]¶
Convert mne Raw or Epochs object to dataframe or dict of dataframes.
- Parameters
eeg (Union[mne.io.Raw, mne.Epochs]) – Raw or Epochs M/EEG data from MNE.
- Returns
DataFrame – A DataFrame containing all epochs identifiable by the ‘Label’ column, which time axis is stored in the ‘Time’ column.
Examples
>>> import neurokit2 as nk >>> >>> eeg = nk.mne_data("filt-0-40_raw") >>> eeg = nk.eeg_rereference(eeg, 'average') >>> >>> gfp = nk.eeg_gfp(eeg) >>> gfp_z = nk.eeg_gfp(eeg, normalize=True) >>> gfp_zr = nk.eeg_gfp(eeg, normalize=True, robust=True) >>> gfp_s = nk.eeg_gfp(eeg, smooth=0.05) >>> nk.signal_plot([gfp[0:500], gfp_z[0:500], gfp_zr[0:500], gfp_s[0:500]], standardize=True)
- mne_to_dict(eeg)[source]¶
Convert MNE Raw or Epochs object to a dictionnary.
- Parameters
eeg (Union[mne.io.Raw, mne.Epochs]) – Raw or Epochs M/EEG data from MNE.
- Returns
DataFrame – A DataFrame containing all epochs identifiable by the ‘Label’ column, which time axis is stored in the ‘Time’ column.
Examples
>>> import neurokit2 as nk >>> import mne >>> >>> raw = nk.mne_data() >>> >>> nk.mne_to_dict(raw)
Signal Processing¶
Submodule for NeuroKit.
- signal_autocor(signal, lag=None, normalize=True)[source]¶
Auto-correlation of a 1-dimensional sequences.
- Parameters
signal (Union[list, np.array, pd.Series]) – Vector of values.
normalize (bool) – Normalize the autocorrelation output.
lag (int) – Time lag. If specified, one value of autocorrelation between signal with its lag self will be returned.
- Returns
r – The cross-correlation of the signal with itself at different time lags. Minimum time lag is 0, maximum time lag is the length of the signal. Or a correlation value at a specific lag if lag is not None.
Examples
>>> import neurokit2 as nk >>> >>> x = [1, 2, 3, 4, 5] >>> autocor = nk.signal_autocor(x) >>> autocor
- signal_binarize(signal, method='threshold', threshold='auto')[source]¶
Binarize a continuous signal.
Convert a continuous signal into zeros and ones depending on a given threshold.
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
method (str) – The algorithm used to discriminate between the two states. Can be one of ‘mixture’ (default) or ‘threshold’. If ‘mixture’, will use a Gaussian Mixture Model to categorize between the two states. If ‘threshold’, will consider as activated all points which value is superior to the threshold.
threshold (float) – If method is ‘mixture’, then it corresponds to the minimum probability required to be considered as activated (if ‘auto’, then 0.5). If method is ‘threshold’, then it corresponds to the minimum amplitude to detect as onset. If “auto”, takes the value between the max and the min.
- Returns
list – A list or array depending on the type passed.
Examples
>>> import numpy as np >>> import pandas as pd >>> import neurokit2 as nk >>> >>> signal = np.cos(np.linspace(start=0, stop=20, num=1000)) >>> binary = nk.signal_binarize(signal) >>> fig = pd.DataFrame({"Raw": signal, "Binary": binary}).plot() >>> fig
- signal_changepoints(signal, change='meanvar', penalty=None, show=False)[source]¶
Change Point Detection.
Only the PELT method is implemented for now.
- Parameters
signal (Union[list, np.array, pd.Series]) – Vector of values.
change (str) – Can be one of “meanvar” (default), “mean” or “var”.
penalty (float) – The algorithm penalty. Default to
np.log(len(signal))
.show (bool) – Defaults to False.
- Returns
Array – Values indicating the samples at which the changepoints occur.
Fig – Figure of plot of signal with markers of changepoints.
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.emg_simulate(burst_number=3) >>> fig = nk.signal_changepoints(signal, change="var", show=True) >>> fig
References
Killick, R., Fearnhead, P., & Eckley, I. A. (2012). Optimal detection of changepoints with a linear
computational cost. Journal of the American Statistical Association, 107(500), 1590-1598.
- signal_decompose(signal, method='emd', n_components=None, **kwargs)[source]¶
Decompose a signal.
Signal decomposition into different sources using different methods, such as Empirical Mode Decomposition (EMD) or Singular spectrum analysis (SSA)-based signal separation method.
The extracted components can then be recombined into meaningful sources using
signal_recompose()
.- Parameters
signal (Union[list, np.array, pd.Series]) – Vector of values.
method (str) – The decomposition method. Can be one of ‘emd’ or ‘ssa’.
n_components (int) – Number of components to extract. Only used for ‘ssa’ method. If
None
, will default to 50.**kwargs – Other arguments passed to other functions.
- Returns
Array – Components of the decomposed signal.
See also
Examples
>>> import neurokit2 as nk >>> >>> # Create complex signal >>> signal = nk.signal_simulate(duration=10, frequency=1, noise=0.01) # High freq >>> signal += 3 * nk.signal_simulate(duration=10, frequency=3, noise=0.01) # Higher freq >>> signal += 3 * np.linspace(0, 2, len(signal)) # Add baseline and trend >>> signal += 2 * nk.signal_simulate(duration=10, frequency=0.1, noise=0) >>> >>> nk.signal_plot(signal) >>> >>> # EMD method >>> components = nk.signal_decompose(signal, method="emd") >>> fig = nk.signal_plot(components) # Visualize components >>> fig >>> >>> # SSA method >>> components = nk.signal_decompose(signal, method="ssa", n_components=5) >>> fig = nk.signal_plot(components) # Visualize components >>> fig
- signal_detrend(signal, method='polynomial', order=1, regularization=500, alpha=0.75, window=1.5, stepsize=0.02)[source]¶
Polynomial detrending of signal.
Apply a baseline (order = 0), linear (order = 1), or polynomial (order > 1) detrending to the signal (i.e., removing a general trend). One can also use other methods, such as smoothness priors approach described by Tarvainen (2002) or LOESS regression, but these scale badly for long signals.
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
method (str) – Can be one of ‘polynomial’ (default; traditional detrending of a given order) or ‘tarvainen2002’ to use the smoothness priors approach described by Tarvainen (2002) (mostly used in HRV analyses as a lowpass filter to remove complex trends), ‘loess’ for LOESS smoothing trend removal or ‘locreg’ for local linear regression (the ‘runline’ algorithm from chronux).
order (int) – Only used if method is ‘polynomial’. The order of the polynomial. 0, 1 or > 1 for a baseline (‘constant detrend’, i.e., remove only the mean), linear (remove the linear trend) or polynomial detrending, respectively. Can also be ‘auto’, it which case it will attempt to find the optimal order to minimize the RMSE.
regularization (int) – Only used if method=’tarvainen2002’. The regularization parameter (default to 500).
alpha (float) – Only used if method is ‘loess’. The parameter which controls the degree of smoothing.
window (float) – Only used if method is ‘locreg’. The detrending ‘window’ should correspond to the desired low frequency band to remove multiplied by the sampling rate (for instance,
1.5*1000
will remove frequencies below 1.5Hz for a signal sampled at 1000Hz).stepsize (float) – Only used if method is ‘locreg’. Simialrly to ‘window’, ‘stepsize’ should also be multiplied by the sampling rate.
- Returns
array – Vector containing the detrended signal.
See also
signal_filter
,fit_loess
Examples
>>> import numpy as np >>> import pandas as pd >>> import neurokit2 as nk >>> import matplotlib.pyplot as plt >>> >>> # Simulate signal with low and high frequency >>> signal = nk.signal_simulate(frequency=[0.1, 2], amplitude=[2, 0.5], sampling_rate=100) >>> signal = signal + (3 + np.linspace(0, 6, num=len(signal))) # Add baseline and linear trend >>> >>> # Apply detrending algorithms >>> baseline = nk.signal_detrend(signal, order=0) # Constant detrend (removes the mean) >>> linear = nk.signal_detrend(signal, order=1) # Linear detrend >>> quadratic = nk.signal_detrend(signal, order=2) # Quadratic detrend >>> cubic = nk.signal_detrend(signal, order=3) # Cubic detrend >>> poly10 = nk.signal_detrend(signal, order=10) # Linear detrend (10th order) >>> tarvainen = nk.signal_detrend(signal, method='tarvainen2002') # Tarvainen (2002) method >>> loess = nk.signal_detrend(signal, method='loess') # LOESS detrend (smooth removal) >>> locreg = nk.signal_detrend(signal, method='locreg', ... window=1.5*100, stepsize=0.02*100) # Local regression (100Hz) >>> >>> # Visualize different methods >>> axes = pd.DataFrame({"Original signal": signal, ... "Baseline": baseline, ... "Linear": linear, ... "Quadratic": quadratic, ... "Cubic": cubic, ... "Polynomial (10th)": poly10, ... "Tarvainen": tarvainen, ... "LOESS": loess, ... "Local Regression": locreg}).plot(subplots=True) >>> # Plot horizontal lines to better visualize the detrending >>> for subplot in axes: ... subplot.axhline(y=0, color='k', linestyle='--')
References
`Tarvainen, M. P., Ranta-Aho, P. O., & Karjalainen, P. A. (2002). An advanced detrending method
with application to HRV analysis. IEEE Transactions on Biomedical Engineering, 49(2), 172-175. <https://ieeexplore.ieee.org/document/979357>`_
- signal_distort(signal, sampling_rate=1000, noise_shape='laplace', noise_amplitude=0, noise_frequency=100, powerline_amplitude=0, powerline_frequency=50, artifacts_amplitude=0, artifacts_frequency=100, artifacts_number=5, linear_drift=False, random_state=None, silent=False)[source]¶
Signal distortion.
Add noise of a given frequency, amplitude and shape to a signal.
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
noise_shape (str) – The shape of the noise. Can be one of ‘laplace’ (default) or ‘gaussian’.
noise_amplitude (float) – The amplitude of the noise (the scale of the random function, relative to the standard deviation of the signal).
noise_frequency (float) – The frequency of the noise (in Hz, i.e., samples/second).
powerline_amplitude (float) – The amplitude of the powerline noise (relative to the standard deviation of the signal).
powerline_frequency (float) – The frequency of the powerline noise (in Hz, i.e., samples/second).
artifacts_amplitude (float) – The amplitude of the artifacts (relative to the standard deviation of the signal).
artifacts_frequency (int) – The frequency of the artifacts (in Hz, i.e., samples/second).
artifacts_number (int) – The number of artifact bursts. The bursts have a random duration between 1 and 10% of the signal duration.
linear_drift (bool) – Whether or not to add linear drift to the signal.
random_state (int) – Seed for the random number generator. Keep it fixed for reproducible results.
silent (bool) – Whether or not to display warning messages.
- Returns
array – Vector containing the distorted signal.
Examples
>>> import numpy as np >>> import pandas as pd >>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=10, frequency=0.5) >>> >>> # Noise >>> noise = pd.DataFrame({"Freq100": nk.signal_distort(signal, noise_frequency=200), ... "Freq50": nk.signal_distort(signal, noise_frequency=50), ... "Freq10": nk.signal_distort(signal, noise_frequency=10), ... "Freq5": nk.signal_distort(signal, noise_frequency=5), ... "Raw": signal}).plot() >>> noise >>> >>> # Artifacts >>> artifacts = pd.DataFrame({"1Hz": nk.signal_distort(signal, noise_amplitude=0, ... artifacts_frequency=1, artifacts_amplitude=0.5), ... "5Hz": nk.signal_distort(signal, noise_amplitude=0, ... artifacts_frequency=5, artifacts_amplitude=0.2), ... "Raw": signal}).plot() >>> artifacts
- signal_filter(signal, sampling_rate=1000, lowcut=None, highcut=None, method='butterworth', order=2, window_size='default', powerline=50)[source]¶
Filter a signal using ‘butterworth’, ‘fir’ or ‘savgol’ filters.
Apply a lowpass (if ‘highcut’ frequency is provided), highpass (if ‘lowcut’ frequency is provided) or bandpass (if both are provided) filter to the signal.
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values. or “bandstop”.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
lowcut (float) – Lower cutoff frequency in Hz. The default is None.
highcut (float) – Upper cutoff frequency in Hz. The default is None.
method (str) – Can be one of ‘butterworth’, ‘fir’, ‘bessel’ or ‘savgol’. Note that for Butterworth, the function uses the SOS method from scipy.signal.sosfiltfilt, recommended for general purpose filtering. One can also specify “butterworth_ba’ for a more traditional and legacy method (often implemented in other software).
order (int) – Only used if method is ‘butterworth’ or ‘savgol’. Order of the filter (default is 2).
window_size (int) – Only used if method is ‘savgol’. The length of the filter window (i.e. the number of coefficients). Must be an odd integer. If ‘default’, will be set to the sampling rate divided by 10 (101 if the sampling rate is 1000 Hz).
powerline (int) – Only used if method is ‘powerline’. The powerline frequency (normally 50 Hz or 60 Hz).
See also
- Returns
array – Vector containing the filtered signal.
Examples
>>> import numpy as np >>> import pandas as pd >>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=10, frequency=0.5) # Low freq >>> signal += nk.signal_simulate(duration=10, frequency=5) # High freq >>> >>> # Lowpass >>> fig1 = pd.DataFrame({"Raw": signal, ... "Butter_2": nk.signal_filter(signal, highcut=3, method='butterworth', order=2), ... "Butter_2_BA": nk.signal_filter(signal, highcut=3, method='butterworth_ba', order=2), ... "Butter_5": nk.signal_filter(signal, highcut=3, method='butterworth', order=5), ... "Butter_5_BA": nk.signal_filter(signal, highcut=3, method='butterworth_ba', order=5), ... "Bessel_2": nk.signal_filter(signal, highcut=3, method='bessel', order=2), ... "Bessel_5": nk.signal_filter(signal, highcut=3, method='bessel', order=5), ... "FIR": nk.signal_filter(signal, highcut=3, method='fir')}).plot(subplots=True) >>> fig1
>>> # Highpass >>> fig2 = pd.DataFrame({"Raw": signal, ... "Butter_2": nk.signal_filter(signal, lowcut=2, method='butterworth', order=2), ... "Butter_2_ba": nk.signal_filter(signal, lowcut=2, method='butterworth_ba', order=2), ... "Butter_5": nk.signal_filter(signal, lowcut=2, method='butterworth', order=5), ... "Butter_5_BA": nk.signal_filter(signal, lowcut=2, method='butterworth_ba', order=5), ... "Bessel_2": nk.signal_filter(signal, lowcut=2, method='bessel', order=2), ... "Bessel_5": nk.signal_filter(signal, lowcut=2, method='bessel', order=5), ... "FIR": nk.signal_filter(signal, lowcut=2, method='fir')}).plot(subplots=True) >>> fig2
>>> # Bandpass in real-life scenarios >>> original = nk.rsp_simulate(duration=30, method="breathmetrics", noise=0) >>> signal = nk.signal_distort(original, noise_frequency=[0.1, 2, 10, 100], noise_amplitude=1, ... powerline_amplitude=1) >>> >>> # Bandpass between 10 and 30 breaths per minute (respiratory rate range) >>> fig3 = pd.DataFrame({"Raw": signal, ... "Butter_2": nk.signal_filter(signal, lowcut=10/60, highcut=30/60, ... method='butterworth', order=2), ... "Butter_2_BA": nk.signal_filter(signal, lowcut=10/60, highcut=30/60, ... method='butterworth_ba', order=2), ... "Butter_5": nk.signal_filter(signal, lowcut=10/60, highcut=30/60, ... method='butterworth', order=5), ... "Butter_5_BA": nk.signal_filter(signal, lowcut=10/60, highcut=30/60, ... method='butterworth_ba', order=5), ... "Bessel_2": nk.signal_filter(signal, lowcut=10/60, highcut=30/60, ... method='bessel', order=2), ... "Bessel_5": nk.signal_filter(signal, lowcut=10/60, highcut=30/60, ... method='bessel', order=5), ... "FIR": nk.signal_filter(signal, lowcut=10/60, highcut=30/60, ... method='fir'), ... "Savgol": nk.signal_filter(signal, method='savgol')}).plot(subplots=True) >>> fig3
- signal_findpeaks(signal, height_min=None, height_max=None, relative_height_min=None, relative_height_max=None, relative_mean=True, relative_median=False, relative_max=False)[source]¶
Find peaks in a signal.
Locate peaks (local maxima) in a signal and their related characteristics, such as height (prominence), width and distance with other peaks.
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
height_min (float) – The minimum height (i.e., amplitude in terms of absolute values). For example,``height_min=20`` will remove all peaks which height is smaller or equal to 20 (in the provided signal’s values).
height_max (float) – The maximum height (i.e., amplitude in terms of absolute values).
relative_height_min (float) – The minimum height (i.e., amplitude) relative to the sample (see below). For example,
relative_height_min=-2.96
will remove all peaks which height lies below 2.96 standard deviations from the mean of the heights.relative_height_max (float) – The maximum height (i.e., amplitude) relative to the sample (see below).
relative_mean (bool) – If a relative threshold is specified, how should it be computed (i.e., relative to what?).
relative_mean=True
will use Z-scores.relative_median (bool) – If a relative threshold is specified, how should it be computed (i.e., relative to what?). Relative to median uses a more robust form of standardization (see
standardize()
).relative_max (bool) – If a relative threshold is specified, how should it be computed (i.e., relative to what?). Reelative to max will consider the maximum height as the reference.
- Returns
dict – Returns a dict itself containing 5 arrays: - ‘Peaks’ contains the peaks indices (as relative to the given signal). For instance, the value 3 means that the third data point of the signal is a peak. - ‘Distance’ contains, for each peak, the closest distance with another peak. Note that these values will be recomputed after filtering to match the selected peaks. - ‘Height’ contains the prominence of each peak. See scipy.signal.peak_prominences(). - ‘Width’ contains the width of each peak. See scipy.signal.peak_widths(). - ‘Onset’ contains the onset, start (or left trough), of each peak. - ‘Offset’ contains the offset, end (or right trough), of each peak.
Examples
>>> import numpy as np >>> import pandas as pd >>> import neurokit2 as nk >>> import scipy.misc >>> >>> signal = nk.signal_simulate(duration=5) >>> info = nk.signal_findpeaks(signal) >>> fig1 = nk.events_plot([info["Onsets"], info["Peaks"]], signal) >>> fig1 >>> >>> signal = nk.signal_distort(signal) >>> info = nk.signal_findpeaks(signal, height_min=1) >>> fig2 = nk.events_plot(info["Peaks"], signal) >>> fig2 >>> >>> # Filter peaks >>> ecg = scipy.misc.electrocardiogram() >>> signal = ecg[0:1000] >>> info1 = nk.signal_findpeaks(signal, relative_height_min=0) >>> info2 = nk.signal_findpeaks(signal, relative_height_min=1) >>> fig3 = nk.events_plot([info1["Peaks"], info2["Peaks"]], signal) >>> fig3
See also
scipy.signal.find_peaks
,scipy.signal.peak_widths
,peak_prominences.signal.peak_widths
,eda_findpeaks
,ecg_findpeaks
,rsp_findpeaks
,signal_fixpeaks
- signal_fixpeaks(peaks, sampling_rate=1000, iterative=True, show=False, interval_min=None, interval_max=None, relative_interval_min=None, relative_interval_max=None, robust=False, method='Kubios')[source]¶
Correct erroneous peak placements.
Identify and correct erroneous peak placements based on outliers in peak-to-peak differences (period).
- Parameters
peaks (list or array or DataFrame or Series or dict) – The samples at which the peaks occur. If an array is passed in, it is assumed that it was obtained with signal_findpeaks(). If a DataFrame is passed in, it is assumed to be obtained with ecg_findpeaks() or ppg_findpeaks() and to be of the same length as the input signal.
sampling_rate (int) – The sampling frequency of the signal that contains the peaks (in Hz, i.e., samples/second).
iterative (bool) – Whether or not to apply the artifact correction repeatedly (results in superior artifact correction).
show (bool) – Whether or not to visualize artifacts and artifact thresholds.
interval_min (float) – The minimum interval between the peaks.
interval_max (float) – The maximum interval between the peaks.
relative_interval_min (float) – The minimum interval between the peaks as relative to the sample (expressed in standard deviation from the mean).
relative_interval_max (float) – The maximum interval between the peaks as relative to the sample (expressed in standard deviation from the mean).
robust (bool) – Use a robust method of standardization (see standardize()) for the relative thresholds.
method (str) – Either “Kubios” or “Neurokit”. “Kubios” uses the artifact detection and correction described in Lipponen, J. A., & Tarvainen, M. P. (2019). Note that “Kubios” is only meant for peaks in ECG or PPG. “neurokit” can be used with peaks in ECG, PPG, or respiratory data.
- Returns
peaks_clean (array) – The corrected peak locations.
artifacts (dict) – Only if method=”Kubios”. A dictionary containing the indices of artifacts, accessible with the keys “ectopic”, “missed”, “extra”, and “longshort”.
See also
signal_findpeaks
,ecg_findpeaks
,ecg_peaks
,ppg_findpeaks
,ppg_peaks
Examples
>>> import neurokit2 as nk >>> import numpy as np >>> import matplotlib.pyplot as plt >>> >>> # Kubios >>> ecg = nk.ecg_simulate(duration=240, noise=0.25, heart_rate=70, random_state=42) >>> rpeaks_uncorrected = nk.ecg_findpeaks(ecg) >>> artifacts, rpeaks_corrected = nk.signal_fixpeaks(rpeaks_uncorrected, iterative=True, ... show=True, method="Kubios") >>> rate_corrected = nk.signal_rate(rpeaks_corrected, desired_length=len(ecg)) >>> rate_uncorrected = nk.signal_rate(rpeaks_uncorrected, desired_length=len(ecg)) >>> >>> fig, ax = plt.subplots() >>> ax.plot(rate_uncorrected, label="heart rate without artifact correction") >>> ax.plot(rate_corrected, label="heart rate with artifact correction") >>> ax.legend(loc="upper right") >>> >>> # NeuroKit >>> signal = nk.signal_simulate(duration=4, sampling_rate=1000, frequency=1) >>> peaks_true = nk.signal_findpeaks(signal)["Peaks"] >>> peaks = np.delete(peaks_true, [1]) # create gaps >>> >>> signal = nk.signal_simulate(duration=20, sampling_rate=1000, frequency=1) >>> peaks_true = nk.signal_findpeaks(signal)["Peaks"] >>> peaks = np.delete(peaks_true, [5, 15]) # create gaps >>> peaks = np.sort(np.append(peaks, [1350, 11350, 18350])) # add artifacts >>> >>> peaks_corrected = nk.signal_fixpeaks(peaks=peaks, interval_min=0.5, interval_max=1.5, method="neurokit") >>> # Plot and shift original peaks to the rightto see the difference. >>> fig = nk.events_plot([peaks + 50, peaks_corrected], signal) >>> fig
References
Lipponen, J. A., & Tarvainen, M. P. (2019). A robust algorithm for heart rate variability time
series artefact correction using novel beat classification. Journal of medical engineering & technology, 43(3), 173-181. 10.1080/03091902.2019.1640306
- signal_flatline(signal, threshold=0.01)[source]¶
Return the flatline percentage of the signal.
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
threshold (float, optional) – Flatline threshold relative to the biggest change in the signal. This is the percentage of the maximum value of absolute consecutive differences.
- Returns
float – Percentage of signal where the absolute value of the derivative is lower then the threshold.
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=5) >>> nk.signal_flatline(signal) 0.008
- signal_formatpeaks(info, desired_length, peak_indices=None, other_indices=None)[source]¶
Transforms an peak-info dict to a signal of given length.
- signal_interpolate(x_values, y_values, x_new=None, method='quadratic')[source]¶
Interpolate a signal.
Interpolate a signal using different methods.
- Parameters
x_values (Union[list, np.array, pd.Series]) – The samples corresponding to the values to be interpolated.
y_values (Union[list, np.array, pd.Series]) – The values to be interpolated.
x_new (Union[list, np.array, pd.Series] or int) – The samples at which to interpolate the y_values. Samples before the first value in x_values or after the last value in x_values will be extrapolated. If an integer is passed, nex_x will be considered as the desired length of the interpolated signal between the first and the last values of x_values. No extrapolation will be done for values before or after the first and the last valus of x_values.
method (str) – Method of interpolation. Can be ‘linear’, ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘previous’, ‘next’ or ‘monotone_cubic’. The methods ‘zero’, ‘slinear’, ‘quadratic’ and ‘cubic’ refer to a spline interpolation of zeroth, first, second or third order; whereas ‘previous’ and ‘next’ simply return the previous or next value of the point. An integer specifying the order of the spline interpolator to use. See here for details on the ‘monotone_cubic’ method.
- Returns
array – Vector of interpolated samples.
Examples
>>> import numpy as np >>> import neurokit2 as nk >>> import matplotlib.pyplot as plt >>> >>> # Generate points >>> signal = nk.signal_simulate(duration=1, sampling_rate=10) >>> >>> # List all interpolation methods and interpolation parameters >>> interpolation_methods = ["zero", "linear", "quadratic", "cubic", 5, "nearest", "monotone_cubic"] >>> x_values = np.linspace(0, 1, num=10) >>> x_new = np.linspace(0, 1, num=1000) >>> >>> # Visualize all interpolations >>> fig, ax = plt.subplots() >>> ax.scatter(x_values, signal, label="original datapoints", zorder=3) >>> for im in interpolation_methods: ... signal_interpolated = nk.signal_interpolate(x_values, signal, x_new=x_new, method=im) ... ax.plot(x_new, signal_interpolated, label=im) >>> ax.legend(loc="upper left")
- signal_merge(signal1, signal2, time1=[0, 10], time2=[0, 10])[source]¶
Arbitrary addition of two signals with different time ranges.
- Parameters
signal1 (Union[list, np.array, pd.Series]) – The first signal (i.e., a time series)s in the form of a vector of values.
signal2 (Union[list, np.array, pd.Series]) – The second signal (i.e., a time series)s in the form of a vector of values.
time1 (list) – Lists containing two numeric values corresponding to the beginning and end of
signal1
.time2 (list) – Same as above, but for
signal2
.
- Returns
array – Vector containing the sum of the two signals.
Examples
>>> import numpy as np >>> import pandas as pd >>> import neurokit2 as nk >>> >>> signal1 = np.cos(np.linspace(start=0, stop=10, num=100)) >>> signal2 = np.cos(np.linspace(start=0, stop=20, num=100)) >>> >>> signal = nk.signal_merge(signal1, signal2, time1=[0, 10], time2=[-5, 5]) >>> nk.signal_plot(signal)
- signal_period(peaks, sampling_rate=1000, desired_length=None, interpolation_method='monotone_cubic')[source]¶
Calculate signal period from a series of peaks.
- Parameters
peaks (Union[list, np.array, pd.DataFrame, pd.Series, dict]) – The samples at which the peaks occur. If an array is passed in, it is assumed that it was obtained with signal_findpeaks(). If a DataFrame is passed in, it is assumed it is of the same length as the input signal in which occurrences of R-peaks are marked as “1”, with such containers obtained with e.g., ecg_findpeaks() or rsp_findpeaks().
sampling_rate (int) – The sampling frequency of the signal that contains peaks (in Hz, i.e., samples/second). Defaults to 1000.
desired_length (int) – If left at the default None, the returned period will have the same number of elements as peaks. If set to a value larger than the sample at which the last peak occurs in the signal (i.e., peaks[-1]), the returned period will be interpolated between peaks over desired_length samples. To interpolate the period over the entire duration of the signal, set desired_length to the number of samples in the signal. Cannot be smaller than or equal to the sample at which the last peak occurs in the signal. Defaults to None.
interpolation_method (str) – Method used to interpolate the rate between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the y-direction). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.
- Returns
array – A vector containing the period.
See also
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=10, sampling_rate=1000, frequency=1) >>> info = nk.signal_findpeaks(signal) >>> >>> period = nk.signal_period(peaks=info["Peaks"], desired_length=len(signal)) >>> nk.signal_plot(period)
- signal_phase(signal, method='radians')[source]¶
Compute the phase of the signal.
The real phase has the property to rotate uniformly, leading to a uniform distribution density. The prophase typically doesn’t fulfill this property. The following functions applies a nonlinear transformation to the phase signal that makes its distribution exactly uniform. If a binary vector is provided (containing 2 unique values), the function will compute the phase of completion of each phase as denoted by each value.
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
method (str) – The values in which the phase is expressed. Can be ‘radians’ (default), ‘degrees’ (for values between 0 and 360) or ‘percents’ (for values between 0 and 1).
See also
- Returns
array – A vector containing the phase of the signal, between 0 and 2*pi.
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=10) >>> phase = nk.signal_phase(signal) >>> nk.signal_plot([signal, phase]) >>> >>> rsp = nk.rsp_simulate(duration=30) >>> phase = nk.signal_phase(rsp, method="degrees") >>> nk.signal_plot([rsp, phase]) >>> >>> # Percentage of completion of two phases >>> signal = nk.signal_binarize(nk.signal_simulate(duration=10)) >>> phase = nk.signal_phase(signal, method="percents") >>> nk.signal_plot([signal, phase])
- signal_plot(signal, sampling_rate=None, subplots=False, standardize=False, labels=None, **kwargs)[source]¶
Plot signal with events as vertical lines.
- Parameters
signal (array or DataFrame) – Signal array (can be a dataframe with many signals).
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Needs to be supplied if the data should be plotted over time in seconds. Otherwise the data is plotted over samples. Defaults to None.
subplots (bool) – If True, each signal is plotted in a subplot.
standardize (bool) – If True, all signals will have the same scale (useful for visualisation).
labels (str or list) – Defaults to None.
**kwargs (optional) – Arguments passed to matplotlib plotting.
Examples
>>> import numpy as np >>> import pandas as pd >>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=10, sampling_rate=1000) >>> nk.signal_plot(signal, sampling_rate=1000, color="red") >>> >>> data = pd.DataFrame({"Signal2": np.cos(np.linspace(start=0, stop=20, num=1000)), ... "Signal3": np.sin(np.linspace(start=0, stop=20, num=1000)), ... "Signal4": nk.signal_binarize(np.cos(np.linspace(start=0, stop=40, num=1000)))}) >>> nk.signal_plot(data, labels=['signal_1', 'signal_2', 'signal_3'], subplots=True) >>> nk.signal_plot([signal, data], standardize=True)
- signal_power(signal, frequency_band, sampling_rate=1000, continuous=False, show=False, normalize=True, **kwargs)[source]¶
Compute the power of a signal in a given frequency band.
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
frequency_band (tuple or list) – Tuple or list of tuples indicating the range of frequencies to compute the power in.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
continuous (bool) – Compute instant frequency, or continuous power.
show (bool) – If True, will return a Poincaré plot. Defaults to False.
normalize (bool) – Normalization of power by maximum PSD value. Default to True. Normalization allows comparison between different PSD methods.
**kwargs – Keyword arguments to be passed to signal_psd().
See also
- Returns
pd.DataFrame – A DataFrame containing the Power Spectrum values and a plot if show is True.
Examples
>>> import neurokit2 as nk >>> import numpy as np >>> >>> # Instant power >>> signal = nk.signal_simulate(duration=60, frequency=10) + 3*nk.signal_simulate(duration=60, frequency=20) >>> power_plot = nk.signal_power(signal, frequency_band=[(8, 12), (18, 22)], method="welch", show=True) >>> power_plot >>> >>> # Continuous (simulated signal) >>> signal = np.concatenate((nk.ecg_simulate(duration=30, heart_rate=75), nk.ecg_simulate(duration=30, heart_rate=85))) >>> power = nk.signal_power(signal, frequency_band=[(72/60, 78/60), (82/60, 88/60)], continuous=True) >>> processed, _ = nk.ecg_process(signal) >>> power["ECG_Rate"] = processed["ECG_Rate"] >>> nk.signal_plot(power, standardize=True) >>> >>> # Continuous (real signal) >>> signal = nk.data("bio_eventrelated_100hz")["ECG"] >>> power = nk.signal_power(signal, sampling_rate=100, frequency_band=[(0.12, 0.15), (0.15, 0.4)], continuous=True) >>> processed, _ = nk.ecg_process(signal, sampling_rate=100) >>> power["ECG_Rate"] = processed["ECG_Rate"] >>> nk.signal_plot(power, standardize=True)
- signal_psd(signal, sampling_rate=1000, method='welch', show=False, normalize=True, min_frequency=0, max_frequency=inf, window=None, window_type='hann', order=16, order_criteria='KIC', order_corrected=True, **kwargs)[source]¶
Compute the Power Spectral Density (PSD).
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
method (str) – Either ‘multitapers’ (default; requires the ‘mne’ package), or ‘welch’ (requires the ‘scipy’ package).
show (bool) – If True, will return a plot. If False, will return the density values that can be plotted externally.
normalize (bool) – Normalization of power by maximum PSD value. Default to True. Normalization allows comparison between different PSD methods.
min_frequency (float) – The minimum frequency.
max_frequency (float) – The maximum frequency.
window (int) – Length of each window in seconds (for Welch method). If None (default), window will be automatically calculated to capture at least 2 cycles of min_frequency. If the length of recording does not allow the formal, window will be default to half of the length of recording.
window_type (str) – Desired window to use. Defaults to ‘hann’. See scipy.signal.get_window() for list of windows.
order (int) – The order of autoregression (only used for autoregressive (AR) methods such as ‘burg’).
order_criteria (str) – The criteria to automatically select order in parametric PSD (only used for autoregressive (AR) methods such as ‘burg’).
order_corrected (bool) – Should the order criteria (AIC or KIC) be corrected? If unsure which method to use to choose the order, rely on the default (i.e., the corrected KIC).
**kwargs (optional) – Keyword arguments to be passed to scipy.signal.welch().
See also
signal_filter
,mne.time_frequency.psd_array_multitaper
,scipy.signal.welch
- Returns
data (pd.DataFrame) – A DataFrame containing the Power Spectrum values and a plot if show is True.
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(frequency=5) + 0.5*nk.signal_simulate(frequency=20) >>> >>> psd_multitapers = nk.signal_psd(signal, method="multitapers", show=True) >>> psd_welch = nk.signal_psd(signal, method="welch", min_frequency=1, show=True) >>> psd_burg = nk.signal_psd(signal, method="burg", min_frequency=1, show=True) >>> psd_lomb = nk.signal_psd(signal, method="lomb", min_frequency=1, show=True)
- signal_rate(peaks, sampling_rate=1000, desired_length=None, interpolation_method='monotone_cubic')[source]¶
Calculate signal rate from a series of peaks.
This function can also be called either via
ecg_rate()
,`ppg_rate()
orrsp_rate()
(aliases provided for consistency).- Parameters
peaks (Union[list, np.array, pd.DataFrame, pd.Series, dict]) – The samples at which the peaks occur. If an array is passed in, it is assumed that it was obtained with signal_findpeaks(). If a DataFrame is passed in, it is assumed it is of the same length as the input signal in which occurrences of R-peaks are marked as “1”, with such containers obtained with e.g., ecg_findpeaks() or rsp_findpeaks().
sampling_rate (int) – The sampling frequency of the signal that contains peaks (in Hz, i.e., samples/second). Defaults to 1000.
desired_length (int) – If left at the default None, the returned rated will have the same number of elements as peaks. If set to a value larger than the sample at which the last peak occurs in the signal (i.e., peaks[-1]), the returned rate will be interpolated between peaks over desired_length samples. To interpolate the rate over the entire duration of the signal, set desired_length to the number of samples in the signal. Cannot be smaller than or equal to the sample at which the last peak occurs in the signal. Defaults to None.
interpolation_method (str) – Method used to interpolate the rate between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the y-direction). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.
- Returns
array – A vector containing the rate.
See also
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=10, sampling_rate=1000, frequency=1) >>> info = nk.signal_findpeaks(signal) >>> >>> rate = nk.signal_rate(peaks=info["Peaks"], desired_length=len(signal)) >>> fig = nk.signal_plot(rate) >>> fig
- signal_recompose(components, method='wcorr', threshold=0.5, keep_sd=None, **kwargs)[source]¶
Combine signal sources after decomposition.
Combine and reconstruct meaningful signal sources after signal decomposition.
- Parameters
components (array) – Array of components obtained via
signal_decompose()
.method (str) – The decomposition method. Can be one of ‘wcorr’.
threshold (float) – The threshold used to group components together.
keep_sd (float) – If a float is specified, will only keep the reconstructed components that are superior or equal to that percentage of the max standard deviaiton (SD) of the components. For instance,
keep_sd=0.01
will remove all components with SD is lower that 1% of the max SD. This can be used to filter out noise.**kwargs – Other arguments to override for instance
metric='chebyshev'
.
- Returns
Array – Components of the recomposed components.
Examples
>>> import neurokit2 as nk >>> >>> # Create complex signal >>> signal = nk.signal_simulate(duration=10, frequency=1, noise=0.01) # High freq >>> signal += 3 * nk.signal_simulate(duration=10, frequency=3, noise=0.01) # Higher freq >>> signal += 3 * np.linspace(0, 2, len(signal)) # Add baseline and trend >>> signal += 2 * nk.signal_simulate(duration=10, frequency=0.1, noise=0) >>> >>> # Decompose signal >>> components = nk.signal_decompose(signal, method='emd') >>> >>> # Recompose >>> recomposed = nk.signal_recompose(components, method='wcorr', threshold=0.90) >>> fig = nk.signal_plot(components) # Visualize components >>> fig
- signal_resample(signal, desired_length=None, sampling_rate=None, desired_sampling_rate=None, method='interpolation')[source]¶
Resample a continuous signal to a different length or sampling rate.
Up- or down-sample a signal. The user can specify either a desired length for the vector, or input the original sampling rate and the desired sampling rate. See https://github.com/neuropsychology/NeuroKit/scripts/resampling.ipynb for a comparison of the methods.
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
desired_length (int) – The desired length of the signal.
sampling_rate (int) – The original sampling frequency (in Hz, i.e., samples/second).
desired_sampling_rate (int) – The desired (output) sampling frequency (in Hz, i.e., samples/second).
method (str) – Can be ‘interpolation’ (see scipy.ndimage.zoom()), ‘numpy’ for numpy’s interpolation (see numpy.interp()),’pandas’ for Pandas’ time series resampling, ‘poly’ (see scipy.signal.resample_poly()) or ‘FFT’ (see scipy.signal.resample()) for the Fourier method. FFT is the most accurate (if the signal is periodic), but becomes exponentially slower as the signal length increases. In contrast, ‘interpolation’ is the fastest, followed by ‘numpy’, ‘poly’ and ‘pandas’.
- Returns
array – Vector containing resampled signal values.
Examples
>>> import numpy as np >>> import pandas as pd >>> import neurokit2 as nk >>> >>> signal = np.cos(np.linspace(start=0, stop=20, num=100)) >>> >>> # Downsample >>> downsampled_interpolation = nk.signal_resample(signal, method="interpolation", ... sampling_rate=1000, desired_sampling_rate=500) >>> downsampled_fft = nk.signal_resample(signal, method="FFT", ... sampling_rate=1000, desired_sampling_rate=500) >>> downsampled_poly = nk.signal_resample(signal, method="poly", ... sampling_rate=1000, desired_sampling_rate=500) >>> downsampled_numpy = nk.signal_resample(signal, method="numpy", ... sampling_rate=1000, desired_sampling_rate=500) >>> downsampled_pandas = nk.signal_resample(signal, method="pandas", ... sampling_rate=1000, desired_sampling_rate=500) >>> >>> # Upsample >>> upsampled_interpolation = nk.signal_resample(downsampled_interpolation, ... method="interpolation", ... sampling_rate=500, desired_sampling_rate=1000) >>> upsampled_fft = nk.signal_resample(downsampled_fft, method="FFT", ... sampling_rate=500, desired_sampling_rate=1000) >>> upsampled_poly = nk.signal_resample(downsampled_poly, method="poly", ... sampling_rate=500, desired_sampling_rate=1000) >>> upsampled_numpy = nk.signal_resample(downsampled_numpy, method="numpy", ... sampling_rate=500, desired_sampling_rate=1000) >>> upsampled_pandas = nk.signal_resample(downsampled_pandas, method="pandas", ... sampling_rate=500, desired_sampling_rate=1000) >>> >>> # Compare with original >>> fig = pd.DataFrame({"Original": signal, ... "Interpolation": upsampled_interpolation, ... "FFT": upsampled_fft, ... "Poly": upsampled_poly, ... "Numpy": upsampled_numpy, ... "Pandas": upsampled_pandas}).plot(style='.-') >>> fig >>> >>> # Timing benchmarks >>> %timeit nk.signal_resample(signal, method="interpolation", ... sampling_rate=1000, desired_sampling_rate=500) >>> %timeit nk.signal_resample(signal, method="FFT", ... sampling_rate=1000, desired_sampling_rate=500) >>> %timeit nk.signal_resample(signal, method="poly", ... sampling_rate=1000, desired_sampling_rate=500) >>> %timeit nk.signal_resample(signal, method="numpy", ... sampling_rate=1000, desired_sampling_rate=500) >>> %timeit nk.signal_resample(signal, method="pandas", ... sampling_rate=1000, desired_sampling_rate=500)
See also
scipy.signal.resample_poly
,scipy.signal.resample
,scipy.ndimage.zoom
- signal_sanitize(signal)[source]¶
Reset indexing for Pandas Series
- Parameters
signal (Series) – The indexed input signal (pandas set_index())
- Returns
Series – The default indexed signal
Examples
>>> import pandas as pd >>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=10, sampling_rate=1000, frequency=1) >>> df = pd.DataFrame({'signal': signal, 'id': [x*2 for x in range(len(signal))]}) >>> >>> df = df.set_index('id') >>> default_index_signal = nk.signal_sanitize(df.signal) >>>
- signal_simulate(duration=10, sampling_rate=1000, frequency=1, amplitude=0.5, noise=0, silent=False)[source]¶
Simulate a continuous signal.
- Parameters
duration (float) – Desired length of duration (s).
sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second).
frequency (float or list) – Oscillatory frequency of the signal (in Hz, i.e., oscillations per second).
amplitude (float or list) – Amplitude of the oscillations.
noise (float) – Noise level (amplitude of the laplace noise).
silent (bool) – If False (default), might print warnings if impossible frequencies are queried.
- Returns
array – The simulated signal.
Examples
>>> import numpy as np >>> import pandas as pd >>> import neurokit2 as nk >>> >>> fig = pd.DataFrame({"1Hz": nk.signal_simulate(duration=5, frequency=1), ... "2Hz": nk.signal_simulate(duration=5, frequency=2), ... "Multi": nk.signal_simulate(duration=5, frequency=[0.5, 3], amplitude=[0.5, 0.2])}).plot() >>> fig
- signal_smooth(signal, method='convolution', kernel='boxzen', size=10, alpha=0.1)[source]¶
Signal smoothing.
Signal smoothing can be achieved using either the convolution of a filter kernel with the input signal to compute the smoothed signal (Smith, 1997) or a LOESS regression.
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
method (str) – Can be one of ‘convolution’ (default) or ‘loess’.
kernel (Union[str, np.array]) – Only used if method is ‘convolution’. Type of kernel to use; if array, use directly as the kernel. Can be one of ‘median’, ‘boxzen’, ‘boxcar’, ‘triang’, ‘blackman’, ‘hamming’, ‘hann’, ‘bartlett’, ‘flattop’, ‘parzen’, ‘bohman’, ‘blackmanharris’, ‘nuttall’, ‘barthann’, ‘kaiser’ (needs beta), ‘gaussian’ (needs std), ‘general_gaussian’ (needs power, width), ‘slepian’ (needs width) or ‘chebwin’ (needs attenuation).
size (int) – Only used if method is ‘convolution’. Size of the kernel; ignored if kernel is an array.
alpha (float) – Only used if method is ‘loess’. The parameter which controls the degree of smoothing.
- Returns
array – Smoothed signal.
See also
fit_loess
Examples
>>> import numpy as np >>> import pandas as pd >>> import neurokit2 as nk >>> >>> signal = np.cos(np.linspace(start=0, stop=10, num=1000)) >>> distorted = nk.signal_distort(signal, noise_amplitude=[0.3, 0.2, 0.1, 0.05], noise_frequency=[5, 10, 50, 100]) >>> >>> size = len(signal)/100 >>> signals = pd.DataFrame({"Raw": distorted, ... "Median": nk.signal_smooth(distorted, kernel='median', size=size-1), ... "BoxZen": nk.signal_smooth(distorted, kernel='boxzen', size=size), ... "Triang": nk.signal_smooth(distorted, kernel='triang', size=size), ... "Blackman": nk.signal_smooth(distorted, kernel='blackman', size=size), ... "Loess_01": nk.signal_smooth(distorted, method='loess', alpha=0.1), ... "Loess_02": nk.signal_smooth(distorted, method='loess', alpha=0.2), ... "Loess_05": nk.signal_smooth(distorted, method='loess', alpha=0.5)}) >>> fig = signals.plot() >>> fig_magnify = signals[50:150].plot() # Magnify >>> fig_magnify
References
Smith, S. W. (1997). The scientist and engineer’s guide to digital signal processing.
- signal_synchrony(signal1, signal2, method='hilbert', window_size=50)[source]¶
Compute the synchrony (coupling) between two signals.
Compute a continuous index of coupling between two signals either using the ‘Hilbert’ method to get the instantaneous phase synchrony, or using rolling window correlation.
The instantaneous phase synchrony measures the phase similarities between signals at each timepoint. The phase refers to the angle of the signal, calculated through the hilbert transform, when it is resonating between -pi to pi degrees. When two signals line up in phase their angular difference becomes zero.
For less clean signals, windowed correlations are widely used because of their simplicity, and can be a good a robust approximation of synchrony between two signals. The limitation is the need to select a window.
- Parameters
signal1 (Union[list, np.array, pd.Series]) – Time series in the form of a vector of values.
signal2 (Union[list, np.array, pd.Series]) – Time series in the form of a vector of values.
method (str) – The method to use. Can be one of ‘hilbert’ or ‘correlation’.
window_size (int) – Only used if method=’correlation’. The number of samples to use for rolling correlation.
See also
- Returns
array – A vector containing the phase of the signal, between 0 and 2*pi.
Examples
>>> import neurokit2 as nk >>> >>> signal1 = nk.signal_simulate(duration=10, frequency=1) >>> signal2 = nk.signal_simulate(duration=10, frequency=1.5) >>> >>> coupling_h = nk.signal_synchrony(signal1, signal2, method="hilbert") >>> coupling_c = nk.signal_synchrony(signal1, signal2, method="correlation", window_size=1000/2) >>> >>> fig = nk.signal_plot([signal1, signal2, coupling_h, coupling_c]) >>> fig
References
- signal_timefrequency(signal, sampling_rate=1000, min_frequency=0.04, max_frequency=None, method='stft', window=None, window_type='hann', mode='psd', nfreqbin=None, overlap=None, analytical_signal=True, show=True)[source]¶
Quantify changes of a nonstationary signal’s frequency over time. The objective of time-frequency analysis is to offer a more informative description of the signal which reveals the temporal variation of its frequency contents.
There are many different Time-Frequency Representations (TFRs) available:
- Linear TFRs: efficient but create tradeoff between time and frequency resolution
Short Time Fourier Transform (STFT): the time-domain signal is windowed into short segments
and FT is applied to each segment, mapping the signal into the TF plane. This method assumes that the signal is quasi-stationary (stationary over the duration of the window). The width of the window is the trade-off between good time (requires short duration window) versus good frequency resolution (requires long duration windows) - Wavelet Transform (WT): similar to STFT but instead of a fixed duration window functrion, a varying window length by scaling the axis of the window is used. At low frequency, WT proves high spectral resolution but poor temporal resolution. On the other hand, for high frequencies, the WT provides high temporal resolution but poor spectral resolution.
Quadratic TFRs: better resolution but computationally expensive and suffers from having
- cross terms between multiple signal components
Wigner Ville Distribution (WVD): while providing very good resolution in time and frequency
of the underlying signal structure, because of its bilinear nature, existence of negative values, the WVD has misleading TF results in the case of multi-component signals such as EEG due to the presence of cross terms and inference terms. Cross WVD terms can be reduced by using moothing kernal functions as well as analyzing the analytic signal (instead of the original signal) - Smoothed Pseudo Wigner Ville Distribution (SPWVD): to address the problem of cross-terms suppression, SPWVD allows two independent analysis windows, one in time and the other in frequency domains.
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
method (str) – Time-Frequency decomposition method.
min_frequency (float) – The minimum frequency.
max_frequency (float) – The maximum frequency.
window (int) – Length of each segment in seconds. If None (default), window will be automatically calculated. For stft method
window_type (str) – Type of window to create, defaults to ‘hann’. See
scipy.signal.get_window()
to see full options of windows. For stft method.mode (str) – Type of return values for stft method. Can be ‘psd’, ‘complex’ (default, equivalent to output of stft with no padding or boundary extension), ‘magnitude’, ‘angle’, ‘phase’. Default to ‘psd’.
nfreqbin (int, float) – Number of frequency bins. If None (default), nfreqbin will be set to 0.5*sampling_rate.
overlap (int) – Number of points to overlap between segments. If None, noverlap = nperseg // 8. Defaults to None.
analytical_signal (bool) – If True, analytical signal instead of actual signal is used in Wigner Ville Distrubution methods.
show (bool) – If True, will return two PSD plots.
- Returns
frequency (np.array) – Frequency.
time (np.array) – Time array.
stft (np.array) – Short Term Fourier Transform. Time increases across its columns and frequency increases down the rows.
Examples
>>> import neurokit2 as nk >>> import numpy as np >>> signal = nk.signal_simulate(100, sampling_rate=100, frequency=10.0) >>> signal += 2 * nk.signal_simulate(100, sampling_rate=100, frequency=3.0) >>> sampling_rate=100 >>> f, t, stft = nk.signal_timefrequency(signal, sampling_rate, max_frequency=20, method="stft", show=True) >>> f, t, cwtm = nk.signal_timefrequency(signal, sampling_rate, max_frequency=20, method="cwt", show=True) >>> f, t, wvd = nk.signal_timefrequency(signal, sampling_rate, max_frequency=20, method="wvd", show=True) >>> f, t, pwvd = nk.signal_timefrequency(signal, sampling_rate, max_frequency=20, method="pwvd", show=True)
- signal_zerocrossings(signal, direction='both')[source]¶
Locate the indices where the signal crosses zero.
Note that when the signal crosses zero between two points, the first index is returned.
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
direction (str) – Direction in which the signal crosses zero, can be “positive”, “negative” or “both” (default).
- Returns
array – Vector containing the indices of zero crossings.
Examples
>>> import numpy as np >>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=5) >>> zeros = nk.signal_zerocrossings(signal) >>> fig = nk.events_plot(zeros, signal) >>> fig >>> >>> # Only upward or downward zerocrossings >>> up = nk.signal_zerocrossings(signal, direction='up') >>> down = nk.signal_zerocrossings(signal, direction='down') >>> fig = nk.events_plot([up, down], signal) >>> fig
Events¶
Submodule for NeuroKit.
- events_create(event_onsets, event_durations=None, event_labels=None, event_conditions=None)[source]¶
Create events dictionnary from list of onsets.
- Parameters
event_onsets (array or list) – A list of events onset.
event_durations (array or list) – A list of durations. If none is passed, will take the duration between each onset (i.e., will assume that events are consecutive).
event_labels (list) – A list containing unique event identifiers. If None, will use the event index number.
event_conditions (list) – An optional list containing, for each event, for example the trial category, group or experimental conditions.
- Returns
dict – Dict containing 3 or 4 arrays, ‘onset’ for event onsets, ‘duration’ for event durations, ‘label’ for the event identifiers and the optional ‘conditions’ passed to event_conditions.
See also
Example
>>> import neurokit2 as nk >>> >>> events = nk.events_create(event_onsets = [500, 1500, 2500, 5000]) >>> events {'onset': [...], 'duration': array(...), 'label': array(...)}
- events_find(event_channel, threshold='auto', threshold_keep='above', start_at=0, end_at=None, duration_min=1, duration_max=None, inter_min=0, discard_first=0, discard_last=0, event_labels=None, event_conditions=None)[source]¶
Find and select events in a continuous signal (e.g., from a photosensor).
- Parameters
event_channel (array or list) – The channel containing the events.
threshold (str or float) – The threshold value by which to select the events. If “auto”, takes the value between the max and the min.
threshold_keep (str) – “above” or “below”, define the events as above or under the threshold. For photosensors, a white screen corresponds usually to higher values. Therefore, if your events are signaled by a black colour, events values are the lower ones, and you should set the cut to “below”.
start_at (int) – Keep events which onset is after a particular time point.
end_at (int) – Keep events which onset is before a particular time point.
duration_min (int) – The minimum duration of an event to be considered as such (in time points).
duration_max (int) – The maximum duration of an event to be considered as such (in time points).
inter_min (int) – The minimum duration after an event for the subsequent event to be considered as such (in time points). Useful when spurious consecutive events are created due to very high sampling rate.
discard_first (int) – Discard first or last n events. Useful if the experiment starts with some spurious events. If discard_first=0, no first event is removed.
discard_last (int) – Discard first or last n events. Useful if the experiment ends with some spurious events. If discard_last=0, no last event is removed.
event_labels (list) – A list containing unique event identifiers. If None, will use the event index number.
event_conditions (list) – An optional list containing, for each event, for example the trial category, group or experimental conditions.
- Returns
dict – Dict containing 3 or 4 arrays, ‘onset’ for event onsets, ‘duration’ for event durations, ‘label’ for the event identifiers and the optional ‘conditions’ passed to event_conditions.
See also
Example
>>> import numpy as np >>> import pandas as pd >>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=4) >>> events = nk.events_find(signal) >>> events {'onset': array(...), 'duration': array(...), 'label': array(...)} >>> >>> nk.events_plot(events, signal) <Figure ...>
- events_plot(events, signal=None, show=True, color='red', linestyle='--')[source]¶
Plot events in signal.
- Parameters
events (list or ndarray or dict) – Events onset location. Can also be a list of lists, in which case it will mark them with different colors. If a dict is passed (e.g., from ‘events_find()’), will select only the ‘onset’ list.
signal (array or DataFrame) – Signal array (can be a dataframe with many signals).
show (bool) – If True, will return a plot. If False, will return a DataFrame that can be plotted externally.
color (str) – Argument passed to matplotlib plotting.
linestyle (str) – Argument passed to matplotlib plotting.
- Returns
fig – Figure representing a plot of the signal and the event markers.
See also
Examples
>>> import numpy as np >>> import pandas as pd >>> import neurokit2 as nk >>> >>> fig = nk.events_plot([1, 3, 5]) >>> fig >>> >>> # With signal >>> signal = nk.signal_simulate(duration=4) >>> events = nk.events_find(signal) >>> fig1 = nk.events_plot(events, signal) >>> fig1 >>> >>> # Different events >>> events1 = events["onset"] >>> events2 = np.linspace(0, len(signal), 8) >>> fig2 = nk.events_plot([events1, events2], signal) >>> fig2 >>> >>> # Conditions >>> events = nk.events_find(signal, event_conditions=["A", "B", "A", "B"]) >>> fig3 = nk.events_plot(events, signal) >>> fig3 >>> >>> # Different colors for all events >>> signal = nk.signal_simulate(duration=20) >>> events = nk.events_find(signal) >>> events = [[i] for i in events['onset']] >>> fig4 = nk.events_plot(events, signal) >>> fig4
- events_to_mne(events, event_conditions=None)[source]¶
Create MNE compatible events for integration with M/EEG.
- Parameters
events (list or ndarray or dict) – Events onset location. Can also be a dict obtained through ‘events_find()’.
event_conditions (list) – An optional list containing, for each event, for example the trial category, group or experimental conditions. Defaults to None.
- Returns
tuple – MNE-formatted events and the event id, that can be added via ‘raw.add_events(events), and a dictionary with event’s names.
See also
Examples
>>> import numpy as np >>> import pandas as pd >>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=4) >>> events = nk.events_find(signal) >>> events, event_id = nk.events_to_mne(events) >>> events array([[ 1, 0, 0], [1001, 0, 0], [2001, 0, 0], [3001, 0, 0]]) >>> event_id {'event': 0} >>> >>> # Conditions >>> events = nk.events_find(signal, event_conditions=["A", "B", "A", "B"]) >>> events, event_id = nk.events_to_mne(events) >>> event_id {'B': 0, 'A': 1}
Data¶
Submodule for NeuroKit.
- data(dataset='bio_eventrelated_100hz')[source]¶
Download example datasets.
Download and load available example datasets. Note that an internet connexion is necessary.
- Parameters
dataset (str) – The name of the dataset. The list and description is available here.
- Returns
DataFrame – The data.
Examples
>>> import neurokit2 as nk >>> >>> data = nk.data("bio_eventrelated_100hz")
- read_acqknowledge(filename, sampling_rate='max', resample_method='interpolation', impute_missing=True)[source]¶
Read and format a BIOPAC’s AcqKnowledge file into a pandas’ dataframe.
The function outputs both the dataframe and the sampling rate (retrieved from the AcqKnowledge file).
- Parameters
filename (str) – Filename (with or without the extension) of a BIOPAC’s AcqKnowledge file (e.g., ‘data.acq’).
sampling_rate (int) – Sampling rate (in Hz, i.e., samples/second). Since an AcqKnowledge file can contain signals recorded at different rates, harmonization is necessary in order to convert it to a DataFrame. Thus, if sampling_rate is set to ‘max’ (default), will keep the maximum recorded sampling rate and upsample the channels with lower rate if necessary (using the signal_resample() function). If the sampling rate is set to a given value, will resample the signals to the desired value. Note that the value of the sampling rate is outputted along with the data.
resample_method (str) – Method of resampling (see signal_resample()).
impute_missing (bool) – Sometimes, due to connections issues, the signal has some holes (short periods without signal). If ‘impute_missing’ is True, will automatically fill the signal interruptions using padding.
- Returns
df (DataFrame) – The AcqKnowledge file as a pandas dataframe.
sampling rate (int) – The sampling rate at which the data is sampled.
See also
signal_resample
Example
>>> import neurokit2 as nk >>> >>> data, sampling_rate = nk.read_acqknowledge('file.acq')
- read_bitalino(filename, sampling_rate='max', resample_method='interpolation')[source]¶
Read and format a OpenSignals file (e.g., from BITalino) into a pandas’ dataframe.
The function outputs both the dataframe and the sampling rate (retrieved from the OpenSignals file).
- Parameters
filename (str) – Filename (with or without the extension) of an OpenSignals file (e.g., ‘data.txt’).
sampling_rate (int) – Sampling rate (in Hz, i.e., samples/second). Defaults to the original sampling rate at which signals were sampled if set to “max”. If the sampling rate is set to a given value, will resample the signals to the desired value. Note that the value of the sampling rate is outputted along with the data.
resample_method (str) – Method of resampling (see signal_resample()).
- Returns
df (DataFrame) – The BITalino file as a pandas dataframe.
sampling rate (int) – The sampling rate at which the data is sampled.
See also
read_acqknowledge
,signal_resample
Examples
>>> import neurokit2 as nk >>> >>> #data, sampling_rate = nk.read_bitalino("data.txt")
Epochs¶
Submodule for NeuroKit.
- epochs_create(data, events=None, sampling_rate=1000, epochs_start=0, epochs_end='from_events', event_labels=None, event_conditions=None, baseline_correction=False)[source]¶
Epoching a dataframe.
- Parameters
data (DataFrame) – A DataFrame containing the different signal(s) as different columns. If a vector of values is passed, it will be transformed in a DataFrame with a single ‘Signal’ column.
events (list or ndarray or dict) – Events onset location. If a dict is passed (e.g., from
events_find()
), will select only the ‘onset’ list. If an integer is passed, will use this number to create an evenly spaced list of events. If None, will chunk the signal into successive blocks of the set duration.sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
epochs_start (int, list) – Epochs start relative to events_onsets (in seconds). The start can be negative to start epochs before a given event (to have a baseline for instance). An integer can be specified to have the same start for all epochs. A list of equal length to the events can be specified to have a different start for each epoch.
epochs_end (int, list) – Epochs end relative to events_onsets (in seconds). An integer can be specified to have the same end for all epochs. A list of equal length to the events can be specified to have a different end for each epoch. If “from_events”, events must be a dict (from ``events_find()`). Duration from events will be used as epochs_end.
event_labels (list) – A list containing unique event identifiers. If None, will use the event index number.
event_conditions (list) – An optional list containing, for each event, for example the trial category, group or experimental conditions.
baseline_correction (bool) – Defaults to False.
- Returns
dict – A dict containing DataFrames for all epochs.
See also
events_find
,events_plot
,epochs_to_df
,epochs_plot
Examples
>>> import neurokit2 as nk >>> >>> # Get data >>> data = nk.data("bio_eventrelated_100hz") >>> >>> # Find events >>> events = nk.events_find(data["Photosensor"], ... threshold_keep='below', ... event_conditions=["Negative", "Neutral", "Neutral", "Negative"]) >>> fig1 = nk.events_plot(events, data) >>> fig1 >>> >>> # Create epochs >>> epochs = nk.epochs_create(data, events, sampling_rate=100, epochs_end=3) >>> fig2 = nk.epochs_plot(epochs) >>> fig2 >>> >>> # Baseline correction >>> epochs = nk.epochs_create(data, events, sampling_rate=100, epochs_end=3, baseline_correction=True) >>> fig3 = nk.epochs_plot(epochs) >>> fig3 >>> >>> # Chunk into n blocks of 1 second >>> epochs = nk.epochs_create(data, sampling_rate=100, epochs_end=1)
- epochs_plot(epochs, legend=True, show=True)[source]¶
Plot epochs.
- Parameters
epochs (dict) – A dict containing one DataFrame per event/trial. Usually obtained via epochs_create().
legend (bool) – Display the legend (the key of each epoch).
show (bool) – If True, will return a plot. If False, will return a DataFrame that can be plotted externally.
- Returns
epochs (dict) – dict containing all epochs.
See also
events_find
,events_plot
,epochs_create
,epochs_to_df
Examples
>>> import neurokit2 as nk >>> >>> # Example with data >>> data = nk.data("bio_eventrelated_100hz") >>> events = nk.events_find(data["Photosensor"], ... threshold_keep='below', ... event_conditions=["Negative", "Neutral", "Neutral", "Negative"]) >>> epochs = nk.epochs_create(data, events, sampling_rate=200, epochs_end=1) >>> fig1 = nk.epochs_plot(epochs) >>> fig1 >>> >>> # Example with ECG Peaks >>> signal = nk.ecg_simulate(duration=10) >>> events = nk.ecg_findpeaks(signal) >>> epochs = nk.epochs_create(signal, events=events["ECG_R_Peaks"], epochs_start=-0.5, epochs_end=0.5) >>> fig2 = nk.epochs_plot(epochs) >>> fig2
- epochs_to_array(epochs)[source]¶
Convert epochs to an array.
TODO: make it work with uneven epochs (not the same length).
- Parameters
epochs (dict) – A dict containing one DataFrame per event/trial. Usually obtained via epochs_create().
- Returns
array – An array containing all signals.
See also
events_find
,events_plot
,epochs_create
,epochs_plot
Examples
>>> import neurokit2 as nk >>> import pandas as pd >>> >>> # Get data >>> signal = nk.signal_simulate(sampling_rate=100) >>> >>> # Create epochs >>> epochs = nk.epochs_create(signal, events=[400, 430, 460], sampling_rate=100, epochs_end=1) >>> X = nk.epochs_to_array(epochs) >>> nk.signal_plot(X.T)
- epochs_to_df(epochs)[source]¶
Convert epochs to a DataFrame.
- Parameters
epochs (dict) – A dict containing one DataFrame per event/trial. Usually obtained via epochs_create().
- Returns
DataFrame – A DataFrame containing all epochs identifiable by the ‘Label’ column, which time axis is stored in the ‘Time’ column.
See also
events_find
,events_plot
,epochs_create
,epochs_plot
Examples
>>> import neurokit2 as nk >>> import pandas as pd >>> >>> # Get data >>> data = pd.read_csv("https://raw.githubusercontent.com/neuropsychology/NeuroKit/dev/data/bio_eventrelated_100hz.csv") >>> >>> # Find events >>> events = nk.events_find(data["Photosensor"], ... threshold_keep='below', ... event_conditions=["Negative", "Neutral", "Neutral", "Negative"]) >>> fig = nk.events_plot(events, data) >>> fig >>> >>> # Create epochs >>> epochs = nk.epochs_create(data, events, sampling_rate=200, epochs_end=3) >>> data = nk.epochs_to_df(epochs)
Statistics¶
Submodule for NeuroKit.
- cluster(data, method='kmeans', n_clusters=2, random_state=None, optimize=False, **kwargs)[source]¶
Performs clustering of data according to different algorithms.
- Parameters
data (np.ndarray) – Matrix array of data (E.g., an array (channels, times) of M/EEG data).
method (str) – The algorithm for clustering. Can be one of ‘kmeans’ (default), modified k-means algorithm ‘kmod’, ‘kmedoids’ (k-centers or k-medoids clustering), ‘pca’ (Principal Component Analysis), ‘ica’ (Independent Component Analysis), ‘agglomerative’ (Atomize and Agglomerate Hierarchical Clustering), ‘hierarchical’, ‘spectral’, ‘mixture’, ‘mixturebayesian’. See
sklearn
for methods details.n_clusters (int) – The desired number of clusters.
random_state (Union[int, numpy.random.RandomState]) – The
RandomState
for the random number generator. Defaults toNone
, in which case a different random state is chosen each time this function is called.optimize (bool) – To use a new optimized method in https://www.biorxiv.org/content/10.1101/289850v1.full.pdf. For the Kmeans modified method. Default to False.
**kwargs – Other arguments to be passed into
sklearn
functions.
- Returns
clustering (DataFrame) – Information about the distance of samples from their respective clusters.
clusters (np.ndarray) – Coordinates of cluster centers, which has a shape of n_clusters x n_features.
info (dict) – Information about the number of clusters, the function and model used for clustering.
Examples
>>> import neurokit2 as nk >>> import matplotlib.pyplot as plt >>> >>> # Load the iris dataset >>> data = nk.data("iris") >>> >>> # Cluster using different methods >>> clustering_kmeans, clusters_kmeans, info = nk.cluster(data, method="kmeans", n_clusters=3) >>> clustering_spectral, clusters_spectral, info = nk.cluster(data, method="spectral", n_clusters=3) >>> clustering_hierarchical, clusters_hierarchical, info = nk.cluster(data, method="hierarchical", n_clusters=3) >>> clustering_agglomerative, clusters_agglomerative, info= nk.cluster(data, method="agglomerative", n_clusters=3) >>> clustering_mixture, clusters_mixture, info = nk.cluster(data, method="mixture", n_clusters=3) >>> clustering_bayes, clusters_bayes, info = nk.cluster(data, method="mixturebayesian", n_clusters=3) >>> clustering_pca, clusters_pca, info = nk.cluster(data, method="pca", n_clusters=3) >>> clustering_ica, clusters_ica, info = nk.cluster(data, method="ica", n_clusters=3) >>> clustering_kmod, clusters_kmod, info = nk.cluster(data, method="kmod", n_clusters=3) >>> clustering_kmedoids, clusters_kmedoids, info = nk.cluster(data, method="kmedoids", n_clusters=3) >>> clustering_aahc, clusters_aahc, info = nk.cluster(data, method='aahc_frederic', n_clusters=3) >>> >>> # Visualize classification and 'average cluster' >> fig, axes = plt.subplots(ncols=2, nrows=5) >> axes[0, 0].scatter(data.iloc[:,[2]], data.iloc[:,[3]], c=clustering_kmeans['Cluster']) >> axes[0, 0].scatter(clusters_kmeans[:, 2], clusters_kmeans[:, 3], c='red') >> axes[0, 0].set_title("k-means") >> axes[0, 1].scatter(data.iloc[:,[2]], data.iloc[:, [3]], c=clustering_spectral['Cluster']) >> axes[0, 1].scatter(clusters_spectral[:, 2], clusters_spectral[:, 3], c='red') >> axes[0, 1].set_title("Spectral") >> axes[1, 0].scatter(data.iloc[:,[2]], data.iloc[:,[3]], c=clustering_hierarchical['Cluster']) >> axes[1, 0].scatter(clusters_hierarchical[:, 2], clusters_hierarchical[:, 3], c='red') >> axes[1, 0].set_title("Hierarchical") >> axes[1, 1].scatter(data.iloc[:,[2]], data.iloc[:,[3]], c=clustering_agglomerative['Cluster']) >> axes[1, 1].scatter(clusters_agglomerative[:, 2], clusters_agglomerative[:, 3], c='red') >> axes[1, 1].set_title("Agglomerative") >> axes[2, 0].scatter(data.iloc[:,[2]], data.iloc[:,[3]], c=clustering_mixture['Cluster']) >> axes[2, 0].scatter(clusters_mixture[:, 2], clusters_mixture[:, 3], c='red') >> axes[2, 0].set_title("Mixture") >> axes[2, 1].scatter(data.iloc[:,[2]], data.iloc[:,[3]], c=clustering_bayes['Cluster']) >> axes[2, 1].scatter(clusters_bayes[:, 2], clusters_bayes[:, 3], c='red') >> axes[2, 1].set_title("Bayesian Mixture") >> axes[3, 0].scatter(data.iloc[:,[2]], data.iloc[:,[3]], c=clustering_pca['Cluster']) >> axes[3, 0].scatter(clusters_pca[:, 2], clusters_pca[:, 3], c='red') >> axes[3, 0].set_title("PCA") >> axes[3, 1].scatter(data.iloc[:,[2]], data.iloc[:,[3]], c=clustering_ica['Cluster']) >> axes[3, 1].scatter(clusters_ica[:, 2], clusters_ica[:, 3], c='red') >> axes[3, 1].set_title("ICA") >> axes[4, 0].scatter(data.iloc[:,[2]], data.iloc[:,[3]], c=clustering_kmod['Cluster']) >> axes[4, 0].scatter(clusters_kmod[:, 2], clusters_kmod[:, 3], c='red') >> axes[4, 0].set_title("modified K-means") >> axes[4, 1].scatter(data.iloc[:,[2]], data.iloc[:,[3]], c=clustering_aahc['Cluster']) >> axes[4, 1].scatter(clusters_aahc[:, 2], clusters_aahc[:, 3], c='red') >> axes[4, 1].set_title("AAHC (Frederic's method)")
Park, H. S., & Jun, C. H. (2009). A simple and fast algorithm for K-medoids
clustering. Expert systems with applications, 36(2), 3336-3341.
- cluster_findnumber(data, method='kmeans', n_max=10, show=False, **kwargs)[source]¶
Find the optimal number of clusters based on different metrices of quality.
- Parameters
data (np.ndarray) – An array (channels, times) of M/EEG data.
method (str) – The clustering algorithm to be passed into
nk.cluster()
.n_max (int) – Runs the clustering alogrithm from 1 to n_max desired clusters in
nk.cluster()
with quality metrices produced for each cluster number.show (bool) – Plot indices normalized on the same scale.
**kwargs – Other arguments to be passed into
nk.cluster()
andnk.cluster_quality()
.
- Returns
DataFrame – The different quality scores for each number of clusters: - Score_Silhouette - Score_Calinski - Score_Bouldin - Score_VarianceExplained - Score_GAP - Score_GAPmod - Score_GAP_diff - Score_GAPmod_diff
See also
Examples
>>> import neurokit2 as nk >>> >>> # Load the iris dataset >>> data = nk.data("iris") >>> >>> # How many clusters >>> results = nk.cluster_findnumber(data, method="kmeans", show=True)
- cluster_quality(data, clustering, clusters=None, info=None, n_random=10, **kwargs)[source]¶
Compute quality of the clustering using several metrices.
- Parameters
data (np.ndarray) – A matrix array of data (e.g., channels, sample points of M/EEG data)
clustering (DataFrame) – Information about the distance of samples from their respective clusters, generated from
nk.cluster()
.clusters (np.ndarray) – Coordinates of cluster centers, which has a shape of n_clusters x n_features, generated from
nk.cluster()
.info (dict) – Information about the number of clusters, the function and model used for clustering, generated from
nk.cluster()
.n_random (int) – The number of random initializations to cluster random data for calculating the GAP statistic.
**kwargs – Other argument to be passed on, for instance GFP as ‘sd’ in microstates.
- Returns
individual (DataFrame) – Indices of cluster quality scores for each sample point.
general (DataFrame) – Indices of cluster quality scores for all clusters.
Examples
>>> import neurokit2 as nk >>> import matplotlib.pyplot as plt >>> >>> # Load the iris dataset >>> data = nk.data("iris") >>> >>> # Cluster >>> clustering, clusters, info = nk.cluster(data, method="kmeans", n_clusters=3) >>> >>> # Compute indices of clustering quality >>> individual, general = nk.cluster_quality(data, clustering, clusters, info) >>> general n_Clusters Score_Silhouette ... Score_GAP_sk Score_GAPmod_sk 0 ... ... ... ... ...
[1 rows x 10 columns]
References
Tibshirani, R., Walther, G., & Hastie, T. (2001). Estimating the number of clusters in a
data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2), 411-423.
Mohajer, M., Englmeier, K. H., & Schmid, V. J. (2011). A comparison of Gap statistic
definitions with and without logarithm function. arXiv preprint arXiv:1103.4767.
- cor(x, y, method='pearson', show=False)[source]¶
Density estimation.
Computes kernel density estimates.
- Parameters
x (Union[list, np.array, pd.Series]) – Vectors of values.
y (Union[list, np.array, pd.Series]) – Vectors of values.
method (str) – Correlation method. Can be one of ‘pearson’, ‘spearman’, ‘kendall’.
show (bool) – Draw a scatterplot with a regression line.
- Returns
r – The correlation coefficient.
Examples
>>> import neurokit2 as nk >>> >>> x = [1, 2, 3, 4, 5] >>> y = [3, 1, 5, 6, 6] >>> corr = nk.cor(x, y, method="pearson", show=True) >>> corr
- density(x, desired_length=100, bandwith=1, show=False)[source]¶
Density estimation.
Computes kernel density estimates.
- Parameters
x (Union[list, np.array, pd.Series]) – A vector of values.
desired_length (int) – The amount of values in the returned density estimation.
bandwith (float) – The bandwith of the kernel. The smaller the values, the smoother the estimation.
show (bool) – Display the density plot.
- Returns
x, y – The x axis of the density estimation.
y – The y axis of the density estimation.
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.ecg_simulate(duration=20) >>> x, y = nk.density(signal, bandwith=0.5, show=True) >>> >>> # Bandwidth comparison >>> x, y1 = nk.density(signal, bandwith=0.5) >>> x, y2 = nk.density(signal, bandwith=1) >>> x, y3 = nk.density(signal, bandwith=2) >>> pd.DataFrame({"x": x, "y1": y1, "y2": y2, "y3": y3}).plot(x="x")
- distance(X=None, method='mahalanobis')[source]¶
Distance.
Compute distance using different metrics.
- Parameters
X (array or DataFrame) – A dataframe of values.
method (str) – The method to use. One of ‘mahalanobis’ or ‘mean’ for the average distance from the mean.
- Returns
array – Vector containing the distance values.
Examples
>>> import neurokit2 as nk >>> >>> X = nk.data("iris") >>> vector = nk.distance(X) >>> vector
- fit_error(y, y_predicted, n_parameters=2)[source]¶
Calculate the fit error for a model.
Also specific and direct access functions can be used, such as fit_mse(), fit_rmse() and fit_r2().
- Parameters
y (Union[list, np.array, pd.Series]) – The response variable (the y axis).
y_predicted (Union[list, np.array, pd.Series]) – The fitted data generated by a model.
n_parameters (int) – Number of model parameters (for the degrees of freedom used in R2).
- Returns
dict – A dictionary containing different indices of fit error.
Examples
>>> import neurokit2 as nk >>> >>> y = np.array([-1.0, -0.5, 0, 0.5, 1]) >>> y_predicted = np.array([0.0, 0, 0, 0, 0]) >>> >>> # Master function >>> x = nk.fit_error(y, y_predicted) >>> x >>> >>> # Direct access >>> nk.fit_mse(y, y_predicted) 0.5 >>> >>> nk.fit_rmse(y, y_predicted) 0.7071067811865476 >>> >>> nk.fit_r2(y, y_predicted, adjusted=False) 0.7071067811865475 >>> >>> nk.fit_r2(y, y_predicted, adjusted=True, n_parameters=2) 0.057190958417936755
- fit_loess(y, X=None, alpha=0.75, order=2)[source]¶
Local Polynomial Regression (LOESS)
Performs a LOWESS (LOcally WEighted Scatter-plot Smoother) regression.
- Parameters
y (Union[list, np.array, pd.Series]) – The response variable (the y axis).
X (Union[list, np.array, pd.Series]) – Explanatory variable (the x axis). If ‘None’, will treat y as a continuous signal (useful for smoothing).
alpha (float) – The parameter which controls the degree of smoothing, which corresponds to the proportion of the samples to include in local regression.
order (int) – Degree of the polynomial to fit. Can be 1 or 2 (default).
- Returns
array – Prediciton of the LOESS algorithm.
See also
signal_smooth
,signal_detrend
,fit_error
Examples
>>> import pandas as pd >>> import neurokit2 as nk >>> >>> signal = np.cos(np.linspace(start=0, stop=10, num=1000)) >>> distorted = nk.signal_distort(signal, noise_amplitude=[0.3, 0.2, 0.1], noise_frequency=[5, 10, 50]) >>> >>> pd.DataFrame({ "Raw": distorted, "Loess_1": nk.fit_loess(distorted, order=1), ... "Loess_2": nk.fit_loess(distorted, order=2)}).plot()
References
- fit_mixture(X=None, n_clusters=2)[source]¶
Gaussian Mixture Model.
Performs a polynomial regression of given order.
- Parameters
X (Union[list, np.array, pd.Series]) – The values to classify.
n_clusters (int) – Number of components to look for.
- Returns
pd.DataFrame – DataFrame containing the probability of belongning to each cluster.
See also
signal_detrend
,fit_error
Examples
>>> import pandas as pd >>> import neurokit2 as nk >>> >>> x = nk.signal_simulate() >>> probs = nk.fit_mixture(x, n_clusters=2) >>> fig = nk.signal_plot([x, probs["Cluster_0"], probs["Cluster_1"]], standardize=True) >>> fig
- fit_polynomial(y, X=None, order=2)[source]¶
Polynomial Regression.
Performs a polynomial regression of given order.
- Parameters
y (Union[list, np.array, pd.Series]) – The response variable (the y axis).
X (Union[list, np.array, pd.Series]) – Explanatory variable (the x axis). If ‘None’, will treat y as a continuous signal.
order (int) – The order of the polynomial. 0, 1 or > 1 for a baseline, linear or polynomial fit, respectively. Can also be ‘auto’, it which case it will attempt to find the optimal order to minimize the RMSE.
- Returns
array – Prediction of the regression.
See also
signal_detrend
,fit_error
,fit_polynomial_findorder
Examples
>>> import pandas as pd >>> import neurokit2 as nk >>> >>> y = np.cos(np.linspace(start=0, stop=10, num=100)) >>> >>> pd.DataFrame({"y": y, "Poly_0": nk.fit_polynomial(y, order=0), ... "Poly_1": nk.fit_polynomial(y, order=1), ... "Poly_2": nk.fit_polynomial(y, order=2), ... "Poly_3": nk.fit_polynomial(y, order=3), "Poly_5": nk.fit_polynomial(y, order=5), ... "Poly_auto": nk.fit_polynomial(y, order='auto')}).plot()
- fit_polynomial_findorder(y, X=None, max_order=6)[source]¶
Polynomial Regression.
Find the optimal order for polynomial fitting. Currently, the only method implemented is RMSE minimization.
- Parameters
y (Union[list, np.array, pd.Series]) – The response variable (the y axis).
X (Union[list, np.array, pd.Series]) – Explanatory variable (the x axis). If ‘None’, will treat y as a continuous signal.
max_order (int) – The maximum order to test.
- Returns
int – Optimal order.
See also
Examples
>>> import neurokit2 as nk >>> >>> y = np.cos(np.linspace(start=0, stop=10, num=100)) >>> >>> nk.fit_polynomial_findorder(y, max_order=10) 9
- hdi(x, ci=0.95, show=False, **kwargs)[source]¶
Highest Density Interval (HDI)
Compute the Highest Density Interval (HDI) of a distribution. All points within this interval have a higher probability density than points outside the interval. The HDI can be used in the context of uncertainty characterisation of posterior distributions (in the Bayesian farmework) as Credible Interval (CI). Unlike equal-tailed intervals that typically exclude 2.5% from each tail of the distribution and always include the median, the HDI is not equal-tailed and therefore always includes the mode(s) of posterior distributions.
- Parameters
x (Union[list, np.array, pd.Series]) – A vector of values.
ci (float) – Value of probability of the (credible) interval - CI (between 0 and 1) to be estimated. Default to .95 (95%).
show (bool) – If True, the function will produce a figure.
**kwargs (Line2D properties) – Other arguments to be passed to
density()
.
See also
- Returns
float(s) – The HDI low and high limits.
fig – Distribution plot.
Examples
>>> import numpy as np >>> import neurokit2 as nk >>> >>> x = np.random.normal(loc=0, scale=1, size=100000) >>> ci_min, ci_high = nk.hdi(x, ci=0.95, show=True)
- mad(x, constant=1.4826, **kwargs)[source]¶
Median Absolute Deviation: a “robust” version of standard deviation.
- Parameters
x (Union[list, np.array, pd.Series]) – A vector of values.
constant (float) – Scale factor. Use 1.4826 for results similar to default R.
- Returns
float – The MAD.
Examples
>>> import neurokit2 as nk >>> nk.mad([2, 8, 7, 5, 4, 12, 5, 1]) 3.7064999999999997
References
- mutual_information(x, y, method='varoquaux', bins=256, sigma=1, normalized=True)[source]¶
Computes the (normalized) mutual information (MI) between two vectors from a joint histogram. The mutual information of two variables is a measure of the mutual dependence between them. More specifically, it quantifies the “amount of information” obtained about one variable by observing the other variable.
- Parameters
x (Union[list, np.array, pd.Series]) – A vector of values.
y (Union[list, np.array, pd.Series]) – A vector of values.
method (str) – Method to use. Can either be ‘varoquaux’ or ‘nolitsa’.
bins (int) – Number of bins to use while creating the histogram.
sigma (float) – Sigma for Gaussian smoothing of the joint histogram. Only used if method==’varoquaux’.
normalized (book) – Compute normalised mutual information. Only used if method==’varoquaux’.
- Returns
float – The computed similariy measure.
Examples
>>> import neurokit2 as nk >>> >>> x = [3, 3, 5, 1, 6, 3] >>> y = [5, 3, 1, 3, 4, 5] >>> >>> nk.mutual_information(x, y, method="varoquaux") 0.2360075... >>> >>> nk.mutual_information(x, y, method="nolitsa") 1.4591479...
References
Studholme, jhill & jhawkes (1998). “A normalized entropy measure of 3-D medical image alignment”.
in Proc. Medical Imaging 1998, vol. 3338, San Diego, CA, pp. 132-143.
- rescale(data, to=[0, 1], scale=None)[source]¶
Rescale data.
Rescale a numeric variable to a new range.
- Parameters
data (Union[list, np.array, pd.Series]) – Raw data.
to (list) – New range of values of the data after rescaling.
scale (list) – A list or tuple of two values specifying the actual range of the data. If None, the minimum and the maximum of the provided data will be used.
- Returns
list – The rescaled values.
Examples
>>> import neurokit2 as nk >>> >>> nk.rescale([3, 1, 2, 4, 6], to=[0, 1]) [0.4, 0.0, 0.2, 0.6000000000000001, 1.0]
- standardize(data, robust=False, window=None, **kwargs)[source]¶
Standardization of data.
Performs a standardization of data (Z-scoring), i.e., centering and scaling, so that the data is expressed in terms of standard deviation (i.e., mean = 0, SD = 1) or Median Absolute Deviance (median = 0, MAD = 1).
- Parameters
data (Union[list, np.array, pd.Series]) – Raw data.
robust (bool) – If True, centering is done by substracting the median from the variables and dividing it by the median absolute deviation (MAD). If False, variables are standardized by substracting the mean and dividing it by the standard deviation (SD).
window (int) – Perform a rolling window standardization, i.e., apply a standardization on a window of the specified number of samples that rolls along the main axis of the signal. Can be used for complex detrending.
**kwargs (optional) – Other arguments to be passed to
pandas.rolling()
.
- Returns
list – The standardized values.
Examples
>>> import neurokit2 as nk >>> import pandas as pd >>> >>> # Simple example >>> nk.standardize([3, 1, 2, 4, 6, np.nan]) [...] >>> nk.standardize([3, 1, 2, 4, 6, np.nan], robust=True) [...] >>> nk.standardize(np.array([[1, 2, 3, 4], [5, 6, 7, 8]]).T) array(...) >>> nk.standardize(pd.DataFrame({"A": [3, 1, 2, 4, 6, np.nan], ... "B": [3, 1, 2, 4, 6, 5]})) A B 0 ... ... ... >>> >>> # Rolling standardization of a signal >>> signal = nk.signal_simulate(frequency=[0.1, 2], sampling_rate=200) >>> z = nk.standardize(signal, window=200) >>> nk.signal_plot([signal, z], standardize=True)
Complexity¶
Submodule for NeuroKit.
- complexity_apen(signal, delay=1, dimension=2, r='default', corrected=False, **kwargs)¶
Approximate entropy (ApEn)
Python implementations of the approximate entropy (ApEn) and its corrected version (cApEn). Approximate entropy is a technique used to quantify the amount of regularity and the unpredictability of fluctuations over time-series data. The advantages of ApEn include lower computational demand (ApEn can be designed to work for small data samples (< 50 data points) and can be applied in real time) and less sensitive to noise. However, ApEn is heavily dependent on the record length and lacks relative consistency.
This function can be called either via
entropy_approximate()
orcomplexity_apen()
, and the corrected version viacomplexity_capen()
.- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see
delay_optimal()
).dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.
r (float) – Tolerance (similarity threshold). It corresponds to the filtering level - max absolute difference between segments. If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).
corrected (bool) – If true, will compute corrected ApEn (cApEn), see Porta (2007).
**kwargs – Other arguments.
See also
- Returns
float – The approximate entropy as float value.
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=2, frequency=5) >>> entropy1 = nk.entropy_approximate(signal) >>> entropy1 >>> entropy2 = nk.entropy_approximate(signal, corrected=True) >>> entropy2
References
EntroPy <https://github.com/raphaelvallat/entropy>`_
Sabeti, M., Katebi, S., & Boostani, R. (2009). Entropy and complexity measures for EEG signal classification of schizophrenic and control participants. Artificial intelligence in medicine, 47(3), 263-274.
Shi, B., Zhang, Y., Yuan, C., Wang, S., & Li, P. (2017). Entropy analysis of short-term heartbeat interval time series during regular walking. Entropy, 19(10), 568.
- complexity_capen(signal, delay=1, dimension=2, r='default', *, corrected=True, **kwargs)¶
Approximate entropy (ApEn)
Python implementations of the approximate entropy (ApEn) and its corrected version (cApEn). Approximate entropy is a technique used to quantify the amount of regularity and the unpredictability of fluctuations over time-series data. The advantages of ApEn include lower computational demand (ApEn can be designed to work for small data samples (< 50 data points) and can be applied in real time) and less sensitive to noise. However, ApEn is heavily dependent on the record length and lacks relative consistency.
This function can be called either via
entropy_approximate()
orcomplexity_apen()
, and the corrected version viacomplexity_capen()
.- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see
delay_optimal()
).dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.
r (float) – Tolerance (similarity threshold). It corresponds to the filtering level - max absolute difference between segments. If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).
corrected (bool) – If true, will compute corrected ApEn (cApEn), see Porta (2007).
**kwargs – Other arguments.
See also
- Returns
float – The approximate entropy as float value.
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=2, frequency=5) >>> entropy1 = nk.entropy_approximate(signal) >>> entropy1 >>> entropy2 = nk.entropy_approximate(signal, corrected=True) >>> entropy2
References
EntroPy <https://github.com/raphaelvallat/entropy>`_
Sabeti, M., Katebi, S., & Boostani, R. (2009). Entropy and complexity measures for EEG signal classification of schizophrenic and control participants. Artificial intelligence in medicine, 47(3), 263-274.
Shi, B., Zhang, Y., Yuan, C., Wang, S., & Li, P. (2017). Entropy analysis of short-term heartbeat interval time series during regular walking. Entropy, 19(10), 568.
- complexity_cmse(signal, scale='default', dimension=2, r='default', *, composite=True, refined=False, fuzzy=False, show=False, **kwargs)¶
Multiscale entropy (MSE) and its Composite (CMSE), Refined (RCMSE) or fuzzy version.
Python implementations of the multiscale entropy (MSE), the composite multiscale entropy (CMSE), the refined composite multiscale entropy (RCMSE) or their fuzzy version (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).
This function can be called either via
entropy_multiscale()
orcomplexity_mse()
. Moreover, variants can be directly accessed viacomplexity_cmse()
, complexity_rcmse()`,complexity_fuzzymse()
andcomplexity_fuzzyrcmse()
.- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
scale (str or int or list) – A list of scale factors used for coarse graining the time series. If ‘default’, will use
range(len(signal) / (dimension + 10))
(see discussion here). If ‘max’, will use all scales until half the length of the signal. If an integer, will create a range until the specified int.dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.
r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).
composite (bool) – Returns the composite multiscale entropy (CMSE), more accurate than MSE.
refined (bool) – Returns the ‘refined’ composite MSE (RCMSE; Wu, 2014)
fuzzy (bool) – Returns the fuzzy (composite) multiscale entropy (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).
show (bool) – Show the entropy values for each scale factor.
**kwargs – Optional arguments.
- Returns
float – The point-estimate of multiscale entropy (MSE) as a float value corresponding to the area under the MSE values curvee, which is essentially the sum of sample entropy values over the range of scale factors.
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=2, frequency=5) >>> entropy1 = nk.entropy_multiscale(signal, show=True) >>> entropy1 >>> entropy2 = nk.entropy_multiscale(signal, show=True, composite=True) >>> entropy2 >>> entropy3 = nk.entropy_multiscale(signal, show=True, refined=True) >>> entropy3
References
pyEntropy <https://github.com/nikdon/pyEntropy>`_
Richman, J. S., & Moorman, J. R. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology, 278(6), H2039-H2049.
Costa, M., Goldberger, A. L., & Peng, C. K. (2005). Multiscale entropy analysis of biological signals. Physical review E, 71(2), 021906.
Gow, B. J., Peng, C. K., Wayne, P. M., & Ahn, A. C. (2015). Multiscale entropy analysis of center-of-pressure dynamics in human postural control: methodological considerations. Entropy, 17(12), 7926-7947.
Norris, P. R., Anderson, S. M., Jenkins, J. M., Williams, A. E., & Morris Jr, J. A. (2008). Heart rate multiscale entropy at three hours predicts hospital mortality in 3,154 trauma patients. Shock, 30(1), 17-22.
Liu, Q., Wei, Q., Fan, S. Z., Lu, C. W., Lin, T. Y., Abbod, M. F., & Shieh, J. S. (2012). Adaptive computation of multiscale entropy and its application in EEG signals for monitoring depth of anesthesia during surgery. Entropy, 14(6), 978-992.
- complexity_d2(signal, delay=1, dimension=2, r=64, show=False)¶
Correlation Dimension.
Python implementation of the Correlation Dimension D2 of a signal.
This function can be called either via
fractal_correlation()
orcomplexity_d2()
.- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see
delay_optimal()
).dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.
r (str or int or list) – The sequence of radiuses to test. If an integer is passed, will get an exponential sequence ranging from 2.5% to 50% of the distance range. Methods implemented in other packages can be used via setting
r='nolds'
orr='Corr_Dim'
.show (bool) – Plot of correlation dimension if True. Defaults to False.
- Returns
D2 (float) – The correlation dimension D2.
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=2, frequency=5) >>> >>> fractal1 = nk.fractal_correlation(signal, r="nolds", show=True) >>> fractal1 >>> fractal2 = nk.fractal_correlation(signal, r=32, show=True) >>> fractal2 >>> >>> signal = nk.rsp_simulate(duration=120, sampling_rate=50) >>> >>> fractal3 = nk.fractal_correlation(signal, r="nolds", show=True) >>> fractal3 >>> fractal4 = nk.fractal_correlation(signal, r=32, show=True) >>> fractal4
References
Bolea, J., Laguna, P., Remartínez, J. M., Rovira, E., Navarro, A., & Bailón, R. (2014). Methodological framework for estimating the correlation dimension in HRV signals. Computational and mathematical methods in medicine, 2014.
Boon, M. Y., Henry, B. I., Suttle, C. M., & Dain, S. J. (2008). The correlation dimension: A useful objective measure of the transient visual evoked potential?. Journal of vision, 8(1), 6-6.
- complexity_delay(signal, delay_max=100, method='fraser1986', show=False)[source]¶
Estimate optimal Time Delay (tau) for time-delay embedding.
The time delay (Tau) is one of the two critical parameters involved in the construction of the time-delay embedding of a signal.
Several authors suggested different methods to guide the choice of Tau:
Fraser and Swinney (1986) suggest using the first local minimum of the mutual information between the delayed and non-delayed time series, effectively identifying a value of tau for which they share the least information.
Theiler (1990) suggested to select Tau where the autocorrelation between the signal and its lagged version at Tau first crosses the value 1/e.
Casdagli (1991) suggests instead taking the first zero-crossing of the autocorrelation.
Rosenstein (1993) suggests to the point close to 40% of the slope of the average displacement from the diagonal (ADFD).
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
delay_max (int) – The maximum time delay (Tau or lag) to test.
method (str) – Correlation method. Can be one of ‘fraser1986’, ‘theiler1990’, ‘casdagli1991’, ‘rosenstein1993’.
show (bool) – If true, will plot the mutual information values for each value of tau.
- Returns
int – Optimal time delay.
See also
Examples
>>> import neurokit2 as nk >>> >>> # Artifical example >>> signal = nk.signal_simulate(duration=10, frequency=1, noise=0.01) >>> nk.signal_plot(signal) >>> >>> delay = nk.complexity_delay(signal, delay_max=1000, show=True, method="fraser1986") >>> delay = nk.complexity_delay(signal, delay_max=1000, show=True, method="theiler1990") >>> delay = nk.complexity_delay(signal, delay_max=1000, show=True, method="casdagli1991") >>> delay = nk.complexity_delay(signal, delay_max=1000, show=True, method="rosenstein1993") >>> >>> # Realistic example >>> ecg = nk.ecg_simulate(duration=60*6, sampling_rate=150) >>> signal = nk.ecg_rate(nk.ecg_peaks(ecg, sampling_rate=150), sampling_rate=150, desired_length=len(ecg)) >>> nk.signal_plot(signal) >>> >>> delay = nk.complexity_delay(signal, delay_max=1000, show=True)
References
Gautama, T., Mandic, D. P., & Van Hulle, M. M. (2003, April). A differential entropy based method for determining the optimal embedding parameters of a signal. In 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). (Vol. 6, pp. VI-29). IEEE.
Camplani, M., & Cannas, B. (2009). The role of the embedding dimension and time delay in time series forecasting. IFAC Proceedings Volumes, 42(7), 316-320.
Rosenstein, M. T., Collins, J. J., & De Luca, C. J. (1994). Reconstruction expansion as a geometry-based framework for choosing proper delay times. Physica-Section D, 73(1), 82-98.
- complexity_dfa(signal, windows='default', overlap=True, integrate=True, order=1, multifractal=False, q=2, show=False)¶
(Multifractal) Detrended Fluctuation Analysis (DFA or MFDFA).
Python implementation of Detrended Fluctuation Analysis (DFA) or Multifractal DFA of a signal. Detrended fluctuation analysis, much like the Hurst exponent, is used to find long-term statistical dependencies in time series.
This function can be called either via fractal_dfa() or complexity_dfa(), and its multifractal variant can be directly accessed via fractal_mfdfa() or complexity_mfdfa().
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
windows (list) – A list containing the lengths of the windows (number of data points in each subseries). Also referred to as ‘lag’ or ‘scale’. If ‘default’, will set it to a logarithmic scale (so that each window scale has the same weight) with a minimum of 4 and maximum of a tenth of the length (to have more than 10 windows to calculate the average fluctuation).
overlap (bool) – Defaults to True, where the windows will have a 50% overlap with each other, otherwise non-overlapping windows will be used.
integrate (bool) – It is common practice to convert the signal to a random walk (i.e., detrend and integrate, which corresponds to the signal ‘profile’). Note that it leads to the flattening of the signal, which can lead to the loss of some details (see Ihlen, 2012 for an explanation). Note that for strongly anticorrelated signals, this transformation should be applied two times (i.e., provide np.cumsum(signal - np.mean(signal)) instead of signal).
order (int) – The order of the polynomial trend for detrending, 1 for the linear trend.
multifractal (bool) – If true, compute Multifractal Detrended Fluctuation Analysis (MFDFA), in which case the argument q is taken into account.
q (list or np.array (default 2)) – The sequence of fractal exponents when multifractal=True. Must be a sequence between -10 and 10 (note that zero will be removed, since the code does not converge there). Setting q = 2 (default) gives a result of a standard DFA. For instance, Ihlen (2012) uses q = [-5, -3, -1, 0, 1, 3, 5]. In general, positive q moments amplify the contribution of fractal components with larger amplitude and negative q moments amplify the contribution of fractal with smaller amplitude (Kantelhardt et al., 2002)
show (bool) – Visualise the trend between the window size and the fluctuations.
- Returns
dfa (dict) – If multifractal is False, the dictionary contains q value, a series of windows, fluctuations of each window and the slopes value of the log2(windows) versus log2(fluctuations) plot. If multifractal is True, the dictionary additionally contains the parameters of the singularity spectrum (scaling exponents, singularity dimension, singularity strength; see singularity_spectrum() for more information).
Examples
>>> import neurokit2 as nk >>> import numpy as np >>> >>> signal = nk.signal_simulate(duration=10, noise=0.05) >>> dfa1 = nk.fractal_dfa(signal, show=True) >>> dfa1 >>> dfa2 = nk.fractal_mfdfa(signal, q=np.arange(-5, 6), show=True) >>> dfa2
References
Ihlen, E. A. F. E. (2012). Introduction to multifractal detrended fluctuation analysis in Matlab. Frontiers in physiology, 3, 141.
Kantelhardt, J. W., Zschiegner, S. A., Koscielny-Bunde, E., Havlin, S., Bunde, A., & Stanley, H. E. (2002). Multifractal detrended fluctuation analysis of nonstationary time series. Physica A: Statistical Mechanics and its Applications, 316(1-4), 87-114.
Hardstone, R., Poil, S. S., Schiavone, G., Jansen, R., Nikulin, V. V., Mansvelder, H. D., & Linkenkaer-Hansen, K. (2012). Detrended fluctuation analysis: a scale-free view on neuronal oscillations. Frontiers in physiology, 3, 450.
- complexity_dimension(signal, delay=1, dimension_max=20, method='afnn', show=False, R=10.0, A=2.0, **kwargs)[source]¶
Estimate optimal Dimension (m) for time-delay embedding.
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see
complexity_delay()
).dimension_max (int) – The maximum embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’) to test.
method (str) – Method can either be afnn (average false nearest neighbour) or fnn (false nearest neighbour).
show (bool) – Visualize the result.
R (float) – Relative tolerance (for fnn method).
A (float) – Absolute tolerance (for fnn method)
**kwargs – Other arguments.
- Returns
int – Optimal dimension.
See also
Examples
>>> import neurokit2 as nk >>> >>> # Artifical example >>> signal = nk.signal_simulate(duration=10, frequency=1, noise=0.01) >>> delay = nk.complexity_delay(signal, delay_max=500) >>> >>> values = nk.complexity_dimension(signal, delay=delay, dimension_max=20, show=True)
References
Cao, L. (1997). Practical method for determining the minimum embedding dimension of a scalar time series. Physica D: Nonlinear Phenomena, 110(1-2), 43-50.
- complexity_embedding(signal, delay=1, dimension=3, show=False)[source]¶
Time-delay embedding of a time series (a signal)
A dynamical system can be described by a vector of numbers, called its ‘state’, that aims to provide a complete description of the system at some point in time. The set of all possible states is called the ‘state space’.
Takens’s (1981) embedding theorem suggests that a sequence of measurements of a dynamic system includes in itself all the information required to completely reconstruct the state space. Delay coordinate embedding attempts to identify the state s of the system at some time t by searching the past history of observations for similar states, and, by studying the evolution of similar states, infer information about the future of the system.
How to visualize the dynamics of a system? A sequence of state values over time is called a trajectory. Depending on the system, different trajectories can evolve to a common subset of state space called an attractor. The presence and behavior of attractors gives intuition about the underlying dynamical system. We can visualize the system and its attractors by plotting the trajectory of many different initial state values and numerically integrating them to approximate their continuous time evolution on discrete computers.
This function is adapted from EntroPy and is equivalent to the delay_embedding() function from ‘nolds’.
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see
delay_optimal()
).dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.
show (bool) – Plot the reconstructed attractor.
- Returns
array – Embedded time-series, of shape (n_times - (order - 1) * delay, order)
See also
embedding_delay
,embedding_dimension
Examples
>>> import neurokit2 as nk >>> >>> # Artifical example >>> signal = nk.signal_simulate(duration=2, frequency=5, noise=0.01) >>> >>> embedded = nk.complexity_embedding(signal, delay=50, dimension=2, show=True) >>> embedded = nk.complexity_embedding(signal, delay=50, dimension=3, show=True) >>> embedded = nk.complexity_embedding(signal, delay=50, dimension=4, show=True) >>> >>> # Realistic example >>> ecg = nk.ecg_simulate(duration=60*4, sampling_rate=200) >>> signal = nk.ecg_rate(nk.ecg_peaks(ecg, sampling_rate=200)[0], sampling_rate=200, desired_length=len(ecg)) >>> >>> embedded = nk.complexity_embedding(signal, delay=250, dimension=2, show=True) >>> embedded = nk.complexity_embedding(signal, delay=250, dimension=3, show=True) >>> embedded = nk.complexity_embedding(signal, delay=250, dimension=4, show=True)
References
Gautama, T., Mandic, D. P., & Van Hulle, M. M. (2003, April). A differential entropy based method for determining the optimal embedding parameters of a signal. In 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). (Vol. 6, pp. VI-29). IEEE.
- complexity_fuzzycmse(signal, scale='default', dimension=2, r='default', *, composite=True, refined=False, fuzzy=True, show=False, **kwargs)¶
Multiscale entropy (MSE) and its Composite (CMSE), Refined (RCMSE) or fuzzy version.
Python implementations of the multiscale entropy (MSE), the composite multiscale entropy (CMSE), the refined composite multiscale entropy (RCMSE) or their fuzzy version (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).
This function can be called either via
entropy_multiscale()
orcomplexity_mse()
. Moreover, variants can be directly accessed viacomplexity_cmse()
, complexity_rcmse()`,complexity_fuzzymse()
andcomplexity_fuzzyrcmse()
.- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
scale (str or int or list) – A list of scale factors used for coarse graining the time series. If ‘default’, will use
range(len(signal) / (dimension + 10))
(see discussion here). If ‘max’, will use all scales until half the length of the signal. If an integer, will create a range until the specified int.dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.
r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).
composite (bool) – Returns the composite multiscale entropy (CMSE), more accurate than MSE.
refined (bool) – Returns the ‘refined’ composite MSE (RCMSE; Wu, 2014)
fuzzy (bool) – Returns the fuzzy (composite) multiscale entropy (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).
show (bool) – Show the entropy values for each scale factor.
**kwargs – Optional arguments.
- Returns
float – The point-estimate of multiscale entropy (MSE) as a float value corresponding to the area under the MSE values curvee, which is essentially the sum of sample entropy values over the range of scale factors.
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=2, frequency=5) >>> entropy1 = nk.entropy_multiscale(signal, show=True) >>> entropy1 >>> entropy2 = nk.entropy_multiscale(signal, show=True, composite=True) >>> entropy2 >>> entropy3 = nk.entropy_multiscale(signal, show=True, refined=True) >>> entropy3
References
pyEntropy <https://github.com/nikdon/pyEntropy>`_
Richman, J. S., & Moorman, J. R. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology, 278(6), H2039-H2049.
Costa, M., Goldberger, A. L., & Peng, C. K. (2005). Multiscale entropy analysis of biological signals. Physical review E, 71(2), 021906.
Gow, B. J., Peng, C. K., Wayne, P. M., & Ahn, A. C. (2015). Multiscale entropy analysis of center-of-pressure dynamics in human postural control: methodological considerations. Entropy, 17(12), 7926-7947.
Norris, P. R., Anderson, S. M., Jenkins, J. M., Williams, A. E., & Morris Jr, J. A. (2008). Heart rate multiscale entropy at three hours predicts hospital mortality in 3,154 trauma patients. Shock, 30(1), 17-22.
Liu, Q., Wei, Q., Fan, S. Z., Lu, C. W., Lin, T. Y., Abbod, M. F., & Shieh, J. S. (2012). Adaptive computation of multiscale entropy and its application in EEG signals for monitoring depth of anesthesia during surgery. Entropy, 14(6), 978-992.
- complexity_fuzzyen(signal, delay=1, dimension=2, r='default', **kwargs)¶
Fuzzy entropy (FuzzyEn)
Python implementations of the fuzzy entropy (FuzzyEn) of a signal.
This function can be called either via
entropy_fuzzy()
orcomplexity_fuzzyen()
.- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see
delay_optimal()
).dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.
r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).
**kwargs – Other arguments.
- Returns
float – The fuzzy entropy as float value.
See also
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=2, frequency=5) >>> entropy = nk.entropy_fuzzy(signal) >>> entropy
- complexity_fuzzymse(signal, scale='default', dimension=2, r='default', composite=False, refined=False, *, fuzzy=True, show=False, **kwargs)¶
Multiscale entropy (MSE) and its Composite (CMSE), Refined (RCMSE) or fuzzy version.
Python implementations of the multiscale entropy (MSE), the composite multiscale entropy (CMSE), the refined composite multiscale entropy (RCMSE) or their fuzzy version (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).
This function can be called either via
entropy_multiscale()
orcomplexity_mse()
. Moreover, variants can be directly accessed viacomplexity_cmse()
, complexity_rcmse()`,complexity_fuzzymse()
andcomplexity_fuzzyrcmse()
.- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
scale (str or int or list) – A list of scale factors used for coarse graining the time series. If ‘default’, will use
range(len(signal) / (dimension + 10))
(see discussion here). If ‘max’, will use all scales until half the length of the signal. If an integer, will create a range until the specified int.dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.
r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).
composite (bool) – Returns the composite multiscale entropy (CMSE), more accurate than MSE.
refined (bool) – Returns the ‘refined’ composite MSE (RCMSE; Wu, 2014)
fuzzy (bool) – Returns the fuzzy (composite) multiscale entropy (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).
show (bool) – Show the entropy values for each scale factor.
**kwargs – Optional arguments.
- Returns
float – The point-estimate of multiscale entropy (MSE) as a float value corresponding to the area under the MSE values curvee, which is essentially the sum of sample entropy values over the range of scale factors.
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=2, frequency=5) >>> entropy1 = nk.entropy_multiscale(signal, show=True) >>> entropy1 >>> entropy2 = nk.entropy_multiscale(signal, show=True, composite=True) >>> entropy2 >>> entropy3 = nk.entropy_multiscale(signal, show=True, refined=True) >>> entropy3
References
pyEntropy <https://github.com/nikdon/pyEntropy>`_
Richman, J. S., & Moorman, J. R. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology, 278(6), H2039-H2049.
Costa, M., Goldberger, A. L., & Peng, C. K. (2005). Multiscale entropy analysis of biological signals. Physical review E, 71(2), 021906.
Gow, B. J., Peng, C. K., Wayne, P. M., & Ahn, A. C. (2015). Multiscale entropy analysis of center-of-pressure dynamics in human postural control: methodological considerations. Entropy, 17(12), 7926-7947.
Norris, P. R., Anderson, S. M., Jenkins, J. M., Williams, A. E., & Morris Jr, J. A. (2008). Heart rate multiscale entropy at three hours predicts hospital mortality in 3,154 trauma patients. Shock, 30(1), 17-22.
Liu, Q., Wei, Q., Fan, S. Z., Lu, C. W., Lin, T. Y., Abbod, M. F., & Shieh, J. S. (2012). Adaptive computation of multiscale entropy and its application in EEG signals for monitoring depth of anesthesia during surgery. Entropy, 14(6), 978-992.
- complexity_fuzzyrcmse(signal, scale='default', dimension=2, r='default', composite=False, *, refined=True, fuzzy=True, show=False, **kwargs)¶
Multiscale entropy (MSE) and its Composite (CMSE), Refined (RCMSE) or fuzzy version.
Python implementations of the multiscale entropy (MSE), the composite multiscale entropy (CMSE), the refined composite multiscale entropy (RCMSE) or their fuzzy version (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).
This function can be called either via
entropy_multiscale()
orcomplexity_mse()
. Moreover, variants can be directly accessed viacomplexity_cmse()
, complexity_rcmse()`,complexity_fuzzymse()
andcomplexity_fuzzyrcmse()
.- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
scale (str or int or list) – A list of scale factors used for coarse graining the time series. If ‘default’, will use
range(len(signal) / (dimension + 10))
(see discussion here). If ‘max’, will use all scales until half the length of the signal. If an integer, will create a range until the specified int.dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.
r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).
composite (bool) – Returns the composite multiscale entropy (CMSE), more accurate than MSE.
refined (bool) – Returns the ‘refined’ composite MSE (RCMSE; Wu, 2014)
fuzzy (bool) – Returns the fuzzy (composite) multiscale entropy (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).
show (bool) – Show the entropy values for each scale factor.
**kwargs – Optional arguments.
- Returns
float – The point-estimate of multiscale entropy (MSE) as a float value corresponding to the area under the MSE values curvee, which is essentially the sum of sample entropy values over the range of scale factors.
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=2, frequency=5) >>> entropy1 = nk.entropy_multiscale(signal, show=True) >>> entropy1 >>> entropy2 = nk.entropy_multiscale(signal, show=True, composite=True) >>> entropy2 >>> entropy3 = nk.entropy_multiscale(signal, show=True, refined=True) >>> entropy3
References
pyEntropy <https://github.com/nikdon/pyEntropy>`_
Richman, J. S., & Moorman, J. R. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology, 278(6), H2039-H2049.
Costa, M., Goldberger, A. L., & Peng, C. K. (2005). Multiscale entropy analysis of biological signals. Physical review E, 71(2), 021906.
Gow, B. J., Peng, C. K., Wayne, P. M., & Ahn, A. C. (2015). Multiscale entropy analysis of center-of-pressure dynamics in human postural control: methodological considerations. Entropy, 17(12), 7926-7947.
Norris, P. R., Anderson, S. M., Jenkins, J. M., Williams, A. E., & Morris Jr, J. A. (2008). Heart rate multiscale entropy at three hours predicts hospital mortality in 3,154 trauma patients. Shock, 30(1), 17-22.
Liu, Q., Wei, Q., Fan, S. Z., Lu, C. W., Lin, T. Y., Abbod, M. F., & Shieh, J. S. (2012). Adaptive computation of multiscale entropy and its application in EEG signals for monitoring depth of anesthesia during surgery. Entropy, 14(6), 978-992.
- complexity_lempelziv(signal, threshold='median', normalize=True)[source]¶
Computes Lempel Ziv Complexity (LZC) to quantify the regularity of the signal, by scanning symbolic sequences for new patterns, increasing the complexity count every time a new sequence is detected. Regular signals have a lower number of distinct patterns and thus have low LZC whereas irregular signals are characterized by a high lZC.
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
threshold (str) – Method for partitioning the signal into a binary sequence. Current options are “median” (default) or “mean”, where each data point is assigned 0 if lower than the median or mean of signal respecitvely, and 1 if higher.
normalize (bool) – Defaults to True, to obtain a complexity measure independent of sequence length.
- Returns
float – Lempel Ziv Complexity.
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=2, frequency=5, noise=10) >>> >>> lzc = nk.complexity_lempelziv(signal, threshold="median") >>> lzc
References
Lempel, A., & Ziv, J. (1976). On the complexity of finite sequences. IEEE Transactions on information theory,
22(1), 75-81.
Nagarajan, R. (2002). Quantifying physiological data with Lempel-Ziv complexity-certain issues.
IEEE Transactions on Biomedical Engineering, 49(11), 1371–1373. doi:10.1109/tbme.2002.804582.
Kaspar, F., & Schuster, H. G. (1987). Easily calculable measure for the complexity of spatiotemporal patterns.
Physical Review A, 36(2), 842.
Zhang, Y., Hao, J., Zhou, C., & Chang, K. (2009). Normalized Lempel-Ziv complexity and
its application in bio-sequence analysis. Journal of mathematical chemistry, 46(4), 1203-1212.
- complexity_mfdfa(signal, windows='default', overlap=True, integrate=True, order=1, *, multifractal=True, q=2, show=False)¶
(Multifractal) Detrended Fluctuation Analysis (DFA or MFDFA).
Python implementation of Detrended Fluctuation Analysis (DFA) or Multifractal DFA of a signal. Detrended fluctuation analysis, much like the Hurst exponent, is used to find long-term statistical dependencies in time series.
This function can be called either via fractal_dfa() or complexity_dfa(), and its multifractal variant can be directly accessed via fractal_mfdfa() or complexity_mfdfa().
- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
windows (list) – A list containing the lengths of the windows (number of data points in each subseries). Also referred to as ‘lag’ or ‘scale’. If ‘default’, will set it to a logarithmic scale (so that each window scale has the same weight) with a minimum of 4 and maximum of a tenth of the length (to have more than 10 windows to calculate the average fluctuation).
overlap (bool) – Defaults to True, where the windows will have a 50% overlap with each other, otherwise non-overlapping windows will be used.
integrate (bool) – It is common practice to convert the signal to a random walk (i.e., detrend and integrate, which corresponds to the signal ‘profile’). Note that it leads to the flattening of the signal, which can lead to the loss of some details (see Ihlen, 2012 for an explanation). Note that for strongly anticorrelated signals, this transformation should be applied two times (i.e., provide np.cumsum(signal - np.mean(signal)) instead of signal).
order (int) – The order of the polynomial trend for detrending, 1 for the linear trend.
multifractal (bool) – If true, compute Multifractal Detrended Fluctuation Analysis (MFDFA), in which case the argument q is taken into account.
q (list or np.array (default 2)) – The sequence of fractal exponents when multifractal=True. Must be a sequence between -10 and 10 (note that zero will be removed, since the code does not converge there). Setting q = 2 (default) gives a result of a standard DFA. For instance, Ihlen (2012) uses q = [-5, -3, -1, 0, 1, 3, 5]. In general, positive q moments amplify the contribution of fractal components with larger amplitude and negative q moments amplify the contribution of fractal with smaller amplitude (Kantelhardt et al., 2002)
show (bool) – Visualise the trend between the window size and the fluctuations.
- Returns
dfa (dict) – If multifractal is False, the dictionary contains q value, a series of windows, fluctuations of each window and the slopes value of the log2(windows) versus log2(fluctuations) plot. If multifractal is True, the dictionary additionally contains the parameters of the singularity spectrum (scaling exponents, singularity dimension, singularity strength; see singularity_spectrum() for more information).
Examples
>>> import neurokit2 as nk >>> import numpy as np >>> >>> signal = nk.signal_simulate(duration=10, noise=0.05) >>> dfa1 = nk.fractal_dfa(signal, show=True) >>> dfa1 >>> dfa2 = nk.fractal_mfdfa(signal, q=np.arange(-5, 6), show=True) >>> dfa2
References
Ihlen, E. A. F. E. (2012). Introduction to multifractal detrended fluctuation analysis in Matlab. Frontiers in physiology, 3, 141.
Kantelhardt, J. W., Zschiegner, S. A., Koscielny-Bunde, E., Havlin, S., Bunde, A., & Stanley, H. E. (2002). Multifractal detrended fluctuation analysis of nonstationary time series. Physica A: Statistical Mechanics and its Applications, 316(1-4), 87-114.
Hardstone, R., Poil, S. S., Schiavone, G., Jansen, R., Nikulin, V. V., Mansvelder, H. D., & Linkenkaer-Hansen, K. (2012). Detrended fluctuation analysis: a scale-free view on neuronal oscillations. Frontiers in physiology, 3, 450.
- complexity_mse(signal, scale='default', dimension=2, r='default', composite=False, refined=False, fuzzy=False, show=False, **kwargs)¶
Multiscale entropy (MSE) and its Composite (CMSE), Refined (RCMSE) or fuzzy version.
Python implementations of the multiscale entropy (MSE), the composite multiscale entropy (CMSE), the refined composite multiscale entropy (RCMSE) or their fuzzy version (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).
This function can be called either via
entropy_multiscale()
orcomplexity_mse()
. Moreover, variants can be directly accessed viacomplexity_cmse()
, complexity_rcmse()`,complexity_fuzzymse()
andcomplexity_fuzzyrcmse()
.- Parameters
signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.
scale (str or int or list) – A list of scale factors used for coarse graining the time series. If ‘default’, will use
range(len(signal) / (dimension + 10))
(see discussion here). If ‘max’, will use all scales until half the length of the signal. If an integer, will create a range until the specified int.dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.
r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).
composite (bool) – Returns the composite multiscale entropy (CMSE), more accurate than MSE.
refined (bool) – Returns the ‘refined’ composite MSE (RCMSE; Wu, 2014)
fuzzy (bool) – Returns the fuzzy (composite) multiscale entropy (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).
show (bool) – Show the entropy values for each scale factor.
**kwargs – Optional arguments.
- Returns
float – The point-estimate of multiscale entropy (MSE) as a float value corresponding to the area under the MSE values curvee, which is essentially the sum of sample entropy values over the range of scale factors.
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=2, frequency=5) >>> entropy1 = nk.entropy_multiscale(signal, show=True) >>> entropy1 >>> entropy2 = nk.entropy_multiscale(signal, show=True, composite=True) >>> entropy2 >>> entropy3 = nk.entropy_multiscale(signal,