Skip to main content


Please Read First! These datasets are from early academic research work in 2016/2017, they have several known errata and are NOT currently used within DeepSig products. We HIGHLY recommend researchers develop their own datasets using basic modulation tools such as in MATLAB or GNU Radio, or use REAL data recorded from over the air! DeepSig provides several supported and vetted datasets for commercial customers which are not provided here -- unfortunately we are not able to provide support, revisions or assistance for these open datasets due to overwhelming demand!

DeepSig’s team has created several small example datasets which were used in early research from the team in modulation recognition – these are made available here for historical and educational usage.  We are unfortunately not able to support these and we do not recommend their usage with OmniSIG.

We recommend researchers and ML engineers create their own datasets using real data for new work and usage!


All datasets provided by Deepsig Inc. are licensed under the Creative Commons Attribution – NonCommercial – ShareAlike 4.0 License (CC BY-NC-SA 4.0). If an alternative license is needed, please contact us at

Please reference this page or our relevant academic papers when using these datasets.

Historical Dataset:

RADIOML 2018.01A

(from 2017)

A dataset which includes both synthetic simulated channel effects of 24 digital and analog modulation types which has been validated.  This dataset was used in our paper Over-the-air deep learning based radio signal classification which was published in 2017 in IEEE Journal of Selected Topics in Signal Processing, which provides additional details and description of the dataset.

Data are stored in hdf5 format as complex floating point values, with 2 million examples, each 1024 samples long.

Historical Dataset:

RADIOML 2016.10A

(from 2016)

A synthetic dataset, generated with GNU Radio, consisting of 11 modulations (8 digital and 3 analog) at varying signal-to-noise ratios. This dataset was first released at the 6th Annual GNU Radio Conference.

This represents a cleaner and more normalized version of the 2016.04C dataset, which this supersedes.  The file is formatted as a “pickle” file which can be opened for example in Python by using cPickle.load(…).

Signal Generation Software: (Warning! These modules are not maintained)


Historical Dataset:

RADIOML 2016.04C

(from 2016)

A synthetic dataset, generated with GNU Radio, consisting of 11 modulations. This is a variable-SNR dataset with moderate LO drift, light fading, and numerous different labeled SNR increments for use in measuring performance across different signal and noise power scenarios.

This dataset was used for the “Convolutional Radio Modulation Recognition Networks” and “Unsupervised Representation Learning of Structured Radio Communications Signals” papers, found on our Publications Page.

There are three variations within this dataset with the following characteristics and labeling:

DeepSig logo

© 2024 DeepSig Inc. All rights reserved.

Privacy Policy
Terms of Use