Large Scale Radio Frequency Signal Classification
Abstract
Existing datasets used to train deep learning models
for narrowband radio frequency (RF) signal
classification lack enough diversity in signal types
and channel impairments to sufficiently assess
model performance in the real world. We introduce
the Sig53 dataset consisting of 5 million
synthetically-generated samples from 53 different
signal classes and expertly chosen impairments.
We also introduce TorchSig, a signals processing
machine learning toolkit that can be used to
generate this dataset. TorchSig incorporates data
handling principles that are common to the vision
domain, and it is meant to serve as an open-source
foundation for future signals machine learning research.
Initial experiments using the Sig53 dataset
are conducted using state of the art (SoTA) convolutional
neural networks (ConvNets) and Transformers.
These experiments reveal Transformers
outperform ConvNets without the need for
additional regularization or a ConvNet teacher,
which is contrary to results from the vision domain.
Additional experiments demonstrate that
TorchSig’s domain-specific data augmentations facilitate
model training, which ultimately benefits
model performance. Finally, TorchSig supports
on-the-fly synthetic data creation at training time,
thus enabling massive scale training sessions with
virtually unlimited datasets.
- I grant gnuradio.org a perpetual, non-exclusive license to distribute this article.
- I certify that I have the right to grant this license.
- I understand that submissions cannot be completely removed once accepted.
- I understand that gnuradio.org reserves the right to reclassify or reject any submission.