Abstract: |
Source separation problems are a long-standing and well-studied challenge in signal processing and information sciences. The “Cocktail Party Phenomenon” and other classical source separation problems are vector representable and additive, and thus solvable by well-established linear algebra techniques. However, the proliferation and adoption of Internet-connected devices (e.g., IoT, distributed sensor networks, etc.) have led to a “Cambrian explosion” of data that is available for processing. Much of this data is not readily available for processing because it includes data objects that are categorical or non-additive superpositions (i.e., data not confined to signals). The Data Deconflation Problem refers to the challenge of identifying and separating the individual constituent elements of these complex data objects. Real-world data deconflation scenarios include pattern-of-life tracking (e.g., identifying recreational activities in conjunction with a business trip), multi-target tracking (e.g., occlusions and track assignment challenges), and network situational awareness (e.g., monitoring NATed network traffic, detecting and identifying shadow IT, network steganalysis). This paper details our approach, utilizing Generative Adversarial Networks (GANs) and attention-based Transformers, to solving the data deconflation problem, as well as our experimental application to network situational awareness tasks.
We cover traditional source separation solutions and expound upon why these solutions are inadequate for network monitoring tasks. Background information on GANs and transformers is presented before a description of our architecture and initial experimentation which serves as a proof-of-concept. We then describe experimentation applying our methodology to network monitoring tasks, in particular separating activities and shadow IT devices within double-NATed network traffic. We discuss our results and our methodology’s applicability to other network monitoring tasks, such as network steganalysis and covert channel detection. |