Compromised through Compression: Python source code for DLMS compression privacy analysis & graphing

DOI

Python code (for Python 3.9 & Pandas 1.3.2) to generate the results used in "Compromised through Compression: Privacy Implications of Smart Meter Traffic Analysis".Smart metering comes with risks to privacy. One concern is the possibility of an attacker seeing the traffic that reports the energy use of a household and deriving private information from that. Encryption helps to mask the actual energy measurements, but is not sufficient to cover all risks. One aspect which has yet gone unexplored — and where encryption does not help — is traffic analysis, i.e. whether the length of messages communicating energy measurements can leak privacy-sensitive information to an observer. In this paper we examine whether using encodings or compression for smart metering data could potentially leak information about household energy use. Our analysis is based on the real-world energy use data of ±80 Dutch households.We find that traffic analysis could reveal information about the energy use of individual households if compression is used. As a result, when messages are sent daily, an attacker performing traffic analysis would be able to determine when all the members of a household are away or not using electricity for an entire day. We demonstrate this issue by recognizing when households from our dataset were on holiday. If messages are sent more often, more granular living patterns could likely be determined.We propose a method of encoding the data that is nearly as effective as compression at reducing message size, but does not leak the information that compression leaks. By not requiring compression to achieve the best possible data savings, the risk of traffic analysis is eliminated.This code operates on the relative energy measurements from the "Zonnedael dataset" from Liander N.V. This dataset needs to be obtained separately; see instructions accompanying the code. The code transforms the dataset into absolute measurements such as would be taken by a smart meter. It then generates batch messages covering 24-hour periods starting at midnight, similar to how the Dutch infrastructure batches daily meter readings, in the different possible encodings with and without compression applied. For an explanation of the different encodings, see the paper. The code will then provide statistics on the efficiency of encoding and compression for the entire dataset, and attempt to find the periods of multi-day absences for each household. It will also generate the graphs in the style used in the paper and presentation.

Identifier
DOI https://doi.org/10.17026/dans-2by-bna3
Metadata Access https://phys-techsciences.datastations.nl/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.17026/dans-2by-bna3
Provenance
Creator P.J.M. van Aubel; E. Poll
Publisher DANS Data Station Phys-Tech Sciences
Contributor RU Radboud University
Publication Year 2021
Rights BSD-2-Clause; info:eu-repo/semantics/openAccess; http://opensource.org/licenses/BSD-2-Clause
OpenAccess true
Contact RU Radboud University
Representation
Resource Type Dataset
Format text/xml; text/plain; application/zip; text/markdown
Size 5795; 626; 12920; 20542; 792
Version 2.0
Discipline Construction Engineering and Architecture; Engineering; Engineering Sciences