Tweets used to study reports of food fraud related to fish products 2018

DOI

Data collected from Twitter social media platform (8 June 2018 - 22 June 2018) to study reports of food fraud related to fish products on social media from posts originating in the UK. The dataset contains Tweet IDs and keywords used to search for Tweets using a programatic access via the public Twitter API. Keywords used in this search were generated using a machine learning tool and consisted of combinations of keywords describing terms related to fish and fake.Social media and other forms of online content have enormous potential as a way to understand people's opinions and attitudes, and as a means to observe emerging phenomena - such as disease outbreaks. How might policy makers use such new forms of data to better assess existing policies and help formulate new ones? This one year demonstrator project is a partnership between computer science academics at the University of Aberdeen and officers from Food Standards Scotland which aims to answer this question. Food Standards Scotland is the public-sector food body for Scotland created by the Food (Scotland) Act 2015. It regularly provides policy guidance to ministers in areas such as food hygiene monitoring and reporting, food-related health risks, and food fraud. The project will develop a software tool (the Food Sentiment Observatory) that will be used to explore the role of data from sources such as Twitter, Facebook, and TripAdvisor in three policy areas selected by Food Standards Scotland: - attitudes to the differing food hygiene information systems used in Scotland and the other UK nations; - study of an historical E.coli outbreak to understand effectiveness of monitoring and decision making protocols; - understanding the potential role of social media data in responding to new and emerging forms of food fraud. The Observatory will integrate a number of existing software tools (developed in our recent research) to allow us to mine large volumes of data to identify important textual signals, extract opinions held by individuals or groups, and crucially, to document these data processing operations - to aid transparency of policy decision-making. Given the amount of noise appearing in user-generated online content (such as fake restaurant reviews) it is our intention to investigate methods to extract meaningful and reliable knowledge, to better support policy making.

The search for relevant data content was performed using a custom built data collection module within the Observatory platform (see Related Resources). A public API provided by Twitter was utilised to gather all social media messages (Tweets) matching a specific set of keywords. Each line in the fish-keywords.txt file (group 1) and in the fake-keywords.txt file (group 2) contains a search keyword/phrase. A list of search keywords was then created from all possible combinations of individual keywords/phrases form group 1 and group 2. A matching Tweet, returned by the search had to include at least one combination of such search keywords/phrases. Therefore, the search string used by the API was constructed as follows: ( ) OR ( ) OR ... *Note: the space between represents a logical AND in terms of the Twitter API service. The Twitter API allows historical searches to be restricted to Tweets associated with a specific location, however, this can be only specified as a specific radius from a given latitude and longitude geo-point. We used Twitter's geo-resticted search by defining a Lat/Long point and radius (in kilometres). In order to cover major areas in the UK we used the following four geo-restrictions: Latitude =57.334942 Longitude=-4.395858 Radius = 253 km; Latitude =55.288000 Longitude=-2.374374 Radius = 282 km; Latitude =52.250808 Longitude=-0.660507 Radius = 198 km; Latitude =51.953880 Longitude=-2.989608 Radius = 198 km.

Identifier
DOI https://doi.org/10.5255/UKDA-SN-853378
Metadata Access https://datacatalogue.cessda.eu/oai-pmh/v0/oai?verb=GetRecord&metadataPrefix=oai_ddi25&identifier=798c06a445488d3ab35c5f4817015662e03fbc515f11e96da61f561280976781
Provenance
Creator Edwards, P, University of Aberdeen; Markovic, M, University of Aberdeen; Petrunova, N, University of Aberdeen; Chenghua, L, University of Aberdeen; Corsar, D, University of Aberdeen
Publisher UK Data Service
Publication Year 2018
Funding Reference Economic and Social Research Council
Rights Peter Edwards, University of Aberdeen; The Data Collection is available to any user without the requirement for registration for download/access.
OpenAccess true
Representation
Resource Type Text
Discipline Jurisprudence; Law; Social and Behavioural Sciences
Spatial Coverage United Kingdom