As TrustServista is becoming more and more proficient at detecting fake news that is generated by professional online scammers or politically-biased groups, one are of mass misinformation is still posing problems not only to automated AI-engines, but also to most skilled human analysts and investigators: geopolitical propaganda.
While fake news generally follows a series of patterns that can be rather easily identified through our machine learning trained algorithms (from very specific punctuation, writing style that aims to provoke and lack of factual information), geopolitical propaganda is elaborate, created in a writing style similar to that of news agencies, can skillfully interwove true with fake facts or uses information that is very hard to verify.
Geopolitical propaganda works best in conflict zones. With little to no actual independent reporters on the ground, most information comes from the belligerent parties, military officials, local sources that are most of the time anonymous or social media accounts that can be verified. The result is a flow of information that is neither true or false, but almost always untrustworthy or unverifiable. News about the Syrian conflict, for example, is most of the times contradictory, depending of its source: news outlets loyal to president Assad, Russian state-owned media, Coalition officials and the news agencies using information exclusively from them and the different factions heavily relying on social media to spread their message.
One example we investigated recently was the information that the US-led Coalition in Syria has evacuated 2o ISIS military leaders from Deiz ez-Zor, ahead of the Syrian Armed forces retaking the town.
As the same information was verified by the Atlantic Council’s Digital Forensic Lab (https://medium.com/dfrlab/questionable-sources-on-syria-36fcabddc950), TrustServista was put the the task of tracking the information to its original source and verifying it using it’s content quality analysis, or TrustLevel for short.
TrustServista was able to automatically generate all the links between the articles writing about this topic, tracking the information back to a couple of sources:
- The D24 website on 25th August 2017: https://sputniknews.com/middleeast/201709071057182244-daesh-deir-ez-zor-us-evacuation/. The source of the information was an anonymous corespondent: “A D24 correspondent said that…”.
- A later article from Russian Sputnik news agency (7th Sept 2017), that uses again anonymous sources (“A military and diplomatic source has told Sputnik…”) to expand on the information provided by D24.
The 2 articles are the most referenced, either directly, or through intermediary webpages. Both articles contain information that is very difficult to verify, but should be regarded as not trustworthy because it is not backed up by images, videos or other independent witnesses.
However, the automatic quality scoring revealed that most of the articles found as being part of this story had a high score (above 70%). No clickbait techniques were used, the content was loaded with factual information and in some cases even the writing style was revealed to have a neutral sentiment.
The challenge of automatically detecting this type of content does not rely that much on analyzing the writing style, but focus more on the origin of information. As learned from the example shown before, if the original URL can be detected accurately and linked to all its siblings and the information source falls within a pattern of relying on anonymous sources, then propaganda detection does not become a difficult task anymore.
TrustServista is in the process of creating a “propaganda recipe” based on a set of relevant examples and will release an improved version of its Content Quality algorithm, that will embed scoring specific to propaganda content.