Terrorism Dictionary 2014

Standard

After seeing mass media’s strong response to the extremists’ attack against Charlie Hebdo, I started thinking what can I do for this increasingly important topic? One simple work is making a dictionary containing keywords related to terrorism, so the Terrorism Dictionary 2014 is created. This dictionary is made from newswires submitted by the Associated Press and Agence France-Presse in 2014 using the collocation-of-collocation technique.

The following top-30 keywords contain a lot of nasty words, although there is a proper noun as a result of failure of name entity recognition.

terrorist          854.315998
terrorism          588.953735
terror             481.302104
terrorists         218.965274
attacks            210.076406
group              166.118157
groups             149.240685
militant           119.637842
murder             116.232324
criminal           110.881245
charges            110.290402
extremist          103.736008
organisation       98.508253
jihadist           97.517677
jihadists          91.178175
al-qaida           86.847945
threat             83.710184
acts               82.083663
organization       82.043942
strikes            81.820983
militants          81.399320
violence           78.733125
violent            73.914122
terrorism-related  73.734030
guilty             73.433767
attack             69.712507
fight              66.956416
extremists         64.436517
links              63.727690
charged            63.267324

This list of keywords can be used to find news stories or Twitter posts about terrorism. For example, if an item contain more than three of the keywords among the top 100 in the dictionary, it is very likely to be about terrorism.

Left-right policy position dictionary

Standard

The Latent Semantic Scaling (LSS) not only works well with positive-negative sentiment but with left-right position on economic policy. The seed words for this dimension are {deficit, austerity, unstable, recession, inflation, currency, workforce} for the light and {poor, poverty, free, benefits, prices, money, workers} for the left.

Left-right policy position dictionary was created from UK and Irish news corpus from 1996-1997. The first chart is the replication of the Wordscore paper by Benoit and Laver, and black and red letters represent Irish and UK parties.

UK and IE 1997 manifestos

The second chart is the result of the machine coding of UK party manifestos from 1987 to 2010 by the same dictionary, and it is showing clear separation of the leftist and rightist parties until 2005. Why there is not difference between the three parties in the 2010? It is arguably because their economic policy became very similar after the economic crisis from the perspective of 1990s politics.

UK party manifestos 1987-2010

Immigration dictionary

Standard

This is probably the final version of my immigration dictionary. This text analysis dictionary was created using technique called the Latent Semantic Scaling, which is based on the Latent Semantic Analysis, from British newspaper corpus.

The result of the automated content analysis by this dictionary is strongly corresponds to manual coding by Amazon’s Mechanical Turks as you can see in the chart (whiskers represent 95% confidence intervals). Yet, please note that the documents coded by the dictionary are only sentences about immigration in the party manifestos selected by keywords (‘immigra*’, ‘migra*’, ‘refugee*’, ‘asylum*’, ‘foreign*’).

UK 2010 manifestos on immigration

The dictionary is made up of 750 entry words. The following is the top 30 most positive and negative words in the dictionary. Many of them are intuitively positive or negative, but some are not. For example, ‘globalisation’ is positive only in the context of immigration. This is why texts are restricted to sentences on this subject. We can spot words like ‘species’ and ‘wildebeest’, because the newspaper corpus contains stories about animal migration, but it is not too harmful.

# Positive words

1   skills            100
2   globalisation     88.24
3   chauffeured       86.93
4   airport           86.68
5   ranging           82.41
6   clearance         79.48
7   status            78.4
8   agency            74.98
9   issues            72.15
10  breed             69.45
11  claimed           68.84
12  vehemently        68.6
13  skill             67.3
14  test              65.91
15  attract           64.39
16  permanent         63.68
17  legal             59.23
18  melting-pot       57.34
19  species           57.27
20  wildebeest        56.96
21  overstaying       56.07
22  documents         55.9
23  routes            55.75
24  work              55.63
25  shambles          55.28
26  breeding          53.65
27  bringing          53.24
28  employ            52.76
29  passport          52.24
30  official          51.88


# Negative words

1   xenophobia        -141.27
2   control           -130.09
3   racist            -125.2
4   stemming          -122.5
5   tide              -122.46
6   working-class     -115.53
7   negative          -113.76
8   failure           -110.32
9   problems          -106.95
10  influx            -100.81
11  branded           -99.42
12  caused            -96.82
13  exploit           -94.11
14  first-generation  -90.78
15  warned            -89.93
16  families          -88.51
17  soaring           -86.53
18  ignored           -86.45
19  housed            -85.33
20  magnet            -84.47
21  borders           -83.18
22  newly-arrived     -83.12
23  accused           -82.89
24  evicted           -82.02
25  trickle           -81.42
26  rates             -79.42
27  fuelled           -78.34
28  flooded           -76.69
29  non-white         -76.48
30  lorries           -76.38