Russia’s foreign policy priority

Methodological papers are tasteless and boring without nice examples. For an exemplary application of my Newsmap, I downloaded all the news stories published by ITAR-TASS news agency from 2009 to 2014 both in English and Russian. From a public diplomacy point of view, I was interested in which countries are receiving the highest coverage in […]

Sentence segmentation

I believe that sentence is the optimal unit of sentiment analysis, but splitting whole news articles into sentences is often tricky because there are a lot of quotations in news. If we simply chop up texts based on punctuations, we get quoted texts are split into different sentences. This code is meant to avoid such […]

Nexis news importer updated

I posted the code Nexis importer last year, but it tuned out that the HTML format of the database service is less consistent than I though, so I changed the logic. The new version is dependent less on the structure of the HTML files, but more on the format of the content. library(XML) #might need […]

The Latent Semantic Scaling

I have posted document scaling results on different dimensions such as political left-right, and immigration positive-negative on this blog previously, but I did not explain the detail of the technique, call the Latent Semantic Scaling. The LSS is a type of lexicon expansion technique based on the Latent Semantic Analysis. Please have a look at […]

Geographical dictionary making technique

My new draft paper Newsmap: Dictionary expansion technique for geographical classification of very short longitudinal texts explains how to create a large geographical dictionary for text classification. Its algorithm is an updated version of the International Newsmap, and it is simpler and more statistically grounded. As I am arguing in the paper, this technique could […]

International news coding instruction

It was already four years ago when I created my Newsmap. It is time to update the whole system: fully rewritten in Python and developing a new classification algorithm. This is why I generated a 5,000 human-coded international news stories using the Prolific Academic. Thanks to the crowed-sourcing services, recruiting is no longer a problem, […]

Terrorism Dictionary 2014

After seeing mass media’s strong response to the extremists’ attack against Charlie Hebdo, I started thinking what can I do for this increasingly important topic? One simple work is making a dictionary containing keywords related to terrorism, so the Terrorism Dictionary 2014 is created. This dictionary is made from newswires submitted by the Associated Press […]

Left-right policy position dictionary

The Latent Semantic Scaling (LSS) not only works well with positive-negative sentiment but with left-right position on economic policy. The seed words for this dimension are {deficit, austerity, unstable, recession, inflation, currency, workforce} for the light and {poor, poverty, free, benefits, prices, money, workers} for the left. Left-right policy position dictionary was created from UK […]

Begin typing your search term above and press enter to search. Press ESC to cancel.

Back To Top