TL;DR

What is this all about? In this short post I'll give you an introduction to Twint Desktop. But let's start from the beginning, what is Twint?

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.

Formerly known as Tweep, Twint is an advanced Twitter scraping tool written in Python that allows for scraping Tweets from Twitter profiles without using Twitter's API.

Twint utilizes Twitter's search operators to let you scrape Tweets from specific users, scrape Tweets relating to certain topics, hashtags & trends, or sort out sensitive information from Tweets like e-mail and phone numbers. I find this very useful, and you can get really creative with it too.

Twint also makes special queries to Twitter allowing you to also scrape a Twitter user's followers, Tweets a user has liked, and who they follow without any authentication, API, Selenium, or browser emulation.

You can find more information in the Github reopsitory

Twint did not ever had an application to visualize data with, I already made a dashboard for Kibana, but this could not be suitable is some cases, plus is needed some effort by the user and that's why I started building Twint Desktop, a solution with batteries included.

Summarily Twint Desktop allows you to graph data in four basic charts:

  • bar chart, shows weekly activity;

  • line chart, shows daily activity;

  • radar chart, shows top N hashtags;

  • doughnut chart, shows top N users.

N is temporarily fixed to 15 and will be customizable in the next release, among other settings.

Advanced visualizations:

  • graph connections (user <-> hashtag);

  • map to view where a tweet was tweeted and who tweeted in that place (if the information is available);

  • word-cloud to see which are the most used.

After this brief introduction, let's dive into it!

0x00 - First thing first

Do you have an Elasticsearch instance? If not, get one because now as now Twint Desktop gets data only from Elasticsearch.

0x01 - Setup

In this first stage of the application we release just for Debian based OSes (like Ubuntu). The stable version will be supported both on Windows and MacOS as well.

Let's start download Twint Desktop and install it.

0x02 - Initialization

Search for twint-desktop and open it, you will see this window:

Click the database icon and fill fields, here an example:

Twint is shipped with some default values, which are:

  • protocol: http;

  • hostname: localhost;

  • port: 9200;

  • index name: twinttweets;

  • type: items.

0x03 - Fetch some data

In this early stage of Twint Desktop there are just a few params, there will more don't worry!

The field names are describing themself, what is worth to note is that we can filter-out tweets based on:

  • who tweeted;

  • which hashtags contains;

  • which words contains (full text search).

Plus we can add more values for every field since the params accept CSV. If we type “Google,Microsoft” (without quotation marks) in the Username field, we will get every (stored) tweet sent by either Google or Microsoft. The same approach for the other two fields.

At the end of the rendering process, this could be a result:

Please note that only the radar chart will have a constant and unique color, other charts will have random color but the same for every user across every chart. Meaning that if user1 has blue bars in the bar chart, his/her color line (in the line chart) will be blue as well, the same for the doughnut chart.

Now as now the charts are interactive but not interconnected:

0x04 - Graph Explorer

The graph shows connections between users and used hashtags, to open it just click on the forth icon starting from the top:

A known bug is that it does not render on first opening when, instead, it should. In that case you have just to close&reopen it. The result will be a 3D network like this:

Known issue: if I resize the window, the graph does not fill it completely. How do I solve this? Workaround: just restart the window typing CTRL-R, it's a matter of seconds even for large (4k nodes) graphs.

Please note that it seems to be an issue with the library that I'm using, not width or height CSS stuff.

0x041 - Graph - Interaction

If you hover over the links, you will see “directional particles”, useful when the graph starts getting bigger and bigger.

If you hover over dots you will see the text field that describes the node (usernames or hashtags).

If you click the node, its size will increase and it will be a sort of pivot for your graph. This could be useful if you have to modify the rendered graph, plus in this case comes into play that fact that nodes are dragable.

0x05 - Tweets in the map

To open a word map just click map icon. If tweets do not contain geo-coordinates, the map will be blank since there will not be anything to display. Otherwise, the maps zooms over the last tweet pointed. Use your mouse to move over the map and the mouse wheel to zoom in&out.

Q: How do I display tweets in the map?

A: Well, be sure to be indexing tweets into an Elasticsearch instance, then pass the --place/--geo argument (with proper values) otherwise if you are using Twint as module specify c.Place or c.Geo.

Q: How do I know who twitted in a specific location?

A: Just click on the Twitter squared icon (note that it's size does not change if you zoom in/out)

0x06 - WordCloud

Click on the “pencil with paper” icon and a new window will popup:

The window is auto-rendered if you resize it.

0x07 - Disabled icons

The last two icons (settings and envelope) are disabled, this is still in a development phase so I change params and stuff directly in the code. But do not worry about this, you will be able to customize every single setting on your needs.

0x08 - General FAQ

Q: I asked for data about 1k users/hashtags, but I just see the top 15. Why?

A: Data is automatically sorted by “weight”. This means that only the top 15 users that tweeted the most will be displayed. For hashtags only the top 15 used hashtags will be showed in the radar chart.

Q: I worked for a while, now I want to focus on another search but I do not want to mash-up the results. How to?

A: Just restart Twint Desktop, a simple CTRL-R is enough but can close and reopen the application as well. There is not a function that lets you reset charts, yet.

Q: I store data into the database and CSV/JSON files, will I be able to fetch data from that sources?

A: Premised that full text search is not suitable for those sources, we'll add support even for those.