A pretty snapshot of the Wiki brought to you by the Social Media Observatory at HBI
This list provides an overview of useful data collection tools that can be used for research on Twitter. If you face problems or issues with one of the applications on the list, feel free to post an Issue. It helps us to maintain this list.
Most of these Twitter tools connect to official Twitter APIs and therefore need an API key from Twitter. You can retrieve an API key from Twitter easily, just follow the documentation. You are bound to the restrictions given by Twitter. You can read about the rate limits here. Version 2 of the API will be more restrictive (at least it looks like that at time of writing). As an academic, you can apply for access to the new academic track (Twitter Academic API Track Application) though to elevate your access levels to 10 million tweets per month and access to the ‘historic’ archive, i.e., tweets older than 7 days.
Some of the tools are scrapers, which do not use the official APIs. Please be aware that the use of these tools might violate Twitter’s Terms of Service. Despite being public, Twitter data can be very personal. Ensure to inform yourself thoroughly in order to follow data protection laws and ethical guidelines that apply to your research before starting your data collection.
API? | Last Tested | Language | Interfaces | Comments | |
---|---|---|---|---|---|
Facepager | V1/V2 | 2/2/2021 | - | GUI | No programming needed |
focalevents | Academic | Not Tested | Python | CLI | depends on PostgreSQL |
twacapic | Academic | 7/4/2021 | Python | CLI | early development |
twarc | V1/V2/Academic | 2/2/2021 | Python | CLI / Python Module | Programming possible |
TwitterAPI | V1/V2/Academic | 2/2/2021 | Python | Python Module | Programming needed |
Twint | Scraper | 2/2/2021 | Python | Python Module | Programming needed |
Twitterscraper | V1 | 2/2/2021 | Python | Python Module | Programming needed |
tweepy | V1/V2 | 2/2/2021 | Python | Python Module | Programming needed |
rtweet | V1 | Not Tested | R | R Module | Programming needed |
twitter-explorer | V1/V2/Academic | 19/01/2023 | Python | GUI | No Programming needed |
cta-tool | V2/Academic | 13/12/2021 | Python | Python Module | programming needed, collecting and counting conversations, MongoDB |
Twitter Downloader | Academic | 25/05/2022 | - | GUI | no programming needed, access to Tweets only |
DMI-TCAT provides robust and reproducible data capture and analysis, and interlinks with existing analytical software. Captured data sets can be refined in different ways (search queries, exclusions, date range, etc.) and the resulting selections of tweets can be analyzed in various ways, mainly by outputting files in standard formats (CSV for tabular files and GEXF for network files).
The big plus of DMI-TCAT is that it is organized around a MySQL database, which can run 24/7 robustly over months to years. However, setting up DMI-TCAT on a server requires some command line skills.
known issues and limitations:
Notable Features:
Installation via: Accessible through a Web Application, no local installation needed.
known issues and limitations:
Notable Features:
Installation via: An Google account is needed to install this sheets
Notable Features:
Save the data in jsonl format.
Plot can be export in .gml/.csv/.gv
Installation via: An installation package is available for Windows, Linux and MacOS
Requires python 3.6 or above .
# replace XXX by release number
cd ~/Downloads/twitter-explorer-vXXX
pip install -r requirements.txt
After installation we can collect data using streamlit
streamlit run collector.py
known issues and limitations:
Notable Features:
Installation via: CRAN
known issues and limitations:
Notable Features:
Notable Features:
Installation via: An installation package is available for Windows, Linux and MacOS