#RESEARCH QUESTION
"Which different categories of crypto currencies can be identified, and what are their characteristics?"
1. Motivation
1.1
Crypto currencies are getting more and more popular lately and people increasingly start investing in it. This project aims to investigate the different categories of crypto currenties that exist, and their characteristics. As crypto currencies are highly volatile, it is very interesting to investigate their characteristics and how prices of different types of crypto currencies are affected. The dataset will provide users with a clear overview of crypto currency categories, and their specific characteristics. The research question has a high practical relevance, as it provides not only traders but also policy makers with insights on the characteristics of different types of crypto currencies.
For generating this dataset, multiple relevant websites and API's were considered. Crypto news and Newsdata.io contain articles that are linked to dates, authors and crypto currencies. The data is only refreshed after publications of articles, which makes scraping more difficult. Also, the data on these websites can still be modified after it has been published, which is not the case for Coinmarketcap.com. We therefore choose to use the API of Coinmarketcap.com, where data is refreshed every minute, and which is more user friendly. This API fits our research question best, and provides good insights in the different types of crypto currencies.
1.2
This dataset is created by by team 7 of the course Online Data Collection and Management, consisting of Robbin de Waal, Efe Kiremitci, Sezen Birkan, Bram, and Xenia Tijssen. The course is provided by instructor Hannes Data, at Tilburg University.
2. Composition
2.1
Each instance in the dataset represents a different cryptocurrency that we retrieve from the website. These cryptocurrencies are divided into categories. These categories are based on/linked to certain characteristics.
2.2
Cryptocurrencies: The dataset currently consists of 7114 different cryptocurrencies. However, this number is growing every day. Categories: there are 124 categories available.
2.3
The data we scraped does not contain all possible instances, as we did not scrape all the cryptocurrencies available on the website.
2.4
For each observation the following data is collected:
CATEGORY DATA
- Average price change
- Description
- ID
- Last Updated
- Market Cap
- Market Cap Change
- Name
- Number of Tokens
- Title
- Volume
- Volume Change
CRYPTOCURRENCY DATA
- CMC rank
- Date Added
- Is active (yes/no)
- Is fiat (yes/no)
- Last Updated
- Maximum Supply
- Name
- Number of Market Pairs
- Platform
- Quote -> (Fully diluted market cap, last updated, market cap, market cap dominance, %-change 1hour, %-change 24hours, %-change 30days, %-change 60days, %-change 7days, %-change 90days, price, volume 24hours)
- Slug
- Symbol
- Tags
- Total Supply
- Circulating Supply
2.5
Cryptocurrencies: these are labeled by name and a short abbreviation consisting of three letters.
Categories: Categories are labeled by name.
2.6
There is no information missing on individual instances.
2.7
The relationship between the individual instances is that they are all cryptocurrencies, subdivided into different categories based on certain characteristics.
2.8
2.9
The dataset is self contained.
2.10
All data inside the dataset is available to everyone and is therefore not considered as confidential.
2.11
No, the dataset only contains cryptocurrencies and their characteristics. The dataset does not contain any data that might be considered offensive, insulting, threatening and the data will in no way cause anxiety.
3. Collection processe 3.1 Technical extraction plan The collection process started with the end in mind, collecting the categories of different crytocurrencies. The website used for this process is coinmarketcap.com, a website which stores and displays current and historical data on a wide variety of crypto currencies. Coinmarketcap offers and extensive API where amateur coders can sign up for free and use a limited amount of tokens to request data. Wihtin the documentation you will find an explanation on how to extract data from Coinmarketcap while running the API. One of the more important aspects of this ducmentation is the endpoint overview where you will find a variety of endpoints from which you can choose to pull the data. In order to extract the coins within each category, you first need to understand how the data is layered (see docs\screenshots). Extracting data on coins from a specific category requires an 'id'. This id is given after extracting all the different categories listed on Coinmarketcap. With that said, first a function needs to be build to receive all categories uncluding their 'id'. Second, the id must be extracted and integrated in the function to extract cryptocurrencies within that category. Finally, the data needs to be parced and saved as a .csv.
4. Preprocessing There was no need for preprocessing. All the data listed on Coinmarketcap and the data extracted did not contain any personal infromation which eliminated the need to anonymize the data. After going through the extracted data, no implausible observations have been made which eliminated the need to do so as well.
5. Uses
5.1 The dataset was created for educational purposes, and has not yet been used for any tasks.
5.2 The dataset has not yet been used in any papers or systems.
5.3 The dataset can be used for a lot of different causes and tasks that involve crypto currencies. It can for example help with research about market values, dayly or weekly changes, price changes per category, or top gainers ratings.
5.4 There is nothing about the composition of this dataset, or the way it was collected that could result in any form of unfair treatment or other undiserable harms. The dataset only contains publicly available cryptocurrency categories and their characteristics, which does not involve any data that could cause harm.