Web scraping is the process of extracting, copying, storing, and reusing third-party content on the web.

In this tutorial, we will use web scraping in python to get data from the CoinMarketCap website to use later for our projects.

Everything we use here can be reused on any website by just modifying a few lines of code.

We will write the code step by step. You will find the whole code at the end of the tutorial.

What do you need for this project?

  • python installed on our PC
  • Installing some libraries (lxml, requests, beautifulsoup4)

1. Installing the libraries we need

You need to have python and pip installed on your computer to be able to use the libraries.

To install the libraries you have to open the terminal on your computer (CMD on windows).

Type the following command:

  pip install lxml requests beautifulsoup4
  

Try the command with pip3 if the previous command didn’t work:

  pip3 install lxml requests beautifulsoup4
  

2. Import the libraries we installed

You need to import the libraries we downloaded.

Make a new file (e.g. price.py) and import the libraries by typing:

  import requests
from bs4 import BeautifulSoup
  

3. Get the HTML code of the website we want to get data from

Now we want to get the price of some cryptocurrencies from the CoinMarketCap website. For this, we need to copy the link to the page of the currency we need. For example, if I want to get the price of Ethereum I should find the link to Ethereum price on CoinMarketCap and make HTML request to that link and get the whole HTML code.

It should look something like this:

  eth_url = requests.get("https://coinmarketcap.com/de/currencies/ethereum/")  
eth_src = eth_url.content
  

If you want to get the price of multiple currencies, you have to repeat these steps as many times as the number of currencies you want to get. In my example, I want to get the price of BNB, Anime Token, Monero, and Ethereum. This means that I should repeat the step 4 times and store the HTML code containing the price of each currency in 4 variables (bnb_src, ani_src, xmr_src, and eth_src).

  bnb_url = requests.get("https://coinmarketcap.com/de/currencies/binance-coin/")
bnb_src = bnb_url.content

ani_url = requests.get("https://coinmarketcap.com/de/currencies/anime-token/")
ani_src = ani_url.content

xmr_url = requests.get("https://coinmarketcap.com/de/currencies/monero/")
xmr_src = xmr_url.content

eth_url = requests.get("https://coinmarketcap.com/de/currencies/ethereum/")
eth_src = eth_url.content
  

4. Beautifulsoup settings

Now that we have the content of the page, we need to get the data we need. For this, we need the beautifulsoup library. We create now a new variable (eth_soup) and call the constructor of beautifulsoup. The construct takes two parameters (the source code and the parser). We already installed the lxml parser in the first step and it helps us to do processing on our site so we will use it.

eth_soup = BeautifulSoup(eth_src, "lxml")

Since we need the price of 4 cryptocurrencies in our project, we have to repeat the step 4 time:

  bnb_soup = BeautifulSoup(bnb_src, "lxml")ani_soup = BeautifulSoup(ani_src, "lxml")xmr_soup = BeautifulSoup(xmr_src, "lxml")eth_soup = BeautifulSoup(eth_src, "lxml")
  

5. Find out where the data you need is located in the HTML code

Select the data you need to have and open it with the inspect using google chrome as shown in the picture.

coinmarketcap-open-inspect-980x4

Now find out in which div the data is contained in the HTML code and copy the name of the div. In our example, it is called “priceValue”.

coinmarketcap-html-code-980x474

6. Find the data with beautifulsoup

Now we know in which div our data is located. We want now to extract that information with python using beautifulsoup. The following code will remove all irrelevant HTML codes and leave just the one that includes the div:

  eth_ = eth_soup.find("div", {"class":"priceValue"}) 
  
  Output of eth_:
<div class="priceValue"><span>€3,026.64</span></div>
  

We want to remove the HTML tags so in conclusion, we came up with:

  eth_list = []
eth_ = eth_soup.find("div", {"class":"priceValue"})
eth_list.append(eth_.text) #get just the text
  
  Output of eth_list: 
['€3,026.64']
  

Do this step 4 times for each currency we worked with in the steps before.

7. Make the necessary edits to the data

Since we received [‘€3,026.64’] as an output, we want now to make a few changes. We need the price as a double number. We should therefore remove a few characters and do typecasting. We will remove the “€” and “,” characters and convert the string number to a float.

  x = eth_list[0][0]       #this is the "€" symbol (if USD it should be the "$" symbol)
eth_list[0] = eth_list[0].replace(x, '').replace(",", '')#reomoving the "€" and the ","
ETH = float(eth_list[0])   #convert the string number to float
  

This step should be also repeated 4 times for each currency we worked with. You will find the whole code in the next step.

8. The whole Code

So in this code, you will be able to know how much money you have if you put the amount of the cryptocurrency you have in the variables my_BNB, my_ANI, my_XMR, and my_ETH. The output will look something like this:

Output-of-coinmarketcap-program
  ##you need to install the following libraries
##pip install lxml  this is the parser we need to use
##pip install requests
##pip install beautifulsoup4
import requests
from bs4 import BeautifulSoup

##We need those lists to append the value of each currency in it
bnb_list = [] 
ani_list = []
xmr_list = []
eth_list = [] 


##get the html code of the page you want to get the informations from
bnb_url = requests.get("https://coinmarketcap.com/de/currencies/binance-coin/")
bnb_src = bnb_url.content

ani_url = requests.get("https://coinmarketcap.com/de/currencies/anime-token/")
ani_src = ani_url.content

xmr_url = requests.get("https://coinmarketcap.com/de/currencies/monero/")
xmr_src = xmr_url.content

eth_url = requests.get("https://coinmarketcap.com/de/currencies/ethereum/")
eth_src = eth_url.content


##Using the lxml parser
bnb_soup = BeautifulSoup(bnb_src, "lxml")
ani_soup = BeautifulSoup(ani_src, "lxml")
xmr_soup = BeautifulSoup(xmr_src, "lxml")
eth_soup = BeautifulSoup(eth_src, "lxml")


##Search on each CoinMarketCap html for the value we need
bnb_ = bnb_soup.find("div", {"class":"priceValue"})  
bnb_list.append(bnb_.text)   #get just the text and append it in bnb_list

x = bnb_list[0][0]    # this is the "€" symbol we want to remove from the string
bnb_list[0] = bnb_list[0].replace(x, '').replace(",", '')   #reomoving the "€" and the ","
BNB = float(bnb_list[0])  #typecast the string to float



##same steps as before
ani_ = ani_soup.find("div", {"class":"priceValue"})   
ani_list.append(ani_.text)   

x = ani_list[0][0]       #this is the "€" symbol (if USD it should be the "$" symbol)    
ani_list[0] = ani_list[0].replace(x, '').replace(",", '')   #reomoving the "€" and the ","
ANI = float(ani_list[0])   



##same steps as before
xmr_ = xmr_soup.find("div", {"class":"priceValue"})  
xmr_list.append(xmr_.text)  

x = xmr_list[0][0]       #this is the "€" symbol (if USD it should be the "$" symbol)
xmr_list[0] = xmr_list[0].replace(x, '').replace(",", '')   #reomoving the "€" and the ","
XMR = float(xmr_list[0])



##same steps as before
eth_ = eth_soup.find("div", {"class":"priceValue"})  
eth_list.append(eth_.text)   #get just the text  

x = eth_list[0][0]       #this is the "€" symbol (if USD it should be the "$" symbol)
eth_list[0] = eth_list[0].replace(x, '').replace(",", '')   #reomoving the "€" and the ","
ETH = float(eth_list[0])



##how much you own of each currency (you have to put values that suit you)
my_BNB = 0.5
my_ANI = 1000000
my_XMR = 5.234
my_ETH = 12


##value in euro
BNB_EUR = my_BNB * BNB
ANI_EUR = my_ANI * ANI
XMR_EUR = my_XMR * XMR
ETH_EUR = my_ETH * ETH

##how much you have in total 
summary = BNB_EUR + ANI_EUR + XMR_EUR + ETH_EURa

print ("\n\n==================================\n\n  - Binance: %f BNB\n  ==== %f € ====\n\n  - AnimeToken: %f ANI\n  ==== %f € ====\n\n  - Monero: %f XMR \n  ==== %f € ====\n\n  - Etherium: %f ETH  \n  ==== %f € ====\n\n\n======== %f € ========\n\n==================================" %(BNB, BNB_EUR, ANI, ANI_EUR, XMR, XMR_EUR, ETH, ETH_EUR, summary))