Ad

How To Parse Historical BTC Data From Coinmarketcap?

I am trying to learn how to web scrape BTC historical data from Coinmarketcap.com using Python, requests, and BeautifulSoup.

I would like to parse the following:

1)Date

2)Close

3)Volume

4)Market Cap

Here is my code so far:

import requests
from bs4 import BeautifulSoup
from fake_useragent import UserAgent

ua = UserAgent()
header = {'user-agent': ua.chrome}
response = requests.get('https://coinmarketcap.com/currencies/bitcoin/historical-data/', headers=header)

# html.parser
soup = BeautifulSoup(response.content,'lxml')  

tags = soup.find_all('td')
print(tags)

I am able to scrape the data I need but I am not sure how to parse it correctly. I would prefer to have the dates go back as far as possible ('All Time'). Any advice would be greatly appreciated. Thanks in advance!

Ad

Answer

You could have a function which takes the number of months to return (you could alter this but months is a good enough example) then use pandas read_html to grab the table and subset for columns. This is currently set-up to work from today's date.

import requests
import pandas as pd
from datetime import datetime
from dateutil.relativedelta import relativedelta

def get_date_range(number_of_months:int):
    now = datetime.now()
    dt_end = now.strftime("%Y%m%d")
    dt_start = (now - relativedelta(months=number_of_months)).strftime("%Y%m%d")
    return f'start={dt_start}&end={dt_end}'

number_of_months = 3

table = pd.read_html(f'https://coinmarketcap.com/currencies/bitcoin/historical-data/?{get_date_range(number_of_months)}')[0]
table = table[['Date', 'Close**', 'Volume','Market Cap']]
print(table)
Ad
source: stackoverflow.com
Ad