Ad

How To Use Requests POST Method To Get The Search Result From Website?

- 1 answer

I am trying to get the output of the website search result, i am using requests post method to do it. Below you can see the FORM and INPUT htmls from the url. I need to get each search result in input of form.

I have tried the code below but it returns nothing

import requests
from bs4 import BeautifulSoup

# FORM from website
# <form name="form1" method="post" action="payerOrVoenChecker.jsp">

# INPUT from the website
# <input type="text" name="voen" size="38" style="BACKGROUND-COLOR: #ffffff; BORDER-BOTTOM-STYLE: groove;
# BORDER-LEFT-STYLE: groove; BORDER-RIGHT-STYLE: groove; BORDER-TOP-STYLE: groove; COLOR: #000000; FONT-FAMILY:
# Tahoma, Arial; FONT-SIZE: 12px" value="">

request_headers = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
    'Accept-Encoding': 'gzip, deflate, br',
    'Accept-Language': 'en-US,en;q=0.9',
    'Cache-Control': 'max-age=0',
    'Connection': 'keep-alive',
    'Content-Type': 'application/x-www-form-urlencoded',
    'Host': 'www.e-taxes.gov.az',
    'Origin': 'https://www.e-taxes.gov.az',
    'Referer': 'https://www.e-taxes.gov.az/ebyn/payerOrVoenChecker.jsp',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-Site': 'same-origin',
    'Sec-Fetch-User': '?1',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'YOUR USER AGENT',
}

voens = {2000460031,
         1000877741,
         1000877741,
         1500403661,
         1000877741,
         3000489411,
         1000877741,
         1802142932,
         }

tip = ['L',
       'P',
       ]

form_data = {
    'tip': tip,
    'voenOrName': 'V',
    'voen': voens,
    'name': '',
    'submit': '  Yoxla   ',
}


url = 'https://www.e-taxes.gov.az/ebyn/payerOrVoenChecker.jsp'

for voen in voens:
    form_data['voen'] = voen
    form_data['tip'] = tip
    response = requests.post(url, data=form_data, headers=request_headers)
    s = BeautifulSoup(response.content, 'lxml')
    sContent = s.findAll('table', {'class': 'com'})[0].findAll('tr', recursive=False)[1]
    outcome = sContent.get_text().strip()
    # .find("tr", recursive=False)
    print(outcome)

Expected result will be in a table form, i am adding the screenshot of before and after search in website, the highlighted are is the table i need to get

enter image description hereenter image description here

Ad

Answer

You're missing the other items that are to be sent in the body of the POST request.

Try this:

request_headers = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
    'Accept-Encoding': 'gzip, deflate, br',
    'Accept-Language': 'en-US,en;q=0.9',
    'Cache-Control': 'max-age=0',
    'Connection': 'keep-alive',
    'Content-Type': 'application/x-www-form-urlencoded',
    'Host': 'www.e-taxes.gov.az',
    'Origin': 'https://www.e-taxes.gov.az',
    'Referer': 'https://www.e-taxes.gov.az/ebyn/payerOrVoenChecker.jsp',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-Site': 'same-origin',
    'Sec-Fetch-User': '?1',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'YOUR USER AGENT',
}

form_data = {
    'tip': 'L',
    'voenOrName': 'V',
    'voen': '1700393071',
    'name': '',
    'submit': '  Yoxla   ',
}

response = requests.post(url, data=form_data, headers=request_headers)

The best way to figure these out is to switch to Network tab in the developer's tools in the browser. The shortcut in both Chrome and Firefox is F12.

  • Once you open the Developer's Tools, go to Network Tab.
  • Now, submit all the requests which you want to automate (e.g., in this case, clicking on Submit after filling in the ID).

It'll show you a list of all the requests that the browser is sending behind the scenes. Click on the one that matches your URL.

A pane will open on the right that shows what method (GET / POST) was used, what headers were passed in the request, what data was sent (in case of POST), etc.

All I did was paste the request headers and form data from that tab.

inspecting network tab

Ad
source: stackoverflow.com
Ad