How To Get DIV Attributes With Puppeteer?

- 1 answer

I get the href of a elements by

const hrefs = await page.evaluate(() => 
Array.from(document.body.querySelectorAll('a'), ({ href }) => href));

but when I try to get aria-label or data-xx of div elements, this method does not work.

Why is that and how can I get aria-label or data-xx attributes of div elements?


<div class="test" arial-label="something" data-all="something">


Problem: DOM node attribute ≠ HTML element attribute

Only some of the HTML attributes are exposed on the DOM node. And even the exposed one might contain a different value: The href attribute of the DOM node is not the same as the attribute written into the HTML (<a target="_blank" rel="nofollow noreferrer" href="..."></a>). To give an example:

<a id="link" target="_blank" rel="nofollow noreferrer" href="test.html">Link</a>

Accessing document.querySelector('#link').href will return the full path (e.g. instead of test.html. To get the original element attribute you have to use the function getAttribute.


Coming back to your code, that means you can read aria-label and data-all by using getAttribute like this:

Array.from(document.body.querySelectorAll('div'), (el) => el.getAttribute('aria-label'));
Array.from(document.body.querySelectorAll('div'), (el) => el.getAttribute('data-all'));

For accessing the data attribute, there is an additional solution available. You can access the data values by using a special attribute called dataset, which allows to read the value of data-xx like this:

Array.from(document.body.querySelectorAll('div'), (el) => el.dataset.xx);