Ad

Web Scraping With Javascript?

I'm having a hard time figuring out how to scrape this webpage to get this wedding list into my onepager. It doesn't seem complicated at first but when I get into the code, I just can't get any results.

I've tried ygrab.js, which was fairly simple and got me somewhere but then I can't seem to scrape the images and it only prints the output in the console (not much documentation to go on).

$(function() {
var $listResult = $('#list-result');
var kado = [];
var data = [
{
    url: 'https://www.kadolog.com/fr/list/liste-de-mariage-laura-julien',
    selector: '.kado-not-full',
    loop: true,
    result: [{
              name: 'photo', 
              find: '.views-field-field-photo',
              grab: {
                by: 'attr',
                value: 'src'
              }
             },
            {
            name: 'title',
            find: '.views-field-title .field-content',
            grab: {
                by: 'text',
                value: ''
            }
        },
        {
            name: 'description',
            find: '.views-field-body .field-content',
            grab: {
                by: 'text',
                value: ''
            }
        },
        {
            name: 'price',
            find: '.price',
            grab: {
                by: 'text',
                value: ''
            }
        },
        {
            name: 'remaining',
            find: '.topinfo',
            grab: {
                by: 'text',
                value: ''
            }
        },
        {
            name: 'link',
            find: '.views-field-nothing .field-content .btn',
            grab: {
                by: 'attr',
                value: 'href'
            }
        },
    ],
  },
];
ygrab(data, function(result){
 console.log(JSON.stringify(result, null, 2)); //photos = undefined
});

Then there's Node.js with Request and Cheerio (and I tried Crawler too), but I have no idea how node works.

var request = require("request");

This gives me an error in the console saying require is not defined. Fair enough, I added require.js to the scripts in my page. I got another error ("Uncaught Error: Mismatched anonymous define() module: ...").


My question is this: Is there a simple Javascript way (possibly without involving node?), to scrape the wedding list I'm trying to get? Or maybe a tutorial that resembles what I'm trying to do step by step ?

I'd be truly grateful for any help or advice.

Ad

Answer

i think your only issue is the img selector. Change

    {
          name: 'photo', 
          find: '.views-field-field-photo',
          grab: {
            by: 'attr',
            value: 'src'
          }
    },

To this

   {
          name: 'photo', 
          find: '.views-field-field-photo .field-content img',
          grab: {
            by: 'attr',
            value: 'src'
          }
    },

I actually can't test this right now, but it should be working!!

Ad
source: stackoverflow.com
Ad