Ad

Firestore - Get Large Collection And Parse. Request Is Aborted

I've had a look around but can't find an obvious solution.

I have a collection with 130k documents. I need to export these as a CSV file. (The CSV part I have sorted I think).

My code works fine with smaller collection but when trying it on the 130k documents in a collection it hangs and I get "Request Aborted". What would be the best way to handle this?

My code:

db.collection("games")
.doc(req.params.docid)
.collection("players")
.onSnapshot(snapshot => {
  console.log("On Snapshot")
  snapshot.docs.forEach(data => {

    const doc = data.data();

    downloadArray.push(doc);

  });

  jsonexport(downloadArray, function(err, csv) {
    if (err) return console.log(err);

    fs.writeFile("out.csv", csv, function() {
      res.sendFile(path.join(__dirname, "../out.csv"), err => {
        console.log(err);
      });
    });
  });
});

I'm trying out pagination as suggested, however I'm having trouble understanding how to keep calling the next batch until the loop is done, as sometimes I won't know the collection size, and querying such a large collection size takes over 1-2 minutes.

 let first = db
      .collection("games")
      .doc(req.params.docid)
      .collection("players")
      .orderBy("uid")
      .limit(500);

    let paginate = first.get().then(snapshot => {
      // ...
      snapshot.docs.map(doc => {
        console.log(doc.data());
      });

      // Get the last document
      let last = snapshot.docs[snapshot.docs.length - 1];

      // Construct a new query starting at this document.
      let next = db
        .collection("games")
        .doc(req.params.docid)
        .collection("players")
        .orderBy("uid")
        .startAfter(last.data())
        .limit(500);
Ad

Answer

You could paginate your query with cursors to reduce the size of the result set to something more manageable, and keep paging forward until the collection is fully iterated.

Also, you will want to use get() instead of onSnapshot(), as an export process is probably not interested in receiving updates for any document in the set that might be added, changed, or deleted.

Ad
source: stackoverflow.com
Ad