Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse

Darkscribes Community

  1. Home
  2. Uncategorized
  3. How to performantly bulk remove dormant users from a forum?

How to performantly bulk remove dormant users from a forum?

Scheduled Pinned Locked Moved Uncategorized
2 Posts 2 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • zipit@community.nodebb.orgZ This user is from outside of this forum
    zipit@community.nodebb.orgZ This user is from outside of this forum
    [email protected]
    wrote last edited by
    #1

    Hey everyone,

    I am trying to clean out our forum from dormant and spam users. We have roughly 60000 accounts (sic!) of which about 56000 are spam accounts with no posts at all.

    I have written a small Python script which reaches into our MongoDB database and identifies 'invalid' accounts over a handful criteria such as the user having no posts, URLs in the profile of the user and more. And I can quite accurately sort out spam from legit accounts. The problem is when I just delete these documents and their directly related documents (e.g., for user:100 also user:100:emails, user:100:settings, ...) in the Mongo database, then I end up with an at first glance first glance functional NodeBB instance. But secondary data has not been updated as NodeBB does not seem to be very atomic. The users list on the dummy-forum now has for example countless empty pages, as the users are gone but something has not been updated which feeds that user list. I already rebuilt the forum, but this did not change anything.

    I also had a look at the WriteAPI. I did not (yet) get the bulk user account deletion to work, but when I use the endpoint /api/v3/users/{uid}, my script ends up like this: Processing users: 1%| 320/56329 [11:13<32:23:18, 2.08s/user] I.e., it takes NodeBB about 2 seconds to delete a single user account. And in total this is then more than a day of processing time. I cannot be the first one with this problem, right? I did not find any solutions to this problem. I also found /nodebb/src/api/users.js:processDeletion and the lower level nodebb/src/user/delete.js:User.deleteAccount, but there is no clear path for me which database documents I have to delete and update.

    Cheers,
    zipit

    julian@community.nodebb.orgJ 1 Reply Last reply
    0
    • zipit@community.nodebb.orgZ [email protected]

      Hey everyone,

      I am trying to clean out our forum from dormant and spam users. We have roughly 60000 accounts (sic!) of which about 56000 are spam accounts with no posts at all.

      I have written a small Python script which reaches into our MongoDB database and identifies 'invalid' accounts over a handful criteria such as the user having no posts, URLs in the profile of the user and more. And I can quite accurately sort out spam from legit accounts. The problem is when I just delete these documents and their directly related documents (e.g., for user:100 also user:100:emails, user:100:settings, ...) in the Mongo database, then I end up with an at first glance first glance functional NodeBB instance. But secondary data has not been updated as NodeBB does not seem to be very atomic. The users list on the dummy-forum now has for example countless empty pages, as the users are gone but something has not been updated which feeds that user list. I already rebuilt the forum, but this did not change anything.

      I also had a look at the WriteAPI. I did not (yet) get the bulk user account deletion to work, but when I use the endpoint /api/v3/users/{uid}, my script ends up like this: Processing users: 1%| 320/56329 [11:13<32:23:18, 2.08s/user] I.e., it takes NodeBB about 2 seconds to delete a single user account. And in total this is then more than a day of processing time. I cannot be the first one with this problem, right? I did not find any solutions to this problem. I also found /nodebb/src/api/users.js:processDeletion and the lower level nodebb/src/user/delete.js:User.deleteAccount, but there is no clear path for me which database documents I have to delete and update.

      Cheers,
      zipit

      julian@community.nodebb.orgJ This user is from outside of this forum
      julian@community.nodebb.orgJ This user is from outside of this forum
      [email protected]
      wrote last edited by
      #2

      zipit if the accounts have no actual content you can just call .deleteAccount as that's more lightweight.

      The reason why user deletion takes so long is because of all those cross referenced sets. There are probably opportunities for optimization there.

      1 Reply Last reply
      0
      Reply
      • Reply as topic
      Log in to reply
      • Oldest to Newest
      • Newest to Oldest
      • Most Votes


      • Login

      • Don't have an account? Register

      • Login or register to search.
      Powered by NodeBB Contributors
      • First post
        Last post
      0
      • Categories
      • Recent
      • Tags
      • Popular
      • Users
      • Groups