The Grim Squeaker wrote: 2024-12-29 09:36am
Solauren wrote: 2024-12-24 06:52am
LadyTevar wrote: 2024-12-20 06:39pm
And how much can those store?
Well, I downloaded the WOTC discussion forms, at least the D&D Section of it, before it closed.
Compressed, it was just over a Gigabyte. Standard Text compression of 20:1 would put it at about 22 Gigabytes.
That's 4 DVDs in capacity, probably 5 when you deal with having to sort things so they all fit without stuff being divided up.
Given the sheer amount of activity on that board, that's probably more then SD.NET by a comfortable margin.
Can you download that without admin access? I'd like that as a dataset (The SDN board, nvm others)
Yeah, but it's not a quick thing.
When you get down to it, all most Internet BBS are simply large websites. The individual threads are webpages (possibly multiple linked ones, in the case of threads with multiple pages), with a built in editor for posting.
So, downloading one, into a non functional but viewable copy, is as simple as tossing all the thread webpages addresses into a downloader program.
For example, the 'Testing Policy' thread is at the web address of
http:// bbs.stardestroyer.net / viewtopic.php ? f=21 & t=138477
The most recent thread about the Order of the Stick Webcomic is at
http:// bbs.stardestroyer.net / viewtopic.php ? f=32 & t=167586
with page 2 starting on
http:// bbs.stardestroyer.net / viewtopic.php? f =32 & t=167586 &start=25
going up to the current page (33) &start=800
(25 posts per page)
The breakdown of the address is 'Website/command' (stardestroyer.net, viewtopic), Forum #, Thread #, Starting Post
and if you want, you can add "&view=print" to simplify the view.
Now, you can actually ignore the forum #.
So, there are two ways it could be done.
One, is go into each form, get a count of all the thread pages (i.e Fantasy has 129 pages worth of threads displayed)
Save each of those, and a little programmer magic, and you'd known the #, Title, and # of Pages of each thread.
Easy enough to turn into a download list. (A quick look at the code via the view source option in most web browsers shows it's very clean and efficient compared to most bbs background code, so it would actually be simple to work with).
Repeat on each forum that can be viewed without login. (You'd need someone that had access to each subforum to get them all.
And then there is also the fact that you'd have to configure the downloader to be used to log you in with that password. Annoying, but doable.)
Add a second page grab on all the threads to account for activity since you started making the list to make sure...
And the End result is each Thread, all pages, ready to download, nice and organized. Drop them into your favourite mass downloader, and wait.
(Note to the Admins: I use one with speed limits to keep any board I'm downloading from being abused by the process.
The other is to brute force it, ignoring thread #, and starting at Thread #1, and assuming 2000 posts per thread.
(I believe most threads on here are locked after 1000 posts, it's been a while since any got that big).