pizza_gator

I have a backup of the top 30 pages of this subverse from 10 days ago (12-05-16) link: http://multimirrorupload.com/so_xtpaa86bbi5d Mirror1: http://multimirrorupload.com/zp_xtpaa86bbi5d Mirror2: http://multimirrorupload.com/sw_xtpaa86bbi5d

Ivegotredditcancer

Ive got a new wd black 6tb hhd that neefs to be filled. Ill be happy to store the work if I can get a magnet

adam_danischewski

To really make a utility with snapshot capabilities and updating only what's new, you need to have the software run on the server side. The problem is with trust, you don't really know if VOAT itself will be capable of not being infiltrated by attacks. Every intelligence agency on the planet is interested in this type of scandal, and is attempting to put themselves into position to control the flow of data and exposes emanating from it.

I looked briefly into what options exist in Open Source, there is IRC yet that uses centralized servers and does not allow for the look and feel of this type of message board.

What really is needed is a solid well-tested p2p style message board, that cannot be tampered with, anything that takes control away from servers that are explicitly or implicitly (controlled at will) by the large players will likely meet fierce resistance. Yet if you get something that proves itself capable that would be very very worthwhile and would probably help more than just this scandal. It could save the entire internet one day.

The current Open Source technology that looks the most promising out there is AETHER:

http://getaether.net

"Aether is an app you install to your computer to connect to Aether network. This network is made of different boards (forums) where people post and discuss things. On the surface, it's fairly similar to Slashdot, Metafilter, Reddit, or any other community site on the Internet.

The different thing about Aether is that it doesn't have a server somewhere. The only thing the app does is that it finds and connects to other people using Aether. In other words, it's a distributed, peer-to-peer network."

In the meantime, what you can do is ask VOAT if you could MIRROR their website for them, and perhaps they would allow if they trust you, to be failed over via DNS in case the site goes down. Again, there are many trust issues involved. But if you can at least mirror their website for them, you could patch the mirrored website onto Aether making the message board serverless.

Pizzafundthrowaway

Very interesting suggestion, thank you!

gittttttttttttttttt

Hey thanks for the mention. Just sent a DM to OP.

ParadiseFaIIs

I don't typically respond but I have a server I can store a copy of voat.co/v/pizzagate on along with scripted bots I can quickly write up to crawl this directory once a week. I'm only doing this because I need the cash for a pizzagate project I'm doing myself. Message me if interested.

Pizzafundthrowaway

Thanks for the offer! I'll let you know

pembo210

I was taking a snapshot of the hot 100 frontpage of a few pizza subs and a bunch of other subs every 4 hours for a while. Here's a one week demo of the three pizza ones. The lower table has search and sort options. Unfortunately I ran out of room on my test server on Dec 5 and decided to stopped archiving stuff so i could do other stuff..

https://pembo210.com/pages/pizzasnaps.php?date=pizzagate_2016-12-05%3A06

Limitations:

This pulls from https://voat.co/api/subversefrontpage?subverse=pizzagate which is only the hot 100 and does not include heavily downvoted posts. Also I haven't updated the site in many months, it's been on autopilot collecting data, so some parts are broken. (Once the new API is ready, we can go back to anytime and see any comment.)

I'll poke around this weekend to see if I can fix some stuff to get it going again.

Pizzafundthrowaway

I updated my OP. There's someone working on this already. Please check the link and work with him to complete the project. There will still be a small reward for your effort if you can get it done together as a team

pembo210

that's pretty good, go with that. It get's the comment pages too.

Pizzafundthrowaway

Although the implementation is slightly different than what I had envisioned, it still gets the primary job done of allowing us to download everything as a giant .zip file any time we want.

Keeping true to my word, I'm still donating $100 to the cause. It looks like our fearless archived bot creator needs some help, so I'll give him $50 for the existing proof of concept and another $50 once it's done, which can be split between him and one of you willing to take a look at his work and give him a hand.

totesgoats908234

Neat! Thanks.

Total posts for the last week per day

401 421 341 415 444 524 495

3041 posts in 7 days, 434 average per day.

I wonder how many posts there are total, has to be more than what google shows being in the blank search.

bikergang_accountant

I can do it. My big question is about hosting preference. If it would get taken down here it could get taken down however it is distributed out from the archive.

Basically how would you like the archive to be stored and made available?

To be honest $100 might be a little short if you want every feature I can think of. Grabbing the post sure. Hosting it all, sure. Grabbing every comment even as comments can be made days after and making it available live which means sql and constant scraping. Absolutely doable, but still a pretty decent amount of work. The comments are valuable and they load via ajax.

This is all very doable.

So if the requirements are reasonably contrained that can be done for that amount. I'd think you would want the full thing though.

Pizzafundthrowaway

Thank you for your willingness to help. Someone is already working on this, but he needs help. Can you follow the link in the updated OP and see what you can do? I'm willing to split the final $50 between the original developer and anyone else who helps him?

bikergang_accountant

Hmm, I think for the sake of being polite to the main dev I would be a little uncomfortable with that arrangement. I'd rather work with him directly in such a case.

@freetibet , I'm available if any particular part gives you a hicup. What needs to be done or improved.

I guess if I we do work together and he acknowledges that I helped you can pay my $50 to him and we'll work things out between us.

totesgoats908234

Downloading a website like voat can be tricky without it taking up a huge amount of space. I have a web crawler that I have used to archive websites, that use MVC coding style to generate their content, with trying to take up as little space as needed. I will give it a shot and let you know if it works here.

If I were to just recursively download all the subverse with http GETs it would be a lot of extra space needed. Google says there are about 7600 pages in this subverse and downloading one generates ~800KB, so if you did it this way you would need ~6GB of space to download what is here to date.

If you use google to isolate search to a single day, I used the 14th, it doesnt show the total number of pages at the top but if you look there are 10 results per page and 32 pages so estimate 300 new posts a day @ 800KB means ~240MB of daily delta snapshot data you will need space for.

This is just my guestimate from using google and a single page download size as rule of thumb. I'm not 100% sure how accurate single day search is for estimating number of pages per day but meh.

Other than that, security is a concern. I would put it in a hidden service to make it harder to takedown.

Pizzafundthrowaway

Thank you so much for your willingness to help. Someone is already working on this, but he needs help. Can you follow the link in the updated OP and see what you can do? I'm willing to split the final $50 between the original developer and anyone else who helps him

totesgoats908234

To be fair, whoever is creating that backup script you linked to is not familiar with software development. This is more just like a one-off shell script. It doesn't have functions, flow, or error correction. It will be prone to break and not be reliable. Not to be offensive, but it is just a hacked together bash shell script. I'd prefer to use existing developed tools like httrack - https://www.httrack.com/ that can be compiled for linux (what all web hosts will be using). It is well developed and has existing functions to update changes on web page and is easy to script for a cron job.

I'm willing to set this up if you use all the mentioned funds to purchase an anonymous hosting service. I won't ask for payment to set up and secure the server. I noticed that another member offered to set up a clone that will utilize a database backend, if you don't hear back from them then I will offer to set up more simplistic backup using httrack. I can set up prototype on a spare Linode server that I have if you PM me from your throwaway account.

I found other voat specific projects on github for backing up voat and comments ( https://github.com/voatarc/voatarc ) , but it would take more work than what I will do for free to set up such a replicant. The source code for Voat is also available to anyone to fork ( https://github.com/voat/voat ) . There are also existing scrapers for copying a website ( https://github.com/guillemhs/ScraperBot ), which is a better solution than what the other user using is doing by creating a one-off shell script for backup and replication.

Test using httrack to replicate - https://i.sli.mg/6RZAhv.png - https://i.sli.mg/XvfaLT.png - https://i.sli.mg/SJj64X.png - https://i.sli.mg/OxsHpB.png

Pizzafundthrowaway

Thank you very much for the suggestion. I still haven't issued any payment yet. I'm learning about how to use bitcoin (my first time). I'm also learning about the alternative suggestions people have made.

Please, can you guys work together on this? With all the censorship and content takedowns, I'm really afraid this site is in serious jeopardy.

I'm still learning to pick the best option, but if you and the others who offered their services can arrive at a consensus about how to tackle this, and deliver proof of concept, I'll double my contribution.

Teamwork, please. And consensus. We can do this!

RedGreenAlliance

Could some type of script not be written to copy every post and comments automatically and send to a backup location?

SomeD

just use the archive.is button and download the zip file of your archived site. And I use a Txt file wher I'm writting comments about the links I have.

And, of course, I rename the zip file but I leave the original name in the end. Being organized is the key for backups.

RedGreenAlliance

Upvote for diligence and greater exposure. Wish I had the skills. Instead I will stick to the #AskAlefantis campaign I started on twitter . Now up to 3.5 MILLION impressions!

Schade

Great! In the event that the this website is real, and 15,000 new Podesta emails are released it will need a good campaign for maximum exposure.

Bring_Down_CF

Kek. #askalefantis. Almost worth it to join the twatter juat for the lols that must be creating.

LolturdFerguson

Apparently, Twitter hated my username. I signed up without a phone number, searched for the pizzagate hashtag...and then my account was locked. Never had a chance to post! Damn they are stepping up their censoring game... Waaa Waaa Waaaaah

RedGreenAlliance

https://mobile.twitter.com/search?q=%40momeetsaisha%3A+%23AskAlefantis

Here's my archive of #AskAlefantis tweets. And that's just my contribution. Several big hitter accounts took up the fight. Don't have to join to view. MAYHEM!

PS good source of downloable memes and pics

Thorshamster

I'm all for this. Make it happen, nerds! Get that money!