Official Everybody Edits Forums

Do you think I could just leave this part blank and it'd be okay? We're just going to replace the whole thing with a header image anyway, right?

You are not logged in.

#1 2016-01-29 16:52:16, last edited by Hexagon (2016-03-01 16:05:43)

Hexagon
Member
Joined: 2015-04-22
Posts: 1,213

EE Analytics torrent [Feb 2016 available]

Available Downloads wrote:

Data Set | Size | Torrent
November 2015 | 5.2 GiB |

https://mega.nz/#!odAxwbQA!Fc8PTZpzCqrmLToU02z8vSlX4Ba_2NIcm6l1Ip2FmfQ

December 2015 | 4.8 GiB |

magnet:?xt=urn:btih:8a6100df38765c94289b129c4477d0a44f8fa9b2&dn=2015-12&tr=udp%3a%2f%2ftracker.openbittorrent.com%3a80&ws=http%3a%2f%2fweb.lrussell.net%2f~hexagon%2f2015-12%2f

January 2016 | 4.0 GiB |

https://mega.nz/#!5Vg0ETLT!OzA5Ec_iMq5krxVD0lo_89zOVnjLrO6HwaGwwzxwLGY

February 2016 | 4.2 GiB |

https://mega.nz/#!FJwz2DrB!1giLyyuQSikWPVzawGsIxTVslfcXvExlccnnHwYW944

It's very important for the history of EE to be archived--most of the funny chats, worlds, lobby/players online, usernames, player movements, and much more. These records provide the social fabric for EE, during the highs and lows. Unfortunately, the time has come and the space available for EE Analytics is shrinking. After going through a couple of hard drives and some SD cards, I have only one copy of the data, and not much space left.

I'd like to upload the data to archive.org, and am asking permission from the community to upload the data as-is (plaintext with usernames), because encrypting the data would mean that it could get removed. The usernames could be stripped, but that would mean that anyone else who would like to use it would not be able to customize an experience for a specific user. Another benefit is that the community can do what they want with the data, like making visualizations and predictions.

Offline

#2 2016-01-29 17:26:09

Zumza
Member
From: root
Joined: 2015-02-17
Posts: 4,656

Re: EE Analytics torrent [Feb 2016 available]

User Generated Material
In this agreement you give Everybody Edits the right, without special compensation, to publish the Worlds, Games and Material you make for an unlimited time throughout the world on the internet, on TV or any other medium. Thus you give Everybody Edits the right to use, copy, present, reproduce, display, edit, integrate, license as well as distribute the Games and Material which you have created, for both commercial and noncommercial use.

I think EE servers should be in charge of that data not archive.org


Everybody edits, but some edit more than others

Offline

Wooted by: (3)

#3 2016-01-29 17:27:11

Processor
Member
Joined: 2015-02-15
Posts: 2,246

Re: EE Analytics torrent [Feb 2016 available]

Remove old data and replace it with newer data.


I have never thought of programming for reputation and honor. What I have in my heart must come out. That is the reason why I code.

Offline

#4 2016-01-29 17:27:58

Kaslai
Official Caroler
From: SEAͩT̓͑TLͯͥͧͪ̽ͧE͑̚
Joined: 2015-02-17
Posts: 787

Re: EE Analytics torrent [Feb 2016 available]

Do you really think the data is worth pushing to archive.org?

It doesn't really have much historical significance...

Offline

Wooted by:

#5 2016-01-29 17:40:01, last edited by Hexagon (2016-01-29 17:45:38)

Hexagon
Member
Joined: 2015-04-22
Posts: 1,213

Re: EE Analytics torrent [Feb 2016 available]

Processor wrote:

Remove old data and replace it with newer data.

I'll have to resort to this if there's no more options left. It removes the historical value, though.

Kaslai wrote:

Do you really think the data is worth pushing to archive.org?

It doesn't really have much historical significance...

Well, depending on how long EE lives on for it could be historical in many many years from now. It's not really historical to society at large but historical to the EE community.

Zumza wrote:

User Generated Material
In this agreement you give Everybody Edits the right, without special compensation, to publish the Worlds, Games and Material you make for an unlimited time throughout the world on the internet, on TV or any other medium. Thus you give Everybody Edits the right to use, copy, present, reproduce, display, edit, integrate, license as well as distribute the Games and Material which you have created, for both commercial and noncommercial use.

I think EE servers should be in charge of that data not archive.org

I'd be happy to give it to one/some of the moderators/admins. The TOS just says that EE has complete control over your data, and they can pretty much do whatever they want with it. An interesting section is:

You may not use any of Everybody Edits material, such as avatars, blocks, images, logos, icons, audio files etc. for commercial purposes outside of the Everybody Edits website. This means that you may not copy, distribute, sell, publish, send or otherwise recirculate Everybody Edits material to a third party without the written consent of Everybody Edits. You may not change, revise or replace any of the material found in the game or website, either in its entirety or parts thereof.

Since I'm not selling the data, and the website is non-profit (i.e non-commercial) it could be okay, as it's user-generated material (which the TOS doesn't say anything about recirculating.)

Offline

#6 2016-01-29 18:06:42

lrussell
Member
From: Saturn's Titan
Joined: 2015-02-15
Posts: 843
Website

Re: EE Analytics torrent [Feb 2016 available]

How much space does this data require? I have a NAS drive that I'm planning on upgrading. I'm just not sure how my ISP would feel about say... a terabyte of data being downloaded.

Offline

#7 2016-01-29 18:09:58, last edited by Hexagon (2016-01-29 18:11:29)

Hexagon
Member
Joined: 2015-04-22
Posts: 1,213

Re: EE Analytics torrent [Feb 2016 available]

lrussell wrote:

How much space does this data require? I have a NAS drive that I'm planning on upgrading. I'm just not sure how my ISP would feel about say... a terabyte of data being downloaded.

It's only like 1.5TB (I think) uncompressed so it's not too bad. If you could store it that would be awesome!

Offline

#8 2016-01-29 18:38:27

Zumza
Member
From: root
Joined: 2015-02-17
Posts: 4,656

Re: EE Analytics torrent [Feb 2016 available]

I'm not sure why you recorded and want this data to be posted. Could you tell us whats this massive data consists in?


Everybody edits, but some edit more than others

Offline

#9 2016-01-29 18:52:41

Hexagon
Member
Joined: 2015-04-22
Posts: 1,213

Re: EE Analytics torrent [Feb 2016 available]

Zumza wrote:

I'm not sure why you recorded and want this data to be posted. Could you tell us whats this massive data consists in?

I'm very interested in data and statistics, which is why I recorded it and I'd like to post it so that other people can use it (as it's just sitting on my harddrive, unable to be accessed) and I don't really have enough space left.

It consists of (roughly):
- (95%) World event data (five or so of the most popular worlds, determined on a minute basis). Event data is everything that your EE client receives when you join a room (i.e *everything*--movements, blocks, usernames, join, init, left, potions, zombie, trolling, chats, but nothing private since clients don't get that data.) This data is from August 2014 to present, recording virtually 24/7 since then
- (1%) Old EEforum posts and user avatars; EE user profiles
- (1%) lobby data (on a per-second basis) and some corrupted lobby data that I couldn't figure how to decompress
- (0.5%) a ping log to playerio, and website logs
- (0.49%) a bunch of EE worlds
- (2%) some world data that is in an extremely bloated XML format (not sure if I still have it)
- (0.05%) almost every EE swf file from August 2014, recorded every six hours, except I'm missing one for December 2015 I think. Probably not 0.05% but something very small
- (0.05%) Source code for many projects, including unreleased ones, for EE. Also block data for those
- (?%) lrussell's eeindex, preprocessed data, server vm, other third-party EE related source code, png minimaps of thousands of worlds, programatically generated EE chat screenshots

Offline

Wooted by: (2)

#10 2016-01-29 19:07:13

Zumza
Member
From: root
Joined: 2015-02-17
Posts: 4,656

Re: EE Analytics torrent [Feb 2016 available]

I bet you have some kind of google glass and record every moment of your life.
Or you are more interested in others?


Everybody edits, but some edit more than others

Offline

#11 2016-01-29 19:10:16

N1KF
Wiki Mod
From: ဪဪဪဪဪ From: ဪဪဪဪဪ From: ဪဪဪဪဪ
Joined: 2015-02-15
Posts: 11,113
Website

Re: EE Analytics torrent [Feb 2016 available]

We need not store this, as NSA probably has all this data locked up in a vault somewhere.

Offline

Wooted by: (3)

#12 2016-01-29 19:29:46

den3107
Member
From: Netherlands
Joined: 2015-04-24
Posts: 1,025

Re: EE Analytics torrent [Feb 2016 available]

I personally would only ever be interested in:
• Chats;
• versions of worlds;
• forum archives (which for the record isn't yet being archived by the awesome way back machine (archive.org));
• I guess EE projects.

The events I personally think might only be fun for the block placement events, so that players can request a speedbuild of rooms.
Old EE versions aren't interesting, as you can't use them, due to them being dependent on the server, which will block them due to an outdated version.

I personally think a lot can be thrown of, but far from everything, but that's my opinion.

Offline

#13 2016-01-29 20:24:46

Different55
Forum Admin
Joined: 2015-02-07
Posts: 16,575

Re: EE Analytics torrent [Feb 2016 available]

Hexagon wrote:
lrussell wrote:

How much space does this data require? I have a NAS drive that I'm planning on upgrading. I'm just not sure how my ISP would feel about say... a terabyte of data being downloaded.

It's only like 1.5TB (I think) uncompressed so it's not too bad. If you could store it that would be awesome!

How much is it compressed?


"Sometimes failing a leap of faith is better than inching forward"
- ShinsukeIto

Offline

#14 2016-01-29 20:25:41

Hexagon
Member
Joined: 2015-04-22
Posts: 1,213

Re: EE Analytics torrent [Feb 2016 available]

Different55 wrote:

How much is it compressed?

About 150GB, which is very manageable however I have a lot of my own stuff, which takes up a lot of space.

Offline

#15 2016-01-29 21:24:02

W24
Member
From: USA
Joined: 2015-05-30
Posts: 591
Website

Re: EE Analytics torrent [Feb 2016 available]

So are you the one behind those bot accounts randomly connecting to worlds?


yqSDpmp.png

Offline

Wooted by: (2)

#16 2016-01-29 21:25:18

Hexagon
Member
Joined: 2015-04-22
Posts: 1,213

Re: EE Analytics torrent [Feb 2016 available]

W24 wrote:

So are you the one behind those bot accounts randomly connecting to worlds?

For the most part, yeah.

There was a period of time where there was other random accounts connecting to worlds for a couple of days, but that was a long time ago. That wasn't me.

Offline

Wooted by:

#17 2016-01-29 21:49:15

SmittyW
Member
Joined: 2015-03-13
Posts: 2,085

Re: EE Analytics torrent [Feb 2016 available]

Hexagon wrote:
Zumza wrote:

I'm not sure why you recorded and want this data to be posted. Could you tell us whats this massive data consists in?

I'm very interested in data and statistics, which is why I recorded it and I'd like to post it so that other people can use it (as it's just sitting on my harddrive, unable to be accessed) and I don't really have enough space left.

It consists of (roughly):
- (95%) World event data (five or so of the most popular worlds, determined on a minute basis). Event data is everything that your EE client receives when you join a room (i.e *everything*--movements, blocks, usernames, join, init, left, potions, zombie, trolling, chats, but nothing private since clients don't get that data.) This data is from August 2014 to present, recording virtually 24/7 since then
- (1%) Old EEforum posts and user avatars; EE user profiles
- (1%) lobby data (on a per-second basis) and some corrupted lobby data that I couldn't figure how to decompress
- (0.5%) a ping log to playerio, and website logs
- (0.49%) a bunch of EE worlds
- (2%) some world data that is in an extremely bloated XML format (not sure if I still have it)
- (0.05%) almost every EE swf file from August 2014, recorded every six hours, except I'm missing one for December 2015 I think. Probably not 0.05% but something very small
- (0.05%) Source code for many projects, including unreleased ones, for EE. Also block data for those
- (?%) lrussell's eeindex, preprocessed data, server vm, other third-party EE related source code, png minimaps of thousands of worlds, programatically generated EE chat screenshots

Here's an idea. If you get rid of that 95% you'll save a lot of space.

But seriously though that 95% seems entirely worthless, unless it partially includes some world storage. Keeping records of chat, movement, and block placements isn't worth much besides some line graphs. I don't think people will be so keen on releasing chat records either, and this has been strongly argued after ninjasup's huge chat-recording dilemma. I know you spent a lot of time hoarding it but it's time to let it go man.

Offline

Wooted by: (2)

#18 2016-01-30 00:28:36, last edited by sthegreat (2016-01-30 00:29:21)

sthegreat
Member
Joined: 2015-04-25
Posts: 409

Re: EE Analytics torrent [Feb 2016 available]

I would archive it.

EDIT: I mean make it public to keep the data


user.php?id=sthegreat

Offline

#19 2016-01-30 01:37:21

lrussell
Member
From: Saturn's Titan
Joined: 2015-02-15
Posts: 843
Website

Re: EE Analytics torrent [Feb 2016 available]

Hexagon wrote:
Zumza wrote:

I'm not sure why you recorded and want this data to be posted. Could you tell us whats this massive data consists in?

I'm very interested in data and statistics, which is why I recorded it and I'd like to post it so that other people can use it (as it's just sitting on my harddrive, unable to be accessed) and I don't really have enough space left.

It consists of (roughly):
- (95%) World event data (five or so of the most popular worlds, determined on a minute basis). Event data is everything that your EE client receives when you join a room (i.e *everything*--movements, blocks, usernames, join, init, left, potions, zombie, trolling, chats, but nothing private since clients don't get that data.) This data is from August 2014 to present, recording virtually 24/7 since then
- (1%) Old EEforum posts and user avatars; EE user profiles
- (1%) lobby data (on a per-second basis) and some corrupted lobby data that I couldn't figure how to decompress
- (0.5%) a ping log to playerio, and website logs
- (0.49%) a bunch of EE worlds
- (2%) some world data that is in an extremely bloated XML format (not sure if I still have it)
- (0.05%) almost every EE swf file from August 2014, recorded every six hours, except I'm missing one for December 2015 I think. Probably not 0.05% but something very small
- (0.05%) Source code for many projects, including unreleased ones, for EE. Also block data for those
- (?%) lrussell's eeindex, preprocessed data, server vm, other third-party EE related source code, png minimaps of thousands of worlds, programatically generated EE chat screenshots

I'd be willing to store all of it compressed, however transferring that much over my internet connection isn't doable. I can take everything except the World Event data (which isn't even useful, really). My current VPS has 2TB of bandwidth if you want me to give you credentials to upload it there to distribute to others, but it only has ~16GB of HDD space left.

Offline

#20 2016-01-30 01:43:43, last edited by Hexagon (2016-01-30 01:57:07)

Hexagon
Member
Joined: 2015-04-22
Posts: 1,213

Re: EE Analytics torrent [Feb 2016 available]

I have an idea, but it's a little "overkill":

- I have a world file, and it contains all of the events in the world from a certain time period.
- I duplicate that file, and remove the sensitive portions of the chat (i.e usernames, maybe more)
- Then, I create a binary diff between the two files, keeping the diff on my HD and upload the "sanitized" version to archive.org

Now, I have a very small file which is able to completely reconstruct the file downloaded from archive.org. If someone downloaded the public file, all they would really get is a bunch of block commands with opaque ids. This means that I can save ~48x as much data.

It is trivial to take a screenshot of the chat and post it everywhere (i.e put it in your user signature) but I do understand that chats might be private.

Offline

#21 2016-01-30 04:29:26

Different55
Forum Admin
Joined: 2015-02-07
Posts: 16,575

Re: EE Analytics torrent [Feb 2016 available]

Hexagon wrote:
Different55 wrote:

How much is it compressed?

About 150GB, which is very manageable however I have a lot of my own stuff, which takes up a lot of space.

In that case I can definitely save a few compressed copies with my regular backups. Make a torrent to host it and I will download and seed whenever I can.


"Sometimes failing a leap of faith is better than inching forward"
- ShinsukeIto

Offline

#22 2016-01-30 04:54:23, last edited by lrussell (2016-01-30 05:11:12)

lrussell
Member
From: Saturn's Titan
Joined: 2015-02-15
Posts: 843
Website

Re: EE Analytics torrent [Feb 2016 available]

Different55 wrote:
Hexagon wrote:
Different55 wrote:

How much is it compressed?

About 150GB, which is very manageable however I have a lot of my own stuff, which takes up a lot of space.

In that case I can definitely save a few compressed copies with my regular backups. Make a torrent to host it and I will download and seed whenever I can.

Actually, a torrent wouldn't be a bad idea. It would make the overall download speed faster because of multiple seeders/leechers. I suppose I could store the full data-set if I downloaded it in bursts to not scare my ISP (150 GB in 6 hours). I'm upgrading my NAS to 2TB of space so it's doable for me I suppose. But it would probably be better to have a "base" data-set that's accessible to more people. Perhaps there's a way to compress it more, 7-Zip Ultra maybe?

Offline

#23 2016-01-30 12:54:46, last edited by Hexagon (2016-01-30 14:45:08)

Hexagon
Member
Joined: 2015-04-22
Posts: 1,213

Re: EE Analytics torrent [Feb 2016 available]

Different55 wrote:

In that case I can definitely save a few compressed copies with my regular backups. Make a torrent to host it and I will download and seed whenever I can.

Sounds good, I'll begin the process of making one as soon as I get things cleaned up. Since I'm giving this data to a few people, I might upload it to archive.org first then use them as a tracker if that's okay.

lrussell wrote:

I suppose I could store the full data-set if I downloaded it in bursts to not scare my ISP (150 GB in 6 hours). I'm upgrading my NAS to 2TB of space so it's doable for me I suppose. But it would probably be better to have a "base" data-set that's accessible to more people. Perhaps there's a way to compress it more, 7-Zip Ultra maybe?

I could give 7z ultra a shot, and see if that works. I'd have to tar it though first. Do you mean a base set by splitting it up the data into smaller chunks?

EDIT: zipping with 7z looks like I can get it under ~120GB, which looks good. I can compress it with xz on extreme for about ~107GB, however it will take ~35 days. I'll have to start zipping shortly.

Offline

#24 2016-01-30 18:23:58

lrussell
Member
From: Saturn's Titan
Joined: 2015-02-15
Posts: 843
Website

Re: EE Analytics torrent [Feb 2016 available]

Hexagon wrote:
Different55 wrote:

In that case I can definitely save a few compressed copies with my regular backups. Make a torrent to host it and I will download and seed whenever I can.

Sounds good, I'll begin the process of making one as soon as I get things cleaned up. Since I'm giving this data to a few people, I might upload it to archive.org first then use them as a tracker if that's okay.

lrussell wrote:

I suppose I could store the full data-set if I downloaded it in bursts to not scare my ISP (150 GB in 6 hours). I'm upgrading my NAS to 2TB of space so it's doable for me I suppose. But it would probably be better to have a "base" data-set that's accessible to more people. Perhaps there's a way to compress it more, 7-Zip Ultra maybe?

I could give 7z ultra a shot, and see if that works. I'd have to tar it though first. Do you mean a base set by splitting it up the data into smaller chunks?

EDIT: zipping with 7z looks like I can get it under ~120GB, which looks good. I can compress it with xz on extreme for about ~107GB, however it will take ~35 days. I'll have to start zipping shortly.

By base set I mean the most important data. 107 GB is still out of reach for people who want to help keep the data safe. I'd say it should be no more than 15 GB. Knowing what block someone placed, when, and where or if they used the Grinch smiley isn't all that important. It might be fun if someone made a "player" for it, showing everything as it happened. But did you store the timing between each message to do even that?

Offline

#25 2016-01-31 13:19:30

Hexagon
Member
Joined: 2015-04-22
Posts: 1,213

Re: EE Analytics torrent [Feb 2016 available]

lrussell wrote:

By base set I mean the most important data. 107 GB is still out of reach for people who want to help keep the data safe. I'd say it should be no more than 15 GB. Knowing what block someone placed, when, and where or if they used the Grinch smiley isn't all that important. It might be fun if someone made a "player" for it, showing everything as it happened. But did you store the timing between each message to do even that?

I agree that 107GB is still very large, and even if I could compress it down to 1GB, nobody would be able to do anything with it because it would expand to 1.5TB in order to read it. What I could is split them into months, or even into days in gzip (as each month compressed is about 10GB, days compressed are about 250MB, which is manageable.) As long as I get enough volunteers to host a section of the data, the entire data set can be reassembled.

Most of the data is move events, and I could remove them (or compressed them separately in a numerical format) to see if I can get down the file sizes a bit more. As with the timing between each message, it was recorded however a fatal error in my implementation for at least a couple of months means that some timings are off by a lot (when played back, it will look like time is slowing down during high activity and then suddenly speed up when the activity stops.)

Offline

N1KF1459884241593763

Board footer

Powered by FluxBB

[ Started around 1732401312.63 - Generated in 0.180 seconds, 10 queries executed - Memory usage: 1.81 MiB (Peak: 2.08 MiB) ]