Do you think I could just leave this part blank and it'd be okay? We're just going to replace the whole thing with a header image anyway, right?
You are not logged in.
Move events are unnecessary, not only do they have the potential of being deprecated, but years worth of move events can be generated in a day.
If someone were to make an AI (think NEAT (neuralevolution of augmenting topologies)), then they could generate tons of datasets with simply using time and/or coins as a fitness variable.
The only potentially valuable data is init, godmode, chat, trophy and saved room packets, essentially anything that can't be easily computated in the future.
There are large privacy implications with storing chat logs in the first place, regardless of them being in publicly available rooms.
The godmode would be essential for knowing who the owners' friends are by seeing who they grant access to the most, and in what timespan and worlds, etc.
I'd think anonymised friendship graphs would be neat. Though, with anything nowadays, you could just simply request an admin to make it for you.
Though, it would be quite neat to have a repository of Everybody Edits related material.
For this, I would recommend something like BT Sync (if only it weren't proprietary...), which is highly related to torrents, only with the addition of being synchronised in realtime.
In any case, storing a large amount of data has its complications.
When it comes to replicating data, using torrents as a means of backing up, you have to be aware of that torrents aren't by default synchronised with realtime data, so data redundancy is a factor as well.
That being said, nobody is going to allocate tons of digital storage space and bandwidth for things they don't use or find useful.
*u stinky*
Offline
lrussell wrote:By base set I mean the most important data. 107 GB is still out of reach for people who want to help keep the data safe. I'd say it should be no more than 15 GB. Knowing what block someone placed, when, and where or if they used the Grinch smiley isn't all that important. It might be fun if someone made a "player" for it, showing everything as it happened. But did you store the timing between each message to do even that?
I agree that 107GB is still very large, and even if I could compress it down to 1GB, nobody would be able to do anything with it because it would expand to 1.5TB in order to read it. What I could is split them into months, or even into days in gzip (as each month compressed is about 10GB, days compressed are about 250MB, which is manageable.) As long as I get enough volunteers to host a section of the data, the entire data set can be reassembled.
Most of the data is move events, and I could remove them (or compressed them separately in a numerical format) to see if I can get down the file sizes a bit more. As with the timing between each message, it was recorded however a fatal error in my implementation for at least a couple of months means that some timings are off by a lot (when played back, it will look like time is slowing down during high activity and then suddenly speed up when the activity stops.)
Do the month thing. I like month things.
"Sometimes failing a leap of faith is better than inching forward"
- ShinsukeIto
Offline
Combining a few ideas together from the community, the mirrors (hosters) could just host the last three months of data, and then the rest lives on archive.org. Three months of data will be about 14GB, which is much less than 150GB. If those files are still too large, I can do with Atilla and lrussell are saying, which is to provide a base set with only the essentials.
Right now I'm currently sanitizing the data and getting things compressed on ultra 7z settings, and I'll let you know once the data is available for download.
EDIT: I remembered that open worlds were not recorded from about six months ago to now, so those might be an inconsistency if you see them in the lobby data.
- also the lobby data is very, well... granular as it recorded in chunks of 30 minutes and only user ids and world ids.
Offline
I bet you have some kind of google glass and record every moment of your life.
Or you are more interested in others?
Recorded on a minute-to-minute basis though
Thank you eleizibeth ^
I stack my signatures rather than delete them so I don't lose them
Offline
Later day we'll be releasing the first full data archive from December 2015 via torrent, and if no issues are detected will be accessible to everyone (I'll update this post with a magnet link.) The smaller base archives will take some time to generate, as it takes around 2 hours for each removing-events pass (6 hours for sanitization.)
Offline
Later day we'll be releasing the first full data archive from December 2015 via torrent, and if no issues are detected will be accessible to everyone (I'll update this post with a magnet link.) The smaller base archives will take some time to generate, as it takes around 2 hours for each removing-events pass (6 hours for sanitization.)
Leave out the chatlogs. We've expressed disapproval of this before: http://forums.everybodyedits.com/viewto … 35#p556035
No u.
Offline
Leave out the chatlogs. We've expressed disapproval of this before: http://forums.everybodyedits.com/viewto … 35#p556035
Chat logs aren't going to be included. Sorry, I meant to say that the "full" set is the sanitized one, and the base one is even more condensed.
EDIT: magnetic link available, for 2015-12 data:
magnet:?xt=urn:btih:45d8d6ee3877011960c5346f4eaf7273a9ec6cd1&dn=2015-12&tr=udp%3a%2f%2ftracker.coppersurfer.tk%3a6969&tr=udp%3a%2f%2ftracker.leechers-paradise.org%3a6969&tr=udp%3a%2f%2ftracker.openbittorrent.com%3a80&tr=udp%3a%2f%2fopen.demonii.com%3a1337
Offline
I will definitely seed with all my 10kbps speeds.
*u stinky*
Offline
I have an avarange of 200 KB/s
Everybody edits, but some edit more than others
Offline
Zumza wrote:
I have an avarange of 200 KB/sIt takes a long time to start up (like 10 minutes or so.) I know that another peer has been downloading the file successfully so far.
It was just a joke.
Also I will seed this file while people are interested.
It seems that the wireless router wormed up and gives me almost 1 MB/s I'm kinda disappointed that the uploading speed is 3 times garter than download speed.
Everybody edits, but some edit more than others
Offline
We fully understand why chat messages shouldn't be in the data, however we're considering about re-adding the add messages because otherwise it's very difficult to reconstruct a user based on when the first move event occurs. This means that usernames will be in the data, and add messages are just messages that someone joins a room.
We'd really want them to be back in, but we're asking if you'd feel that's okay if they were in.
Offline
We fully understand why chat messages shouldn't be in the data, however we're considering about re-adding the add messages because otherwise it's very difficult to reconstruct a user based on when the first move event occurs. This means that usernames will be in the data, and add messages are just messages that someone joins a room.
We'd really want them to be back in, but we're asking if you'd feel that's okay if they were in.
I don't see a problem in having the "add" messages included. It'd be a lot more useful for statistics and rather harmless since chat isn't being recorded.
Offline
Where do you draw the line when it comes to anonymising the data? It should be obvious that some users won't want the time and places they've joined worlds tracked.
*u stinky*
Offline
Where do you draw the line when it comes to anonymising the data? It should be obvious that some users won't want the time and places they've joined worlds tracked.
I draw the line where the community says where they wouldn't feel comfortable sharing a certain piece of data. For example, the community doesn't want chat messages in there, so chat messages won't be public.
---
November 2015 has been posted, and is free to download.
Offline
For example, the community doesn't want chat messages in there, so chat messages won't be public
public
I'm not one to care about chat tracking but you should have said not tracked and kept it secret; ignorance is bliss
Thank you eleizibeth ^
I stack my signatures rather than delete them so I don't lose them
Offline
We understand that there are some users who feel that their privacy has been reduced when EE Analytics joins their world. If you'd like to have the bot not join your world, PM me and I'll give you a form that you can fill out. The form must be completed in order to opt-out from EE Analytics (a PM won't suffice.)
Offline
I have to fill out a form to opt-out of an event I never signed up for.
Good one.
Offline
I have to fill out a form to opt-out of an event I never signed up for.
Good one.
Everyone is opted in by default, as it follows internet indexing patterns (i.e you have to opt-out your website from Google.)
Offline
Everyone is opted in by default, as it follows internet indexing patterns (i.e you have to opt-out your website from Google.)
The problem is that most people won't know they're being tracked unless they happen to read this thread.
Or does the bot tell you you're being message stalked?
One bot to rule them all, one bot to find them. One bot to bring them all... and with this cliché blind them.
Offline
Hexagon wrote:Everyone is opted in by default, as it follows internet indexing patterns (i.e you have to opt-out your website from Google.)
The problem is that most people won't know they're being tracked unless they happen to read this thread.
Or does the bot tell you you're being message stalked?
The bot acts like a typical bot would, making it intentionally easy to detect it (i.e just idles, doesn't send any messages on death or anything.) Using this behaviour, a large portion of the EE population has detected my bots.
Offline
The bot acts like a typical bot would, making it intentionally easy to detect it
Well there's a lot of cases where it won't be detected when just idling. It's still secretly stalking anyone who doesn't know, and although what they don't know doesn't hurt them, they do have a right to know.
Maybe have it say something like "[EEAnalytics] Connected." and add some messages that explain the bot or link players to this thread. You could add chat commands for people to opt out of tracking for a certain world or time period, and have a vote kick thing like zerk bot had.
I'm not entirely sure I care, but if it's no longer entirely anonymous with "add" being recorded, and chat messages are logged, you should be careful that this doesn't happen again.
One bot to rule them all, one bot to find them. One bot to bring them all... and with this cliché blind them.
Offline
Hexagon wrote:Everyone is opted in by default, as it follows internet indexing patterns (i.e you have to opt-out your website from Google.)
The problem is that most people won't know they're being tracked unless they happen to read this thread.
Or does the bot tell you you're being message stalked?
Being completely honest, barely anybody knows their facebook page is stalked by Google, Yahoo, etc.
I also do website spidering as a livign myself, and you'd be surprised how many sites don't have a robots.txt (a text file that tells bots that they don't want to be spidered (indexed))
Google won't tell every site they're being spidered, they'll have to find it out themselves and see how to prevent that.
Offline
Hexagon wrote:The bot acts like a typical bot would, making it intentionally easy to detect it
Well there's a lot of cases where it won't be detected when just idling. It's still secretly stalking anyone who doesn't know, and although what they don't know doesn't hurt them, they do have a right to know.
Maybe have it say something like "[EEAnalytics] Connected." and add some messages that explain the bot or link players to this thread. You could add chat commands for people to opt out of tracking for a certain world or time period, and have a vote kick thing like zerk bot had.I'm not entirely sure I care, but if it's no longer entirely anonymous with "add" being recorded, and chat messages are logged, you should be careful that this doesn't happen again.
We would be more than willing to work with anyone if they feel that their privacy is being invaded or feel like their being stalked. Feel free (and anyone else) to PM me if you have any concerns with your world being spidered, and we'll see if we what we can do (we don't normally reach an agreement here because it might bump the post too many times.)
If there is enough demand, we can put in more controls as to what data is being collected from your world, and how it is published too.
Offline
[ Started around 1732400617.1845 - Generated in 0.231 seconds, 10 queries executed - Memory usage: 1.84 MiB (Peak: 2.13 MiB) ]