Good news and bad news.
First off, the bad news.
The server our gallery was on experienced a fairly serious hard drive failure, along with issues rebuilding the raid array.
Fsck was not bringing back our data, and we had to go into full file recovery mode as we had not gotten to the point of making a backup on that server.
We were unable to recover image data to a workable degree.
Much of the image data we did recover, showed signs of the corruption.
Now that the bad news is out of the way.
We did manage to recover the integration code that allows our gallery to integrate with our user accounts and flockmod itself.
This allowed us to restart the gallery on a new server.
We now have backups of this code and that should help with allowing us to restart quicker should this sort of thing happen again.
We are looking into backup solutions for the images themselves, though given the speed of change and the size of the data it is slightly more difficult than the code.
In short, we had a near total hard drive failure within a week of launching the gallery.
This has caused us considerable downtime in attempting to recover the system.
We did not have backups of the system, as it was brand new.
We will try to provide more consistent ant redundancy in the future.
Rest assured that the actual drawing servers are a fair bit stronger then the gallery server.
Murphys law took full effect here though and we got hit in one of the few places we had not made strong yet.
We are extremely sorry about the downtime and confusion, hopefully the gallery’s new home will stave off these issues.
So, Figured I owe everyone an update.
The gallery is and has been down again for a while now.
There was a misunderstanding on my part of what the host had done.
They did not migrate the repaired drives to a new machine as I had previously thought.
They placed the repaired drives back into the same box, which came online and started doing data checks in the background but immediately started having problems writing to the drives.
They belived the issue was related to the data checks (an array rebuild) failing under the load of the websites.
They took the OS offline to do the array rebuild without any load.
Then, the array rebuild failed with no load.
They are now running a final test on the drives to verify they are working correctly.
If they are, then the problem most likely lies with the raid controller itself.
They will then replace the raid controller in the server, and rebuild the array one more time.
If they allow the OS to run while it rebuilds, we might be up in 3-4 hours.
If they dont, Its gonna be another 10-12.
We are deeply sorry about the issues, hopefully this gets sorted out soon.
So, it turns out that the machine we were running the gallery on was unable to remain online long enough to properly offload the data.
Our host has performed a ddrescue and migration to recover the data and migrate the machine.
Its taken a while, but we are now off that god forsaken failing server, and on to a brand new box!
This should allow us to run much smoother now, and not go offline every 5 hours!
Hopefully this is the end of the gallery server nightmare.
Our gallery is back up for the moment!
Our host is still doing some analysis on the machine to see exactly what went wrong and strengthen against it in the future.
But I figured i’d at least post something in the mean time.
This article is going to be mostly for the tech geeks among us,
But feel free to read along if you are just curious whats been going on with the gallery or what we experience behind the scenes.
We had three major issues that caused our gallery to experience downtime, slowness, and errors for the past few days.
The first issue was the driver we were using for our hard drive was in legacy mode more or less.
This caused uploads to lag randomly and error out at times, along with slow image and thumbnail loads.
We became aware of this problem rather quickly, and had it ironed out within about 36 hours.
This one lies on us, I should have been more aware of the difference in speed between the two modes.
I am acutely aware now and this should not happen again.
The second issue we ran into was a failure in our hosts raid array.
This caused complete and total downtime of the gallery, with 502 gateway unavailable error messages when you tried to go to it.
The raid array manages hard drives, if anything is wrong with it it will shut down immediately to prevent data corruption.
The process to bring the raid array back online takes them a couple hours minimum.
Unfortunately, this has happened twice in three days, and three times in two weeks.
They are currently investigating the raid array to try to remedy the problem.
Honestly, This may happen again sometime in the next week to two. I am not convinced they have it stabilized yet.
We are hoping they can keep it under control though.
The third and final issue we’ve ran into was a glitch in our hosts internal networking.
This has caused the on and off lag and long page loads in the gallery.
We have an internal network to send data between our servers in the datacenter.
For some reason, the machine that runs the gallery is having difficulty reaching our database via the internal network.
It has intermittent periods of extremely high connection latency (lag).
It took us quite a while and a lot of back and forth with our hosts tech team to narrow it down to the internal network as the culprit.
We managed to re-route the connection so it bypasses the bad network, and it appears to have fixed the latency issues.
So that’s where we are currently at.
Its been a long couple days for us here trying to keep up with these bugs and issues that have been popping up.
Things seem to be calming down a bit for now, but I could write another one of these tomorrow saying the exact opposite.
We will keep you posted, enjoy the improvements for now.
Hopefully things stay stable and we can move forward with other features soon.
Hotfix 11.07B is out.
This should allow users to save locally automatically if the gallery is down.
It was supposed to be in 11.07A but that feature was broken.
Which brings me to my next point:
Our gallery has been down a bunch!
Not to mention slow when it is up.
Dont think we haven’t noticed.
Its even down right now, which is why I am pushing this update.
Our host has taken notice and is aware that the machine we are on is having issues.
They are working to bring it back up, and will likely be performing some maintenance soon to try to prevent this in the future.
We will continue to work hard to support this gallery launch,
Please bear with us as we iron out the bugs!
A hotfix is being released:
This hotfix fixes some save glitchyness, preventing multiple saves to the gallery at once and providing a useful popup when a user tries.
(EDIT: this hotfix had a failed launch during the early half of the day, it is not currently out)
(Hotfix was re-released successfully about 3-4 hours after failure)
Flockmod 11.07 is out!
Improved custom brush lag in rooms at the expense of more lag for the client who uses it when they load it.
In other words, custom brushes will lag rooms less, but will lag the users who load them more when they load them.
Fixed issues with bad custom brush lag while drawing with a complex brush and a tablet.
Fixed major lag issues when you change size, alpha, or blur with a custom brush.
Fixed angular and incremental mode brushes, also fixed random mode brushes not syncing properly.
Custom brushes can now be disabled by the room owner.
Added status messages with /status <message>
Added client side chat ignore with /ignore <user> (Or you can right click their name in the userlist and click ignore.
Use /unignore <user> or right click unignore in the userlist to unignore.
/unignore without a user will unignore anyone you have ignored.
Alpha in color preview can now be toggled on or off in the config settings.
Gallery added to the site and integrated with flockmod’s save.
You can turn this off in settings if you don’t want to use the gallery.
Help section added to the site and the app.
News section added to the site and the app.
Link to home page added to the app.
We hope you’ll enjoy the changes!
Well, its been an exciting couple months over here on the flockmod team!
Since we havent really had a chance to recap whats happened lately, Ill try to give a brief overview on the main changes since v11.
We finally released flockmod v11 at the start of october.
The initial release brought tools like layers, alpha (transparancy), blend modes, new brushes, more hotkeys.
Features like room listings, board loading, and an inbox for messages were also added.
Tons of code was worked on behind the scenes, and much of of the stuff we had put off for years was checked out and re factored.
It was, undoubtedly, our largest release to date.
After the initial release, we put out 11.01 a week later.
It was mostly a bucket of bugfixes for things that came out after launch.
We also made zoom a little more granular though, and made the system a bit more stable during the connection phase.
About a week and a half after that, we put out 11.02.
11.02 brought flexible undo settings, the harderaser to erase all three layers at once, and a limit on chat message size and inbox messages.
We also cleaned up the graphics on some of the dialogs on the app.
11.03 was released on october 23rd.
We added a feedback button on the main room dialog screen, and linked it with our ideas page.
We also did various bug fixes, including a bug that froze the app if you set undo to save to often.
11.04 was released on november 14th.
We updated save so it does not save hidden layers, fixed hotkeys breaking in certain room types
Added right-click save to undo and layer previews
Seperated chat inbox and pm sounds
Added alpha to color previews
Added a better right-click menu for the userlist
Added user coloring in chat and the userlist
11.05 was released on the 23’rd of november.
This was a largely tablet focused release.
We released our tablet app that adds pressure sensitivity for drawing tablets.
We also added a shift enabled shortcut tool, aimed at tablet users but useful for mouse users as well.
We also updated alpha to be 0-100% rather then 0-255.
It got two hotfixes (11.05b/c) while we worked on our secret bell pepper brush for the next official version.
The hotfixes included adding a max size slider for tablet users to quell pressure over-sensitivity, a moderator chat, and removal of the alpha in color previews (due to popular demand).
11.06 (our most recent release of this writing) had only one change to speak of.
Custom brushes! We implemented support for gimp brushes in this release, opening the door for a whole new array of user driven tools and brushes!
We really feel this, along with other recent drawing developments, are drawing us into a more flexible, useful, and professional app.
So now your all caught up!
Hope you enjoyed the quick rundown,
We will try to provide more information about the updates as we go on where possible from now on.
I realize that we have had issues with delivering a clear message of changes between versions and overall going-on’s at the site.
This will mark the start of our attempt to deliver reliable centralized updates to you guys on our new platform.
Wish us luck!