HyperGrid reported a few days ago about an attack on servers that brought down hundreds of web sites and openSim regions. The attack took down simulators hosted on Linux shared virtual servers, leaving grids like Reaction Grid, which runs on Windows servers immune.
I don’t know enough about this stuff, but it sounds to me like OpenSim itself wasn’t the target, but was swept up in the server take-downs. Regardless, the event left hundreds of regions in melt-down, and owners scrambling to pull back-ups out of inventory to restore the work.
As OpenSim gains traction, the measure of its success will increasingly shift from whether it works, whether your avatars don’t crash when they cross regions or stall in mid-air, flailing as you fly like the good old days of Second Life….to whether the regions are professionally hosted with the kinds of support and security services that users come to expect from service providers.
Grids like Reaction Grid have the opportunity to establish a circle of trust with their users because they treat their businesses like businesses – they may still be trying to hold it all together with wire and a lot of late night attention, but expect to see them grow into something resembling a real operation, with hours, and support staff, and security patches. The others, the one being run out of a garage or basement, will have trouble maintaining the trust of their users who may have migrated from other Grids, and just sort of assume that ’support’ is somehow paid for in the $20 a month sim fees.
The Strangeness of Stability
In counterpoint to the OpenSim meltdown, this week’s Metanomics was a massive fail.
And what a horrible episode to have it happen – Obama transition team member Kevin Werbach, who still works at the FCC, and Mitch Wagner, who’s a Second Life enthusiast perhaps, but who also believes that the future of mass adoption immersive applications may lie elsewhere, were left chatting away to dead air.
I hardly think we’ll get Obama or Steve Jobs on the show after this one.
Voice went down and stuttered. Linden Lab swooped in to try to fix things, recommended a region restart, half the stage disappeared, and although the audience seemed pretty engaged in chatting amongst themselves on the state of Apple, the iPhone, and politics – the show ended up canceled, things just wouldn’t work like they’re supposed to.
Now, we haven’t owned Metanomics for that long, and we’ve had pretty much a smooth run for 18 shows or whatever its been – but from what I gather this kind of thing used to happen a bit more a year or so ago, which meets my own experience of a Grid that is WAY more stable than it was a year ago, and epically more stable than two years ago.
I can’t remember the last time I lagged out. Even my inventory seems to load moderately faster.
And it’s kind of strange, in a way: lag, crashes, and screaming frustration with inventory loads or trying to pull a texture on the upper face of a prim – those are part of the CULTURE of Second Life (hey, even Boellstorff said so in his book), and the Metanomics fail reminded me of the “good old days” when, with the world crashing down around you, people would still chat, dance, snark, whatever….and when you had managed to get back in again after the third re-log, simply said a quick “WB” and picked up the conversation as if nothing had happened.
I guess the odd thing about stability is you stop thinking about how the damn thing actually WORKS. When it crashes, you suddenly marvel that it works at all: how somehow a bunch of convoluted code manages to bring a couple of hundred people together to listen, watch video, chat, share, friend each other, and wear prim hair.
And with nary a hacker in sight, you’re just THERE, which can be a nice place to be, but sometimes it’s good to be reminded of how grateful I am that there’s a there there at all.


More than a hundred thousand hosts all running one particular software package. A lot of Linux-based servers were likewise immune (they didn’t use this particular piece of commercial software).
On our Treet TV show Designing Worlds we use Skype for audio. It’s a lot more predictable than Voice and although the audio quality is sometimes not as good as one would like, it doesn’t exhibit the erratic behaviour that Voice does and it’s a lot more flexible in the production environment.
Personally, I don’t like putting all my eggs in one basket. If you’re doing a TV show in-world then at least if SL has problems or you crash out, Skype audio is still there; similarly if you’re playing or DJing at an event the music stream will still be there even if you aren’t visible for a couple of minutes.
I am really glad, for example, that streaming audio is left to tried and tested external systems and not to internal Viewer functions beyond a player library.
Dusan you nailed it, late nights and wire plus a little duct tape keeps OpenSim running. My first thought when considering hosting OpenSim was, “we have to do superior support” to make users happy.
As you saw with the Lindens response yesterday & our struggle with a recent “Rocking The Metaverse” event virtual worlds are still working towards the reliability of the 2D web.
I had that same feeling yesterday when I saw your Metanomics team respond with utmost concern & with Linden follow up. I felt that you all cared so much that no matter what things would be worked out soon with such passionate people at the helm.
My wife Robin & I restored a 38 foot wooden schooner when we left the Navy & it was a struggle just to keep her afloat in the early years (one night I had to bail water for 5 hours when we awoke to a flood) but we did it and we ended up with a fine sailing vessel that even million dollar sportfishers would compliment us on.
This taught us a valuable lesson that you have to be persistent and the rewards you seek will come.
We all need to keep in mind the complexity of a virtual world. From servers to client it taxes every aspect of a computer like no other website or flash or silverlight application does. It needs a lot of care to run well and this will be true for some time to come.
So Dusan, Linden Labs, OSGrid & the many other virtual world platform developers and end users need to buckle down and keep in mind we’re still early in this game, still laying foundations, still patching leaks to keep our ships afloat, but that things will improve.
I myself will be back to Metanomics & I expect the Obama administration is not so short sighted as to not return. Virtual Worlds are already changing our lives with positive, forward thinking projects & I myself am proud of having peers like Metanomics & LL on the journey with us.
So though the water may rise and the ship may rock a bit some days we continue on because smooth sailing is coming & is well worth the wait and effort.
See you all in 3D-Kyle G CEO RG
I’m still pleasantly surprised every time I log on and everything works. Having no idea how things actually function, in the back of my mind there always seems to be the possibility that the entire enterprise will collapse into a heap of smoldering rubble at any moment.
It’s been an awfully long time since we have seen the “We’re fixing stuff and breaking things” screen which was a regular feature when I was new. For all the crabbing that goes on about a bit of lag or temporary asset server problems, I’m still delighted that everything works so well and stability has greatly improved in the last two years. It seems to me that the cancellation of a Metanomics show has become noteworthy because it is now unusual in the day to day SL experience.
@Tateru – yes, I should have put more emphasis on that. I fear that a lot of people read the headline “OpenSim hacked” and not realized that it wasn’t the specific target, but a bystander in a wider take-down.
@Elrik – I agree. We work with Treet as well, and have been through many of the same audio checks/decisions with them. We’ve tested across platforms, in world voice, Skype, etc. We actually have three configurations, one of which includes a guest calling in from a land line (phone! wow!) and the voice being streamed through their avatar. Kind of a smoke and mirrors thing a little but we use it in a pinch when the guest can’t get in world (rarely used, but still).
@Kyle & Corc – wow, you know, your kind words and thoughtful response really brightened my day. Thank you so much for that.
Mitch Kapor can say we’re past the frontier days, but every now and then we’re reminded that we’re still riding a brand new wave, and sometimes the tidal pull is a bit stronger than we expected – but what a ride it is.
reading. don’t think i have anything to add.
( I do know during the exact time of the show yesterday my PC wanted to download and install 13 updates… i dunno if all the PCs in the world suddenly were being pressed to deal with the computer updates.)
“I hardly think we’ll get Obama or Steve Jobs on the show after this one.”
Well, it is still early stage after all. We are experimenting. We are trying to push the boundaries of communication and connection and make geography collapse.
I cannot believe Obama or Jobs would not find this interesting and worth at least a try.
I am unsure why you put the opensim and metanomics topics in the same blog, Dusan. Ok, they are co-located in time and they both involve examples of virtual worlds breaking, but the differences make the juxtaposition confusing.
It looks like hacking story involved OpenSim only as collateral damage – the server hosts were really at the center of this topic.
And yesterday’s metanomics fail… after missing one after the other this was my first to attend live (albeit on the webcast) and I had plenty other work to occupy me as I watched the lighthearted chatbridge traffic scrolling from top to bottom. I have to admit I was reminded of why I avoid video teleconferences… it seems at least 20 or 30 minutes are required to dink around and get things working while the meeting is stalled… (granted, i am referring to the prehistoric system where i work) …and I have to wonder if it is really worth it.
Of course, it makes sense to ‘walk the walk’ with metanomics and explore the technology while talking about the technology. That said, I am glad I didn’t follow up on my initial thought to call in colleagues when I was starting to watch the metanomics presenation. That would not have been a good first impression for doing meetings in VW
It sounded to me that Mr. Werbach took the glitches in stride. Most tech savvy people learn one of the most important skills is to tolerate frustration or else move into another line of work.
I’m sorry this had to happen with such a high-profile visitor — ouch, you guys don’t deserve that at all. I hope you will get someone from the administration soon.
Yes, Skype and TalkShoe or some other service like that are good backups.
Re: OpenSim, we would never know this had happened to this overhyped HypeGrid if you hadn’t reported it. Thanks! And all the scrambling to justify it from Tateru doesn’t cut it — it’s an awful collapse, of the kind that SL hasn’t seen since 2004 when the Lindens actually decided to monetarily compensate people in Linden dollars for 2 days of no service.
And…we don’t know that in fact it wasn’t the target. Why assume that?
I had the worst attack of inventory loss I’ve ever had in SL this week. I had my boards and posters and rezzers set up to go put in the installation at the LandExpo when whoops, half my inventory disappeared. This happens all the time, my inventory fluctuates daily from 20,000 to 24,000 items, and I even keep items stashed inworld like a squirrel, but this was now down to 12,000, and losing precisely ever single thing I had accessed in the last 48 hours.
I tried everything to flush it all back, even trying a new trick which involves logging on to aditi and then to agni a few times, which forced in some of it, but a lot of my house rezzers are still gone, which amounts to hundreds of US dollars.
Having to hastily make a kind of scrappy display at LandExpo after that, I felt pretty hosed. Walking around the Expo like a fly crawling in molasses, with the sims being reset by Lindens and people bitching in the group, I gave up.
And I, too, realized that I never try to “work in SL” or “on a schedule”. I’ve made my small business there completely adaptable with all kinds of redundancies, workarounds, tactics, feints, etc. to try to deal with the never ending chain of bad mojo — builds lost from sims inexplicably, rental boxes resetting their times on sim restarts, scripts being turned into mush by some new patch, etc. etc. I keep four different kinds of rental boxes going, I have land spread all over on 60 sims so if one is done, I can send a customer to another, etc. etc.
And all of this is sort of “dysfunctional” because when I have to try to do something like “I would in real life,” i.e. make a plan, make a chart, perform it on schedule, it turns into hash.
I really have a new respect for people like you or Fleep Tuque or others who try to “work like we do in real life” with what we call “plans” and “schedules”.
Does it ever get better?
always seems like important meetings and demos using computers and then the web haven’t worked on schedule in 20 years:)
Just another episode of the holodeck breaking? or is there something deeper not being learned after 20 years.
more real than real? hopefully not.
glad my os sim is on reactiogrid.:) keep it running Kyle:)
As I think about this some more, I’m more and more convinced that Linden Lab’s severe problems could be symptomatic of a major hack, too. They never, ever tell when they are hacked anyway. They make a policy of never, ever admitting that when sims are crashed it is deliberate. So I think this question definitely needs to be asked.
we were working in the background with Treet.TV throughout the show to try to handle the many issues we encountered. There were far more things going on than just voice problems. We do have a Skype fall-back available for voice, and have used it several times on shows in the last couple of months (maybe no one noticed when we did, which would be a good thing
.
The Treet.TV cameras and several key avatars on-set were experiencing repeated disconnects or crashes, to the point where we could not keep a solid enough video feed nor keep the guest inworld to consistently run the show.
There was something quite wrong with the region that several restarts did not resolve, including a problem that would generate an error any time any avatar would try to rez any attachment (and the attachment would fail to load). Collectively, the number of issues, and the distributed nature of the problems, got to be enough that we opted to cancel and attempt to reschedule the show.
At this point, through time inworld and with Concierge, I believe that we have the region back to a reasonable operating profile, and hope that for next week’s show we’re back to smooth operation.
Now that the region has had a thorough smackdown, it’s actually running more smoothly than we’ve ever seen it – including during the period when we were putting the new studio build together. I’m thinking at the moment that there was something wrong with it over the long term that was slowly degrading, and that hopefully now we’ll be back in a good position for next week’s show.
Thank you all for your support!
Joel Savard
@Prokofy Deliberate interference is always a possibility, however given the number of behavioral issues we have now resolved, I think it’s entirely possible for the problems we saw yesterday to have been fully the result of non-malicious system stability issues. We will be watching closely going forward…
Joel
Dusan –
Thanks for the mention. The HyperVM hack took down over 100,000 websites — and just over 100 OpenSim regions — so it seems pretty clear that the OpenSim regions were just collateral damage, hosted on the affected servers. Unless it was a really malicious hacker who specifically wanted to destroy a few homesteads, office buildings and an up-and-coming convention center.
And covered it up by taking down all the other stuff as well.
However, most OpenSim regions stayed up. OSGrid alone has over 2,000 regions. ReactionGrid is fully up. All the home-based regions were unaffected, of course, since they don’t use the virtualization software that was hacked. And many regions had full off-line backups, such as Simon Gutteridge’s regions on PioneerX, and will be fully recovered as soon as the servers are back up.
I love the degree of control OpenSim (and similar grid servers) offer their operators. You can host at home, on a cheap shared server, on an expensive, high-end server, or on the Amazon cloud, if you wanted to. A friend of mine is talking about setting up an OpenSim-based conference center using Sim-OnDemand’s Amazon Cloud hosting service. That’s less than $10 for 10 hours of use… get the region all set up, then activate it when you need it, shut it down when you don’t. I’ll be interested in trying it out if the project gets going — and I’m sure there will be several like it pretty soon.
- Maria, Hypergrid Business
There’s a sad coda to this story, first picked up as a rumor by Tateru, but confirmed online:
http://www.theregister.co.uk/2009/06/09/lxlabs_funder_death/
“The boss of Indian software firm LxLabs was found dead in a suspected suicide on Monday.
Reports of the death of K T Ligesh, 32, come in the wake of the exploitation of a critical vulnerability in HyperVM, a virtualization application made by LXLabs, to wipe out data on 100,000 sites hosted by the UK web hosting firm VAserv.”
[...] week’s edition of Metanomics was cancelled because of technical problems. I hope we’ll have another try with the same [...]
Oh, the death was confirmed at the time of writing — whether it is suicide or not, I believe, remains unconfirmed until the coroner’s report becomes public. That was the only part of the statement I was really in doubt about.
At least LL responded, I am mostly left twisting in the wind.