Microsoft’s Photosynth wowed them at TED, and now the tech demo is live, letting you play with the same environments (and a few extras) they showed then.
Words fail me. It really is an astonishing, brilliant and enriching experience. Forget Papervision for now. Forget Google Virtual Earth. This is a taste of the future. You can almost imagine being able to fly through the environment with an avatar. To drill down through the photos to see who took them. To link them to geotags. To layer a semantic Web on top.
For all the talk of the metaverse roadmap, augmented reality, and information shadows it’s often hard to picture what it will look like. Well, this is pretty close to where I imagine we might be headed - culling photos from public databases (or your own), mapping them to three dimensions, flying through the architecture it creates - add layers of information on top, an avatar, and (once processing power increases) multiple users and the mirror world has truly arrived. (Photos after the fold)
If you have the processing power and (grrr) a PC, check it out.
(P.S. I had to shut down other applications, Second Life included, to open up enough processing power and I’m on a pretty heavy machine).
From the Photosynth site:
The Photosynth Technology Preview is a taste of the newest - and, we hope, most exciting - way to view photos on a computer. Our software takes a large collection of photos of a place or an object, analyzes them for similarities, and then displays the photos in a reconstructed three-dimensional space, showing you how each one relates to the next.
In our collections, you can access gigabytes of photos in seconds, view a scene from nearly any angle, find similar photos with a single click, and zoom in to make the smallest detail as big as your monitor.
Pais looks up after diving into photosynth and flying around, leans back, gets a 10K km stare, and strokes his chin…..
It looks like they are creating some derived geometries of camera and target position, but I am not sure how robust that is. They may be relying more on stitching together the photos to pull things into a topological framework.
The reason I am thinking this is I am wondering how to map things to real world coordinates. Maybe if a frame work were augmented with something like LIDAR (http://en.wikipedia.org/wiki/Lidar) deployed a bit like google street view (http://en.wikipedia.org/wiki/Google_street_view) so that the textures from images are mapped on a real-world 3D coordinate framework, we’d have the basis for being able to really create an immersive virtual map of our spaces. A tricky bit will be to deal with the temporal nature of time, since it may be troublesome to see the same seen stitched together, with trees fully leafed in summer, the other with snow drifts.
Taking a different tack… I was looking at the artist’s studio collection in photosynth and was thinking if I could “see” stuff in Second Life like that, it might even be worth going to the various galleries and museums (or even stores). Pais gets tired of waiting for things to rez around him.
Gosh Pais I wish I were as technical as you. But it strikes me that what’s happened here is that the people taking the photos themselves are the ones tagging that the image is what they say it is. Interesting in the TED demo that he shows how one photo was actually a photo of a photo! Talk about crowd sourcing content!
While I imagine that it would be good to get accurate measures of location based on some sort of computer cross-match to geographies, I also wonder whether that doesn’t end up getting solved in the long run by the photographers themselves. Based on how they put this together, they’ve been able to create a 3D object out of different photos taken at different times and have been able to compute the exact location of each camera in reference to that object. Doesn’t it just take one of those photographers to also tag the photo with a GPS position?
So, for example, while its Google, synch Photosynth up to something like their GPI synch application:
http://code.google.com/p/gpicsync/
You bring up the intriguing idea that the photos themselves can stitch together an actual “textured build”. On that front, start to wonder whether you don’t end up with a mash-up somewhere down the road:
- For an advanced version of a “build” use something like Stanford’s 3D camera:
http://news-service.stanford.edu/news/2008/march19/camera%20-031908.html
- Map the underlying skeleton of the building using something like Photosynth (which also gives you a deep source of information artefacts)
- Combine the two and import into 3DVIA, Sketch-Up, etc for some more “polished” work
- Probably script some stuff in here while we’re at it, using I’m sure LSL
- Load it up to a Google street view type of thing.
It’s the combination of technology with user-generated inputs that’s so interesting. After all, you can spend 100 years running around with a 3D camera or creating an algorithm to assess geo-positions, or you can count on people out there taking photos anyways and let them tag it with a geosynch iPhone for example.
*head spins*
I gotta lie down I think.
Geotagging from cell phones…
http://digitalurban.blogspot.com/2007/04/how-to-geotag-photographs-from-nokia.html
Oh, and one more app, just cuz I’m turning this page into a little bit of a personal bookmarking thing for future reference
(Excuse the self indulgence)
http://digitalurban.blogspot.com/2007/10/google-earth-photooverlay-download-and_25.html
its very “styley” in interface and interactions… but the utility beyond “photo album” style of map- i dont see.
The sence of place is marred by the static nature of the still photography. one feels no sence of the imediate, which is part immersion.
Even after adding avatars as the mediating interface, the app, like papervision( 2d flash) limited “fully 3d ” navigation at a constant presentation level, feels more presentational than confrontational. with is a key factor for in scene immersion psychologically.
i think it’ll just end up another tech demo for the portfolio.
BTW- the whole 3D MAP- mirror world meme— way overblown. Consider it the VIDEO phone circa ATT 1970.:)
time will tell.
c3
Sure Larry - and I don’t think Photosynth is meant to be anything more than a presentation interface for photographic artefacts. At least, they’re not advertising it as such. All these tools and geotags and mapping of 3D models from photos and so on seem to be gaining traction which, when combined with 3D modeling using Sketch-Up and so on does make it seem like new types of representation of data where geo-position is important are arriving.
The work out of London for city planning and urban issues is a great practical use for this stuff, avatars or not, and is really compelling to look at if not necessarily participate in.
I’ve written elsewhere (who knows which post it was, I don’t tag very well) that my interest is NOT in augmented reality or mirror worlds. I’m far more interested in immersive environments. I’m also of the opinion that we can find clues for how these immersive environments will evolve by looking at what other tools and visual languages are evolving in related fields.
I DO think Photosynth, or at least the idea of photographic artefacts layered over 3D spaces, when combined with some of the other technologies that are evolving, does suggest that the mirror world will be far richer than a 3D version of a bunch of buildings, that it will have layers of data far different than how we think of information on “flat” Web pages, and that as these forms of visualization expand and improve and eventually add avatars that we’ll be opening up a new language for conceptualizing space and content.
My interest, however, as I say, is in extrapolating to immersive worlds, because its there that the deep sense of presence and, as you say, confrontation, not only allows us to access these deeper models for information display, but to combine those with avatar expression, identity, immersion, and emotion - and that it is THIS combination that will open the door to a paradigm shift that hopefully won’t go the way of the Video phone (which, by the way, CISCO is making every attempt to resurrect hehe).
well teh video phone has been around now for a decade- as the webam… the main ” problem” solved was timeshifting — removing the issue of getting a video phonecall when in the shower…. althought the most prolific use for the one to one vid cam call has become again attached to nudity…lol
anyhow– attaching deep data to 3d has again been the goal for over a decade… X3D is 3d plus XML — specifically designed as the major extention of the earlier VRML language for this reason….
its success or failure though has been political and financial, now more than ever as the 3rd web3d bubble rages on and masses of newbies reinvent web3d wheels for fame and vc glory:)
every flash site is now a virtual world, and everything 3d is now avatars…..- this demo thankfully excluded- which is why i dont see it as a future of the metaverse interface, but only , and it seem we agree, as a single presentation interface that may or may not offer any real power vs many others that will be shown.
[…] authors don’t mention PhotoSynth, sure to be one of the most astonishing tools for aggregating images of “place” and creating […]
[…] something like this with the stunning Photosynth and you start to get the idea of where this is […]
[…] being done right now with 3D desktops (like the sparkly and intriguing Bump Top, for example. Or Photosynth, which I mentioned above, which turns photos into a David Hockney […]