This is a rumination on 3D graphics, the Internet, and irrational exuberence for a new medium. It closes with several musings on long term applications for 3D Internet space. This was written right after the ratification of the VRML 2.0 specification in 1997. In many ways it prefigures the reassessment of 3D graphics' role in enriching the Internet that seems to have occured right before the turn of the century.
written at the World Movers Conference
San Francisco
January 29, 1997
Where Its Been
For the most part the Internet is currently comprised of hyper-linked enriched text. Like print media in layout. It also has some real-time content distribution capabilities that liken it to limited broadcast television. It also has interactive feedback that give it somewhat telephone like capabilities. Print-like layout, television like 'themed' channels, and simple interactivity.
The popularity of the Internet did not start to grow significantly until standard Internet protocols for visual information domain navigation became widely adopted. For standard adoption to take place, a clear common benefit must exist for everyone impacted by them.
HTML for the most part has been a tremendous success because it allows existing media (documents and pictures) to be posted in a format that any HTML browser can to read. HTML being text based requires no special or proprietary development tools other than a reasonable text editor (often free and already installed on every PC). Its popularity grew rapidly because plenty of content already existed in near HTML form and the larger public is already very familiar with productivity application like word processors and paint programs.
Flat 2D Techniques Easily Keep Focus on Rich Content
HTML is a protocol for formatting text with pictures using a simple page layout metaphor. The by product of this formatting is a 'web page' of 2D information very familiar and similar to users of other print mediums like magazines, newspapers and such. This 'product similarity' allows web pages to leverage the enormous amount of design experience and production technology garnered by traditional mediums from hundreds of years of use.
Web pages are rarely viewed in their entirety. Various pervasive graphics user interface techniques developed from over a decade of computer interface exposure are used to make navigating and viewing HTML based web pages simple and intuitive. Additionally, a standard interface design for 'browser' HTML navigating interfaces has now migrated to virtually every make and model of personal computer and workstation commonly available. Of particular note is that at all times the context of information displayed with a HTML web browser is immediately evident. HTML web pages are linear with a top and bottom, and this is fed back with various 'scroll bars' and 'percent read' indicators built into the standard viewing interfaces. This interface and the print-like content make it very familiar to digest.
The process of digging deeper into linked web sites is also very linear. At any point the previous link can be returned to. This makes navigating the myriad links and the relationships denoted by them easy to approach, learn, and later take advantage of directly. This linear page and link metaphor does not deviate significantly from the notion of all HTML web pages organized as a 'gigantic book' with multiple 'themed' indexes or table of contents. Analogies like 'Fodor's guide to traveling the geography of the planet' and 'Hitchikers Guide to Galaxy' easily come to mind.
This standard interface and powerful legacy of content is a very alluring information technology. The organization of web page information is perhaps its most significant achievement. The breadth, depth and flexibility of organizing hyper-linked information combined with the standard interface on commonly available devices is a tremendous 'compendium of planetary information' the likes of which has never been seen before in mankind's history.
This tremendous amount of content viewed by commonly available and standard browser technology is easy to use. All web page display and link navigation follow structured 'linear' relationships that are intuitive and simple to understand. As will be pointed out later, never does a HTML web page browser display a document in a disorienting manner on traverse links in an unstructured fashion. This has been a significant factor toward the World Wide Web's acceptance.
Unfettered 3D - How to Stand in a Corner and Stare at the Wall
VRML is an attempt at a 3D Internet authoring file format. Like a HTML formatted file, a VRML file is a specially formatted text file. The current VRML 2.0 file format specification provides provisions for describing the placement of 3D surfaces, the pre-programmed animation of those surfaces, and can indicate preset viewpoint positions and orientations. An interface similar in some respects to a HTML 'browser' is used to view the 3D representations of information described in a VRML file that is downloaded from a selected web site. VRML links to other VRML sites can be imbedded in a VRML file. Using the VRML browsing interface, different viewpoints of the 3D information represented in the VRML file can be selected or navigated towards. Activating a link may be done by activating an appropriate object when it is visible in the VRML browser interface.
Unlike traditional 2D which has a long legacy of information authoring, presentation, and navigating methodologies, 3D visual information representations like VRML have only a few crude established methodologies. It can take considerable time for workable methodologies for new information representations to be developed. For instance, a VRML link may be hidden at first, and become accessable only after a very specific non-intuitive viewpoint reorientation. Without a legacy of 3D information methodologies in place, with new methodologies still in development and not widely understood, it can be difficult for truly useful and alluring applications to develop. Furthermore, a commonly available VRML browser thats easy to use on a wide variety of commonly available devices has not been realized yet. Most importantly, VRML and other 3D information representation systems, have little if any legacy of easy to assimilate content or commonly available easy to use authoring tools (a simple text editor can be employed to edit a VRML text file, however such files tend to be hard to read).
One increasingly common application of information represented in 3D and VRML is simulating a physical environment. This can be a very ambitious undertaking. The crude methodologies currently available are severely overextended when simulating physical environments. VRML 2.0 has no physics or collaborative user provisions built in. Some scripting and escape opportunities exist, however, their lack of inclusion within the VRML 2.0 specification means their implentention may not be widely implemented in a standard way if at all. The 3D visual component of physical simulation can be the least challenging feature of such systems. To compare favorably to the rich sensual experience of the 'real world', a simulation may require considerable physics computation and elaborate apparatus for feedback and display. Such simulations are usually limited to a fixed scale of interaction and impose limited degrees of freedom (aircraft cockpit, automobile drivers seat, spaceship bridge helm, ...). Reducing the complexity of physics calculation and viewing apparatus may only leave a very crude methodology for presenting and navigating information represented in 3D. There are plenty of 3D viewing programs and VRML browsers in which there is no object collision and physics is non-existent. Those that due impose simple physics and collision detection due so in a very limited and specific fashion. This can make for a very disorienting and unsatisfying 3D navigation methodology.
Although VRML and other 3D information representations can simulate a physical environment, this is only a sub set of its potential. With time, non-physical simulation applications are sure to mature and redefine what is useful for 3D information representation. The physics of the 'real-world' should soon prove irrelevant to 3D information visualization as more sophisticated and optimized methodologies for interacting develop which are more suited toward commonly available display and computational devices.
The Rarity of Navigating Consistently Toward Rich 3D Points of View
One of the most challenging problems with representing information in 3D verses 2D is quality and consistency of presentation. 2D information displays can rely on a extremely limited viewpoint being strictly maintained. 3D displays by definition are capable of displaying almost any conceivable view of information. This is a mixed blessing. 2D displays allow orientation locked panning and orthoganal zooming of information. 3D displays can spin information up side down, view it from all sides which if its flat may make it invisible, and allow positioning and view orientation almost any where. Studying information display from a stochastic point of view, its easy to conclude that exercise all 2D display settings will always yield a coherent display which includes at least some of the target information. Varying 3D settings stochastically, however, can easily completely elude any trace of the target information. Consider how many viewpoint positions and attitudes are possible in which the target information is still visible and sensibly oriented to the viewer verses those in which the target is not visible and/or oriented to be unrecognizable.
What is needed is some standardized 3D User Interface physics. With a 2D display like a train track, its hard to steer oneself into oblivion. With a 3D interface, unless some benevolent physics are in place, varying the view of information is tantamount to crash landing an airplane blind. With 2D, display interfaces are generally very simple, provide clear feedback, and are easy to understand. Before 3D information display technologies can become pervasive, they need to adopt a their own very simple common user interface metaphor. As the mouse is to 2D, a similar irrevocable relationship must exist for 3D. Their are mice with thumb wheels. Also space balls have become more common. However, no dominant relationship has suggested itself at this time.
Assuming an appropriate 3D mousing device does become pervasive, then some limitations and default capabilities on viewing interfaces are needed. There are many ways to confine 3D displays for more controlled viewing. One suggestion is for 3D information to have special view paths, or routes associated with it. Such paths should have visible markings that can be seen easily if a viewer strays away from a path. A viewer could then use a simple control to move about a 3D web page assured of a consistent quality view of 3D information. Perhaps a new 'web search engine' market may arise to develop new and exciting new paths from which to view information. With VRML 2.0 animation cabablities, these paths could also include temporal data to coincide with 3D information, for 'Internet News Movies'. This is just an example, other very useful simplifications are sure to develop.
Fashion, Security, Buzz, Gratification, Power
What makes any given technology alluring is that it allow expressions of indivuality that are easily noticed by others. Clothes, cars, snow boards, mountain bikes, cell phones, laptop, to some extent television, camcorders, and vcrs, and especially eye wear are all ways of differentiating oneself from others. Internet browsers allow us to see the electric clothes of others. Be they individuals, corporations, or artificial electric life (most often advertisements). As long as ease of access to the Internet is maintained ... a gratifying mode for self-expression is sustained and promotes the medium.
For personal and corporate web sites to develop interesting and worthwhile content that others will seek out, persistence and security of Internet space is crucial. If domain name conflicts and electric vandals make consistent access to favorite sites with legitimate content difficult, then the expressiveness of the medium is compromised and its popularity and use will diminish.
Truly extraordinary web spaces deploy novel and premiere content long before others catch on. The potential to be recognized as extraordinary, on the crest of the wave to follow, can be a powerful incentive for independent content developers, corporations public relations, and advertisers to develop for a medium. A key factor for a new medium is how easy it can be digested/viewed, and the degree to which innovators are rewarded for their efforts when they develop new ways of expressing with it. Yahoo's 'What Cool' during the early days of the formation of the World Wide Web is an excellent small scale example of this phenomena in practice. The ubiquitous hits counter, and a variety of 'top 5%' or 'cool site of the week' badges from a plethora of different institutions is another mark of recognition. Of late, more traditional media have sprung up with full featured online newspapers that hilite and critic content on the web. In one such online newspaper a colleague of mine's personal web page created to honor Pez Candy was recognized as getting close to the number of hits/week as Hot Wired's latest offerings. This potential for individuals to garner the same recognition, however brief, as established institutions makes the Internet a tremendously alluring form of expression.
The Net Is Long, The Net Is Slow
The often eluded toward ideal of instant information access from anywhere is sadly contrary to the fundamentals of modern science and engineering. Even as basic Internet connection services providers enhance their bandwidth capabilities, end user and application demands seem to always outpace infrastructure improvements. The real-time collaborative Internet game will always be a concept who's ideal implementation lies just beyond the next infrastructure upgrade. Location based and LAN oriented communications will maintain their superiority in bandwidth and security for ever.
The promise of 3D Avatars, user controlled Internet puppets, guiding and interacting on the Internet is a very ambitious undertaking. Truly immersive real-time 2 way interactivity requires modest to ambitious fixed pipe inter communication streams between each client and Avatar server. Coloborative client only implementations can only exacerbate this requirement. Avatar servers even under ideal circumstances will be taxed to accommodate additional simultaneous Avatars. The methodology for Avatar interaction must have sufficient interactivity for desired quality while keeping state and interaction physics as simple as possible. Complex 3D environment avatar spaces can easily necessitate enormous amounts of state and environment physics that both place extreme demands on client and server network and computational bandwidth. The cost of this extra bandwidth must be compensated by a comparable increase in interactive experience for the enviromant to be viable. An additional complication is that often the higher the bandwidth requirement for Avatar environment the more difficult to setup it is. This added complication can seriously limit the audience to a elite class that can not be easily penetrated by more common users. Ultimately, the most popular collaborative user environment are simple text oriented message passing chat that are easy to setup and operate by common users.
Some of the most successful interactive internet service like mail, usenet, IRC are not Avatar based at all. None of these is a classic Avatar implementation, however, all offer true interactivity over the Internet and can accommodate significant delays without breaking down.
As 3D Internet standards mature and evolve, new methodologies for 3D information visualization for the Internet must continue to accommodate low bandwidth and significant latency to be embraced and integrated into common web presence.
Threaded TV - Push a Button, Get 3D Programming
3D TV and Internet tuner. 2D cable TV shows are fed live onto surfaces of a 3D web page with no bandwidth hit. Streaming low bandwidth 3D surface and object animation information combined with live TV signals could present an extremely alluring and easy to use application of 3D. Suddenly with 3D internet, there is an extreme demand for 500 cable TV channels to service the new potential 'WebVision' industry. WebVisions could become the dominant entertainment device that electronic manufactures have been longing for. WebVisions could easily start to outsell televisions and PCs!
Passwords, Hard Virtual Physics, Deep Intimate Space
Beyond using the Internet for public expression is using it as a medium to develop and express highly sensitive and otherwise uncommunicatable ideas and concepts. This is a logical end to which 3D information representation is perhaps most appropriate.
Beyond 'real space' physics, and more abstract than any existing medium are abstract 'transaction' spaces that can only exist in a dynamic abstract medium like the Internet. By defining a set of transactions (or class method if you will) by which a an authorized user may interact with a abstarct information space, bandwidth and latency issues can be accommodated. In its simplest manifestation, this could simple be a database with a 3D front end. More complex possibilities include a pure net based equity trading space. Perhaps the most sophisticated possibility is a dynamically altering labrynth to provide deep security an anonymity for 'SwissBank' like transactions that are untraceable by any institution or government. International conglomerates, passionate artists, and organized crime alike could deliberate in 'untraceable' space. This could have incredible implications for nation interests and democratic reform alike, bad for the former, good for the latter.
The realms appropriate for this kind of information space require extraordinary security measures. Hyper secure passwords will require more elaborate mechanisms than simple code key sequence pairs and other existing mechanisms. A mechanism for generating hyper dynamic login protocols must accommodate enormous password and password authentication sequencies paries. The potential for an abstract 3D login methodology could dramatically enhance such hyper login protocols by presenting a rich abstract 3D environment to be interacted with quickly to authenticate the hyper-password encoded in the 3D login environment.
Abstract 3D information environments that persists and evolve, and allow 'user transactions' to interact with them require hard virtual physics for the spaces to have significant value. Hard physics include permanently retained server side environment state, client/server SQL like transactions and logging, and accurate state transmission to clients. This premise is similar to the mechanisms in place for stock and equity trading. The very nature of these mechanisms imbues transactions with credibility and verifiability necessary for valuable assets, be they abstract 3D information representations, simple files, pure equity, or services compensation, to be exchanged. Hard virtual space commerce need not be limited to capital and equity transactions. Of particular interest are transactions that change the environment and the nature of transactions themselves . A server with limited bandwidth and storage space could literally offer its transaction and storage bandwidth to the internet via a hard virtual physics interactive 3D information representation and immediately be taken advantage of by Internet users. The value of its service and the equity exchanged for them could immediately settled be the transaction space. This is potentially an extremely efficient method for information storage, service compensation, and equity to be organized. Media, Finance, and service become inextricable components of The Internet. Physical 'real' commerce is reduced to Internet Service companies that exchange Internet equity for physical material.
Lastly, it should be possible for truly intimate abstract 3D realms tailored to the needs and tastes of users and corporations to develop. Such spaces could include simple 3D information representations that are available to authorized guests to log into and interact with. Advanced spaces could include personally developed transaction services for enterprise and or personal equity, service, and information development and trade. Special abstract information representations and hard physics that interface to other Internet equity, service, or information servers could be developed, maintained, and cultivated personally. Like a 'cyber-garden' of equity, information, and function groomed for personal preference and performance.
With the Internet ultimately three groupings of virtual transaction spaces develop: physical 'real' to Internet equity, Internet equity to Internet service and information, personal/private information, equity, and functions. To effectively and efficiently interact with these transaction spaces, 3D information represeantions of them can have significant security and dynamic representation advantages over classical 2D interface capabilities.