For this month's feature, we're honored to have Richard A. Bartle's thoughts on voice communication in multiplayer online games.
By Richard A. Bartle
When I first heard that the X-Box would support real-time voice communication between players, my heart sank. It didn't sink because the effect it would have on X-Box games; it sank because of the effect it would inevitably have on virtual worlds.
You can see how the logic goes. "Virtual worlds are multi-player computer games. The latest multi-player computer games feature real-time voice communication, and they're a blast! Players are coming to expect it, current virtual worlds don't have it. Hey, we should put voice in the new virtual world we're developing and blow away the competition!".
This is so depressing…
Newbies, I can forgive. Newbies have heard that virtual worlds are fun. They're looking to play one, but they have no easy way to determine which of those on offer is the best. Are they going to head for the one with all the latest bells and whistles, or the one that looks like it was created in 1999? Are they going to play the game written for broadband or the game written for 28.8K baud? Even if they've never experienced X-Box Live, they may have read reports extolling the amazingness of trash talk in Unreal Championship. Who wants to read, or use keyboards?
Designers are not newbies. Designers should know better. Maybe some of them do, and are right now locking horns with the marketing director and threatening to resign over the issue. Yeah, like that's going to happen…
The thing is, most designers of virtual worlds don't know enough about what they're designing. Design is about consequences. One of the consequences of adding real-time voice communication to virtual worlds is that it will attract newbies; this is why marketers want it. Another of the consequences is that when players cease to be newbies they won't stay for as long; this is why designers should be telling marketers they can't have it. Unfortunately, many of them don't give a moment's thought to the possibility that real-time voice communication might be A Bad Idea for virtual worlds. This is what's so depressing: it exposes just how little they grasp about their craft.
A know-nothing designer works on instincts acquired from being a player. They'll remember going on plane raids with their group, imagine how cool it would have been if they'd been able to talk to one another, and relish the thought of creating a virtual world where this would be a reality.
A more thoughtful designer might vaguely be aware of the concept of immersion, and have an inkling that real-time voice from players could present them with some difficulties in that area. They may also have some dim recollection of the importance of anonymity (or pseudonymity) in virtual worlds. However, these are bridges that can be crossed when they come to them, once the rest of the design has been fleshed out. No need to worry, la la la.
Designers who know their trade will realise that the introduction of voice - real-time or otherwise - will seriously influence the way their virtual world is played. They will also have absolute confidence in their ability to design round the problem, however. A little modulation here and there will give people voices that aren't their own. Sure, there may be some teething problems tackling the abuses that are certain to arise - it's an awful lot of data to log - but nothing intractable. Audible channels can be gagged as easily as textual ones. Things will work out.
Designers who simply understand will recoil in horror, despairing that anyone could even contemplate such an immersion-busting, reality-intrusive, anti role-playing debasement of what virtual worlds are. Don't these fools see what damage they're going to do?
Virtual worlds are just that, virtual. People play them to get away from reality; they play them to get away from themselves. In a virtual world, you can be someone else. By being someone else, you can become a better you. Why do people play the same game for hour after hour, night after night, for week after week, month after month? It's not because they like the game; it's because they like who they are.
Designers who don't understand that should go away and not come back until they do.
If you introduce reality into a virtual world, it's no longer a virtual world: it's just an adjunct to the real world. It ceases to be a place, and reverts to being a medium. Immersion is enhanced by closeness to reality, but thwarted by isomorphism with it: the act of will required to suspend disbelief is what sustains a player's drive to be, but it disappears when there is no disbelief required.
Adding reality to a virtual world robs it of what makes it compelling - it takes away that which is different between virtual worlds and the real world: the fact that they are not the real world.
Voice is reality.
"But it's not your voice". Well yes, gee, instead of sounding like I do in real life I can sound like someone in real life does after they've had their voice put through a processor. It fools no-one. Besides, even if the pitch changes were good enough to make men sound like women and vice versa (which they aren't), it wouldn't alter accents. "Hey, this elf babe is from England!". Hello reality.
This is what we're going to get. Virtual worlds will appear with voice. They'll attract newbies. They won't hold these players, but they'll condition them to expect voice in whatever virtual world they decamp to instead. To compete for newbies, new virtual worlds - and perhaps some well-financed older ones - will also add voice. Eventually, they'll all have it, their players will all be unsatisfied because of it, and everyone will wonder what the fuss with virtual worlds was all about. They're just like regular multi-player computer games except with more players.
I'm being pessimistic, I know, but still… Are virtual worlds as we know them doomed?
Fortunately, no, they're not. It's not that we shouldn't have voice in virtual worlds; it's that we shouldn't have it yet.
Voice isn't in itself any more disruptive of the virtual world experience than are photo-realistic graphics. It's fine to fool the senses, to make virtual worlds appear to be real, so long as that final step - their actually being real - is not taken.
Here's a look into the future…
Even if voice becomes the norm in virtual worlds, text as a means communication will still exist: not all players will be able to use voice. My wife can watch TV while I visit virtual worlds, but she wouldn't be able to if I were talking the whole time in the next room - it would be way too annoying. So I'd have to type; so would plenty of other people.
Having two distinct input channels - typing and speaking - is non-problematical, because no player experiences both simultaneously. Output, however, is problematical. If I'm talking with text and someone else is talking with voice, the person being talked to must read some conversations while listening to others. Ideally, they should either read them all or hear them all. As to which, well, you could make it a switch: those that prefer to read could see spoken words rendered into text; those that prefer to listen could hear written text rendered into speech.
OK, the technology to do this isn't quite there yet, but suppose it were. You'd have something that converts speech to text and something else that converts text to speech. So in theory, I could say something in my male, English voice, it could be converted into text, then replayed to listeners in a female, New English voice. It would be real-time voice communication, but no more "me" than my graphical avatar: just clothing for an alternative identity.
It works because it sounds real, but we know it isn't (hence we have disbelief to suspend). It works because it permits us to role-play (to become the someone that we want to become). It just works.
At the moment, though, it's mere whimsy. Current speech-from-text generation software isn't quite as bad as that used by Professor Stephen Hawking, but it still flows very awkwardly. Text-from-speech is pretty good once trained to your voice, but not if you start getting emotional (like when you're screaming for help because a dragon is eating you).
Give it a few years, though, and who knows? This could add a whole new dimension to virtual worlds! Not only do you look like a marsh troll, but you sound like one, too. How groovy is that?
Very groovy! Unfortunately, while we're waiting for it we may have to have to endure some otherwise excellent games ruined by the ill-conceived, premature use of an inappropriate form of the technology.
Real-time voice communication in virtual worlds does promise great things - just not yet.
The mustachioed man pictured to the left, Richard Bartle has an excellent thorough web site: www.mud.co.uk/richard. It explains his projects and insights better than this short biography can. In 1979, Bartle co-created the text-based MUD, the first system for players to share adventures online. He's continued development of games in many forms since that time and he works as an advisor and commentator in various capacities. His contribution of a simple taxonomy of MMOG players (Killers, Achievers, Explorers and Socializers) has been a valuable framework for discussing online player behavior.
Recently Bartle collected his expertise in book form, now available on Amazon: Designing Virtual Worlds.
I've been playing in virtual worlds for decades. In their computer incarnations, I play in near silence. My husband games in the next room with audio, but I keep mine off because I find jarring some of the in-game music choices/events. Another reason is that I have to listen for and talk to other household residents.
Maybe I'm spoiled because I'm a touch typist, but I'm for keeping voice chat away from my virtual worlds as long as possible!
Posted by: ancarett | 08/09/2003 at 04:55 AM
I can appreciate the issues associated with suspension of disbelief.
Of more concern to me is the impact of having 8 people screaming while I'm trying to stay alive in a combat situation. You'd need a very discplined team or a rigurous protocol... Perhaps one person in a party, say a caster type removed from the heat of battle giving instuctions/commentary while everone else bites their tounge.
Posted by: funken gruven | 08/09/2003 at 05:04 AM
What's this? Whining that because we can't we shouldn't try?
Speech input works great on team FPS and flight simulators. Granted it doesn't necessarily fit RPG-worlds where you're playing a character very different from yourself. But why superstitiously avoid voice input? Why don't just emphasis on speech recognition and synthesis? You could radically alter your voice or even language on-the-fly! Not to say that that would be easy, or even merely hard, but it's possible thus it's more the reason to investigate rather than to avoid.
Posted by: anonymous coward | 08/09/2003 at 05:19 AM
Come on baby give daddy some loving
Posted by: Jesus | 08/09/2003 at 05:42 AM
Come on baby give daddy some loving
Posted by: Jesus | 08/09/2003 at 05:43 AM
Anyone who thinks that voice communication will ruin the immersion of MMORPGs is suffering from two problems. 1) They haven't played any of the current crop of MMORPGs. 2) They're stuck in an ivory tower of idealism.
The fact of the matter is that there is no immersion now, and there never was immersion in MUDs. How immersive is it to defeat the evil dragon which has been harassing the townsfolk only to have it reappear days later?
How immersive is it to bring an army to slay a god, and once you've slain the god still suffer from his followers being just as zealous, his influence in the world being just as strong? For the god to reappear inexplicably after a timer has finished timing?
The fact of the matter is that the current games are very advanced whack-a-mole machines which facilitate social interactions and throw in a little bit of persistence on how big your mallet is. If you want immersion, many things need to change. Voice communication could be one of them, but there's many more fundamental issues that need to be resolved before targeting voice. Many of these are problems that MMORPGs inherited from their text based predecessors like Diku and LP MUD.
With all due respect to Mr. Bartle, whom I believe
was instrumental in starting a wonderful ball rolling, this article shows a profound ignorance and idealism. Game design requires sacrificing parts of the ideal for broader appeal and accessibility. Voice communication will not destroy things any more than respawns, random drops, and PVP flags do.
Posted by: swanson marpalum | 08/09/2003 at 05:57 AM
What I don't see is how an "adjunct to the real world" fails to be a "place". My house is certainly connected to the real world (sometimes against my wishes), if you show up in my house you'll be able to identify my voice and see my face. You'd probably laugh at me if pretend to be an elf. But my house is most definitely a place.
Likewise, I would see these new voice worlds as very interesting places where I could have conversations with a room full of people from very far away. Playing in a massively multiplayer voice world would be like maintaining a dual-citizenship in a new country with different laws of physics with people from around the world. To me, that sounds infintely cooler than pretending to be an elf while on the treadmill of pointless level building.
So we trade virtual worlds for virtual nations--worlds that are tied down to the real world. Yet by tying them down, paradoxically I expect them to become more creative. The need to roleplay constrains the context of a virtual world to ones universally understood--which is perhaps why the successful MMORPGS are all set in a northern european mythological world, or in the Star Wars universe, or in some other typical sci-fi setting. Something most of the people interested in playing these games can grasp. If we give up the requirement of role playing, we build a tolerance for inconsistency, paving the way for a postmodern explosion of innconceivable places filled with unpredictable characters, each fully aware of their context as a creation of the physical world. I'm talking enclaves of 1920s Prohibition mobsters from Austrailia joining forces with the Californian Samurai guild to resist immigration by the Indonesian Cowboy Society. A beautiful, horrific microcosm of the real world's problems, in this case participated in by both self-aware kitsch junkies and nationalistic bigots alike. A place both more unreal and real than any virtual world yet seen.
And, if my dream turns out to suck, well, after proper text-to-speech/speech-to-text technology is developed, we can all go back to pretending to be female gnomes while playing ProgressQuest. Yay. Just watch for the technology to develop, Mr. Bartle, and you could be the one who gets rich bringing back the Glorius Past of Role Playing to the jaded eyes of future children.
Posted by: ontology | 08/09/2003 at 06:28 AM
Devil's advocate here. I guess the author never did any real roleplaying around a tabletop? Folks have been doing it for years and the fact that they can hear eachothers' voices, see eachothers' faces and, hell, eat eachothers' pixystix... well, it never seemed to detract from immersion.
My 2C.
Alli
Posted by: Alli | 08/09/2003 at 06:31 AM
Regrdless of MMORPGs, voice chat is the BEST thing for On-line FPSs. Case in point - Battlefield 1942. The public server I play on actively encourages TeamSpeak use. Why? An FPS player is not trying to be somebody else, and in the heat of battle, stopping to type a message is difficult at best, often deadly. Voice chat allows us to report enemy positions, call for airstrikes, beg for help, and call out team-killers for immediate kick and ban. The server admins monitor and participate in the chat, so any excessive swearing, insulting, etc. can be squashed. They're not anal about it, but they can keep it within limits. We have a few female players who participate. I have never once heard them hit on. I think most of the male gamers are just happy to have some women on the channel, and would like to encourage more women gamers by not being meat-heads.
Posted by: PudriK | 08/09/2003 at 06:32 AM
Regrdless of MMORPGs, voice chat is the BEST thing for On-line FPSs. Case in point - Battlefield 1942. The public server I play on actively encourages TeamSpeak use. Why? An FPS player is not trying to be somebody else, and in the heat of battle, stopping to type a message is difficult at best, often deadly. Voice chat allows us to report enemy positions, call for airstrikes, beg for help, and call out team-killers for immediate kick and ban. The server admins monitor and participate in the chat, so any excessive swearing, insulting, etc. can be squashed. They're not anal about it, but they can keep it within limits. We have a few female players who participate. I have never once heard them hit on. I think most of the male gamers are just happy to have some women on the channel, and would like to encourage more women gamers by not being meat-heads.
Posted by: PudriK | 08/09/2003 at 06:32 AM
There are big (relatively big) money paid for professional voice actors in games. Even if the characters are voiced by amateurs, there is professional sound editing done. That's why games sound good. If you let everyone input their own voice into the system, the quality will definitely degrade. People are not good actors - they don't know how to express emotions, how to modulate their voice, how to play with it.
And remember, the audio version of "OMFG! FUCKING LAG! YOU FAGS!" would be much harder to ignore.
Posted by: Norma | 08/09/2003 at 06:37 AM
Richard A. Bartle - this must really be your sacred cow ... sacred cows makes the best burgers. Voice in virtual worlds is the next thing ... don't care about the accent.
Posted by: Jesper | 08/09/2003 at 06:37 AM
Well, I really see the point of the author, as virtuallity is the basic fascination of the game. That could even be seen when voice talk was enabled in Counter Strike, a game what really suffered the non-existance of voice support. Sitting duck in a corner and *typing* for help while your teammate is somewhere at the other end of the map doesn't really help you to react at enemy activities, say, knifing you down.
But on the other hand, it was like, terrible. Suddenly all those 13-year old kids that should be sleeping at 2oclock in the morning where whining through the game, and hey, 13yo's are normally not on a S.W.A.T. team, are they? Really bad.
Trash talking was not really a problem, as it was really easy to mute other players.
So reality is a bad thing.
On the other hand, maybe those who don't like voice support in MMORPGs, you might watch a few episodes of .hack//SIGN (or .hack//DUSK). Those series take place *in* a very modern MMORPG (like, not yet available), where player can talk to each other - and it seems proper. Well of course the only abuse in that game are PKs, as in all MMORPGs now. But I think the way those people can use the voice support of "The World" is like the way the author wants; it does not destroy the virtuality. Even if the characters are using their own voice, as it seems. Maybe it's just a matter of time. And a matter of moderation those out who are destroing the fun of games. You won't play chess with someone mocking at your hair all the time, would you?
Posted by: LGW | 08/09/2003 at 06:45 AM
If you're one of those people who uses the phrase, "It's just a game!" you won't understand where he's coming from. The current games may not be immersive enough to suspend your disbelief, so that makes it okay to make them less immersive?
I wish there were a way to simplify the keyboard input. A way to quickly input meaning, so the computer can parse that meaning into language. I'd like typing to be easier than talking. When you type or talk in either of the ways to do it now, the game itself never knows what you're actually saying, I'd like the game to be able to eavesdrop. The way to communicate would have to be enormously simple, but what is communicated would have to remain as complex as vanilla typing or voice chat.
Posted by: Roop Dirump | 08/09/2003 at 08:20 AM
I felt I had to submit my own experience with voice chat in a not so MMORPG, "Neverwinter Nights"
because it differed so much from what the author seems to be expecting.
Our game ran between 4 and 8 players in a custom created NWN module. We noticed that the voice,
(via MS gamevoice), was so imersive (we were talking in character) that our focus shifted completly away
from the onscreen text. So much so, that the 1 or 2 players who didn't have microphones found it nearly
impossible to get our attention if there was any action on our screens.
Reading text and typing responses works great in a MUD, but when you are manipulating a mouse and
keyboard controlled character in a 2D/3D realtime environment, it is very cumbersome and distracting
to repositiong your left hand(from the direcional keyset you are using) and look away from the monitor,
even for a few seconds, to start typing, as apposed to just speaking into a microphone.
As long as it comes with good method of muting annoying players, I am all for it!
Posted by: Travis | 08/09/2003 at 08:36 AM
The solution, Andrew, would be to name your character "D'hűd".
Posted by: William | 08/09/2003 at 08:38 AM
Speech to text would be nice for slow typists, when you're too busy to take your hand off your mouse, and / or typing other commands.
Posted by: bradm | 08/09/2003 at 09:05 AM
swanson said: "Anyone who thinks that voice communication will ruin the immersion of MMORPGs is suffering from two problems. 1) They haven't played any of the current crop of MMORPGs. 2) They're stuck in an ivory tower of idealism."
I'm going to have to partially agree. Any game that currently requires player cooperation is vastly improved by voice chat. Players can still use text chat when they want to role-play. For example, role-playing and combat generally have small overlap.
Sacrificing interface (voice chat allows vastly better organization and coordination of players) to try and make it easier for players to suspend their disbelief is a mistake: the majority of MMORPG gamers are game players first, then role-players.
Suspension of disbelief can be improved after the game is fun and has a great interface, any designer who thinks otherwise should actually play some games. :P
Posted by: Anon_Customer | 08/09/2003 at 09:25 AM
I find it interesting the similarities between Bartle is saying here and a conversation I had recently with Yukihiro "Matz" Matsumoto (the designer of the programming language Ruby). The conversation took place in a very Japanese restaurant in Portland. The waitresses addressed Matz in very formal Japanese with an extreme degree of rigor in performing the forms of classical Japanese culture (eye-contact, ritual, the whole works).
The conversation (much of it about the Whorf-Sapir hypothesis and other relevant linguistic theory) was conducted in English (mainly due to my limited skills in Japanese). Matz expressed a high degree of frustration with the effort he had to make shifting back and forth between thinking and responding in Japanese and the process he had to go through to talk (which he said involved "thinking in Japanese while translating to English"). He said this was actually harder for him than just talking to people in English.
I find the similarity between this and Bartle's description of the difficulties people will have dealing simulataneously with audio and text "speech" in virtual worlds interesting.
Posted by: scotus | 08/09/2003 at 09:33 AM
While the comments by the author do have some justification, under certain circumstances, it does not apply in all cases.
While the points may be argued for some games, others simply demand voice communication.
One example has been provided, Battlefield 1942. And while I don't play that game, I do play the similarly themed MMOG/MMORPG World War II Online. My squad/clan numbers in the hundreds, and at times we have more than 60 players on during a given battle. Our ability to communicate by voice is imperative, due to the complexity and nayure of the information we have to impart.
Assume you to to inform a group of people defending the Hoogstraten depot to Breda. You have noted that the are 4 pieces of German armor advancing(2 PIVDs and 1 unidnetified) as well as multiple softskinned vehicles containing upwards of 20 infantry. They are approaching from the ENE. They are being covered by a squadron of Ju87 divebombers at 1500ft, who appear to have escorts flying at around 3200ft.
Now, while that could be typed in some form of short hand, you end up having half the people (who don't have the time or inclination to memorize the hundreds of necessary short hand translations) typing 'Huh?','Pardon?' and 'what the &%$&^ does that mean?'. By which time the enemy has over run your position, killed you all and established a foothold on your city.
It's not that all games should or need to have voice, but some do, and the blanket of negativity the author puts over the entire idea is disheartening.
Posted by: Silicon Rat | 08/09/2003 at 09:54 AM
Pardon me if this has already been covered...
One thing you don't seem to be taking into account is the fact that people have been using Voice to play RPG games ever since they were created. Show me any Pen and Paper game that doesn't require the players to talk with each other... As far as I know, they have done quite well over the years.
The only reason we haven't been using voice on the computer is because it hasn't been technically possible until now. I'll be the first one to point out that we're going to have some growing pains, but I think voice chat will be a good thing in the end...
Posted by: Geoff | 08/09/2003 at 10:02 AM
Im an Avid gameplay, have been since i was 3 years old (says my parents). I have an Xbox and Xbox Live. I used to play a RPG called Phantasy Star Online for dreamcast, we communicated with a keyboard. I got the same game for xbox and we use the headset to talk to each other. At 1st everyone is a little shy, not willing to say the same things they would have typed, after a couple weeks we all got used to each other and became friends, we started to get more into it, started joking and being gamers. At the end of the day Voice chat was better then the KB ever could be, sure its awkward at 1st, why? Because its something new, sooner or later you will get used to it and the immersion factor will go higher then it ever did when you were using a keyboard.
Posted by: Steve | 08/09/2003 at 10:16 AM
well, i've been mooing for years and have always been myself. i don't like any of the roleplay stuff at all. i don't do muds or the lambda rpg because they don't appeal to me. i'd rather just chat.
similarly, when i am playing NHL2K3 on xbox live, i am just a guy playing a hockey video game. i say things like "have you heard of defense?" and "nice goal, get'em while ya can."
i like this ability, it's a very, very good thing. as a vetarn of online games, your heart might sink, but mine is pounding hard as i fly down the ice, put one in the net, and scream "you've got mail!" into my headset.
Posted by: pax | 08/09/2003 at 10:44 AM
I think the most important part of not implementing voice in virtual worlds is that some of us still really don't want to deal with noobs screaming for this and that. For me, I can deal with some level 1 moron coming up and asking me for money/items/PL/whatever, in text. But if I had to listen to them, I would go positively *insane* after about five minutes. I would need an option to turn off voice.
But that would arguably put me at a disadvantage. Without perfect voice-to-text translation, I would never be able to know what all those other people - presumably some of which I need to intercommunicate with - are trying to tell me. Those of us who hate the prospect of listening to noobs whining all day would pretty much be screwed, because as Richard said, once the first game with some level of popularity has voice, they'll all have to cave in, and I'll be left with no choices from the market of virtual world games.
And, as Richard said, it'll be a non-issue once voice-to-text is perfected. I could then clack away happily at my keyboard, otherwise in silence, while Joe Moron is screaming at his computer for me to give him buffs.
Posted by: Dachannien | 08/09/2003 at 10:58 AM
Speaking as a server programmer for an in-development MMORPG, there's a good reason that I am trying to eliminate the possibility of voice chat in my game: network bandwidth. As anyone in the business knows, bandwidth = money. Voice chat could easily use up more bandwidth than the game itself, forcing us to either figure out how to use next to zero traffic for game packets (not likely), or raise the subscription price. Mentioning the latter to management usually makes them forget about the whole voice idea in the first place, fortunately. But there are still some strong adherents to the idea, and I'll admit that it does make certain parts of the game easier and more fun to play. It will be interesting to see what the eventual resolution is.
Posted by: Vaxhacker | 08/09/2003 at 11:01 AM