Tag Archives: voice recognition

Things 119: Journey, Tree Record, Climb and Descent

Game: Journey
If you’re a gamer, you’ve probably heard about Journey. If you’re not a gamer, then you should have heard about it anyway, because it’s quite beautiful and amazing, and only takes 2-3 hours to play through, which means you could visit a friend that has a PS3 and play it in one sitting.

But why would you want to do that?

In this interview, Jenova Chen, the game’s creative director, says:

“Augustine wrote: ‘People will venture out to the height of the mountain to seek for wonder. They will stand and stare at the width of the ocean to be filled with wonder. But they will pass one another in the street and feel nothing. Yet every individual is a miracle. How strange that nobody sees the wonder in one another.’

“There’s this assumption in video games that if you run into a random player over the Internet, it’s going to be a bad experience. You think that they will be an asshole, right? But listen: none of us was born to be an asshole. […] It is the system that made the player cruel, not the player themselves. So if I get the system correct, the players are human and their humanity will be drawn out. I want to bring the human value into a game and change the player’s assumption.”

The reason I say the game is amazing is that it succeeds at this seemingly impossible aim. I’ve played through it a few times now, and each time I’ve had at least one incredibly positive and sustained play experience with a complete stranger.

http://www.youtube.com/watch?v=_mF8KkDiIdk

[Video not working, try this search – metatim, 02/08/15]

Film: The Cabin in the Woods
If you like horror films, you really should watch The Cabin in the Woods. I don’t think it quite succeeds at Joss Whedon’s stated aim (which you shouldn’t look up until after you’ve seen it), but it’s worth it for the wonderfully insane final half hour or so, which, impressively, the trailer largely resists showing any of:

http://www.youtube.com/watch?v=7ENUBUdFswM

[Video not working, try this search – metatim, 02/08/15]

Video: Tree Record / Years
The technology to turn wistful ideas into a reality is in our hands. Look at this device and imagine what you want it to do:

Now check out the video, where it does exactly that:

Read a bit about it here.

Puzzle: Climb and Descent
Tarim recently introduced me to levels 1 and 2 of a puzzle I’d only ever previously heard set at level 3. This week: level 1.

On Day 1, Joss Whedon hikes his way up a mountain, starting at the bottom at midday, and reaching the top (with a few rest stops along the way) 12 hours later, at midnight. He basks in the glory of his achievement for 12 hours, then at midday on Day 2 sets off back down the mountain, reaching the the bottom 12 hours later again, at midnight.

The question: is there a particular time at which he passed through exactly the same altitude on both his Days 1 ascent and Day 2 descent?

Answer: Voice recognition
A long time ago I asked what one could do to improve the chances of having your words understood by one of the many would-be voice-recognition services we find around us today.

After a bunch of googling around, the answers seem to be:

  • Reduce ambient noise where possible
  • Don’t speak too loudly and close to the microphone
  • Leave longer gaps between words than you might in natural speech
  • Speak with the accent the device was tested for

That last point is the one I’m most interested in. The question is, what accent should you use?

It seems the various companies offering this service (Apple/Siri, Google voice search, Xbox Kinect) do have to release different versions for different parts of the English-speaking world (I don’t have a good source for that, but it’s the impression I get from their staged releases, people’s reported experiences, and common sense).

My next plan is to carry out a small personal test in which I try putting on different accents. Results will of course be reported here.

@metatim

Things 116: Cloud Phase Time-Lapse, 3D Map, Better Tube Map

Video
Point a camera at the sky, create a time lapse video of the clouds. Do the same thing every day of the year. Play back all the videos simultaneously in a grid. Voilà: a kind of phase-diagram visualisation, with seconds representing minutes and space representing seasons. Brilliant.

More detail here. Via Data Pointed.

Link
This is apparently pretty old, and with Google Earth and Street View already taken for granted it’s difficult to appreciate how impressive this is: in-browser 3D maps of major cities by Nokia. A plugin is required, and the sad thing is that I imagine that small barrier is enough to vastly reduce the number of people that will actually try it out.

Picture
Various incarnations of the London tube map regularly feature in Things: in the past I’ve posted about a to-scale tube map, a curvy tube map, and a travel-time interactive tube map.

Unsurprisingly, I rather like Mark Noad’s version, which is an ambitious attempt to make a tube map that is not just interestingly different but actually better than the current canonical version. By retaining the simplicity of design but improving geographic accuracy, I would say it succeeds.

Puzzle
This week, a very first world problem. If voice recognition software fails to understand something you say (e.g. Google voice search, xBox 360 Kinect voice commands, or Siri), what do you do? Having had this happen a few times now, I’m very aware that the natural human response of just saying the same thing but louder might not actually be the best thing to do. (I also imagine my neighbours don’t need to hear me shouting “Xbox go back! Xbox! Go! Back! Xbox go frickin’ back! Fine, don’t then!”)

For example, other approaches to ensure your input is recognised could include: reduce background noise; enunciate more clearly; speak in a monotone; move closer to or further away from the microphone; use a different phrasing; or attempt to put on an American accent.

Which of these is most likely to work? Is there a better approach that I’ve not included here? Is just speaking loudly actually the best approach after all?

Or is the failure rate of voice recognition inevitable and unacceptable in most contexts, and the whole notion flawed from the outset?

@metatim