9/16, Zach Coriarty “An Immersive System with Multi-modal Human-computer Interaction”

In “An Immersive System with Multi-modal Human-computer Interaction” Zhao et al. argues that a multi-model human-computer system that incorporates facial behavior, body gesture, and spacial location can help decipher the intent behind users.

Zhao and his team use modern day technology for monitoring human features, for example, IBM’s Bluemix is used for speech recognition and helps them map intent behind the words a user speaks; the example given in the text is: “a user could say “hello” or some variation of it and the conversation service will map the intent as “greeting”.” They also use Kinect cameras for gesture tracking, so if a student points to something then the system could get an idea of what they’re interested in.

The system was put into action by simulating a restaurant and having people try to learn a new language by ordering a meal in Mandarin. The results of this test overwhelmingly pointed towards a success. Most students liked how they could just ask how to say something and they would get a response; they also liked how they could simply point to a menu item they didn’t understand and the computer would tell them about it, and this is just one application of the system.

The idea behind this is similar to the TED talk we watched in class the other week, where the entrepreneur made a device that would tell you about what you place on it; for example, if you were in an airport and you put a boarding pass on the device then it would tell you where your gate is. This idea of creating devices that tell you what you want to know before you even know it is extremely interesting to me and is why I study machine learning. So, if we could make a device similar to this one and put it in a classroom that could tend to kids while the teacher is busy, maybe with another child, then we would be opening up a whole new world of equality. To go into a bit more detail, it is well known that children learn at different rates and in different ways that cannot always be catered to in the classroom setting due to there only being one teacher. So, if there were a computer that could teach children, but also cater to their specific learning style, then the playing field would be far more leveled, and I think that is one of the most promising applications of AI in the short-term.

4 thoughts on “9/16, Zach Coriarty “An Immersive System with Multi-modal Human-computer Interaction”

  1. I think the topic of this paper is really interesting. I didn’t even think about how this type of device could have such a giant impact on equality until you mentioned it. There is so much potential for AI, but, based on your idea, for it to have such an impact on learning could be life changing. While some fear AI, I think, if implemented correctly, it can lead to great societal improvement, along with an increase in opportunities for those who may not have had them otherwise.

  2. Similar to the other response on your blog post, I find it really interesting that you related this work to the concept of equality. While I know you mean this in terms of equality within perhaps one classroom or learning environment, I wonder if this could lead to issues regarding equality universally. Having access to artificial intelligence machinery like this I assume would be expensive and I’m sure it would take a long time for it to be universally accessible. However, I do agree with your point that it could “even out the playing field” amongst students with different learning styles and needs within the classroom.

  3. Building off of what everyone else said, your thoughts in regard to creating devices to create/ensure equality were interesting. I also think that devices like these could fill in the empty gaps and fulfill some needs that are not being satisfied currently. On the other hand, it is important to be mindful of the factor that technology can also increase disparities if not enough care is taken. If people do not have the same opportunities or access to technological advancements like these, they will fall farther behind. How can we ensure that technology closes gaps and doesn’t increase them? Any thoughts?

    1. Thats definitely a good point. Often times as we try to solve one problem we create another and that is something to keep in mind for sure. However, I think the solution isn’t as complex as it seems because of the technological structure in the world right now. Between Google’s GCP, Amazon’s AWS, and Microsoft’s Azure, there is enough cloud computing power to support GPT-3(the current, most advanced, publicized, AI) in any school with a laptop. So what I’m trying to say is, the beauty of AI lies in that it could primarily be entirely software based and accessed from any Chromebook-level computer around the globe(with wifi) though cloud computing. So, as long as we can keep the software at a low cost(or free), possibly through government programs, then I think we can minimize the expected disparities you mentioned.
      Though this idea does stray away from the hardware-heavy platform discussed in the article, I think it is not too far of a step to consider.

Leave a Reply

Your email address will not be published. Required fields are marked *