What is Natural Language Processing?
Natural language processing, commonly referred to as NLP, is a broad, multidisciplinary, subarea of artificial intelligence which deals automating the process of communicating via natural languages. It attempts to enable machines to naturally converse with others using natural languages.
Why is it difficult?
Interdisciplinary (Reference)
To understand NLP and implement it well, one must be able to integrate knowledge from multiple disciplines, namely math, computer science, linguistics, cognitive science, psychology, and philosophy.
Ambiguity (Reference)
Ambiguity is an inherent part of natural language that, unfortunately, makes NLP much more difficult.
"I saw a man on a hill with a telescope."
How would a computer interpret ambiguous sentences?
Sarcasm (Reference)
Understanding a sarcastic statement requires context, however machines usually are not provided with any context, making the problem of sarcasm detection considerably difficult. This problem can be observed in human interpretation as well, when no context is given.
Multilingual support (Reference)
If you thought natural language processing was hard enough for one language, try to do it in multiple languages.
Question generation (Reference)
Consider a machine is given a sentence,
"A triangle is a polygon with three edges and three vertices."
How would a machine go about asking questions related to the sentence, without being explicitly programmed to do so? Would it ask "How many edges does a triangle have"? Or maybe it would ask "What is a polygon with three edges and three vertices"?
What is it used for?
Probably the most popular use of NLP is automating tasks concerning natural language. Some of those tasks being word suggestion, spell checking, grammar checking, translating, conversing, data extraction, image captioning, and text summarization.
Word suggestion (Reference)
Leveraging NLP, people may be able to type better with word sequence suggestions.
Translation (Reference)
There is so much information available to everyone these days, however without translation, much of that information becomes inaccessible. NLP allows translation software to be much more accurate than ever.
Conversational AI/Chatbots (Reference)
Conversational AI can provide a better experience for clients. Instead of waiting a long time for someone on the other end to pick up, they can talk to a machine who is on duty 24/7. Other applications of chatbots include therapy, group chat, and voice user interfaces.
Data extraction from unstructured text (Reference)
In situations involving documents or records written in natural language, it is often difficult to extract meaning from them in a timely manner. NLP may be significantly helpful by quickly providing meaningful information from such documents, especially in the healthcare industry in which time can play a huge factor in life or death situations.
Image captioning (Reference)
NLP can generate text to explain what is happening in images. Considering that it can be used to generate text for images, I would say it most certainly is possible for explaining videos as well, using previous frames as context for the next frame to be captioned. This can be incredibly useful for those with hearing or visual impairment, especially for situational awareness when driving a car, for example.
Text summarization (Reference)
Automated text summarization will instantly make it easier for people to understand complex or lengthy documents. It can be very helpful in the mass media industry for providing quick summaries explaining content, such as news or podcasts.
Quiz making (Reference)
NLP could potentially be great for the education industry, allowing students or educators to generate quizzes. These quizzes could be generated using all sorts of different sources, such as informational websites, news articles, or textbook pages.
How does it work?
Rule-based NLP
Rule-based NLP is a classical approach which simply maps messages with responses. Without the use of statistics, this approach can be extremely limiting, as each and every rule must be manually entered.
Statistical NLP
Statistical NLP uses machine learning to form a probability distribution of the most probable responses given a message and its context (previous messages), then selects the response with the highest probability.
Try it out now using DialogFlow!
Resources
- https://scholar.google.com/scholar?as_ylo=2017&q=natural+language+processing&hl=en&as_sdt=0,39
- https://en.wikipedia.org/wiki/Natural_language_processing
- https://www.cl.cam.ac.uk/teaching/2002/NatLangProc/revised.pdf
- https://www.cs.utexas.edu/~mooney/cs343/slide-handouts/nlp.pdf
- https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-864-advanced-natural-language-processing-fall-2005/lecture-notes/lec01.pdf
- https://karczmarczuk.users.greyc.fr/TEACH/TAL/Doc/Handbook%20Of%20Natural%20Language%20Processing,%20Second%20Edition%20Chapman%20&%20Hall%20Crc%20Machine%20Learning%20&%20Pattern%20Recognition%202010.pdf
- https://aisel.aisnet.org/cgi/viewcontent.cgi?article=1672&context=amcis2001
- http://nlp.seas.harvard.edu/papers/
- https://nlp.stanford.edu/pubs/
- https://courses.cs.washington.edu/courses/cse447/18wi/slides/Parsing.pdf
- https://web.stanford.edu/class/cs224n/midterm/cs224n-midterm-2018-solution.pdf
- https://www.cs.cornell.edu/courses/cs5740/2018sp/lectures/02-textclass.pdf
- http://modsimworld.org/papers/2015/Natural_Language_Processing.pdf
- http://cs229.stanford.edu/proj2017/final-posters/5147972.pdf
- https://www2.deloitte.com/content/dam/Deloitte/ie/Documents/ie-dispruptive-chat-bots.pdf
- https://cs.stanford.edu/people/karpathy/cvpr2015.pdf
- http://media-lab.ccny.cuny.edu/wordpress/Publications/ACVR201604.pdf
- https://www.cs.colorado.edu/~martin/SLP/Updates/1.pdf
- http://www.cs.cmu.edu/~ark/mheilman/questions/papers/heilman-question-generation-dissertation.pdf
- https://ijpds.org/article/view/381/362
- https://arxiv.org/pdf/1702.01101.pdf
- http://cs229.stanford.edu/proj2015/044_report.pdf