Using CoreNLP, d3.js, and dagre.js to visualize sentence parse trees

I’ve always been casually interested in the field of Natural Langauge Processing (NLP), a field of computer science interested in extracting information from natural human language. I have no training or education whatsoever in the field so I’m not in a position to contribute much to the field, but I am definitely interested in seeing where the state of the art is, and in particular how powerful open-source NLP libraries have gotten (Google and Microsoft certainly have more powerful closed-source systems, but that doesn’t really help me.)

A few years ago I started playing with Apache’s OpenNLP project. I’m a big fan of the Apache foundation and their libraries, but I found myself very frustrated by OpenNLP’s lack of documentation and the hacky-feeling interfaces the library exposed. However recently I took another look at the available NLP libraries and came across Stanford’s CoreNLP project. CoreNLP, as it turns out, is an awesome project, and it took almost zero effort to get their example demo working.

As a total NLP beginnner, the sentence parsing functionality was the most immediately approachable example. Sentence parsing takes a natural-English sentence:

“I am parsing an example sentence.”

and breaks it down into component tokens and their relations:

(ROOT (S (NP (PRP I)) (VP (VBP am) (VP (VBG parsing) (NP (DT an) (NN example) (NN sentence)))) (. .)))

where each token type corresponds to a particular word type–“NP” means “Noun Phrase”, VBG means “Verb, gerund or present participle”, and so forth (I’ve been referencing this as a complete token list.)

I’ve also been looking into JavaScript graph visualization libraries recently (I’ve struggled to find a JS library remotely as powerful and pretty as graphviz), and wanted to test out the dagre library, which re-implements a simplified dot algorithm in javascipt and can render the results to d3 (the current coolest-kid-on-the-block JS graph library). So I put the two together and put together a simple visualization which uses dagre to show CoreNLP’s sentence parse tree. It’s pretty simple, but you can play with it here.

When I have time to work with the two libraries a bit more I’ll hopefully update with something more interesting.

11 thoughts on “Using CoreNLP, d3.js, and dagre.js to visualize sentence parse trees”

Pingback: D3 NLP Visualizer Update | Ben Podgursky

Great stuff. The link in “I’ve been referencing this as a complete token list.)” isn’t working.

bpodgursky says:

February 11, 2017 at 7:39 am

Thanks, I swapped in a link that should work.

Reply

Thanks for the post, I’d really like to be able to view the d3 example and see the source for it. Unfortunately the link “here” is not working.

bpodgursky says:

September 11, 2017 at 4:12 am

Thanks, fixed. Also, the source is all here https://github.com/bpodgursky/nlpviz even if the link dies in the future.

Reply

Hello,
Looks great! Please could you make the test link work again at http://nlpviz.bpodgursky.com/ ?
Thanks

bpodgursky says:

December 2, 2017 at 6:04 pm

Oops, fixed. Try now.

Reply

thanks. working now.

Pingback: El dedo aristotélico y el platónico | escrivivir

The example link was still now working. Can you please fix it?
Even am getting error while running the repo : https://github.com/bpodgursky/nlpviz

edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model

bpodgursky says:

April 10, 2018 at 3:46 pm

Restarted the example, but you should still be able to run it yourself.

Reply

Using CoreNLP, d3.js, and dagre.js to visualize sentence parse trees

Published by bpodgursky

11 thoughts on “Using CoreNLP, d3.js, and dagre.js to visualize sentence parse trees”

Leave a comment Cancel reply

Share:

Related

Published by bpodgursky

11 thoughts on “Using CoreNLP, d3.js, and dagre.js to visualize sentence parse trees”

Leave a comment Cancel reply