This talk covers NLP analysis with Julia, with a focus com graph analysis.
A basic pipeline is shown, including pre-proccessing graph and semantic embeddings with TextGraphs.jl.
We briefly study the insides of this package along with an use case. For this, the heteronyms of Portuguese poet Fernando Pessoa are contrasted with each other. We see how speech structural properties reflect different writing styles.
This talk covers NLP analysis with Julia, with a focus com graph analysis.
A basic pipeline with TextGraphs.jl is shown, including stemming, lemmatization, grammatical tagging, graph embeddings and semantic embeddings .
We briefly study software writing in Julia and the insides of this package, proceeding to an use case. For this, the heteronyms of Portuguese poet Fernando Pessoa are contrasted with each other. We see how speech structural properties reflect different writing styles.
Graph embeddings transform text into networks, considering recurrence of symbols. Their properties (e.g. density, centrality, size of largest connected component) contain information about text structure and are useful in psychometrics.
TextGraphs.jl makes use of RCall.jl and R::udpipe for some functionalities, which give us a glimpse of R and Julia effortless interoperability.
Semantic embeddings provide measures of speech coherence. These are also useful to characterize ideas in text.