Log In
Or create an account -> 
Imperial Library
  • Home
  • About
  • News
  • Upload
  • Forum
  • Help
  • Login/SignUp

Index
Title Page Copyright and Credits
Natural Language Processing with Java Second Edition
Dedication Packt Upsell
Why subscribe? PacktPub.com
Contributors
About the authors About the reviewers Packt is searching for authors like you
Preface
Who this book is for What this book covers To get the most out of this book
Download the example code files Download the color images Conventions used
Get in touch
Reviews
Introduction to NLP
What is NLP? Why use NLP? Why is NLP so hard? Survey of NLP tools
Apache OpenNLP Stanford NLP LingPipe GATE UIMA Apache Lucene Core
Deep learning for Java Overview of text-processing tasks
Finding parts of text Finding sentences Feature-engineering Finding people and things Detecting parts of speech Classifying text and documents Extracting relationships Using combined approaches
Understanding NLP models
Identifying the task Selecting a model Building and training the model Verifying the model Using the model
Preparing data Summary
Finding Parts of Text
Understanding the parts of text What is tokenization?
Uses of tokenizers
Simple Java tokenizers
Using the Scanner class
Specifying the delimiter
Using the split method Using the BreakIterator class Using the StreamTokenizer class Using the StringTokenizer class Performance considerations with Java core tokenization
NLP tokenizer APIs
Using the OpenNLPTokenizer class
Using the SimpleTokenizer class Using the WhitespaceTokenizer class Using the TokenizerME class
Using the Stanford tokenizer
Using the PTBTokenizer class Using the DocumentPreprocessor class Using a pipeline Using LingPipe tokenizers
Training a tokenizer to find parts of text Comparing tokenizers
Understanding normalization
Converting to lowercase Removing stopwords
Creating a StopWords class Using LingPipe to remove stopwords
Using stemming
Using the Porter Stemmer Stemming with LingPipe
Using lemmatization
Using the StanfordLemmatizer class Using lemmatization in OpenNLP
Normalizing using a pipeline
Summary
Finding Sentences
The SBD process What makes SBD difficult? Understanding the SBD rules of LingPipe's HeuristicSentenceModel class Simple Java SBDs
Using regular expressions Using the BreakIterator class
Using NLP APIs
Using OpenNLP
Using the SentenceDetectorME class Using the sentPosDetect method
Using the Stanford API
Using the PTBTokenizer class Using the DocumentPreprocessor class Using the StanfordCoreNLP class
Using LingPipe
Using the IndoEuropeanSentenceModel class Using the SentenceChunker class Using the MedlineSentenceModel class
Training a sentence-detector model
Using the Trained model Evaluating the model using the SentenceDetectorEvaluator class
Summary
Finding People and Things
Why is NER difficult? Techniques for name recognition
Lists and regular expressions Statistical classifiers
Using regular expressions for NER
Using Java's regular expressions to find entities Using the RegExChunker class of LingPipe
Using NLP APIs
Using OpenNLP for NER
Determining the accuracy of the entity Using other entity types Processing multiple entity types
Using the Stanford API for NER Using LingPipe for NER
Using LingPipe's named entity models Using the ExactDictionaryChunker class
Building a new dataset with the NER annotation tool Training a model
Evaluating a model
Summary
Detecting Part of Speech
The tagging process
The importance of POS taggers What makes POS difficult?
Using the NLP APIs
Using OpenNLP POS taggers
Using the OpenNLP POSTaggerME class for POS taggers Using OpenNLP chunking Using the POSDictionary class
Obtaining the tag dictionary for a tagger Determining a word's tags Changing a word's tags Adding a new tag dictionary Creating a dictionary from a file
Using Stanford POS taggers
Using Stanford MaxentTagger Using the MaxentTagger class to tag textese Using the Stanford pipeline to perform tagging
Using LingPipe POS taggers
Using the HmmDecoder class with Best_First tags Using the HmmDecoder class with NBest tags Determining tag confidence with the HmmDecoder class
Training the OpenNLP POSModel
Summary
Representing Text with Features
N-grams Word embedding GloVe Word2vec Dimensionality reduction Principle component analysis Distributed stochastic neighbor embedding Summary
Information Retrieval
Boolean retrieval Dictionaries and tolerant retrieval
Wildcard queries Spelling correction Soundex
Vector space model Scoring and term weighting Inverse document frequency TF-IDF weighting Evaluation of information retrieval systems Summary
Classifying Texts and Documents
How classification is used Understanding sentiment analysis Text-classifying techniques Using APIs to classify text
Using OpenNLP
Training an OpenNLP classification model Using DocumentCategorizerME to classify text
Using the Stanford API
Using the ColumnDataClassifier class for classification Using the Stanford pipeline to perform sentiment analysis
Using LingPipe to classify text
Training text using the Classified class Using other training categories Classifying text using LingPipe Sentiment analysis using LingPipe Language identification using LingPipe
Summary
Topic Modeling
What is topic modeling? The basics of LDA Topic modeling with MALLET
Training Evaluation
Summary
Using Parsers to Extract Relationships
Relationship types Understanding parse trees Using extracted relationships Extracting relationships Using NLP APIs
Using OpenNLP Using the Stanford API
Using the LexicalizedParser class Using the TreePrint class Finding word dependencies using the GrammaticalStructure class
Finding coreference resolution entities
Extracting relationships for a question-answer system
Finding the word dependencies Determining the question type Searching for the answer
Summary
Combined Pipeline
Preparing data Using boilerpipe to extract text from HTML Using POI to extract text from Word documents Using PDFBox to extract text from PDF documents Using Apache Tika for content analysis and extraction Pipelines Using the Stanford pipeline Using multiple cores with the Stanford pipeline Creating a pipeline to search text Summary
Creating a Chatbot
Chatbot architecture Artificial Linguistic Internet Computer Entity
Understanding AIML Developing a chatbot using ALICE and AIML
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
  • ← Prev
  • Back
  • Next →
  • ← Prev
  • Back
  • Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab
.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion