Mapping WordPress Posts to Elasticsearch

greg.blog

I thought I’d share the Elasticsearch type mapping I am using for WordPress posts. We’ve refined it over a number of iterations and it combines dynamic templates and multi_field mappings along with a number of more standard mappings. So this is probably a good general example of how to index real data from a traditional SQL database into Elasticsearch.

If you aren’t familiar with the WordPress database scheme it looks like this:

These Elasticsearch mappings focus on the wp_posts, wp_term_relationships, wp_term_taxonomy, and wp_terms tables.

To simplify things I’ll just index using an English analyzer and leave discussing multi-lingual analyzers to a different post.

A few notes on the analyzers:

  • The minimal_english stemmer only removes plurals rather than potentially butchering the difference between words like “computer”, “computes”, and “computing”.
  • Lowercase keyword analyzer makes doing an exact search without case possible.

Let’s take a look at the post mapping:

Most of the…

View original post 379 more words

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s