How To Search a Postgres Database With Node.js, Elasticsearch, and Sequelize.   Mark Penovich -



Introduction

In this tutorial you will learn how to search over a database (Postgres, MySQL) using Node.js, Elasticsearch, and Sequelize.

Assumptions
  • A NodeJS app is running with Sequelize installed.
  • Sequelize is hooked up to a database (Postgres, MySQL, etc...).
  • Your database is populated with data and you have models or other ways to retrieve your data.
  • Elasticsearch has been downloaded onto your computer and is running on localhost:9200. If you are on a Mac I recommend using homebrew.


Step 1: Prepare your project

Create a file in your projects main directory called elasticsearch.js (This is for demonstration/testing purposes. This code can live anywhere in your project.)

touch elasticsearch.js

Download elasticsearch with NPM

npm install elasticsearch


Step 2: Connect to Elasticsearch with node

In elasticsearch.js enter the following code and run it with node elasticsearch.js. If the value returned is "Connected to Elasticsearch was successful" move on to the next step. If an error is returned, check to make sure you have an Elasticsearch server running on localhost:9200.

                
                  const elasticsearch = require('elasticsearch');

                  const client = new elasticsearch.Client({
                      hosts: [ 'http://localhost:9200']
                  });

                  client.ping({
                      requestTimeout: 30000,
                  }, function(error) {
                      if (error) {
                          console.error('Cannot connect to Elasticsearch.');
                      } else {
                          console.log('Connected to Elasticsearch was successful!');
                      }
                  });
                
            

elasticsearch.js


Step 3: Create an Index

Now we will create our index. This is where the data returned from our database will live for Elasticsearch. I give it the same name as my project. We won't go too much into detail about what indices are but you can read more about them here. If the index is created succesfully you should delete this code from the file because we only need one index for this project. An Error may occur if you do not remove this code after running it succesfully.

Remove the Elasticsearch ping() method from Step 2 and replace it with the Elasticsearch indices.create() method. Run node elasticsearch.js.

                
                  const elasticsearch = require('elasticsearch');

                  const client = new elasticsearch.Client({
                      hosts: [ 'http://localhost:9200']
                  });

                  client.indices.create({
                        index: 'csjoblist'
                    }, function(error, response, status) {
                        if (error) {
                            console.log(error);
                        } else {
                            console.log("created a new index", response);
                        }
                  });
                
            

elasticsearch.js


Step 4: Getting Data from our Database and Putting it into our Index

This is the step that does most of the heavy lifting. First thing we do is require in our Sequelize model called Job. Your model name and file requirement may look a bit different depending on your projects structure.

This time we replace the indices.create() method from Step 3 with an immediately invoked async function. We do this so we can use Javascript's async features.

Inside the function we make an empty array called let bulk = [];

Next, we fetch the data we want to search over using Sequelize. Here I am using the await keyword to resolve the promise returned by Sequelize, we can do this because we are inside an async function. The line const jobs = await Job.findAll() is returning all the jobs we have stored in our database.

Once all the jobs from our database have been returned, we loop through them and pull the data we want and store it into a throwaway object called "data". We also do a little data sanitization with the replace() function.

After that, we begin pushing our data into the bulk array. But first we push a json object that contains information for Elasticsearch to properly index our data. The _index and _type parameters are required. For _index, name it whatever you named the index is Step 3. For _type, name it whatever you want thats associated with the data you are putting into it. I called mine "jobs_list". The _id field is not required, however in order to avoid duplicate data I recommend defining it. I use my regular database id's for these but it can be any unique value you want from your database.

Once we push the json object for Elasticsearch to our bulk array, we then push our throwaway object with all our data to the bulk array. bulk.push(data)

Finally, once all our data has been properly formatted, we send the bulk array we created to the Elasticsearch client using the bulk() method provided by Elasticsearch. client.bulk({body:bulk})

                
                  const Job = require('./database/models').Job;
                  const elasticsearch = require('elasticsearch');

                  const client = new elasticsearch.Client({
                      hosts: [ 'http://localhost:9200']
                  });

                  (async () => {
                      let bulk = [];
                      const jobs = await Job.findAll()

                      jobs.forEach((job, i) => {

                        let data = {
                          id: job.id,
                          company: job.company,
                          title: job.title,
                          location: job.location,
                          description: job.description.replace(/<[^>]*>?/gm, '').replace(/\r?\n|\r/g, " ")
                        }

                        bulk.push({index:{
                                _index:"csjoblist",
                                _type:"jobs_list",
                                _id: job.id
                            }
                        })

                        bulk.push(data)
                      });

                      client.bulk({body:bulk}, function( error, response  ){
                        if( error ){
                          console.log("Failed Bulk operation", error)
                        } else {
                          console.log(response);
                        }
                      });
                  })();
                
            

elasticsearch.js


Step 5: Search for Keyword in your newly created Index

Replace the immediately invoked async function from Step 4 with the following code. We will be using Elasticsearch's search() method. We pass in an object that contains the index, type, and body. Inside the body parameter is where we specify what to look up and what database field to look in. In this case we are looking in the database field "description" for the keyword "python".

This will return your search results. In a few easy steps we now have some comprehensive searching capabilities over our database.

                
                  const Job = require('./database/models').Job;
                  const elasticsearch = require('elasticsearch');

                  const client = new elasticsearch.Client({
                      hosts: [ 'http://localhost:9200']
                  });

                  client.search({
                      index: 'csjoblist',
                      type: 'jobs_list',
                      body: {
                        query: {
                          match: { "description": "python" }
                        },
                      }
                  }, function (error, response,status) {
                      if (error){
                        console.log(error)
                      }
                      else {
                        console.log(response);
                        response.hits.hits.forEach(function(hit){
                          console.log(hit);
                        })
                      }
                  });
                
            

elasticsearch.js