In this tutorial you will learn how to search over a database (Postgres, MySQL) using Node.js, Elasticsearch, and Sequelize.
localhost:9200
. If you are on a Mac I recommend using homebrew.Create a file in your projects main directory called elasticsearch.js (This is for demonstration/testing purposes. This code can live anywhere in your project.)
touch elasticsearch.js
Download elasticsearch with NPM
npm install elasticsearch
In elasticsearch.js enter the following code and run it with node elasticsearch.js
. If the value returned is "Connected to Elasticsearch was successful" move on to the next step. If an error is returned, check to make sure you have an Elasticsearch server running on localhost:9200.
const elasticsearch = require('elasticsearch'); const client = new elasticsearch.Client({ hosts: [ 'http://localhost:9200'] }); client.ping({ requestTimeout: 30000, }, function(error) { if (error) { console.error('Cannot connect to Elasticsearch.'); } else { console.log('Connected to Elasticsearch was successful!'); } });
Now we will create our index. This is where the data returned from our database will live for Elasticsearch. I give it the same name as my project. We won't go too much into detail about what indices are but you can read more about them here. If the index is created succesfully you should delete this code from the file because we only need one index for this project. An Error may occur if you do not remove this code after running it succesfully.
Remove the Elasticsearch ping()
method from Step 2 and replace it with the Elasticsearch indices.create()
method. Run node elasticsearch.js
.
const elasticsearch = require('elasticsearch'); const client = new elasticsearch.Client({ hosts: [ 'http://localhost:9200'] }); client.indices.create({ index: 'csjoblist' }, function(error, response, status) { if (error) { console.log(error); } else { console.log("created a new index", response); } });
This is the step that does most of the heavy lifting. First thing we do is require in our Sequelize model called Job. Your model name and file requirement may look a bit different depending on your projects structure.
This time we replace the indices.create()
method from Step 3 with an immediately invoked async function. We do this so we can use Javascript's async features.
Inside the function we make an empty array called let bulk = [];
Next, we fetch the data we want to search over using Sequelize. Here I am using the await
keyword to resolve the promise returned by Sequelize, we can do this because we are inside an async function. The line const jobs = await Job.findAll()
is returning all the jobs we have stored in our database.
Once all the jobs from our database have been returned, we loop through them and pull the data we want and store it into a throwaway object called "data". We also do a little data sanitization with the replace()
function.
After that, we begin pushing our data into the bulk
array. But first we push a json object that contains information for Elasticsearch to properly index our data. The _index
and _type
parameters are required. For _index, name it whatever you named the index is Step 3. For _type, name it whatever you want thats associated with the data you are putting into it. I called mine "jobs_list". The _id
field is not required, however in order to avoid duplicate data I recommend defining it. I use my regular database id's for these but it can be any unique value you want from your database.
Once we push the json object for Elasticsearch to our bulk array, we then push our throwaway object with all our data to the bulk array. bulk.push(data)
Finally, once all our data has been properly formatted, we send the bulk array we created to the Elasticsearch client using the bulk()
method provided by Elasticsearch. client.bulk({body:bulk})
const Job = require('./database/models').Job; const elasticsearch = require('elasticsearch'); const client = new elasticsearch.Client({ hosts: [ 'http://localhost:9200'] }); (async () => { let bulk = []; const jobs = await Job.findAll() jobs.forEach((job, i) => { let data = { id: job.id, company: job.company, title: job.title, location: job.location, description: job.description.replace(/<[^>]*>?/gm, '').replace(/\r?\n|\r/g, " ") } bulk.push({index:{ _index:"csjoblist", _type:"jobs_list", _id: job.id } }) bulk.push(data) }); client.bulk({body:bulk}, function( error, response ){ if( error ){ console.log("Failed Bulk operation", error) } else { console.log(response); } }); })();
Replace the immediately invoked async function from Step 4 with the following code. We will be using Elasticsearch's search()
method. We pass in an object that contains the index, type, and body. Inside the body parameter is where we specify what to look up and what database field to look in. In this case we are looking in the database field "description" for the keyword "python".
This will return your search results. In a few easy steps we now have some comprehensive searching capabilities over our database.
const Job = require('./database/models').Job; const elasticsearch = require('elasticsearch'); const client = new elasticsearch.Client({ hosts: [ 'http://localhost:9200'] }); client.search({ index: 'csjoblist', type: 'jobs_list', body: { query: { match: { "description": "python" } }, } }, function (error, response,status) { if (error){ console.log(error) } else { console.log(response); response.hits.hits.forEach(function(hit){ console.log(hit); }) } });