Background

The Fear and Greed Index provided by the CNN site is a representative index that shows the sentiment of the stock market.

This index is calculated by aggregating MARKET MOMENTUM, STOCK PRICE STRENGTH, STOCK PRICE BREADTH, PUT AND CALL OPTIONS, MARKET VOLATILITY, SAFE HAVEN DEMAND, JUNK BOND DEMAND. In this post, I want to explain how to crawl the historical data of Fear-and-Greed Index and store it as a CSV file.

crawl the time series data of the Fear-Greed Index at the top of the site.

Prework

Crawling data from Interactive Chart

The way to get chart data on the web page was to refer to Medium Posting and found URL endpoints that provide chart data as JSON using Chrome Devtool.

url : https://production.dataviz.cnn.io/index/fearandgreed/graphdata

Click on the link to see that the time series data you want is included in the "fear_and_greed_historical" key value of the JSON.

💡

Set the date (yyyy-mm-DD) within the URL to find the index data from the set date.
https://production.dataviz.cnn.io/index/fearandgreed/graphdata/2023-02-10

Isn't it too simple?

If I could fetched the JSON data in this simple way, I wouldn't have left a post.
The problem was that when I tried to get the above data with a programming code, not a browser, the request was blocked by the server policy.
The Headless Browser I posted in the past makes it easy to solve these problems and get data.

Let's Coding!

I have described the basic project setup method in the post above, so please refer to it and start with the basic code base as shown below.

const puppeteer = require("puppeteer");

const run = async () => {
  try {
    const browser = await puppeteer.launch({
      args: ["--no-sandbox"],
    });
    const page = await browser.newPage();

  } catch (e) {
    console.log(e);
  }
};

run();

index.js

Now we can go to the page url we want to extract and select the appropriate selector to cut the data part neatly.
In this case, the selector value was selected as 'pre' because the targeted JSON data is well wrapped in the <pre></pre> tag.

const puppeteer = require("puppeteer");
const fs = require("fs");

const run = async () => {
  try {
    const browser = await puppeteer.launch({
      args: ["--no-sandbox"],
    });
    const page = await browser.newPage();
    await page.goto('https://production.dataviz.cnn.io/index/fearandgreed/graphdata');
    const selector = 'pre';
    await page.waitForSelector(selector);

    let data = await page.$eval(selector, (element) => element.textContent);
    let data_json = JSON.parse(data);
    
    console.log(data_json);
    browser.close();   
   
  } catch (e) {
    console.log(e);
  }
};

run();

index.js

When you run the script, you can see that JSON data is well taken.


node index.js


(out){
(out)  fear_and_greed: {
(out)    score: 70.1142857142857,
(out)    rating: 'greed',
(out)    timestamp: '2023-02-10T23:44:47+00:00',
(out)    previous_close: 72.1714285714286,
(out)    previous_1_week: 76.2571428571429,
(out)    previous_1_month: 51.8285714285714,
(out)    previous_1_year: 33.6
(out)  },
(out)  fear_and_greed_historical: {
(out)    timestamp: 1676072687000,
(out)    score: 70.1142857142857,
(out)    rating: 'greed',
(out)    data: [
(out)      [Object], [Object], [Object], [Object], [Object], [Object],
(out) ...

Transform Data

Now add the following data processing functions and complete the code that stores it in the form of a CSV file.

Convert the TIMESTAMP value of Index to the format as 'yyyy-mm-dd'
Convert the Array of Object data format to CSV String format

const puppeteer = require("puppeteer");
const fs = require("fs");

const run = async () => {
  try {
    const browser = await puppeteer.launch({
      args: ["--no-sandbox"],
    });
    const page = await browser.newPage();
    await page.goto('https://production.dataviz.cnn.io/index/fearandgreed/graphdata');
    const selector = 'pre';
    await page.waitForSelector(selector);

    let data = await page.$eval(selector, (element) => element.textContent);
    let data_json = JSON.parse(data);

    let data_array = data_json.fear_and_greed_historical.data

    browser.close();
    const csv_data = [
    [
      "datetime",
      "index_value",
      "rating"
    ],
    ...data_array.map(item => [
      timestamptodatestr(item.x),
      item.y,
      item.rating
    ])
  ]
   .map(e => e.join(","))
   .join("\n");
  
   console.log(csv_data);
   fs.appendFileSync("./index_data.csv", csv_data);
  } catch (e) {
    console.log(e);
  }
};

const timestamptodatestr = (ts) => {
    const d = new Date(ts);
    let datestr = d .toISOString().slice(0,10);
    return datestr;
}
run();

index.js

You can see that the data is well printed to the console window and stored as index_data.csv file!

Wrap-up

If the server blocks the request other than a browser, it seems to be a useful example of crawling data by using Headless Browser. (Note that excessive requests can be block from the server.)
The next post is to use the time series data extracted here to backtest simple strategy with this index in Google Spreadsheet ~ Thank you for reading! 😊

Crawling historical CNN-Fear-and-Greed Index data

Background

Prework

Crawling data from Interactive Chart

Isn't it too simple?

Let's Coding!

Transform Data

Wrap-up

You might also like

Why Yield to Maturity Matters for Fixed Income ETFs (HYG Case Study)

Simulating ETF trading strategy using CNN-Fear-and-Greed index

Writing a web crawling code using ChatGPT (Blackrock ETF example)

Live chat with stock and cryptocurrency price charts (Discord x Alpha.bot)

Subscribe to new posts.