{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# AYLIEN NEWS API: A Starter Guide for Python Users" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Download Jupyter notebook [here](https://learn.aylien.com/download/news_api_python_starter_guide.ipynb)\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "toc": true }, "source": [ "

Table of Contents

\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction\n", "In this document, we will review four of the AYLIEN News API's most commonly used endpoints:\n", "- stories (pull articles that have been enriched by AYLIEN's NLP technology)\n", "- timeseries (pull the volume of stories that meet your query over time)\n", "- trends (identify the most prevalent entities, concepts or keywords that appear in the stories that meet your criteria)\n", "- clusters (identify clusters of similar news stories to investigate events)\n", "\n", "We will utilise AYLIEN's Python SDK (Software Development Kit) and also show you some helpful code to start wrangling the data in Python using Pandas and visualizing it using Plotly. \n", "\n", "As an exercise, we will focus on pulling news stories related to Citibank, to show how these different endpoints can be used in combination to investigate a topic of your choice. \n", "\n", "Please note, comprehensive documentation on how to use the News API can be found [here](https://docs.aylien.com/newsapi/#getting-started)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Technical Set-Up\n", "Here we will outline how to connect to AYLIEN's News API and define some useful functions to make pulling and analysing our data easier. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Configuring Your API Connection\n", "First things first — we need to connect to the News API. Make sure that you have installed the aylien_news_api library using pip. The code below demonstrates how to connect to the API and also imports some other libraries that will be useful later. \n", "\n", "Don't forget to enter your API credentials in order to connect to the API! If you don't have any credentials yet, you can sign up for a free trial [here](https://newsapi.aylien.com/signup)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Complete\n" ] } ], "source": [ "from __future__ import print_function\n", "\n", "\n", "# install packages if not installed already\n", "# !pip install datetime\n", "# !pip install pandas\n", "# !pip install numpy\n", "# !pip install plotly\n", "# !pip install aylien_news_api\n", "# !pip install chart_studio\n", "# !pip install tqdm\n", "# !pip install pprint\n", "# !pip install wordcloud\n", "\n", "import os\n", "import requests\n", "import datetime\n", "from dateutil.tz import tzutc\n", "import json\n", "import time\n", "import pandas as pd\n", "import numpy as np\n", "import math\n", "from tqdm import tqdm\n", "from pprint import pprint\n", "\n", "# for visualization\n", "import plotly.graph_objs as go\n", "import chart_studio.plotly as py\n", "from plotly.subplots import make_subplots\n", "\n", "headers = {\n", " 'X-AYLIEN-NewsAPI-Application-ID': 'ID', \n", " 'X-AYLIEN-NewsAPI-Application-Key': 'KEY'\n", "}\n", "\n", "\n", "print('Complete')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Define Functions to Pull Data\n", "The Functions below will be used to pull the data from the API using get requests. In some cases, data will be returned as an array of objects e.g. the get_stories function. In others data will be returned as Pandas dataframes e.g. the get_timeseires function." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "#=======================================================================================\n", "def get_timeseries(params, print_params = None, print_count = None):\n", " if print_params is None or print_params == 'yes':\n", " pprint(params)\n", " \n", " response = requests.get('https://api.aylien.com/news/time_series', params=params, headers=headers).json()\n", " \n", " if 'errors' in response or 'error' in response:\n", " pprint(response)\n", " \n", " #convert to dataframe\n", " timeseries_data = pd.DataFrame(response['time_series'])\n", " \n", " # convert back to datetime\n", " timeseries_data['published_at'] = pd.to_datetime(timeseries_data['published_at'])\n", " \n", " timeseries_data['published_at'] = timeseries_data['published_at'].dt.date\n", " \n", " if print_count is None or print_count == 'yes':\n", " print('Number of stories returned : ' + str(format(timeseries_data['count'].sum(), \",d\")))\n", " \n", " return timeseries_data\n", "\n", "\n", "#=======================================================================================\n", "def get_stories(params, print_params = None, print_count = None, print_story = None):\n", " if print_params is None or print_params == 'yes':\n", " pprint(params)\n", " \n", " fetched_stories = []\n", " stories = None\n", " while stories is None or len(stories) > 0:\n", " try:\n", " response = requests.get('https://api.aylien.com/news/stories', params=params, headers=headers).json()\n", " except Exception as e:\n", " continue\n", " \n", " if 'errors' in response or 'error' in response:\n", " pprint(response)\n", " \n", " stories = response['stories']\n", " \n", " if len(stories) > 0:\n", " print(stories[0]['title'])\n", " print(stories[0]['links']['permalink'])\n", " \n", " params['cursor'] = response['next_page_cursor']\n", " \n", " fetched_stories += stories\n", " \n", " \n", " if (print_story is None or print_story == 'yes') and len(stories) > 0:\n", " pprint(stories[0]['title'])\n", " \n", " if print_count is None or print_count == 'yes':\n", " print(\"Fetched %d stories. Total story count so far: %d\" %(len(stories), len(fetched_stories)))\n", " \n", " return fetched_stories\n", "\n", "#=======================================================================================\n", "def get_top_ranked_stories(params, no_stories, print_params = None, print_count = None):\n", " if print_params is None or print_params == 'yes':\n", " pprint(params)\n", " \n", " fetched_stories = []\n", " stories = None\n", " while stories is None or len(stories) > 0 and len(fetched_stories) < no_stories:\n", " try:\n", " response = requests.get('https://api.aylien.com/news/stories', params=params, headers=headers).json()\n", " except Exception as e:\n", " continue\n", " \n", " if 'errors' in response or 'error' in response:\n", " pprint(response)\n", " \n", " stories = response['stories']\n", " \n", " if len(stories) > 0:\n", " print(stories[0]['title'])\n", " print(stories[0]['links']['permalink'])\n", " \n", " params['cursor'] = response['next_page_cursor']\n", " \n", " fetched_stories += stories\n", " \n", " if print_count is None or print_count == 'yes':\n", " print(\"Fetched %d stories. Total story count so far: %d\" %(len(stories), len(fetched_stories)))\n", " \n", " return fetched_stories\n", "\n", "\n", "#=======================================================================================\n", "def get_clusters(params={}):\n", " #pprint(params)\n", " \n", " response = requests.get('https://api.aylien.com/news/clusters', params=params, headers=headers).json()\n", " \n", " if 'errors' in response or 'error' in response:\n", " pprint(response)\n", "\n", " return response\n", "\n", "\n", "#=======================================================================================\n", "# pull trends data to identify most frequently occuring entities or keywords \n", "def get_trends(params={}):\n", " #pprint(params)\n", " \n", " response = requests.get('https://api.aylien.com/news/trends', params=params, headers=headers).json()\n", " \n", " if 'errors' in response or 'error' in response:\n", " pprint(response)\n", " \n", " return response\n", "\n", "#=======================================================================================\n", "def get_cluster_from_trends(params, print_params = None):\n", " \n", " if print_params is None or print_params == 'yes':\n", " pprint(params)\n", " \n", " \"\"\"\n", " Returns a list of up to 100 clusters that meet the parameters set out.\n", " \"\"\"\n", " response = requests.get('https://api.aylien.com/news/trends', params=params, headers=headers).json()\n", " \n", " if 'errors' in response or 'error' in response:\n", " pprint(response)\n", " \n", " if len(response) > 0:\n", " return response[\"trends\"]\n", "\n", " \n", "#=======================================================================================\n", "# identify the top ranked story per cluster\n", "def get_top_stories_in_cluster(cluster_id, no_stories): \n", " top_story_params = {\n", " 'clusters[]' : [cluster_id]\n", " , 'sort_by' : \"source.rankings.alexa.rank\"\n", " , 'per_page' : no_stories\n", " , 'return[]' : ['id', 'language', 'links', 'title', 'source', 'translations', 'clusters', 'published_at']\n", " }\n", " \n", " response = requests.get('https://api.aylien.com/news/stories', params=top_story_params, headers=headers).json()\n", " \n", " if 'errors' in response or 'error' in response:\n", " pprint(response)\n", " if len(response[\"stories\"]) > 0:\n", " return response[\"stories\"]\n", " else:\n", " return None\n", " \n", " \n", "#=======================================================================================\n", "# helper endpoint that takes a string of characters and an entity type (such as sources, or DBpedia entities) and returns matching entities of the specified type along with additional metadata\n", "# params = {'type' : 'source_names', 'term' : 'Times of India' } \n", " \n", "def autocompletes(params={}):\n", " pprint(params)\n", " \"\"\"\n", " Returns a list of up to 100 clusters that meet the parameters set out.\n", " \"\"\"\n", " response = requests.get('https://api.aylien.com/news/autocompletes', params=params, headers=headers).json()\n", " \n", " if 'errors' in response or 'error' in response:\n", " pprint(response)\n", " \n", " pprint(response)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Define Other Useful Functions\n", "These other functions will help us format data as necessary." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# return transalted title or body of a story (specify in params) \n", "def return_translated_content(story_x, text_x):\n", " if 'translations' in story_x:\n", " return story_x['translations']['en'][text_x]\n", " else:\n", " return story_x[text_x]\n", " \n", " \n", "# create smaller lists from big lists\n", "def chunks(lst, n):\n", " return list(lst[i:i + n] for i in range(0, len(lst), n))\n", "\n", "\n", "#=======================================================================================\n", "# split title string over multiple lines for legibility on graph\n", "def split_title_string(dataframe_x, column_x):\n", " title_strings = []\n", "\n", " for index, row in dataframe_x.iterrows():\n", " word_array = row[column_x].split()\n", " counter = 0\n", " string = ''\n", " for word in word_array:\n", " if counter == 7:\n", " string += (word + '
')\n", " counter = 0\n", " else:\n", " string += (word + ' ')\n", " counter += 1\n", " title_strings.append(string)\n", "\n", " dataframe_x[column_x + '_string'] = (title_strings)\n", "\n", "\n", "#=======================================================================================\n", "def print_keyword_mention(story_x, element_x, keyword_x):\n", " body_x = story[element_x]\n", " \n", " if 'translations' in story and story['translations'] is not None and 'en' in story['translations']:\n", " body_x = story['translations']['en'][element_x]\n", " \n", " # extract a window around key entity\n", " e_idx = body_x.find(keyword_x)\n", " e_end = e_idx + len(keyword_x)\n", " if e_idx >= 0:\n", " e_str = body_x[e_idx-100:e_idx] + \"\\033[1m\" + body_x[e_idx:e_end] + \"\\033[0m \" + body_x[e_end+1:e_end+51]\n", " print(f'{e_str}')\n", " \n", " elif element_x == 'title':\n", " print(story['title'])\n", "\n", " \n", "#=======================================================================================\n", "def print_entities(story_x, element_x = None, surface_form_x = None, version_x = None):\n", " \n", " element = ''\n", " if element_x is None or element_x == 'body':\n", " element = 'body'\n", " else:\n", " element = 'title'\n", " \n", " # if no surface_form \n", " if surface_form_x is None:\n", " for entity in story_x['entities']:\n", " pprint(entity)\n", " \n", " else:\n", " \n", " for entity in story_x['entities']:\n", " x = 0\n", " for surface_form in entity[element_x]['surface_forms']:\n", " if surface_form_x.lower() in surface_form['text'].lower():\n", " x = 1\n", "\n", " if x != 0:\n", " pprint(entity)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Making Your First Calls" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The Stories Endpoint\n", "The most granular data point we can extract from the News API is a story; all other endpoints are aggregations or extrapolations of stories. Stories are basically news articles that have been enriched using AYLIEN's machine learning prcoess. We will learn more about this enrichment later.\n", "\n", "For now we will pull one story published in English in the last hour." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'language[]': ['en'],\n", " 'per_page': 1,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-1HOUR'}\n", "USA TODAY Sports Josh McDaniels on Raiders' Resiliency in OT Win Over Seattle Originally posted on FanNation Raider Maven By Aidan Champion \t\t\t\t\t\t |  \t\t\t\t\tLast updated 11/28/22\n", "https://www.yardbarker.com/nfl/articles/josh_mcdaniels_on_raiders_resiliency_in_ot_win_over_seattle/s1_16640_38177445\n", "Fetched 1 stories. Total story count so far: 1\n", "\n", "[{'author': {'id': 27406279, 'name': 'Aidan Champion'},\n", " 'body': 'For the second week in a row, the Las Vegas Raiders found the will '\n", " 'to win in overtime on the road.\\n'\n", " ' A week after defeating the Denver Broncos on a walk-off play in '\n", " 'OT, the Raiders did the same against a solid Seattle Seahawks '\n", " 'team. \"I think our team is obviously learning how to be '\n", " 'resilient,\" Raiders coach Josh McDaniels said in his postgame press '\n", " 'conference Sunday. \"And give Seattle a lot of credit. This is a '\n", " \"good football team, they're well coached like we thought they would \"\n", " 'be. Pete [Carroll] does a great job, and they gave us some fits on '\n", " 'some things and made some adjustments and we had to make some '\n", " 'adjustments and it was a very interesting game in that regard. But '\n", " 'I thought our guys were tough.\" Sunday\\'s game was a sequence of '\n", " 'ups and downs, with the Raiders even falling behind by a touchdown '\n", " 'with just over 5 and half minutes remaining in regulation. \"You '\n", " 'got to focus on the next drive, the next sequence, the next group '\n", " 'that\\'s going to go out there on the field,\" McDaniels said. \"I '\n", " \"mean, it started from the first play to the last play. First play's \"\n", " \"an interception and the last play's a touchdown. There was a lot of \"\n", " 'swings, and I credit our coaches. Our coaches did a really good job '\n", " 'of staying neutral at times when they needed to be and trying to '\n", " 'fix the problems if there were any and address those without having '\n", " 'a bunch of emotion in it.\" As promising as the back-to-back '\n", " 'victories have been for the Silver and Black, McDaniels has always '\n", " 'felt optimistic his team was heading in the right direction. '\n", " '\"I\\'ve never doubted that it was,\" he said. \"And like I said, the '\n", " \"NFL, there's a lot of close games every week, and sometimes it \"\n", " 'takes a little while to learn how to get over the hump on some of '\n", " \"those things, and that's what we attribute it to. Doesn't guarantee \"\n", " \"us anything going forward. We're going to stick with our process, \"\n", " 'we think we have a really close-knit group here that works hard, we '\n", " \"believe in what we're doing, we believe in what we're coaching, we \"\n", " \"believe in trying to win the way we're trying to win. And I think \"\n", " 'our guys do, too.\" This article first appeared on FanNation Raider '\n", " 'Maven and was syndicated with permission. More must-reads:',\n", " 'categories': [{'id': 'IAB17',\n", " 'label': 'Sports',\n", " 'links': {'self': 'https://api.aylien.com/api/v1/classify/taxonomy/iab-qag/IAB17'},\n", " 'score': 0.33,\n", " 'taxonomy': 'iab-qag'},\n", " {'id': 'IAB17-12',\n", " 'label': 'Football',\n", " 'links': {'parents': ['https://api.aylien.com/api/v1/classify/taxonomy/iab-qag/IAB17'],\n", " 'self': 'https://api.aylien.com/api/v1/classify/taxonomy/iab-qag/IAB17-12'},\n", " 'score': 0.24,\n", " 'taxonomy': 'iab-qag'},\n", " {'id': '15003000',\n", " 'label': 'American football',\n", " 'links': {'parents': ['https://api.aylien.com/api/v1/classify/taxonomy/iptc-subjectcode/15000000'],\n", " 'self': 'https://api.aylien.com/api/v1/classify/taxonomy/iptc-subjectcode/15003000'},\n", " 'score': 0.54,\n", " 'taxonomy': 'iptc-subjectcode'},\n", " {'id': 'ay.lifesoc.prosport',\n", " 'label': 'Professional Sports',\n", " 'links': {'parents': ['https://api.aylien.com/api/v1/classify/taxonomy/aylien/ay.sports'],\n", " 'self': 'https://api.aylien.com/api/v1/classify/taxonomy/aylien/ay.lifesoc.prosport'},\n", " 'score': 1,\n", " 'taxonomy': 'aylien'},\n", " {'id': 'ay.sports',\n", " 'label': 'Sports',\n", " 'links': {'parents': [],\n", " 'self': 'https://api.aylien.com/api/v1/classify/taxonomy/aylien/ay.sports'},\n", " 'score': 1,\n", " 'taxonomy': 'aylien'},\n", " {'id': 'ay.sports.football',\n", " 'label': 'Football (American)',\n", " 'links': {'parents': ['https://api.aylien.com/api/v1/classify/taxonomy/aylien/ay.sports.team'],\n", " 'self': 'https://api.aylien.com/api/v1/classify/taxonomy/aylien/ay.sports.football'},\n", " 'score': 1,\n", " 'taxonomy': 'aylien'},\n", " {'id': 'ay.sports.nfl',\n", " 'label': 'National Football League',\n", " 'links': {'parents': ['https://api.aylien.com/api/v1/classify/taxonomy/aylien/ay.lifesoc.prosport',\n", " 'https://api.aylien.com/api/v1/classify/taxonomy/aylien/ay.sports.football'],\n", " 'self': 'https://api.aylien.com/api/v1/classify/taxonomy/aylien/ay.sports.nfl'},\n", " 'score': 1,\n", " 'taxonomy': 'aylien'},\n", " {'id': 'ay.sports.team',\n", " 'label': 'Team Sports',\n", " 'links': {'parents': ['https://api.aylien.com/api/v1/classify/taxonomy/aylien/ay.sports'],\n", " 'self': 'https://api.aylien.com/api/v1/classify/taxonomy/aylien/ay.sports.team'},\n", " 'score': 1,\n", " 'taxonomy': 'aylien'},\n", " {'id': 'ay.lifesoc.gensport',\n", " 'label': 'General Sports',\n", " 'links': {'parents': ['https://api.aylien.com/api/v1/classify/taxonomy/aylien/ay.sports'],\n", " 'self': 'https://api.aylien.com/api/v1/classify/taxonomy/aylien/ay.lifesoc.gensport'},\n", " 'score': 0.9,\n", " 'taxonomy': 'aylien'}],\n", " 'characters_count': 2187,\n", " 'clusters': [409119966],\n", " 'entities': [{'body': {'sentiment': {'confidence': 0.59,\n", " 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 387,\n", " 'start': 380},\n", " 'sentiment': {'confidence': 0.59,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'Seattle'}]},\n", " 'external_ids': {},\n", " 'id': 'Q5083',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q5083',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/Seattle'},\n", " 'overall_frequency': 2,\n", " 'overall_prominence': 0.98,\n", " 'overall_sentiment': {'confidence': 0.74,\n", " 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'sentiment': {'confidence': 0.89,\n", " 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 80,\n", " 'start': 73},\n", " 'sentiment': {'confidence': 0.89,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'Seattle'}]},\n", " 'types': ['Local_government',\n", " 'Corporation',\n", " 'Location',\n", " 'Political_organisation',\n", " 'City',\n", " 'Government',\n", " 'Community',\n", " 'Company',\n", " 'Organization']},\n", " {'body': {'sentiment': {'confidence': 0.74,\n", " 'polarity': 'positive'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 323,\n", " 'start': 309},\n", " 'sentiment': {'confidence': 0.74,\n", " 'polarity': 'positive'}}],\n", " 'text': 'Josh McDaniels'}]},\n", " 'external_ids': {},\n", " 'id': 'Q3810320',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q3810320',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/Josh_McDaniels'},\n", " 'overall_frequency': 2,\n", " 'overall_prominence': 0.98,\n", " 'overall_sentiment': {'confidence': 0.88,\n", " 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'sentiment': {'confidence': 0.88,\n", " 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 34,\n", " 'start': 20},\n", " 'sentiment': {'confidence': 0.88,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'Josh McDaniels'}]},\n", " 'types': ['Human']},\n", " {'body': {'sentiment': {'confidence': 0.72,\n", " 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 3,\n", " 'mentions': [{'index': {'end': 179,\n", " 'start': 172},\n", " 'sentiment': {'confidence': 0.72,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 302,\n", " 'start': 295},\n", " 'sentiment': {'confidence': 0.67,\n", " 'polarity': 'positive'}},\n", " {'index': {'end': 775,\n", " 'start': 768},\n", " 'sentiment': {'confidence': 0.59,\n", " 'polarity': 'negative'}}],\n", " 'text': 'Raiders'}]},\n", " 'external_ids': {},\n", " 'id': 'Q5870124',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q5870124',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/History_of_the_Oakland_Raiders'},\n", " 'overall_frequency': 4,\n", " 'overall_prominence': 0.98,\n", " 'overall_sentiment': {'confidence': 0.82,\n", " 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'sentiment': {'confidence': 0.91,\n", " 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 45,\n", " 'start': 38},\n", " 'sentiment': {'confidence': 0.91,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'Raiders'}]},\n", " 'types': []},\n", " {'body': {'sentiment': {'confidence': 0.79,\n", " 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 2134,\n", " 'start': 2112},\n", " 'sentiment': {'confidence': 0.79,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'FanNation Raider Maven'}]},\n", " 'external_ids': {},\n", " 'id': 'N186086181726508417844685281276398801348',\n", " 'overall_frequency': 2,\n", " 'overall_prominence': 0.98,\n", " 'overall_sentiment': {'confidence': 0.85,\n", " 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'sentiment': {'confidence': 0.9,\n", " 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 132,\n", " 'start': 110},\n", " 'sentiment': {'confidence': 0.9,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'FanNation Raider '\n", " 'Maven'}]},\n", " 'types': ['Location']},\n", " {'body': {'surface_forms': []},\n", " 'external_ids': {},\n", " 'id': 'N130591893304718568511464573285032572817',\n", " 'overall_frequency': 1,\n", " 'overall_prominence': 0.98,\n", " 'overall_sentiment': {'confidence': 0.89,\n", " 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'sentiment': {'confidence': 0.89,\n", " 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 16,\n", " 'start': 0},\n", " 'sentiment': {'confidence': 0.89,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'USA TODAY Sports'}]},\n", " 'types': ['Organization']},\n", " {'body': {'surface_forms': []},\n", " 'external_ids': {},\n", " 'id': 'N223547351047399335585713261784924445595',\n", " 'overall_frequency': 1,\n", " 'overall_prominence': 0.91,\n", " 'overall_sentiment': {'confidence': 0.89,\n", " 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'sentiment': {'confidence': 0.89,\n", " 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 72,\n", " 'start': 64},\n", " 'sentiment': {'confidence': 0.89,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'Win Over'}]},\n", " 'types': ['Location']},\n", " {'body': {'surface_forms': []},\n", " 'external_ids': {},\n", " 'id': 'Q132148',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q132148',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/Aidan_of_Lindisfarne'},\n", " 'overall_frequency': 1,\n", " 'overall_prominence': 0.8,\n", " 'overall_sentiment': {'confidence': 0.89,\n", " 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'sentiment': {'confidence': 0.89,\n", " 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 144,\n", " 'start': 139},\n", " 'sentiment': {'confidence': 0.89,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'Aidan'}]},\n", " 'types': ['Human']},\n", " {'body': {'surface_forms': []},\n", " 'external_ids': {},\n", " 'id': 'N283599043316305970941549218810195124075',\n", " 'overall_frequency': 1,\n", " 'overall_prominence': 0.76,\n", " 'overall_sentiment': {'confidence': 0.88,\n", " 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'sentiment': {'confidence': 0.88,\n", " 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 174,\n", " 'start': 170},\n", " 'sentiment': {'confidence': 0.88,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'Last'}]},\n", " 'types': ['Human']},\n", " {'body': {'sentiment': {'confidence': 0.72,\n", " 'polarity': 'positive'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 51,\n", " 'start': 34},\n", " 'sentiment': {'confidence': 0.72,\n", " 'polarity': 'positive'}}],\n", " 'text': 'Las Vegas Raiders'}]},\n", " 'external_ids': {},\n", " 'id': 'Q324523',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q324523',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/Las_Vegas_Raiders'},\n", " 'overall_frequency': 1,\n", " 'overall_prominence': 0.68,\n", " 'overall_sentiment': {'confidence': 0.72,\n", " 'polarity': 'positive'},\n", " 'stock_tickers': [],\n", " 'title': {'surface_forms': []},\n", " 'types': ['Nonprofit_organization', 'Organization']},\n", " {'body': {'sentiment': {'confidence': 0.72,\n", " 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 141,\n", " 'start': 127},\n", " 'sentiment': {'confidence': 0.72,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'Denver Broncos'}]},\n", " 'external_ids': {},\n", " 'id': 'Q223507',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q223507',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/Denver_Broncos'},\n", " 'overall_frequency': 1,\n", " 'overall_prominence': 0.55,\n", " 'overall_sentiment': {'confidence': 0.72,\n", " 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'surface_forms': []},\n", " 'types': ['Nonprofit_organization', 'Organization']},\n", " {'body': {'sentiment': {'confidence': 0.75,\n", " 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 166,\n", " 'start': 164},\n", " 'sentiment': {'confidence': 0.75,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'OT'}]},\n", " 'external_ids': {},\n", " 'id': 'Q186982',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q186982',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/Overtime_(sports)'},\n", " 'overall_frequency': 1,\n", " 'overall_prominence': 0.49,\n", " 'overall_sentiment': {'confidence': 0.75,\n", " 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'surface_forms': []},\n", " 'types': []},\n", " {'body': {'sentiment': {'confidence': 0.58,\n", " 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 225,\n", " 'start': 209},\n", " 'sentiment': {'confidence': 0.58,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'Seattle Seahawks'}]},\n", " 'external_ids': {},\n", " 'id': 'Q221878',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q221878',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/Seattle_Seahawks'},\n", " 'overall_frequency': 1,\n", " 'overall_prominence': 0.43,\n", " 'overall_sentiment': {'confidence': 0.58,\n", " 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'surface_forms': []},\n", " 'types': ['Nonprofit_organization', 'Organization']},\n", " {'body': {'sentiment': {'confidence': 0.91,\n", " 'polarity': 'positive'},\n", " 'surface_forms': [{'frequency': 2,\n", " 'mentions': [{'index': {'end': 995,\n", " 'start': 986},\n", " 'sentiment': {'confidence': 0.83,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 1458,\n", " 'start': 1449},\n", " 'sentiment': {'confidence': 0.91,\n", " 'polarity': 'positive'}}],\n", " 'text': 'McDaniels'}]},\n", " 'external_ids': {},\n", " 'id': 'Q16846249',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q16846249',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/K._J._McDaniels'},\n", " 'overall_frequency': 2,\n", " 'overall_prominence': 0.34,\n", " 'overall_sentiment': {'confidence': 0.91,\n", " 'polarity': 'positive'},\n", " 'stock_tickers': [],\n", " 'title': {'surface_forms': []},\n", " 'types': ['Human']},\n", " {'body': {'sentiment': {'confidence': 0.97,\n", " 'polarity': 'positive'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 491,\n", " 'start': 487},\n", " 'sentiment': {'confidence': 0.97,\n", " 'polarity': 'positive'}}],\n", " 'text': 'Pete'}]},\n", " 'external_ids': {},\n", " 'id': 'N334899751825691118615243936337416130988',\n", " 'overall_frequency': 1,\n", " 'overall_prominence': 0.03,\n", " 'overall_sentiment': {'confidence': 0.97,\n", " 'polarity': 'positive'},\n", " 'stock_tickers': [],\n", " 'title': {'surface_forms': []},\n", " 'types': ['Human']},\n", " {'body': {'sentiment': {'confidence': 0.9,\n", " 'polarity': 'positive'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 1447,\n", " 'start': 1442},\n", " 'sentiment': {'confidence': 0.9,\n", " 'polarity': 'positive'}}],\n", " 'text': 'Black'}]},\n", " 'external_ids': {},\n", " 'id': 'N309782724290245396668082628620805636318',\n", " 'overall_frequency': 1,\n", " 'overall_prominence': 0.02,\n", " 'overall_sentiment': {'confidence': 0.9,\n", " 'polarity': 'positive'},\n", " 'stock_tickers': [],\n", " 'title': {'surface_forms': []},\n", " 'types': ['Human']},\n", " {'body': {'sentiment': {'confidence': 0.6,\n", " 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 1600,\n", " 'start': 1597},\n", " 'sentiment': {'confidence': 0.6,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'NFL'}]},\n", " 'external_ids': {},\n", " 'id': 'Q1215884',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q1215884',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/National_Football_League'},\n", " 'overall_frequency': 1,\n", " 'overall_prominence': 0.02,\n", " 'overall_sentiment': {'confidence': 0.6, 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'surface_forms': []},\n", " 'types': ['Business', 'Organization']}],\n", " 'hashtags': ['#Seattle',\n", " '#JoshMcDaniels',\n", " '#WalkoffHomeRun',\n", " '#USAToday',\n", " '#Touchdown',\n", " '#SeattleSeahawks',\n", " '#Overtime',\n", " '#OaklandRaiders',\n", " '#NationalFootballLeague',\n", " '#NFL',\n", " '#MavenHuffman',\n", " '#Interception',\n", " '#Emotion',\n", " '#DenverBroncos',\n", " '#BroadcastSyndication',\n", " '#AssociationFootball'],\n", " 'id': 5079455953,\n", " 'industries': [],\n", " 'keywords': ['things',\n", " 'OT',\n", " 'Seattle Seahawks',\n", " 'week',\n", " 'Aidan Champion',\n", " 'Seattle',\n", " 'Maven',\n", " 'team',\n", " 'play',\n", " 'emotion',\n", " 'touchdown',\n", " 'Josh McDaniels',\n", " 'Win Over',\n", " 'overtime',\n", " 'Black',\n", " 'interception',\n", " 'walk-off',\n", " 'Pete',\n", " 'syndicated',\n", " 'McDaniels',\n", " 'Last',\n", " 'Raiders',\n", " 'USA TODAY',\n", " 'Denver Broncos',\n", " 'NFL',\n", " 'football',\n", " 'Sunday'],\n", " 'language': 'en',\n", " 'license_type': 0,\n", " 'links': {'clusters': '/stories?clusters[]=409119966',\n", " 'permalink': 'https://www.yardbarker.com/nfl/articles/josh_mcdaniels_on_raiders_resiliency_in_ot_win_over_seattle/s1_16640_38177445',\n", " 'related_stories': '/related_stories?story_id=5079455953'},\n", " 'media': [{'format': 'JPEG',\n", " 'height': 900,\n", " 'type': 'image',\n", " 'url': 'https://www.yardbarker.com/media/7/6/76d5488c5ae0f0dd9b411351871bdc7a7b623a6b/thumb_16x9/usatsi_19517121_168390101_lowres.jpg?v=1',\n", " 'width': 1600}],\n", " 'paragraphs_count': 2,\n", " 'published_at': '2022-11-28T16:58:47Z',\n", " 'sentences_count': 22,\n", " 'sentiment': {'body': {'polarity': 'positive', 'score': 0.6},\n", " 'title': {'polarity': 'neutral', 'score': 0.75}},\n", " 'source': {'domain': 'yardbarker.com',\n", " 'home_page_url': 'https://www.yardbarker.com/',\n", " 'id': 117069,\n", " 'locations': [{'country': 'US'}],\n", " 'logo_url': '',\n", " 'name': 'Yardbarker',\n", " 'scopes': []},\n", " 'summary': {'sentences': ['For the second week in a row, the Las Vegas '\n", " 'Raiders found the will to win in overtime on the '\n", " 'road.\\n'\n", " ' ',\n", " 'A week after defeating the Denver Broncos on a '\n", " 'walk-off play in OT, the Raiders did the same '\n", " 'against a solid Seattle Seahawks team. ',\n", " '\"I think our team is obviously learning how to be '\n", " 'resilient,\" Raiders coach Josh McDaniels said in '\n", " 'his postgame press conference Sunday.',\n", " '\"And like I said, the NFL, there\\'s a lot of '\n", " 'close games every week, and sometimes it takes a '\n", " 'little while to learn how to get over the hump on '\n", " \"some of those things, and that's what we \"\n", " 'attribute it to.',\n", " \"Sunday's game was a sequence of ups and downs, \"\n", " 'with the Raiders even falling behind by a '\n", " 'touchdown with just over 5 and half minutes '\n", " 'remaining in regulation. ']},\n", " 'title': \"USA TODAY Sports Josh McDaniels on Raiders' Resiliency in OT \"\n", " 'Win Over Seattle Originally posted on FanNation Raider '\n", " 'Maven By Aidan Champion \\t\\t\\t\\t\\t\\t\\xa0|\\xa0 \\t\\t\\t\\t\\tLast '\n", " 'updated 11/28/22',\n", " 'words_count': 420}]\n" ] } ], "source": [ "# define parameters\n", "params = {\n", " 'published_at.start': 'NOW-1HOUR'\n", " , 'published_at.end': 'NOW'\n", " , 'language[]' : ['en']\n", " , 'per_page' : 1\n", " } \n", "\n", "stories = get_top_ranked_stories(params, 1)\n", "\n", "print()\n", "pprint(stories)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see that the story output is a list with one dictionary object representing the story we queried. The story object inlcudes the title, body text, summary sentences and lots of other contextual information that has been made available via AYLIEN's enrichment process. \n", "\n", "We can loop through the object's key names to give us a flavour of what is available." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "author\n", "body\n", "categories\n", "industries\n", "characters_count\n", "clusters\n", "entities\n", "hashtags\n", "id\n", "keywords\n", "language\n", "links\n", "media\n", "paragraphs_count\n", "published_at\n", "sentences_count\n", "sentiment\n", "source\n", "summary\n", "title\n", "words_count\n", "license_type\n" ] } ], "source": [ "for key in stories[0]:\n", " print(key)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Refining Your Query" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using Keyword Search and the Cursor\n", "Using a keyword search, we can search the AYLIEN database for words that appear in the title or body of an article. Here we will search for \"Citigroup\" in the title.\n", "\n", "We will also limit the the date range — if we don't, we could return thousands of stories that feature \"Citigroup\" in the title — and define the language as English (\"en\"). Defining the language not only limits our output to English language content, it also allows the query to to remove any relevant stopwords. Learn about stopwords [here](https://docs.aylien.com/tap/#removing-stopwords).\n", "\n", "We will also introduce the cursor. We don't know how many stories we'll get, and the cursor will allow us to scan through results. Learn more about using the cursor [here](https://docs.aylien.com/newsapi/common-workflows/#pagination-of-results). \n", "\n", "The per_page parameter defines how many stories are returned for each API call, with 100 being the max.\n", "\n", "The default parameters below will use relative times to ensure you can access recent news data (historical data is restricted). You can try changing the time periods by altering the paramters using the following formats:\n", "- 'NOW-5DAYS'\n", "- 'NOW-5MINS'\n", "- '2020-11-01T00:00:00Z'" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'cursor': '*',\n", " 'language[]': ['en'],\n", " 'per_page': 50,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-2DAYS',\n", " 'title': 'Citigroup'}\n", "BUZZ-Live Nation rises as Citigroup lifts rating to 'buy'\n", "https://www.swissquote.ch/sqi_premium/market/news/News.action?id=14986092\n", "\"BUZZ-Live Nation rises as Citigroup lifts rating to 'buy'\"\n", "Fetched 50 stories. Total story count so far: 50\n", "Caribou Biosciences (NASDAQ:CRBU) Given New $37.00 Price Target at Citigroup\n", "https://dakotafinancialnews.com/2022/11/27/caribou-biosciences-nasdaqcrbu-given-new-37-00-price-target-at-citigroup.html\n", "'Caribou Biosciences (NASDAQ:CRBU) Given New $37.00 Price Target at Citigroup'\n", "Fetched 50 stories. Total story count so far: 100\n", "Northern Oil and Gas (NYSE:NOG) PT Raised to $46.00 at Citigroup\n", "https://baseballnewssource.com/2022/11/27/northern-oil-and-gas-nysenog-pt-raised-to-46-00-at-citigroup/7852403.html\n", "'Northern Oil and Gas (NYSE:NOG) PT Raised to $46.00 at Citigroup'\n", "Fetched 7 stories. Total story count so far: 107\n", "Fetched 0 stories. Total story count so far: 107\n", "************\n", "Fetched 107 stories\n" ] } ], "source": [ "# define the query parameters\n", "params = {\n", " 'language[]': ['en'],\n", " 'title': 'Citigroup',\n", " 'published_at.start':'NOW-2DAYS',\n", " 'published_at.end':'NOW',\n", " 'cursor': '*',\n", " 'per_page' : 50\n", "}\n", "\n", "stories = get_stories(params)\n", "\n", "print('************')\n", "print(\"Fetched %s stories\" %(len(stories)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Depending on what parameters you used (and of course, how much Citgroup featured in the news), your number of stories may vary. Let's print the first 10 titles to get a feel for the stories we have pulled." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "5079435568\n", "BUZZ-Live Nation rises as Citigroup lifts rating to 'buy'\n", "\n", "5079419698\n", "Citigroup Inc. (NYSE: C) Is Rated A Buy By Analysts.\n", "\n", "5079358521\n", "Ensign Peak Advisors Inc Has $57.29 Million Stock Holdings in Citigroup Inc. (NYSE:C)\n", "\n", "5079233045\n", "Citigroup Trims Galera Therapeutics (NASDAQ:GRTX) Target Price to $18.00\n", "\n", "5079229312\n", "Citigroup Upgrades Live Nation Entertainment (NYSE:LYV) to “Buy”\n", "\n", "5079207000\n", "Citigroup Upgrades Live Nation Entertainment (NYSE:LYV) to “Buy”\n", "\n", "5079202754\n", "MeridianLink (NYSE:MLNK) Price Target Lowered to $16.00 at Citigroup\n", "\n", "5079174029\n", "NuCana (NASDAQ:NCNA) PT Lowered to $2.00 at Citigroup\n", "\n", "5079172560\n", "Citigroup Raises Five Below (NASDAQ:FIVE) Price Target to $186.00\n", "\n", "5079171435\n", "MeridianLink (NYSE:MLNK) Price Target Lowered to $16.00 at Citigroup\n", "\n" ] } ], "source": [ "for story in stories[0:10]:\n", " print(story['id'])\n", " print(story['title'])\n", " print('')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Boolean Search\n", "What if we want to refine our keyword search further? We can create more complicated searches using Boolean statements. For instance, if we were interested in searching for news that mentioned Citigroup or Bank of America and that also mentioned \"shares\" but not \"sell\", we could write the following query. It is important to note here that the \"Bank of America\" search term is wrapped in double quotes — if it wasn't, each individual word would be treated as an indivudal search term, but we want to search for the full phrase." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'cursor': '*',\n", " 'language[]': ['en'],\n", " 'per_page': 50,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-2DAYS',\n", " 'title': '(\"Citigroup\" OR \"Bank of America\" ) AND \"shares\" NOT \"sell\"'}\n", "Bank of America Corporation Announces Hypothetical Accrued Dividends and Hypothetical Total Consideration for LIBOR Depositary Shares Sought in its Cash Tender Offers and Amendments to the Offer to Purchase\n", "https://www.informnny.com/news/business/press-releases/cision/20221128NY47953/bank-of-america-corporation-announces-hypothetical-accrued-dividends-and-hypothetical-total-consideration-for-libor-depositary-shares-sought-in-its-cash-tender-offers-and-amendments-to-the-offer-to-pu/\n", "('Bank of America Corporation Announces Hypothetical Accrued Dividends and '\n", " 'Hypothetical Total Consideration for LIBOR Depositary Shares Sought in its '\n", " 'Cash Tender Offers and Amendments to the Offer to Purchase')\n", "Fetched 28 stories. Total story count so far: 28\n", "Fetched 0 stories. Total story count so far: 28\n", "************\n", "Fetched 28 stories\n", "************\n", "Bank of America Corporation Announces Hypothetical Accrued Dividends and Hypothetical Total Consideration for LIBOR Depositary Shares Sought in its Cash Tender Offers and Amendments to the Offer to Purchase\n", "\n", "Bank of America Corporation Announces Hypothetical Accrued Dividends and Hypothetical Total Consideration for LIBOR Depositary Shares Sought in its Cash Tender Offers and Amendments to the Offer to Purchase\n", "\n", "Bank of America Corporation Announces Hypothetical Accrued Dividends and Hypothetical Total Consideration for LIBOR Depositary Shares Sought in its Cash Tender Offers and Amendments to the Offer to Purchase\n", "\n", "Bank of America Corporation Announces Hypothetical Accrued Dividends and Hypothetical Total Consideration for LIBOR Depositary Shares Sought in its Cash Tender Offers and Amendments to the Offer to Purchase\n", "\n", "Bank of America Corporation Announces Hypothetical Accrued Dividends and Hypothetical Total Consideration for LIBOR Depositary Shares Sought in its Cash Tender Offers and Amendments to the Offer t...\n", "\n", "Bank of America Corporation Announces Hypothetical Accrued Dividends and Hypothetical Total Consideration for LIBOR Depositary Shares Sought in its Cash Tender Offers and Amendments to the Offer to Purchase\n", "\n", "Bank of America Corporation Announces Hypothetical Accrued Dividends and Hypothetical Total Consideration for LIBOR Depositary Shares Sought in its Cash Tender Offers and Amendments to the Offer to Purchase\n", "\n", "BRIEF-Bank Of America Corporation Announces Hypothetical Accrued Dividends And Hypothetical Total Consideration For LIBOR Depositary Shares Sought In Its Cash Tender Offers And Amendments To Offer To Purchase\n", "\n", "Bank of America Corporation Announces Hypothetical Accrued Dividends and Hypothetical Total Consideration for LIBOR Depositary Shares Sought in its Cash Tender Offers and Amendments to the Offer to Purchase\n", "\n", "Bank of America Corporation Announces Hypothetical Accrued Dividends and Hypothetical Total Consideration for LIBOR Depositary Shares Sought in its Cash Tender Offers and Amendments to the Offer to Purchase\n", "\n", "Bank of America Corporation Announces Hypothetical Accrued Dividends and Hypothetical Total Consideration for LIBOR Depositary Shares Sought in its Cash Tender Offers and Amendments to the Offer to Purchase\n", "\n", "Strategic Blueprint LLC Acquires 447 Shares of Bank of America Co. (NYSE:BAC)\n", "\n", "Aramco unit hires HSBC, Citigroup for Riyadh share sale\n", "\n", "Aramco Unit Hires HSBC, Citigroup for Riyadh Share Sale\n", "\n", "Westover Capital Advisors LLC Acquires New Shares in Bank of America Co. (NYSE:BAC)\n", "\n", "Aramco Unit Hires HSBC, Citigroup for Riyadh Share Sale\n", "\n", "Saudi Aramco Base Oil hires HSBC, Citigroup for Riyadh share sale\n", "\n", "Aramco Unit Hires HSBC, Citigroup for $1 Billion Share Sale\n", "\n", "Aramco Unit Hires HSBC, Citigroup for $1 Billion Share Sale\n", "\n", "Aramco Unit Hires HSBC, Citigroup for Riyadh Share Sale\n", "\n", "Aramco Unit Hires HSBC, Citigroup for Riyadh Share Sale (1)\n", "\n", "Aramco Unit Hires HSBC, Citigroup for Riyadh Share Sale\n", "\n", "Aramco Unit Hires HSBC, Citigroup for Riyadh Share Sale\n", "\n", "Aramco Unit Hires HSBC, Citigroup for Riyadh Share Sale\n", "\n", "Aramco Unit Hires HSBC, Citigroup for Riyadh Share Sale\n", "\n", "Eubel Brady & Suttman Asset Management Inc. Purchases 3,906 Shares of Citigroup Inc. (NYSE:C)\n", "\n", "Bank of America Co. (NYSE:BAC) Shares Sold by Robertson Stephens Wealth Management LLC\n", "\n", "Eubel Brady & Suttman Asset Management Inc. Buys 3,906 Shares of Citigroup Inc. (NYSE:C)\n", "\n" ] } ], "source": [ "# define the query parameters\n", "params = {\n", " 'language[]': ['en'],\n", " 'title': '(\"Citigroup\" OR \"Bank of America\" ) AND \"shares\" NOT \"sell\"',\n", " 'published_at.start':'NOW-2DAYS',\n", " 'published_at.end':'NOW',\n", " 'cursor': '*',\n", " 'per_page' : 50\n", "}\n", "\n", "stories = get_stories(params)\n", "\n", "print('************')\n", "print(\"Fetched %s stories\" %(len(stories)))\n", "print('************')\n", "\n", "for story in stories:\n", " print(story['title'])\n", " print('')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Categorical Search - IPTC\n", "We can see that we can refine our query by adding Boolean operators to our keyword search. However, this can become more complicated if we want to cast our net wider. For instance, let's say we want to pull stories about the banking sector in general. Rather than writing a complicated keyword search, we can search by a news category.\n", "\n", "AYLIEN'S NLP enrichment classifies stories into categories to allow us to make more powerful searches. Our classifier is capable of classifying content into two taxonomies where a code corresponds with a a subject. Learn more [here](https://docs.aylien.com/newsapi/common-workflows/#working-with-languages).\n", "\n", "Here, we will search for all stories classified as \"banking\" (04006002) using the IPTC subject taxonomy. You can search for other IPTC codes [here](https://docs.aylien.com/newsapi/search-taxonomies/#search-labels-for-iptc-subject-codes).\n", "\n", "Many stories will be categorised under \"banking\", so we will restrict our output to the first 100.\n", "\n", "We can also perform categorial search using the IAB taxonomy or the AYLIEN Smart Tagger which will be discussed later." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'language': ['en'], 'published_at.start': 'NOW-2DAYS', 'published_at.end': 'NOW', 'categories.taxonomy[]': 'iptc-subjectcode', 'categories.id[]': ['04006002'], 'cursor': '*', 'per_page': 10}\n", "{'categories.id[]': ['04006002'],\n", " 'categories.taxonomy[]': 'iptc-subjectcode',\n", " 'cursor': '*',\n", " 'language': ['en'],\n", " 'per_page': 10,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-2DAYS'}\n", "Wall Street slips as lockdown protests spread in China By DAMIAN J. TROISE - AP Business Writer Nov 28, 2022 Nov 28, 2022 Updated 4 min ago\n", "https://journaltimes.com/lifestyles/health-med-fit/wall-street-slips-as-lockdown-protests-spread-in-china/article_81621f36-0538-520e-9234-9e04289ca9ab.html\n", "Fetched 10 stories. Total story count so far: 10\n", "************\n", "Fetched 10 stories\n", "Wall Street slips as lockdown protests spread in China By DAMIAN J. TROISE - AP Business Writer Nov 28, 2022 Nov 28, 2022 Updated 4 min ago\n", "\n", "'The Bank of Canada Still Has Your Back, But It's Got a Knife In It': Experts Weigh In On Market Future\n", "\n", "Teton Advisors Inc. Lowers Holdings in Value Line, Inc. (NASDAQ:VALU)\n", "\n", "Safra New York Corporation To Acquire Delta National Bank and Trust\n", "\n", "The Bank-Run Phenomenon\n", "\n", "684. News: Daylight builds for the LGBTQ+ community and the FCA hits back at trading apps\n", "\n", "Safra New York Corporation To Acquire Delta National Bank and Trust\n", "\n", "Nigerian man flaunts over N1m saved in his piggy bank after he stopped doing 9k weekly data sub\n", "\n", "Get £175 for switching to Halifax…but there's a catch\n", "\n", "Keith Ligori\n", "\n" ] } ], "source": [ "# define the query parameters\n", "params = {\n", " 'language': ['en'],\n", " 'published_at.start':'NOW-2DAYS',\n", " 'published_at.end':'NOW',\n", " 'categories.taxonomy[]': 'iptc-subjectcode',\n", " 'categories.id[]': ['04006002'],\n", " 'cursor': '*',\n", " 'per_page' : 10\n", "}\n", "\n", "print(params)\n", "\n", "stories = get_top_ranked_stories(params, 10)\n", "\n", "print('************')\n", "print(\"Fetched %s stories\" %(len(stories)))\n", "\n", "for story in stories:\n", " print(story['title'])\n", " print('')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sorting Your Query Response " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You may find you want to sort your query response by some metric. In the examples above, we have taken the top N stories.\n", "\n", "These have been sorted - by default - by published date i.e. we are getting the most recent N stories that meet our search criteria.\n", "\n", "Sorting the query response is particularly useful when many stories meet our search criteria but we only want N stories. For example, say 1,000 stories met our search criteria - we could sort these stories by a range of metrics and return the top N.\n", "\n", "We can use the following paramters to sort our response by:\n", "- recency (used by default, utilises the published_at field)\n", "- alexa ranking (how popular the publisher is using alexa rank)\n", "- relevance (when using a keyword search, this input returns the stories sorted by how significant the keywords are in the document)\n", "- AYLIEN Category or Industry confidence score lease see AYLIEN Smart Tagger for more )\n", "- entity prominence (please see entity prominence for more )\n", "\n", "You can read more about sorting in our docs.\n", "\n", "The sort order by default is descending, but we can explictly state which direction we want to sort by using the 'sort_by' parameter.\n", "\n", "In the following example, we perform a keyword search and sort by keyword relevance." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'language': ['en'], 'published_at.start': 'NOW-2DAYS', 'published_at.end': 'NOW', 'text': 'Microsoft AND (merge OR acquire)', 'cursor': '*', 'per_page': 10, 'sort_by': 'relevance'}\n", "{'cursor': '*',\n", " 'language': ['en'],\n", " 'per_page': 10,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-2DAYS',\n", " 'sort_by': 'relevance',\n", " 'text': 'Microsoft AND (merge OR acquire)'}\n", "In transaction documents between Microsoft COR and Activision Blizzard, Inc date of an exit of The Elder Scrolls 6] was foun\n", "https://news.myseldon.com/en/news/index/275465595\n", "Fetched 10 stories. Total story count so far: 10\n", "************\n", "Fetched 10 stories\n", "In transaction documents between Microsoft COR and Activision Blizzard, Inc date of an exit of The Elder Scrolls 6] was foun\n", "\n", "New Microsoft partnership to drive technical growth at MTN Group\n", "\n", "Autonomy Orders 2,500 VinFast VF 8 And VF 9 Electric Cars\n", "\n", "Microsoft Reported To Extend Call Of Duty Multiplatform Release To PlayStation For Ten Years\n", "\n", "Where to get a Choice Specs in Pokémon Scarlet and Violet How to get a Choice Specs in Pokémon Scarlet and Violet\n", "\n", "Sony wanted to bring PlayStation Plus to Xbox, but Microsoft “wouldn't let it happen,” says SIE\n", "\n", "Joe Jonas urges people to 'check in' with themselves and their friends\n", "\n", "6 biggest deal reports this week: Manchester United open to selling the club By\n", "\n", "Finally! Microsoft Reveals Why It Prefers Elder Scrolls 6 as Xbox\n", "\n", "Activision Blizzard, Inc and Microsoft COR accused of arrangement and falsification of the transaction on merge for $69 billion\n", "\n" ] } ], "source": [ "params = {\n", " 'language': ['en'],\n", " 'published_at.start':'NOW-2DAYS',\n", " 'published_at.end':'NOW',\n", " 'text' : 'Microsoft AND (merge OR acquire)',\n", " 'cursor': '*',\n", " 'per_page' : 10,\n", " 'sort_by' : 'relevance'\n", "}\n", "\n", "print(params)\n", "\n", "stories = get_top_ranked_stories(params, 10)\n", "\n", "print('************')\n", "print(\"Fetched %s stories\" %(len(stories)))\n", "\n", "for story in stories:\n", " print(story['title'])\n", " print('')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## AYLIEN Query Language (AQL)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The AYLIEN Query Language (or AQL), is AYLIEN's custom 'flavour' of the Lucene syntax that enables users to make more powerful queries on our data.\n", "\n", "Queries in this syntax are made within an 'aql' parameter.\n", "\n", "AQL enables us to perform more sophisticated searches like boosting the importance of keywords and enhanced entity search.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Boost \n", "When making a query with many keywords, sometimes one keyword in is more important to your search than others. Boosting enables you to add weight to the more important keyword/keywords so that results mentioning these keywords are given a “boost” to get them higher in the results order.\n", "\n", "For example, searching [\"John\", \"Frank\", \"Sarah\"] gives equal weight to each term, but [\"John\", \"Frank\"^2, \"Sarah\"] is like saying a mention of “Frank” is twice as important as a mention of “John” or “Sarah”. Stories mentioning “Frank” will therefore appear higher in the rank of search results. We can reduce the importance of a keyword by attributing a decimal number e.g. 0.5.\n", "\n", "Boosting is not the definitive keyword search input, simply allows the user to specify the preponderant keywords in a list (i.e. if a story contains many mentions of non-boosted searched keywords, it could still be returned ahead of many stories that mention a boosted keyword). Boosting therefore does not exclude stories from the results, it only affects the order of returned results.\n", "\n", "The boost is allocated using the ^ symbol.\n", "\n", "In the example below, we search for a wide variety of keywords but give special significance to the \"radioactive\" keyword." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'title:((\"toxic\" \"chemical\" \"industrial\" \"radioactive\"^10 \"sewerage\") '\n", " 'AND (\"spill\" \"leak\" \"dump\" \"disaster\" \"contaminate\" \"waste\" '\n", " '\"pollute\"))',\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-1MONTH',\n", " 'sort_by': 'relevance'}\n", "Radioactive Waste Missouri School\n", "https://n.news.naver.com/mnews/article/077/0005779051\n", "Fetched 10 stories. Total story count so far: 10\n", "#############\n", "Radioactive Waste Missouri School\n", "https://n.news.naver.com/mnews/article/077/0005779051\n", "Radioactive Waste Missouri School\n", "https://n.news.naver.com/mnews/article/077/0005772026\n", "Radioactive Waste Missouri School\n", "https://n.news.naver.com/mnews/article/077/0005772025\n", "Norway-Kaupanger: Radioactive-, toxic-, medical- and hazardous waste services\n", "https://ted.europa.eu/udl?uri=TED:NOTICE:656593-2022:TEXT:EN:HTML\n", "State must stop plan to dump radioactive water\n", "\n", "Ukraine shelled radioactive waste storage – official\n", "https://www.rt.com/russia/566874-zaporozhye-waste-storage-shelled/?utm_source=rss&utm_medium=rss&utm_campaign=RSS\n", "UN NUCLEAR CHIEF DECLARES RADIOACTIVE WASTE RECYCLING DIFFICULT\n", "\n", "Chinese Radiation Protection Res Institute Seeks Patent for Radioactive Waste Resin Dehydration Metering Feeding Device\n", "\n", "EDF says radioactive leak at Civaux reactor not due to...\n", "https://www.dailymail.co.uk/wires/reuters/article-11403041/EDF-says-radioactive-leak-Civaux-reactor-not-welding.html?ns_mchannel=rss&ns_campaign=1490&ito=1490\n", "Chinese Radiation Protection Res Institute Submits Chinese Patent Application for Radioactive Waste Resin Wet Oxidation Device\n", "\n" ] } ], "source": [ "params = {\n", " 'published_at.start': 'NOW-1MONTH'\n", ", 'published_at.end': 'NOW'\n", ", 'aql': 'title:((\"toxic\" \"chemical\" \"industrial\" \"radioactive\"^10 \"sewerage\") AND (\"spill\" \"leak\" \"dump\" \"disaster\" \"contaminate\" \"waste\" \"pollute\"))'\n", ", 'sort_by' : 'relevance'\n", "}\n", "\n", "stories = get_top_ranked_stories(params, 10)\n", "\n", "print('#############')\n", "\n", "for story in stories:\n", " print(story['title'])\n", " print(story['links']['permalink'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Proximity Search\n", "Frequently, keywords of interest to us are mentioned in varying sequences of terms. For example, HSBC's division in China could appear in multiple forms: “HSBC China”, “HSBC’s branches in China”, “In China, HSBC is introducing new…” , etc.\n", "\n", "Proximity search is a feature that enables user to broaden the search criteria to return these combinations. “Proximity” refers to the distance, in terms, between two searched terms in a story. For example, \"HSBC China\"~5 only returns stories that mention \"HSBC\" and \"China\", where there is a maximum of four words in between them.\n" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'body': '\"HSBC China\"~4',\n", " 'language[]': ['en'],\n", " 'per_page': 5,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-10DAYS',\n", " 'sort_by': 'relevance'}\n", "Mercedes-Benz becomes 1st MNC to issue green Panda bond in China\n", "http://www.shine.cn/biz/finance/2211283456/\n", "Fetched 5 stories. Total story count so far: 5\n", "##################\n", "2022-11-28T13:26:26Z\n", "5079173598\n", "Mercedes-Benz becomes 1st MNC to issue green Panda bond in China\n", "330\n", "http://www.shine.cn/biz/finance/2211283456/\n", "Keyword mention:\n", "\u001b[1mHSBC\u001b[0m announced it has helped Mercedes-Benz to issue 500\n", "\n", "##################\n", "2022-11-28T02:35:04Z\n", "5078530062\n", "HSBC's smart supply chain breaks the circle again to empower the digital future\n", "1620\n", "https://www.tellerreport.com/business/2022-11-28-hsbc-s-smart-supply-chain-breaks-the-circle-again-to-empower-the-digital-future.HyVx3aq-Po.html\n", "Keyword mention:\n", " reasonable growth of the quantity\" is the goal of the future supply chain development. Among them, \u001b[1mHSBC\u001b[0m China won the regional awards of \"Best Digital Tra\n", "\n", "##################\n", "2022-11-28T15:50:50Z\n", "5079379311\n", "October 31 Asia bond pipeline: What's coming up?\n", "1277\n", "\n", "Keyword mention:\n", "(Asia), CMB Wing Lung, Citi, CMBC, CEB, CTBC, China PA Securities, Guotai Junan, Guosen Securities, \u001b[1mHSBC\u001b[0m Huatai Intl, Haitong Intl, ICBC (Asia), Industria\n", "\n", "##################\n", "2022-11-28T15:54:58Z\n", "5079383946\n", "October 27 Asia bond pipeline: What's coming up?\n", "1251\n", "\n", "Keyword mention:\n", "Price to be set by Dutch auction | Tender deadline November 3\n", "\n", " \n", " \n", " \n", " \u001b[1mHSBC\u001b[0m (Dealer manager) | Kroll Issuer Services (Tender a\n", "\n", "##################\n", "2022-11-28T15:52:27Z\n", "5079381149\n", "October 28 Asia bond pipeline: What's coming up?\n", "1303\n", "\n", "Keyword mention:\n", "(Asia), CMB Wing Lung, Citi, CMBC, CEB, CTBC, China PA Securities, Guotai Junan, Guosen Securities, \u001b[1mHSBC\u001b[0m Huatai Intl, Haitong Intl, ICBC (Asia), Industria\n", "\n" ] } ], "source": [ "params = {\n", "'published_at.start': 'NOW-10DAYS'\n", ", 'published_at.end': 'NOW'\n", "#, 'body': 'HSBC AND China'\n", ", 'body': '\"HSBC China\"~4'\n", ", 'sort_by' : 'relevance'\n", ", 'language[]' : ['en']\n", ", 'per_page' : 5\n", "} \n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "keywords = [\"HSBC\"]\n", "\n", "for story in stories:\n", " print('##################')\n", " print(story['published_at'])\n", " print(story['id'])\n", " print(story['title'])\n", " print(story['words_count'])\n", " print(story['links']['permalink'])\n", " for item in keywords:\n", " print('Keyword mention:')\n", " print_keyword_mention(story, 'body', item)\n", " \n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## AYLIEN Smart Tagger\n", "AYLIEN leverages two industry standard taxonomies in our news categorisation but we also leverage our own propriertary \n", "taxonomy - the Smart Tagger.\n", "\n", "Smart Tagger leverages state-of-the-art classification models that have been built using a vast collection of manually tagged news articles based on domain-specific industry and topical taxonomies. Smart Tagger uses a highly effective rule-based classification system for identifying categorical and industry-related news content.\n", "\n", "As part of the Smart Tagger update we’re introducing 2 new classification taxonomies; the AYLIEN Industry Taxonomy and the AYLIEN Category Taxonomy, which incorporates 2 curated category groupings; Adverse Events and Trading Impact Events.\n", "\n", "You can explore these taxonomies [here](https://newsapi.aylien.com/docs/newsapi/search-taxonomies). " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### AYLIEN Categories\n", "A wide and deep collection of topical categories covering popular topics specifically curated for the business and finance world." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Search for categories using a categories label." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'categories:{{taxonomy:aylien AND label:\"Environmental, Social and '\n", " 'Governance\"}}',\n", " 'language[]': ['en'],\n", " 'per_page': 5,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-10DAYS'}\n", "NOV 2022 WISCONSIN FORESTLAND SOLD REPORT Vernon County; Hunting, Timber, Investments! Market Snapshot\n", "http://activerain.com/blogsview/5760941/nov-2022-wisconsin-forestland-sold-report-vernon-county--hunting--timber--investments--market-snapshot#article-comments-section\n", "Fetched 5 stories. Total story count so far: 5\n", "##################\n", "2022-11-28T16:58:47Z\n", "NOV 2022 WISCONSIN FORESTLAND SOLD REPORT Vernon County; Hunting, Timber, Investments! Market Snapshot\n", "\n", "##################\n", "2022-11-28T16:58:33Z\n", "Residents urged to donate to Christmas clothing drive\n", "\n", "##################\n", "2022-11-28T16:58:12Z\n", "COP27 climate alarmists see oil demand hitting 18-year highs\n", "\n", "##################\n", "2022-11-28T16:57:50Z\n", "Tata Communications and Intertec Systems expand partnership, set up Cyber Security Operations Centre in UAE\n", "\n", "##################\n", "2022-11-28T16:57:18Z\n", "City of Houston Is Under a Water Boil Advisory, Affecting Millions\n", "\n" ] } ], "source": [ "params = {\n", "'published_at.start': 'NOW-10DAYS'\n", ", 'published_at.end': 'NOW'\n", ", 'language[]' : ['en']\n", ", 'aql': 'categories:{{taxonomy:aylien AND label:\"Environmental, Social and Governance\"}}'\n", ", 'per_page' : 5\n", "} \n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "\n", "for story in stories:\n", " print('##################')\n", " print(story['published_at'])\n", " print(story['title'])\n", " \n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Search for a category using a category ID." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'categories:{{taxonomy:aylien AND id:ay.lifesoc.esg}}',\n", " 'language[]': ['en'],\n", " 'per_page': 5,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-10DAYS'}\n", "NOV 2022 WISCONSIN FORESTLAND SOLD REPORT Vernon County; Hunting, Timber, Investments! Market Snapshot\n", "http://activerain.com/blogsview/5760941/nov-2022-wisconsin-forestland-sold-report-vernon-county--hunting--timber--investments--market-snapshot#article-comments-section\n", "Fetched 5 stories. Total story count so far: 5\n", "##################\n", "2022-11-28T16:58:47Z\n", "NOV 2022 WISCONSIN FORESTLAND SOLD REPORT Vernon County; Hunting, Timber, Investments! Market Snapshot\n", "\n", "##################\n", "2022-11-28T16:58:12Z\n", "COP27 climate alarmists see oil demand hitting 18-year highs\n", "\n", "##################\n", "2022-11-28T16:58:01Z\n", "Leeward Renewable Energy closes funding for US solar projects\n", "\n", "##################\n", "2022-11-28T16:57:50Z\n", "Tata Communications and Intertec Systems expand partnership, set up Cyber Security Operations Centre in UAE\n", "\n", "##################\n", "2022-11-28T16:57:18Z\n", "City of Houston Is Under a Water Boil Advisory, Affecting Millions\n", "\n" ] } ], "source": [ "params = {\n", "'published_at.start': 'NOW-10DAYS'\n", ", 'published_at.end': 'NOW'\n", ", 'language[]' : ['en']\n", ", 'aql': 'categories:{{taxonomy:aylien AND id:ay.lifesoc.esg}}'\n", ", 'per_page' : 5\n", "} \n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "for story in stories:\n", " print('##################')\n", " print(story['published_at'])\n", " print(story['title'])\n", " \n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Search for one category but explictly omit another category" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'categories:{{taxonomy:aylien AND label:\"Disasters\"}} NOT '\n", " 'categories:{{taxonomy:aylien AND label:\"Philanthropy\"}}',\n", " 'language[]': ['en'],\n", " 'per_page': 5,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-10DAYS'}\n", "Hawaii's Mauna Loa starts to erupt, sending ash nearby\n", "https://www.chronicle-tribune.com/news/wire/hawaii-s-mauna-loa-starts-to-erupt-sending-ash-nearby/article_dd2bdac5-bc7e-56e0-9f6e-3407d90865e4.html\n", "Fetched 5 stories. Total story count so far: 5\n", "##################\n", "2022-11-28T16:58:47Z\n", "Hawaii's Mauna Loa starts to erupt, sending ash nearby\n", "\n", "##################\n", "2022-11-28T16:58:41Z\n", "Landslide kills at least 14 attending funeral in Cameroon capital | CNN\n", "\n", "##################\n", "2022-11-28T16:57:11Z\n", "Hawaii's Mauna Loa volcano starts to erupt, sending ash nearby\n", "\n", "##################\n", "2022-11-28T16:56:07Z\n", "Hawaii's Mauna Loa, the world's largest active volcano, erupted for the first time in nearly 40 years\n", "\n", "##################\n", "2022-11-28T16:55:56Z\n", "Mauna Loa is erupting for the first time since 1984, prompting an ashfall advisory for Hawaii's Big Island\n", "\n" ] } ], "source": [ "params = {\n", "'published_at.start': 'NOW-10DAYS'\n", ", 'published_at.end': 'NOW'\n", ", 'language[]' : ['en']\n", ", 'aql': 'categories:{{taxonomy:aylien AND label:\"Disasters\"}} NOT categories:{{taxonomy:aylien AND label:\"Philanthropy\"}}'\n", ", 'per_page' : 5\n", "} \n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "for story in stories:\n", " print('##################')\n", " print(story['published_at'])\n", " print(story['title'])\n", " \n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Search for a list of categories" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'categories:{{taxonomy:aylien AND label:(\"Disasters\" \"Fraud\")}}',\n", " 'language[]': ['en'],\n", " 'per_page': 5,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-10DAYS'}\n", "Hawaii's Mauna Loa starts to erupt, sending ash nearby\n", "https://www.chronicle-tribune.com/news/wire/hawaii-s-mauna-loa-starts-to-erupt-sending-ash-nearby/article_dd2bdac5-bc7e-56e0-9f6e-3407d90865e4.html\n", "Fetched 5 stories. Total story count so far: 5\n", "##################\n", "2022-11-28T16:58:47Z\n", "Hawaii's Mauna Loa starts to erupt, sending ash nearby\n", "\n", "##################\n", "2022-11-28T16:58:41Z\n", "Landslide kills at least 14 attending funeral in Cameroon capital | CNN\n", "\n", "##################\n", "2022-11-28T16:58:10Z\n", "Cuomo-era New York corruption cases go before U.S. Supreme Court\n", "\n", "##################\n", "2022-11-28T16:57:11Z\n", "Hawaii's Mauna Loa volcano starts to erupt, sending ash nearby\n", "\n", "##################\n", "2022-11-28T16:56:46Z\n", "Irishman who stole €185,000 in social welfare payments says 'it was a victimless crime' More for you React Comments | 12\n", "\n" ] } ], "source": [ "params = {\n", "'published_at.start': 'NOW-10DAYS'\n", ", 'published_at.end': 'NOW'\n", ", 'language[]' : ['en']\n", ", 'aql': 'categories:{{taxonomy:aylien AND label:(\"Disasters\" \"Fraud\")}}'\n", ", 'per_page' : 5\n", "} \n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "for story in stories:\n", " print('##################')\n", " print(story['published_at'])\n", " print(story['title'])\n", " \n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Search for a category over a threshold of confidence and sort by this confidence" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'categories:{{taxonomy:aylien AND label:(Disasters) AND score:[0.7 TO '\n", " '*] sort_by(score)}}',\n", " 'language[]': ['en'],\n", " 'per_page': 5,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-10DAYS'}\n", "Some without water as sinkhole opens ground under GA truck | Columbus Ledger-Enquirer\n", "https://www.ledger-enquirer.com/news/state/georgia/article268916217.html\n", "Fetched 5 stories. Total story count so far: 5\n", "##################\n", "2022-11-18T18:08:42Z\n", "Some without water as sinkhole opens ground under GA truck | Columbus Ledger-Enquirer\n", "\n", "##################\n", "2022-11-18T17:07:27Z\n", "Wildfires often lead to dust storms – and they’re getting bigger\n", "\n", "##################\n", "2022-11-18T17:50:20Z\n", "Earthquake of magnitude 6.9 shakes Indonesia\n", "\n", "##################\n", "2022-11-18T17:30:05Z\n", "Strong earthquake shakes western Indonesia; no tsunami alert\n", "\n", "##################\n", "2022-11-18T17:28:58Z\n", "When Is Hurricane Season In Florida And How To Prepare For It\n", "\n" ] } ], "source": [ "params = {\n", "'published_at.start': 'NOW-10DAYS'\n", ", 'published_at.end': 'NOW'\n", ", 'language[]' : ['en']\n", ", 'aql': 'categories:{{taxonomy:aylien AND label:(Disasters) AND score:[0.7 TO *] sort_by(score)}}'\n", ", 'per_page' : 5\n", "} \n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "for story in stories:\n", " print('##################')\n", " print(story['published_at'])\n", " print(story['title'])\n", " \n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### AYLIEN Industries\n", "A robust collection of multilevel tags that represent the industry a news article is covering.\n", "\n", "Users can seach for Industry verticals using similar syntax as AYLIEN Categories." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Search for Industries Using IDs" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'industries: {{\"Coal Mining\" \"Agriculture and Fishing\" AND score:[0.7 '\n", " 'TO *] sort_by(score)}}',\n", " 'language[]': ['en'],\n", " 'per_page': 5,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-10DAYS'}\n", "13th Agrovision sets up special pavilion for Agritech Startups & Grassroot Innovators\n", "https://agrospectrumindia.com/2022/11/18/13th-agrovision-to-promote-agritech-startups-grassroot-innovators-though-special-pavilion.html\n", "Fetched 5 stories. Total story count so far: 5\n", "##################\n", "2022-11-18T17:55:02Z\n", "13th Agrovision sets up special pavilion for Agritech Startups & Grassroot Innovators\n", "\n", "##################\n", "2022-11-18T17:56:29Z\n", "Kirin Holdings - Chateau Mercian Mariko Winery Chosen Yet Again By 'World's Best Vineyards 2022'\n", "\n", "##################\n", "2022-11-18T17:05:31Z\n", "Report suggests big changes for ag in Upper Rio Grande River basin\n", "\n", "##################\n", "2022-11-18T18:20:16Z\n", "Worldwide Microgreens Industry to 2027 - by Type, Farming Technique, Growth Medium, Distribution Channel, End-use, Company and Region\n", "\n", "##################\n", "2022-11-18T18:22:28Z\n", "Markham Vineyards Reopens Historic Tasting Room After Extensive Renovations\n", "\n" ] } ], "source": [ "params = {\n", "'published_at.start': 'NOW-10DAYS'\n", ", 'published_at.end': 'NOW'\n", ", 'language[]' : ['en']\n", ", 'aql': 'industries: {{\"Coal Mining\" \"Agriculture and Fishing\" AND score:[0.7 TO *] sort_by(score)}}'\n", ", 'per_page' : 5\n", "} \n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "for story in stories:\n", " print('##################')\n", " print(story['published_at'])\n", " print(story['title'])\n", " \n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Search for Industries Using IDs" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'industries: {{in.mat.coalmine in.agfish AND score:[0.7 TO *] '\n", " 'sort_by(score)}}',\n", " 'language[]': ['en'],\n", " 'per_page': 5,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-10DAYS'}\n", "13th Agrovision sets up special pavilion for Agritech Startups & Grassroot Innovators\n", "https://agrospectrumindia.com/2022/11/18/13th-agrovision-to-promote-agritech-startups-grassroot-innovators-though-special-pavilion.html\n", "Fetched 5 stories. Total story count so far: 5\n", "##################\n", "2022-11-18T17:55:02Z\n", "13th Agrovision sets up special pavilion for Agritech Startups & Grassroot Innovators\n", "\n", "##################\n", "2022-11-18T17:31:11Z\n", "Developments In the World of Fishing Sonar\n", "\n", "##################\n", "2022-11-18T17:56:29Z\n", "Kirin Holdings - Chateau Mercian Mariko Winery Chosen Yet Again By 'World's Best Vineyards 2022'\n", "\n", "##################\n", "2022-11-18T18:05:27Z\n", "After 7,000 years, Turkish wines are hitting the big time\n", "\n", "##################\n", "2022-11-18T18:05:14Z\n", "Soft lending for Russian agriculture to grow nearly twofold in 2022 to 177 bln rubles - AgMin\n", "\n" ] } ], "source": [ "params = {\n", "'published_at.start': 'NOW-10DAYS'\n", ", 'published_at.end': 'NOW'\n", ", 'language[]' : ['en']\n", ", 'aql': 'industries: {{in.mat.coalmine in.agfish AND score:[0.7 TO *] sort_by(score)}}'\n", ", 'per_page' : 5\n", "} \n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "for story in stories:\n", " print('##################')\n", " print(story['published_at'])\n", " print(story['title'])\n", " \n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Working with Entities" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Entity Type Search\n", "Similarly, we may be interested in searching for certain recurring subjects appearing in the news for example, banks, companies, dogs or even aliens! We could do this using keyword search but AYLIEN provides a solution to this problem by classifying some words as \"enties\". \n", "\n", "What is an entity? The Oxford English Dictionary provides a basic starting point of what an entity is, with its definition being \"a thing with distinct and independent existence\". Learn more about searching for entities [here](https://blog.aylien.com/why-searching-for-news-by-entity-is-better-than-keyword/).\n", "\n", "We can use entity types to search for groups of entities without the need for defining an exhaustive list of DBPedia links. \n", "\n", "Returning to our query that pulled stories classifed as \"banking\", let's pull all articles categorised as banking that also feature a \"Company\" or \"Bank\" entity type in the title:\n", "\n", "_N.B. AYLIEN's knowlede base switched from using DBPedia (V2 entities) to Wikidata (V3 entities) in February 2021. If you recquire syntax relating to V2, please contact sales@aylien.com._" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'language[]': ['en'], 'published_at.start': 'NOW-2DAYS', 'published_at.end': 'NOW', 'categories.taxonomy': 'iptc-subjectcode', 'categories.id[]': ['04006002'], 'entities.title.type[]': ['Company', 'Bank'], 'cursor': '*', 'per_page': 10}\n", "{'categories.id[]': ['04006002'],\n", " 'categories.taxonomy': 'iptc-subjectcode',\n", " 'cursor': '*',\n", " 'entities.title.type[]': ['Company', 'Bank'],\n", " 'language[]': ['en'],\n", " 'per_page': 10,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-2DAYS'}\n", "Wall Street slips as lockdown protests spread in China By DAMIAN J. TROISE - AP Business Writer Nov 28, 2022 Nov 28, 2022 Updated 4 min ago\n", "https://journaltimes.com/lifestyles/health-med-fit/wall-street-slips-as-lockdown-protests-spread-in-china/article_81621f36-0538-520e-9234-9e04289ca9ab.html\n", "Fetched 10 stories. Total story count so far: 10\n", "************\n", "Fetched 10 stories\n" ] } ], "source": [ "# define the query parameters\n", "params = {\n", " 'language[]': ['en'],\n", " 'published_at.start':'NOW-2DAYS',\n", " 'published_at.end':'NOW',\n", " 'categories.taxonomy': 'iptc-subjectcode',\n", " 'categories.id[]': ['04006002'],\n", " 'entities.title.type[]': [\"Company\", \"Bank\"],\n", " 'cursor': '*',\n", " 'per_page' : 10\n", "}\n", "\n", "print(params)\n", "\n", "stories = get_top_ranked_stories(params, 10)\n", "\n", "print('************')\n", "print(\"Fetched %s stories\" %(len(stories)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's look closely at the first story in this output and review the entities in the title. \n", "\n", "Note, some entities will be linked to a Wikiedata URLs. AYLIEN uses Wikidata to train a vast knowledge base in order to identify entities. \n", "\n", "Other entities may not be linked to a DBPedia URL. AYLEIN also utilises a Named Entity Recognisition Model to identify entities in cases where they can't be identified from the knowledge base. " ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Wall Street slips as lockdown protests spread in China By DAMIAN J. TROISE - AP Business Writer Nov 28, 2022 Nov 28, 2022 Updated 4 min ago\n", "##############################################\n", "{'body': {'sentiment': {'confidence': 0.64, 'polarity': 'negative'},\n", " 'surface_forms': [{'frequency': 4,\n", " 'mentions': [{'index': {'end': 278, 'start': 273},\n", " 'sentiment': {'confidence': 0.63,\n", " 'polarity': 'negative'}},\n", " {'index': {'end': 1534,\n", " 'start': 1529},\n", " 'sentiment': {'confidence': 0.76,\n", " 'polarity': 'negative'}},\n", " {'index': {'end': 1878,\n", " 'start': 1873},\n", " 'sentiment': {'confidence': 0.56,\n", " 'polarity': 'negative'}},\n", " {'index': {'end': 2475,\n", " 'start': 2470},\n", " 'sentiment': {'confidence': 0.61,\n", " 'polarity': 'negative'}}],\n", " 'text': 'China'},\n", " {'frequency': 1,\n", " 'mentions': [{'index': {'end': 2542,\n", " 'start': 2535},\n", " 'sentiment': {'confidence': 0.54,\n", " 'polarity': 'positive'}}],\n", " 'text': 'Chinese'}]},\n", " 'external_ids': {},\n", " 'id': 'Q148',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q148',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/China'},\n", " 'overall_frequency': 6,\n", " 'overall_prominence': 0.98,\n", " 'overall_sentiment': {'confidence': 0.64, 'polarity': 'negative'},\n", " 'stock_tickers': [],\n", " 'title': {'sentiment': {'confidence': 0.6, 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 54, 'start': 49},\n", " 'sentiment': {'confidence': 0.6,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'China'}]},\n", " 'types': ['Sovereign_state',\n", " 'Location',\n", " 'Community',\n", " 'Country',\n", " 'State_(polity)',\n", " 'Organization']}\n", "\n", "{'body': {'sentiment': {'confidence': 0.93, 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 5,\n", " 'mentions': [{'index': {'end': 85, 'start': 72},\n", " 'sentiment': {'confidence': 0.95,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 400, 'start': 387},\n", " 'sentiment': {'confidence': 0.93,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 856, 'start': 843},\n", " 'sentiment': {'confidence': 0.93,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 1212,\n", " 'start': 1199},\n", " 'sentiment': {'confidence': 0.93,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 4601,\n", " 'start': 4588},\n", " 'sentiment': {'confidence': 0.77,\n", " 'polarity': 'positive'}}],\n", " 'text': 'KEB Hana Bank'}]},\n", " 'external_ids': {},\n", " 'id': 'Q484047',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q484047',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/Hana_Bank'},\n", " 'overall_frequency': 5,\n", " 'overall_prominence': 0.97,\n", " 'overall_sentiment': {'confidence': 0.93, 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'surface_forms': []},\n", " 'types': ['Business', 'Organization']}\n", "\n", "{'body': {'sentiment': {'confidence': 0.94, 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 5,\n", " 'mentions': [{'index': {'end': 107, 'start': 102},\n", " 'sentiment': {'confidence': 0.95,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 422, 'start': 417},\n", " 'sentiment': {'confidence': 0.94,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 878, 'start': 873},\n", " 'sentiment': {'confidence': 0.94,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 1234,\n", " 'start': 1229},\n", " 'sentiment': {'confidence': 0.93,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 4623,\n", " 'start': 4618},\n", " 'sentiment': {'confidence': 0.69,\n", " 'polarity': 'positive'}}],\n", " 'text': 'Seoul'}]},\n", " 'external_ids': {},\n", " 'id': 'Q8684',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q8684',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/Seoul'},\n", " 'overall_frequency': 5,\n", " 'overall_prominence': 0.93,\n", " 'overall_sentiment': {'confidence': 0.94, 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'surface_forms': []},\n", " 'types': ['City', 'Location', 'Organization', 'Community']}\n", "\n", "{'body': {'sentiment': {'confidence': 0.94, 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 5,\n", " 'mentions': [{'index': {'end': 120, 'start': 109},\n", " 'sentiment': {'confidence': 0.95,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 435, 'start': 424},\n", " 'sentiment': {'confidence': 0.94,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 891, 'start': 880},\n", " 'sentiment': {'confidence': 0.94,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 1247,\n", " 'start': 1236},\n", " 'sentiment': {'confidence': 0.93,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 4636,\n", " 'start': 4625},\n", " 'sentiment': {'confidence': 0.68,\n", " 'polarity': 'positive'}}],\n", " 'text': 'South Korea'},\n", " {'frequency': 1,\n", " 'mentions': [{'index': {'end': 762, 'start': 750},\n", " 'sentiment': {'confidence': 0.93,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'South Korean'}]},\n", " 'external_ids': {},\n", " 'id': 'Q884',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q884',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/South_Korea'},\n", " 'overall_frequency': 6,\n", " 'overall_prominence': 0.92,\n", " 'overall_sentiment': {'confidence': 0.94, 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'surface_forms': []},\n", " 'types': ['Sovereign_state',\n", " 'Location',\n", " 'Community',\n", " 'Country',\n", " 'State_(polity)',\n", " 'Organization']}\n", "\n", "{'body': {'surface_forms': []},\n", " 'external_ids': {},\n", " 'id': 'N279424833613967807707022612475825359786',\n", " 'overall_frequency': 1,\n", " 'overall_prominence': 0.91,\n", " 'overall_sentiment': {'confidence': 0.55, 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'sentiment': {'confidence': 0.55, 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 79, 'start': 63},\n", " 'sentiment': {'confidence': 0.55,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'DAMIAN J. TROISE'}]},\n", " 'types': ['Human']}\n", "\n" ] } ], "source": [ "\n", "for story in stories[0:1]:\n", " print(story['title'])\n", " print('##############################################')\n", " for entity in stories[0]['entities'][0:5]:\n", " pprint(entity)\n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Depending on your query, we should see that the classifier picked up some entities. We can also see some of the entities are linked to Wikidata URLs — we will return to this below. \n", "\n", "We are not limited to working with entities in the title however. We can also search for entities in the body of the article. Let's print out the first 10 entities in the body. We can see that AYLIEN's enrichment process identifies a whole range of entity types." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Wall Street slips as lockdown protests spread in China By DAMIAN J. TROISE - AP Business Writer Nov 28, 2022 Nov 28, 2022 Updated 4 min ago\n", "##############################################\n", "{'body': {'sentiment': {'confidence': 0.64, 'polarity': 'negative'},\n", " 'surface_forms': [{'frequency': 4,\n", " 'mentions': [{'index': {'end': 278, 'start': 273},\n", " 'sentiment': {'confidence': 0.63,\n", " 'polarity': 'negative'}},\n", " {'index': {'end': 1534,\n", " 'start': 1529},\n", " 'sentiment': {'confidence': 0.76,\n", " 'polarity': 'negative'}},\n", " {'index': {'end': 1878,\n", " 'start': 1873},\n", " 'sentiment': {'confidence': 0.56,\n", " 'polarity': 'negative'}},\n", " {'index': {'end': 2475,\n", " 'start': 2470},\n", " 'sentiment': {'confidence': 0.61,\n", " 'polarity': 'negative'}}],\n", " 'text': 'China'},\n", " {'frequency': 1,\n", " 'mentions': [{'index': {'end': 2542,\n", " 'start': 2535},\n", " 'sentiment': {'confidence': 0.54,\n", " 'polarity': 'positive'}}],\n", " 'text': 'Chinese'}]},\n", " 'external_ids': {},\n", " 'id': 'Q148',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q148',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/China'},\n", " 'overall_frequency': 6,\n", " 'overall_prominence': 0.98,\n", " 'overall_sentiment': {'confidence': 0.64, 'polarity': 'negative'},\n", " 'stock_tickers': [],\n", " 'title': {'sentiment': {'confidence': 0.6, 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 54, 'start': 49},\n", " 'sentiment': {'confidence': 0.6,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'China'}]},\n", " 'types': ['Sovereign_state',\n", " 'Location',\n", " 'Community',\n", " 'Country',\n", " 'State_(polity)',\n", " 'Organization']}\n", "\n", "{'body': {'sentiment': {'confidence': 0.64, 'polarity': 'negative'},\n", " 'surface_forms': [{'frequency': 4,\n", " 'mentions': [{'index': {'end': 278, 'start': 273},\n", " 'sentiment': {'confidence': 0.63,\n", " 'polarity': 'negative'}},\n", " {'index': {'end': 1534,\n", " 'start': 1529},\n", " 'sentiment': {'confidence': 0.76,\n", " 'polarity': 'negative'}},\n", " {'index': {'end': 1878,\n", " 'start': 1873},\n", " 'sentiment': {'confidence': 0.56,\n", " 'polarity': 'negative'}},\n", " {'index': {'end': 2475,\n", " 'start': 2470},\n", " 'sentiment': {'confidence': 0.61,\n", " 'polarity': 'negative'}}],\n", " 'text': 'China'},\n", " {'frequency': 1,\n", " 'mentions': [{'index': {'end': 2542,\n", " 'start': 2535},\n", " 'sentiment': {'confidence': 0.54,\n", " 'polarity': 'positive'}}],\n", " 'text': 'Chinese'}]},\n", " 'external_ids': {},\n", " 'id': 'Q148',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q148',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/China'},\n", " 'overall_frequency': 6,\n", " 'overall_prominence': 0.98,\n", " 'overall_sentiment': {'confidence': 0.64, 'polarity': 'negative'},\n", " 'stock_tickers': [],\n", " 'title': {'sentiment': {'confidence': 0.6, 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 54, 'start': 49},\n", " 'sentiment': {'confidence': 0.6,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'China'}]},\n", " 'types': ['Sovereign_state',\n", " 'Location',\n", " 'Community',\n", " 'Country',\n", " 'State_(polity)',\n", " 'Organization']}\n", "\n", "{'body': {'sentiment': {'confidence': 0.93, 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 5,\n", " 'mentions': [{'index': {'end': 85, 'start': 72},\n", " 'sentiment': {'confidence': 0.95,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 400, 'start': 387},\n", " 'sentiment': {'confidence': 0.93,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 856, 'start': 843},\n", " 'sentiment': {'confidence': 0.93,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 1212,\n", " 'start': 1199},\n", " 'sentiment': {'confidence': 0.93,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 4601,\n", " 'start': 4588},\n", " 'sentiment': {'confidence': 0.77,\n", " 'polarity': 'positive'}}],\n", " 'text': 'KEB Hana Bank'}]},\n", " 'external_ids': {},\n", " 'id': 'Q484047',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q484047',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/Hana_Bank'},\n", " 'overall_frequency': 5,\n", " 'overall_prominence': 0.97,\n", " 'overall_sentiment': {'confidence': 0.93, 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'surface_forms': []},\n", " 'types': ['Business', 'Organization']}\n", "\n", "{'body': {'sentiment': {'confidence': 0.94, 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 5,\n", " 'mentions': [{'index': {'end': 107, 'start': 102},\n", " 'sentiment': {'confidence': 0.95,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 422, 'start': 417},\n", " 'sentiment': {'confidence': 0.94,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 878, 'start': 873},\n", " 'sentiment': {'confidence': 0.94,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 1234,\n", " 'start': 1229},\n", " 'sentiment': {'confidence': 0.93,\n", " 'polarity': 'neutral'}},\n", " {'index': {'end': 4623,\n", " 'start': 4618},\n", " 'sentiment': {'confidence': 0.69,\n", " 'polarity': 'positive'}}],\n", " 'text': 'Seoul'}]},\n", " 'external_ids': {},\n", " 'id': 'Q8684',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q8684',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/Seoul'},\n", " 'overall_frequency': 5,\n", " 'overall_prominence': 0.93,\n", " 'overall_sentiment': {'confidence': 0.94, 'polarity': 'neutral'},\n", " 'stock_tickers': [],\n", " 'title': {'surface_forms': []},\n", " 'types': ['City', 'Location', 'Organization', 'Community']}\n", "\n" ] } ], "source": [ "for story in stories[0:1]:\n", " print(story['title'])\n", " print('##############################################')\n", " for entity in stories[0]['entities'][0:3]:\n", " for surface_form in entity['body']['surface_forms']:\n", " pprint(entity)\n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Entity Search Using Wikipedia URL\n", "We have seen how AYLIEN's NLP enrichment identifies entities and that some entities are tagged with a Wikidata URLs. Entities can be useful when a keyword or search term can refer to multiple entities. For example, let's imagine we are interested in finding news regarding the company, Apple — how do we restrict searches for the company only and ignore searches for the fruit? We *could* search for the keyword \"Apple\" and also search for company entity types as described above, but then we would run the risk of returning titles that include companies other than Apple Inc. but that mention the fruit, apple. We can, however, perform a more specific search using Wikidata and Wikipedia URLs.\n", "\n", "Wikidata is a semantic web project that extracts structured information created as part of the Wikipedia project where distinct entities are referred to by URIs (like https://en.wikipedia.org/wiki/Apple_Inc. and https://www.wikidata.org/wiki/Q312). Using these URIs, we can perform very specific searches for topics and reduce the ambiguity in our query. Searching by URI will also identify different surface forms that link to Apple e.g. \"Apple\", \"Apple Inc.\" and the Apple stock ticker, \"AAPL\".\n", "\n", "Below, we'll demonstrate a search for Citigroup using its Wikiedpia URL.\n", "\n", "_N.B. AYLIEN's knowlede base switched from using DBPedia (V2 entities) to Wikidata (V3 entities) in February 2021. If you recquire syntax relating to V2, please contact sales@aylien.com._" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'entities: {{links.wikipedia:\"https://en.wikipedia.org/wiki/Citigroup\" '\n", " '}}',\n", " 'per_page': 5,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-1MONTH'}\n", "Wells Fargo & Company MN Sells 97,148 Shares of Analog Devices, Inc. (NASDAQ:ADI)\n", "https://www.dispatchtribunal.com/2022/11/28/wells-fargo-company-mn-sells-97148-shares-of-analog-devices-inc-nasdaqadi.html\n", "Fetched 5 stories. Total story count so far: 5\n", "#############\n", "Wells Fargo & Company MN Sells 97,148 Shares of Analog Devices, Inc. (NASDAQ:ADI)\n", "https://www.dispatchtribunal.com/2022/11/28/wells-fargo-company-mn-sells-97148-shares-of-analog-devices-inc-nasdaqadi.html\n", "Keyword mention:\n", "wn 85.22% of the company’s stock.\n", "\n", "A number of research analysts have issued reports on ADI shares. \u001b[1mCitigroup\u001b[0m upped their target price on shares of Analog Devic\n", "\n", "KPMG bets on Manchester with tech jobs and 'sprint' rooms\n", "https://www.accountingtoday.com/articles/kpmg-bets-on-manchester-with-tech-jobs-and-sprint-rooms\n", "Keyword mention:\n", "he latest international firm to grow beyond London, with banks such as Goldman Sachs Group Inc. and \u001b[1mCitigroup\u001b[0m Inc. finding it easier to secure lower costs and s\n", "\n", "Beaird Harris Wealth Management LLC Has $138,000 Holdings in DTE Energy (NYSE:DTE)\n", "https://www.com-unik.info/2022/11/28/beaird-harris-wealth-management-llc-has-138000-holdings-in-dte-energy-nysedte.html\n", "Keyword mention:\n", "erts: Wall Street Analysts Forecast Growth A number of research firms have issued reports on DTE. \u001b[1mCitigroup\u001b[0m cut their price target on DTE Energy from $146.00 \n", "\n", "Teton Advisors Inc. Has $1.72 Million Position in World Wrestling Entertainment, Inc. (NYSE:WWE)\n", "https://www.dailypolitical.com/2022/11/28/teton-advisors-inc-has-1-72-million-position-in-world-wrestling-entertainment-inc-nysewwe.html\n", "Keyword mention:\n", " to $50.00 and gave the stock an “underweight” rating in a research note on Wednesday, August 17th. \u001b[1mCitigroup\u001b[0m boosted their target price on shares of World Wres\n", "\n", "Gamco Investors INC. ET AL Raises Stock Position in Tredegar Co. (NYSE:TG)\n", "https://www.themarketsdaily.com/2022/11/28/gamco-investors-inc-et-al-raises-stock-position-in-tredegar-co-nysetg.html\n", "Keyword mention:\n", "rials company’s stock valued at $435,000 after buying an additional 1,826 shares during the period. \u001b[1mCitigroup\u001b[0m Inc. boosted its stake in Tredegar by 9.4% during \n", "\n" ] } ], "source": [ "params = {\n", " 'published_at.start': 'NOW-1MONTH'\n", ", 'published_at.end': 'NOW'\n", ", 'aql': 'entities: {{links.wikipedia:\"https://en.wikipedia.org/wiki/Citigroup\" }}'\n", ", 'per_page' : 5\n", "}\n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "print('#############')\n", "\n", "for story in stories:\n", " print(story['title'])\n", " print(story['links']['permalink'])\n", " print('Keyword mention:')\n", " print_keyword_mention(story, 'body', 'Citigroup')\n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Search for an Entity by QID\n", "We can search for entities using their Wikidata ID as per below." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'entities: {{links.wikidata:\"https://www.wikidata.org/wiki/Q219508\" }}',\n", " 'per_page': 5,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-1MONTH'}\n", "Teton Advisors Inc. Has $1.72 Million Position in World Wrestling Entertainment, Inc. (NYSE:WWE)\n", "https://www.dailypolitical.com/2022/11/28/teton-advisors-inc-has-1-72-million-position-in-world-wrestling-entertainment-inc-nysewwe.html\n", "Fetched 5 stories. Total story count so far: 5\n", "#############\n", "Teton Advisors Inc. Has $1.72 Million Position in World Wrestling Entertainment, Inc. (NYSE:WWE)\n", "https://www.dailypolitical.com/2022/11/28/teton-advisors-inc-has-1-72-million-position-in-world-wrestling-entertainment-inc-nysewwe.html\n", "Keyword mention:\n", " to $50.00 and gave the stock an “underweight” rating in a research note on Wednesday, August 17th. \u001b[1mCitigroup\u001b[0m boosted their target price on shares of World Wres\n", "\n", "Gamco Investors INC. ET AL Raises Stock Position in Tredegar Co. (NYSE:TG)\n", "https://www.themarketsdaily.com/2022/11/28/gamco-investors-inc-et-al-raises-stock-position-in-tredegar-co-nysetg.html\n", "Keyword mention:\n", "rials company’s stock valued at $435,000 after buying an additional 1,826 shares during the period. \u001b[1mCitigroup\u001b[0m Inc. boosted its stake in Tredegar by 9.4% during \n", "\n", "Coherent slips even as Deutsche Bank upgrades, saying bear case 'not as bad as feared'\n", "https://seekingalpha.com/news/3911636-coherent-slips-even-as-deutsche-bank-upgrades-saying-bear-case-not-as-bad-as-feared?utm_source=feed_news_all&utm_medium=referral\n", "Keyword mention:\n", "\n", "Trian Fund Management L.P. Raises Stake in General Electric (NYSE:GE)\n", "https://mayfieldrecorder.com/2022/11/28/trian-fund-management-l-p-raises-stake-in-general-electric-nysege.html\n", "Keyword mention:\n", " to $78.00 and set an “overweight” rating on the stock in a research report on Monday, October 3rd. \u001b[1mCitigroup\u001b[0m increased their price objective on shares of Gener\n", "\n", "Pin Oak Investment Advisors Inc. Increases Position in Kimbell Royalty Partners, LP (NYSE:KRP)\n", "https://slatersentinel.com/news/2022/11/28/pin-oak-investment-advisors-inc-increases-position-in-kimbell-royalty-partners-lp-nysekrp.html\n", "Keyword mention:\n", "'s stock. Analyst Upgrades and Downgrades KRP has been the topic of a number of research reports. \u001b[1mCitigroup\u001b[0m assumed coverage on Kimbell Royalty Partners in a \n", "\n" ] } ], "source": [ "params = {\n", " 'published_at.start': 'NOW-1MONTH'\n", ", 'published_at.end': 'NOW'\n", ", 'aql': 'entities: {{links.wikidata:\"https://www.wikidata.org/wiki/Q219508\" }}'\n", ", 'per_page' : 5\n", "}\n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "print('#############')\n", "\n", "for story in stories:\n", " print(story['title'])\n", " print(story['links']['permalink'])\n", " print('Keyword mention:')\n", " print_keyword_mention(story, 'body', 'Citigroup')\n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Search for an Entity by Surface Form\n", "Sometimes we might want to search for an entity by surface form (i.e. the text metnioned) rather than the wiki ID.\n", "This may because we want to limit to a certain surface form (MSFT and not Microsoft) or becuase the entity is not in wikidata and so not in our kenoweldege base. Our Named Entity Recognition model and still recognise entities that are not in wikidata, based on the context of the document. This is useful for searching for lesser known companies, SMEs or start-ups.\n", "\n", "In the code below I use the code surface_forms.text - this is a full text search. This means that\n", "- This means punctuation is removed\n", "- The text of the surface forms are split into individual words e.g. a search for surface_forms.text: \"boeing\" or surface_forms.text: \"company\" would return mentions of the surface form \"Boeing Company\"\n", "\n", "In contrast, searching via surface_forms on its own will perform an exact string match search i.e. case sensitive with special characters included. " ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'entities: {{surface_forms.text:\"Boeing\"}}',\n", " 'per_page': 5,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-1MONTH'}\n", "ВОЙНА В Украине\n", "https://izvestia.kiev.ua/item/show/148204\n", "Fetched 5 stories. Total story count so far: 5\n", "#############\n", "ВОЙНА В Украине\n", "https://izvestia.kiev.ua/item/show/148204\n", "Keyword mention:\n", "\u001b[1mBoeing\u001b[0m has proposed to produce small diameter of land bas\n", "\n", "Russia Won't Stop Strikes until It Runs Out of Missiles, Ukraine's Zelenskiy Says\n", "https://english.aawsat.com/home/article/4013126/russia-won%E2%80%99t-stop-strikes-until-it-runs-out-missiles-ukraine%E2%80%99s-zelenskiy-says\n", "Keyword mention:\n", ". In the latest example of Western military aid to Kyiv, the Pentagon is considering a proposal by \u001b[1mBoeing\u001b[0m to supply Ukraine with cheap, small precision bomb\n", "\n", "محلل سياسى: استمرار الحرب الروسية الأوكرانية يضع مستقبل أوروبا على المحك\n", "https://www.youm7.com/story/2022/11/28/%D9%85%D8%AD%D9%84%D9%84-%D8%B3%D9%8A%D8%A7%D8%B3%D9%89-%D8%A7%D8%B3%D8%AA%D9%85%D8%B1%D8%A7%D8%B1-%D8%A7%D9%84%D8%AD%D8%B1%D8%A8-%D8%A7%D9%84%D8%B1%D9%88%D8%B3%D9%8A%D8%A9-%D8%A7%D9%84%D8%A3%D9%88%D9%83%D8%B1%D8%A7%D9%86%D9%8A%D8%A9-%D9%8A%D8%B6%D8%B9-%D9%85%D8%B3%D8%AA%D9%82%D8%A8%D9%84-%D8%A3%D9%88%D8%B1%D9%88%D8%A8%D8%A7-%D8%B9%D9%84%D9%89/5992901\n", "Keyword mention:\n", "the mental image of how America can help its allies and NATO countries.\n", "The Washington study of the \u001b[1mBoeing\u001b[0m proposal to provide \"Keeff\" with accurate bombs is\n", "\n", "USA harkitsee Boeingin ja Saabin kehittämän täsmäpommin lähettämistä Ukrainaan – GLSDB-pommi mahdollistaisi iskut yli 100 km Venäjän selustaan\n", "https://www.talouselama.fi/uutiset/usa-harkitsee-boeingin-ja-saabin-kehittaman-tasmapommin-lahettamista-ukrainaan-glsdb-pommi-mahdollistaisi-iskut-yli-100-km-venajan-selustaan/abaa49d0-d08e-492e-97f4-f0a0af921177\n", "Keyword mention:\n", "ng sending the GLSDB (Ground-Lunched Small Diamond Bomb) to Ukraine, which was developed jointly by \u001b[1mBoeing\u001b[0m and Saab. The news agency reports on the nameless \n", "\n", "Russia won't stop strikes until it runs out of missiles, Ukraine's Zelenskiy says\n", "https://nationalpost.com/pmn/news-pmn/russia-wont-stop-strikes-until-it-runs-out-of-missiles-ukraines-zelenskiy-says\n", "Keyword mention:\n", ". In the latest example of Western military aid to Kyiv, the Pentagon is considering a proposal by \u001b[1mBoeing\u001b[0m to supply Ukraine with cheap, small precision bomb\n", "\n" ] } ], "source": [ "params = {\n", " 'published_at.start': 'NOW-1MONTH'\n", ", 'published_at.end': 'NOW'\n", "#, 'aql': 'entities: {{surface_forms:\"Boeing\"}}'\n", ", 'aql': 'entities: {{surface_forms.text:\"Boeing\"}}'\n", ", 'per_page' : 5\n", "}\n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "print('#############')\n", "\n", "for story in stories:\n", " print(story['title'])\n", " print(story['links']['permalink'])\n", " print('Keyword mention:')\n", " print_keyword_mention(story, 'body', 'Boeing')\n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Search for an Entity by Stock Ticker \n", "We can search for entities using their stock ticker (where supported). " ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'entities: {{stock_ticker:GOOGL }}',\n", " 'per_page': 5,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-1MONTH'}\n", "JEPI Vs. SPY: The Relative Lead Unlikely To Continue In 2023\n", "https://seekingalpha.com/article/4560842-jepi-vs-spy-the-relative-lead-unlikely-to-continue-in-2023?source=feed_all_articles\n", "Fetched 5 stories. Total story count so far: 5\n", "#############\n", "JEPI Vs. SPY: The Relative Lead Unlikely To Continue In 2023\n", "https://seekingalpha.com/article/4560842-jepi-vs-spy-the-relative-lead-unlikely-to-continue-in-2023?source=feed_all_articles\n", "Keyword mention:\n", "\n", "Investors Increasingly Impatient with Slow Pace of Autonomous Vehicles\n", "https://programbusiness.com/news/investors-increasingly-impatient-with-slow-pace-of-autonomous-vehicles/\n", "Keyword mention:\n", "ut costs during an economic slowdown. An influential hedge fund has also questioned Alphabet Inc.’s \u001b[1mGoogle\u001b[0m s years-long effort to advance self-driving techno\n", "\n", "Yahoo buys nearly 25% stake in advertising tech firm Taboola\n", "https://infotechlead.com/digital/yahoo-buys-nearly-25-stake-in-advertising-tech-firm-taboola-75712\n", "Keyword mention:\n", "\n", "Allianz Asset Management GmbH Acquires 30,898 Shares of Alphabet Inc. (NASDAQ:GOOG)\n", "https://mayfieldrecorder.com/2022/11/28/allianz-asset-management-gmbh-acquires-30898-shares-of-alphabet-inc-nasdaqgoog.html\n", "Keyword mention:\n", ", Europe, the Middle East, Africa, the Asia-Pacific, Canada, and Latin America. It operates through \u001b[1mGoogle\u001b[0m Services, Google Cloud, and Other Bets segments. T\n", "\n", "SPACs Slap Some Lipstick on Their Penny-Stock Pigs\n", "https://medworm.com/1053087611/spacs-slap-some-lipstick-on-their-penny-stock-pigs/\n", "Keyword mention:\n", "\n" ] } ], "source": [ "params = {\n", " 'published_at.start': 'NOW-1MONTH'\n", ", 'published_at.end': 'NOW'\n", ", 'aql': 'entities: {{stock_ticker:GOOGL }}'\n", ", 'per_page' : 5\n", "}\n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "print('#############')\n", "\n", "for story in stories:\n", " print(story['title'])\n", " print(story['links']['permalink'])\n", " print('Keyword mention:')\n", " print_keyword_mention(story, 'body', 'Google')\n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Search for an Entity Specifying Entity Type\n", "Sometimes if we are searching for an entity surface form, we may want to specify the entity type to help identify the correct entity. This may be becuase the entity is not recognised in wikidata and therefore not in the AYLIEN knowledge base. \n", "\n", "However, our Named Entity Recognistion model can predict what entity type the entity is (i.e. Person, Organization, Location etc.) even if it is not in wikidata. This enables us to search for entity surface forms and explictly state what type of entity they should be.\n", "\n", "Below we searcg for the surface form \"Apple\" and specify that we are looking for an Organization entity type. " ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'entities:{{surface_forms.text:Apple AND type:Organization}}',\n", " 'categories_id': ['04000000'],\n", " 'categories_taxonomy': 'iptc-subjectcode',\n", " 'language': ['en'],\n", " 'per_page': 5}\n", "Ahead of Market: 10 things that will decide D-Street action on Tuesday\n", "https://economictimes.indiatimes.com/markets/stocks/news/ahead-of-market-10-things-that-will-decide-d-street-action-on-tuesday/articleshow/95835927.cms\n", "Fetched 5 stories. Total story count so far: 5\n", "#############\n", "Ahead of Market: 10 things that will decide D-Street action on Tuesday\n", "https://economictimes.indiatimes.com/markets/stocks/news/ahead-of-market-10-things-that-will-decide-d-street-action-on-tuesday/articleshow/95835927.cms\n", "Keyword mention:\n", "r Monday sales were set for a record.The biggest drag on the benchmark S&P 500 index, however, were \u001b[1mApple\u001b[0m Inc shares, which fell 1.5% after a report that th\n", "\n", "WhatsApp Message Yourself feature starts rolling out: Here's how to use it\n", "\n", "Keyword mention:\n", "eature, users must update the WhatsApp app on their smartphone. To do so, head to Google Play Store/\u001b[1mApple\u001b[0m App Store and install the latest version of the ap\n", "\n", "'A Christmas miracle': Woman kidnapped as child reunites with family 51 years later\n", "https://headtopics.com/us/a-christmas-miracle-woman-kidnapped-as-child-reunites-with-family-51-years-later-32220438\n", "Keyword mention:\n", "ry isn't too difficult to figure out, the change of pace for a Hallmark movie is welcomed. In 2021, \u001b[1mApple\u001b[0m agreed to broadcast A Charlie Brown Christmas on P\n", "\n", "The Best Cyber Monday deals available now\n", "https://headtopics.com/us/the-best-cyber-monday-deals-available-now-32218332\n", "Keyword mention:\n", "he 2021 iPad Pro 11-inch with an M1 chip, well, here is the follow-up: you can also get the 2nd gen \u001b[1mApple\u001b[0m Pencil that works great with it at a $40 off price\n", "\n", "Amazon, union organizer head to court over COVID-based class racial-bias lawsuit\n", "https://thegrio.com/2022/11/28/amazon-union-organizer-smalls-head-to-court-covid-based-class-racial-bias-lawsuit/\n", "Keyword mention:\n", "acility, it would weaken the claims within the racial-bias lawsuit.\n", "\n", "TheGrio is FREE on your TV via \u001b[1mApple\u001b[0m TV, Amazon Fire, Roku and Android TV. Also, please\n", "\n" ] } ], "source": [ "params = {\n", " \"aql\": \"entities:{{surface_forms.text:Apple AND type:Organization}}\"\n", " , \"categories_taxonomy\": \"iptc-subjectcode\"\n", " , \"categories_id\": [\"04000000\"]\n", " , \"language\": [\"en\"]\n", " , 'per_page' : 5\n", "}\n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "print('#############')\n", "\n", "for story in stories:\n", " print(story['title'])\n", " print(story['links']['permalink'])\n", " print('Keyword mention:')\n", " print_keyword_mention(story, 'body', 'Apple')\n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Search for an Entity Specying Title or Body Element\n", "We can specify where in the article we want to find the entity by specifying the title or body elements." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'entities:{{element:title AND surface_forms:Apple}}',\n", " 'categories_id': ['04000000'],\n", " 'categories_taxonomy': 'iptc-subjectcode',\n", " 'language': ['en'],\n", " 'per_page': 5}\n", "Apple’s Change to AirDrop Is Hurting Chinese Protests\n", "https://www.techinvestornews.com/Tech-News/Latest-Headlines/apples-change-to-airdrop-is-hurting-chinese-protests\n", "Fetched 5 stories. Total story count so far: 5\n", "#############\n", "Apple’s Change to AirDrop Is Hurting Chinese Protests\n", "https://www.techinvestornews.com/Tech-News/Latest-Headlines/apples-change-to-airdrop-is-hurting-chinese-protests\n", "\n", "The Best Apple Cyber Monday Deals\n", "https://headtopics.com/us/the-best-apple-cyber-monday-deals-32218396\n", "\n", "Gwyneth Paltrow Reunites With Look-Alike Daughter Apple, 18, In NYC On Teen's College Break: Photos\n", "https://www.newsbreak.com/news/2839249825578/gwyneth-paltrow-reunites-with-look-alike-daughter-apple-18-in-nyc-on-teen-s-college-break-photos\n", "\n", "Why Apple Stock Is Sinking Today\n", "https://www.fool.com/investing/2022/11/28/why-apple-stock-is-sinking-today/?source=iedfolrf0000001\n", "\n", "Snap up a £30 saving on the Apple Watch ultra this Cyber Monday\n", "https://theworldnews.net/gb-news/snap-up-a-ps30-saving-on-the-apple-watch-ultra-this-cyber-monday\n", "\n" ] } ], "source": [ "params = {\n", " \"aql\": \"entities:{{element:title AND surface_forms:Apple}}\"\n", " , \"categories_taxonomy\": \"iptc-subjectcode\"\n", " , \"categories_id\": [\"04000000\"]\n", " , \"language\": [\"en\"]\n", " , 'per_page' : 5\n", "}\n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "print('#############')\n", "\n", "for story in stories:\n", " print(story['title'])\n", " print(story['links']['permalink'])\n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Searching For Multiple Entities at Once\n", "We can add logic to search for multiple entities at once. Note in this example we are using the OR operator to search for one of two entities." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'entities:{{element:title AND surface_forms: \"Deloitte\"}} OR '\n", " 'entities:{{element:title AND surface_forms: \"Accenture\"}}',\n", " 'per_page': 5,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-30DAYS'}\n", "BRITISH COLUMBIA INVESTMENT MANAGEMENT Corp Sells 24,262 Shares of Accenture plc (NYSE:ACN)\n", "https://baseballnewssource.com/2022/11/28/british-columbia-investment-management-corp-sells-24262-shares-of-accenture-plc-nyseacn/7861473.html\n", "Fetched 5 stories. Total story count so far: 5\n", "#############\n", "BRITISH COLUMBIA INVESTMENT MANAGEMENT Corp Sells 24,262 Shares of Accenture plc (NYSE:ACN)\n", "https://baseballnewssource.com/2022/11/28/british-columbia-investment-management-corp-sells-24262-shares-of-accenture-plc-nyseacn/7861473.html\n", "\n", "Tvh оцифровывает свою глобальную сеть складов с помощью Körber и Accenture — Data Intelligence.\n", "https://zephyrnet.com/ru/tvh-%D0%BE%D1%86%D0%B8%D1%84%D1%80%D0%BE%D0%B2%D1%8B%D0%B2%D0%B0%D0%B5%D1%82-%D1%81%D0%B2%D0%BE%D1%8E-%D0%B3%D0%BB%D0%BE%D0%B1%D0%B0%D0%BB%D1%8C%D0%BD%D1%83%D1%8E-%D1%81%D0%B5%D1%82%D1%8C-%D1%81%D0%BA%D0%BB%D0%B0%D0%B4%D0%BE%D0%B2-%D1%81-%D0%BF%D0%BE%D0%BC%D0%BE%D1%89%D1%8C%D1%8E-korber-%D0%B8-accure-5/\n", "\n", "Ensign Peak Advisors Inc Lowers Position in Accenture plc (NYSE:ACN)\n", "https://www.americanbankingnews.com/2022/11/28/ensign-peak-advisors-inc-lowers-position-in-accenture-plc-nyseacn.html\n", "\n", "Purdue, Accenture sign five-year agreement in support of smart manufacturing\n", "https://www.purdue.edu/newsroom/releases/2022/Q4/purdue,-accenture-sign-five-year-agreement-in-support-of-smart-manufacturing.html\n", "\n", "Deloitte mandated for revisited Hassyan IWP\n", "\n", "\n" ] } ], "source": [ "params = {\n", " 'published_at.start': 'NOW-30DAYS'\n", " , 'published_at.end': 'NOW'\n", " , 'aql': 'entities:{{element:title AND surface_forms: \"Deloitte\"}} OR entities:{{element:title AND surface_forms: \"Accenture\"}}'\n", " , 'per_page' : 5\n", " } \n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "print('#############')\n", "\n", "for story in stories:\n", " print(story['title'])\n", " print(story['links']['permalink'])\n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Searching by Entity and Entity Level Sentiment Analysis\n", "We can also limit to the stories we want by enttiy sentiment, as exemplified below. Here we will search for negative mentions of Citigroup." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'entities:{{element:title AND surface_forms:Citigroup AND '\n", " 'sentiment:negative}}',\n", " 'language': ['en'],\n", " 'per_page': 5,\n", " 'period': '+1DAY',\n", " 'publised_at_start': 'NOW-10DAYS'}\n", "Magna International (NYSE:MGA) Downgraded by Citigroup\n", "https://www.com-unik.info/2022/11/27/magna-international-nysemga-downgraded-by-citigroup.html\n", "Fetched 5 stories. Total story count so far: 5\n", "#############\n", "Magna International (NYSE:MGA) Downgraded by Citigroup\n", "https://www.com-unik.info/2022/11/27/magna-international-nysemga-downgraded-by-citigroup.html\n", "\n", "Magna International (NYSE:MGA) Downgraded by Citigroup\n", "https://www.thelincolnianonline.com/2022/11/27/magna-international-nysemga-downgraded-by-citigroup.html\n", "\n", "MacroGenics (NASDAQ:MGNX) Downgraded by Citigroup\n", "https://www.com-unik.info/2022/11/27/macrogenics-nasdaqmgnx-downgraded-by-citigroup.html\n", "\n", "MacroGenics (NASDAQ:MGNX) Downgraded by Citigroup\n", "https://www.thelincolnianonline.com/2022/11/27/macrogenics-nasdaqmgnx-downgraded-by-citigroup.html\n", "\n", "Magna International (NYSE:MGA) Downgraded by Citigroup\n", "https://zolmax.com/investing/magna-international-nysemga-downgraded-by-citigroup/8168431.html\n", "\n" ] } ], "source": [ "params = {\n", " \"aql\": \"entities:{{element:title AND surface_forms:Citigroup AND sentiment:negative}}\"\n", " , \"publised_at_start\": \"NOW-10DAYS\"\n", " , \"period\": \"+1DAY\"\n", " , \"language\": [\"en\"]\n", " , 'per_page' : 5\n", "}\n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "print('#############')\n", "\n", "for story in stories:\n", " print(story['title'])\n", " print(story['links']['permalink'])\n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we will isolate the Citigroup entity in the first story to show it is classified with negative sentiment. " ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'body': {'sentiment': {'confidence': 0.72, 'polarity': 'neutral'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 9, 'start': 0},\n", " 'sentiment': {'confidence': 0.72,\n", " 'polarity': 'neutral'}}],\n", " 'text': 'Citigroup'}]},\n", " 'external_ids': {},\n", " 'id': 'Q219508',\n", " 'links': {'wikidata': 'https://www.wikidata.org/wiki/Q219508',\n", " 'wikipedia': 'https://en.wikipedia.org/wiki/Citigroup'},\n", " 'overall_frequency': 2,\n", " 'overall_prominence': 0.98,\n", " 'overall_sentiment': {'confidence': 0.72, 'polarity': 'neutral'},\n", " 'stock_tickers': ['C'],\n", " 'title': {'sentiment': {'confidence': 0.53, 'polarity': 'negative'},\n", " 'surface_forms': [{'frequency': 1,\n", " 'mentions': [{'index': {'end': 54, 'start': 45},\n", " 'sentiment': {'confidence': 0.53,\n", " 'polarity': 'negative'}}],\n", " 'text': 'Citigroup'}]},\n", " 'types': ['Business', 'Organization', 'Financial_institution']}\n" ] } ], "source": [ "for entity in stories[0]['entities']:\n", " for surface_form in entity['title']['surface_forms']:\n", " if 'Citigroup' in surface_form['text']:\n", " pprint(entity)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Entity Prominence\n", "Entity prominence is a measure of how significant a mention of an entity is on a scale of 0-1. \n", "\n", "Intuitively - as consumers of news - we know if an entity appears in the title, in the first paragaph or many times in an article, then it is pretty significant. AYLIEN's entioty prominence metric catpures this signficance. \n", "\n", "We can use this as a query paramter to filter out insignificant mentions of an entity by setting an entity prominence threshold. We can also sort by entity prominence to see the most significant mentions first. For more ways to sort your query output see here." ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'entities: {{surface_forms: \"Citigroup\" AND overall_prominence:[0.6 TO '\n", " '*] sort_by(overall_prominence)}}',\n", " 'per_page': 5,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-30DAYS'}\n", "Vestmark Advisory Solutions Inc. Purchases 10,915 Shares of Citigroup Inc. (NYSE:C)\n", "https://www.themarketsdaily.com/2022/10/29/vestmark-advisory-solutions-inc-purchases-10915-shares-of-citigroup-inc-nysec.html\n", "Fetched 5 stories. Total story count so far: 5\n", "##############\n", "Title:\n", "10,915 Shares of \u001b[1mCitigroup\u001b[0m Inc. (NYSE:C)\n", "\n", "Mention:\n", "\u001b[1mCitigroup\u001b[0m Inc. (NYSE:C – Get Rating) by 16.5% during the sec\n", "##############\n", "Title:\n", "\u001b[1mCitigroup\u001b[0m China Technology Forum 2022\n", "\n", "Mention:\n", "a new form of trade, but the brutal development of the industry\n", "In this context, on October 28, the \u001b[1mCitigroup\u001b[0m based Changed Opportunities & Technology Forum | 2\n", "##############\n", "Title:\n", "\u001b[1mCitigroup\u001b[0m Lowers Visa (NYSE:V) Price Target to $238.00\n", "\n", "Mention:\n", "\u001b[1mCitigroup\u001b[0m from $254.00 to $238.00 in a research note publish\n", "##############\n", "Title:\n", "OSCO SHIPPING (OTCMKTS:CICOF) Cut to Sell at \u001b[1mCitigroup\u001b[0m \n", "\n", "Mention:\n", "\u001b[1mCitigroup\u001b[0m downgraded shares of COSCO SHIPPING ( OTCMKTS:CICO\n", "##############\n", "Title:\n", "\u001b[1mCitigroup\u001b[0m Lowers YETI (NYSE:YETI) Price Target to $43.00\n", "\n", "Mention:\n", "\u001b[1mCitigroup\u001b[0m from $57.00 to $43.00 in a report issued on Friday\n" ] } ], "source": [ "params = {\n", " 'published_at.start': 'NOW-30DAYS'\n", " , 'published_at.end': 'NOW'\n", " , 'aql': 'entities: {{surface_forms: \"Citigroup\" AND overall_prominence:[0.6 TO *] sort_by(overall_prominence)}}'\n", " , 'per_page' : 5\n", " } \n", "\n", "stories = get_top_ranked_stories(params, 5)\n", "\n", "for story in stories:\n", " print('##############')\n", " print('Title:')\n", " print_keyword_mention(story, 'title', 'Citigroup')\n", " print()\n", " print('Mention:')\n", " print_keyword_mention(story, 'body', 'Citigroup')\n", " \n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Non-English Content\n", "So far we have pulled stories in English only. However, our News API supports 6 native languages and 10 translated languages:\n", "\n", "Native Languages:\n", "- English (en)\n", "- German (de)\n", "- French (fr)\n", "- Italian (it)\n", "- Spanish (es)\n", "- Portugese (pt)\n", "\n", "Translated Languages:\n", "- Arabic (ar)\n", "- Danish (da)\n", "- Finnish (fi)\n", "- Dutch (nl)\n", "- Norwegian (no)\n", "- Russian (ru)\n", "- Swedish (sv)\n", "- Turkish (tr)\n", "- Chinese (simplified) (zh-tw)\n", "- Chinese (traditional) (zh-cn)\n", "\n", "Let's perform a search in some native languages other than English. Here we'll search for stories featuring Citigroup in the title and print the native language title and an English title." ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'language': ['de', 'fr', 'it', 'es', 'pt'], 'title': 'Citigroup', 'published_at.start': 'NOW-10DAYS', 'published_at.end': 'NOW', 'cursor': '*', 'per_page': 50}\n", "{'cursor': '*',\n", " 'language': ['de', 'fr', 'it', 'es', 'pt'],\n", " 'per_page': 50,\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-10DAYS',\n", " 'title': 'Citigroup'}\n", "¿Carlos Slim e Inbursa abandonan la carrera: ya no comprarán Banamex a Citigroup?\n", "https://www.capitalmexico.com.mx/mundo/carlos-slim-e-inbursa-abandonan-la-carrera-ya-no-compraran-banamex-a-citigroup/\n", "Fetched 28 stories. Total story count so far: 28\n", "Fetched 0 stories. Total story count so far: 28\n", "************\n", "Fetched 28 stories\n", "¿Carlos Slim e Inbursa abandonan la carrera: ya no comprarán Banamex a Citigroup?\n", "Are Slim and Inbursa leaving the race: will Banamex no longer buy Citigroup?\n", "\n", "Citigroup\n", "Citigroup\n", "\n", "Citigroup\n", "Citigroup\n", "\n", "Tesla, Inc. : Citigroup cambia a neutral | MarketScreener\n", "Tesla, Inc. : Citigroup changes to neutral | MarketScreener\n", "\n", "El multimillonario mexicano Carlos Slim descarta comprar Banamex a Citigroup\n", "Mexican millionaire Carlos Slim dismisses Banamex to Citigroup\n", "\n", "México: Banco de Slim se retira de compra de Banamex El grupo financiero Inbursa, del millonario mexicano Carlos Slim, anuncia su retiro del proceso de compra de Banamex, uno de los principales bancos de México que la corporación estadounidense Citigroup espera vender en los próximos meses Associated Press Nov 23, 2022 30 min ago\n", "Mexico: Slim Bank withdraws from Banamex The financial group Inbursa, of Mexican millionaire Carlos Slim, announces its withdrawal from the Banamex purchase process, one of Mexico's major banks that the American firm Citigroup expects to sell in the coming months sociated Press Nov 23, 2022 30 min ago\n", "\n", "México: Banco de Slim se retira de compra de Banamex El grupo financiero Inbursa, del millonario mexicano Carlos Slim, anuncia su retiro del proceso de compra de Banamex, uno de los principales bancos de México que la corporación estadounidense Citigroup espera vender en los próximos meses Associated Press Nov 23, 2022 14 min ago\n", "Mexico: Slim Bank withdraws from Banamex The financial group Inbursa, of Mexican millionaire Carlos Slim, announces its withdrawal from the Banamex purchase process, one of Mexico's major banks that the American firm Citigroup expects to sell in the coming months sociated Press Nov 23, 2022 14 min ago\n", "\n", "Reguladores de EEUU instan a Citigroup a corregir plan de simulación de quiebra\n", "US regulators urge Citigroup to correct bankruptcy simulation plan\n", "\n", "DICK'S Sporting Goods, Inc. : El Citigroup continua con su recomendación de compra | MarketScreener\n", "DICK'S Sporting Goods, Inc. : The Citigroup continues with its purchase recommendation | MarketScreener\n", "\n", "El multimillonario mexicano Carlos Slim descarta comprar Banamex a Citigroup\n", "Mexican millionaire Carlos Slim dismisses Banamex to Citigroup\n", "\n", "Reguladores de EE.UU. pidieron a Citigroup mejorar su plan de simulación de quiebras\n", "US regulators They asked Citigroup to improve its bankruptcy simulation plan\n", "\n", "Reguladores instan a Citigroup a corregir plan de quiebra\n", "Regulators urge Citigroup to correct bankruptcy plan\n", "\n", "Reguladores de EEUU instan a Citigroup a corregir plan de simulación de quiebra\n", "US regulators urge Citigroup to correct bankruptcy simulation plan\n", "\n", "Reguladores de EU instan a Citigroup a corregir plan de simulación de quiebra\n", "US regulators urge Citigroup to correct bankruptcy simulation plan\n", "\n", "Reguladores de EEUU instan a Citigroup a corregir plan de simulación de quiebra\n", "US regulators urge Citigroup to correct bankruptcy simulation plan\n", "\n", "Reguladores de EEUU instan a Citigroup a corregir plan de simulación de quiebra\n", "US regulators urge Citigroup to correct bankruptcy simulation plan\n", "\n", "Dell Technologies Inc. : Citigroup reitera su recomendación de compra | MarketScreener\n", "Dell Technologies Inc. : Citigroup reiterates its purchase recommendation | MarketScreener\n", "\n", "BP plc : Citigroup Cambia su recomendación a compra | MarketScreener\n", "BP Plc : Citigroup Change your purchase recommendation | MarketScreener\n", "\n", "Unity Software Inc. : Obtiene una recomendación de compra de Citigroup | MarketScreener\n", "Unit Software Inc. : Get a Citigroup Buy recommendation | MarketScreener\n", "\n", "Rackspace Technology, Inc. : El Citigroup se mantiene neutral | MarketScreener\n", "Rackspace Technology, Inc. : Citigroup remains neutral | MarketScreener\n", "\n", "Eneti Inc. : Obtiene una recomendación de compra de Citigroup | MarketScreener\n", "Eneti Inc. : Get a Citigroup Buy recommendation | MarketScreener\n", "\n", "Macy's, Inc. : El Citigroup se mantiene neutral | MarketScreener\n", "Macy's, Inc. : Citigroup remains neutral | MarketScreener\n", "\n", "Bath & Body Works, Inc. : recomendación de compra de Citigroup | MarketScreener\n", "Bath & Body Works, Inc. : Citigroup's Buy Recommendation | MarketScreener\n", "\n", "IBEX Limited : Citigroup permanece neutral | MarketScreener\n", "IBEX Limited : Citigroup remains neutral | MarketScreener\n", "\n", "Roblox Corporation : El Citigroup continua con un recomendación de compra | MarketScreener\n", "Robles Corporation : Citigroup continues with a shopping recommendation | MarketScreener\n", "\n", "NetEase, Inc. : Citigroup mantiene su recomendación de compra | MarketScreener\n", "NetEase, Inc. : Citigroup maintains its purchase recommendation | MarketScreener\n", "\n", "Autodesk, Inc. : Citigroup mantiene su recomendación de compra | MarketScreener\n", "Autodesk, Inc. : Citigroup maintains its purchase recommendation | MarketScreener\n", "\n", "HP Inc. : Citigroup se mantiene neutral. | MarketScreener\n", "HP Inc. : Citigroup remains neutral. | MarketScreener\n", "\n" ] } ], "source": [ "# define the query parameters\n", "params = {\n", " 'language': ['de', 'fr', 'it', 'es', 'pt'],\n", " 'title': 'Citigroup',\n", " 'published_at.start':'NOW-10DAYS',\n", " 'published_at.end':'NOW',\n", " 'cursor': '*',\n", " 'per_page' : 50\n", "}\n", "\n", "print(params)\n", "\n", "stories = get_top_ranked_stories(params, 100)\n", "\n", "print('************')\n", "print(\"Fetched %s stories\" %(len(stories)))\n", "\n", "for story in stories:\n", " print(story['title'])\n", " print(story['translations']['en']['title'])\n", " print('')\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create a Pandas Dataframe From a List of Stories Dictionaries\n", "Up to now we have interrogated our News API output by converting the JSON objects to Python dictionaries, iterating through them and printing the elements. Sometimes we may wish to view the data in a more tabular format. Below, we will loop through our non-English content stories and create a Pandas dataframe. This will also be useful later when we want to visualize our data.\n", "\n", "We'll also pull out some contextual information about each story such as the article's permalink and the stories' sentiment score. AYLIEN's enrichment process predicts the overall sentiment in the body and title of a document as positive, negative and neutral and also outputs a confidence score. " ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idtitletitle_engpermalinkpublished_atsourcebody_polaritybody_polarity_score
05074728360¿Carlos Slim e Inbursa abandonan la carrera: y...Are Slim and Inbursa leaving the race: will Ba...https://www.capitalmexico.com.mx/mundo/carlos-...2022-11-24T17:49:13Zcapitalmexico.com.mxnegative0.67
15074491026CitigroupCitigrouphttps://lado.mx/trending.php?id=57562022-11-24T14:33:36Zlado.mxpositive0.51
25074310047CitigroupCitigrouphttps://www.lado.mx/trending.php?id=57562022-11-24T12:28:53Zlado.mxpositive0.70
35074108547Tesla, Inc. : Citigroup cambia a neutral | Mar...Tesla, Inc. : Citigroup changes to neutral | M...https://es.marketscreener.com/cotizacion/accio...2022-11-24T09:48:26Zmarketscreener.comneutral0.60
45073894044El multimillonario mexicano Carlos Slim descar...Mexican millionaire Carlos Slim dismisses Bana...https://palabrasclaras.mx/economia/el-multimil...2022-11-24T06:07:00Zpalabrasclaras.mxnegative0.58
\n", "
" ], "text/plain": [ " id title \\\n", "0 5074728360 ¿Carlos Slim e Inbursa abandonan la carrera: y... \n", "1 5074491026 Citigroup \n", "2 5074310047 Citigroup \n", "3 5074108547 Tesla, Inc. : Citigroup cambia a neutral | Mar... \n", "4 5073894044 El multimillonario mexicano Carlos Slim descar... \n", "\n", " title_eng \\\n", "0 Are Slim and Inbursa leaving the race: will Ba... \n", "1 Citigroup \n", "2 Citigroup \n", "3 Tesla, Inc. : Citigroup changes to neutral | M... \n", "4 Mexican millionaire Carlos Slim dismisses Bana... \n", "\n", " permalink published_at \\\n", "0 https://www.capitalmexico.com.mx/mundo/carlos-... 2022-11-24T17:49:13Z \n", "1 https://lado.mx/trending.php?id=5756 2022-11-24T14:33:36Z \n", "2 https://www.lado.mx/trending.php?id=5756 2022-11-24T12:28:53Z \n", "3 https://es.marketscreener.com/cotizacion/accio... 2022-11-24T09:48:26Z \n", "4 https://palabrasclaras.mx/economia/el-multimil... 2022-11-24T06:07:00Z \n", "\n", " source body_polarity body_polarity_score \n", "0 capitalmexico.com.mx negative 0.67 \n", "1 lado.mx positive 0.51 \n", "2 lado.mx positive 0.70 \n", "3 marketscreener.com neutral 0.60 \n", "4 palabrasclaras.mx negative 0.58 " ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# create dataframe in the format we want\n", "my_columns = ['id', 'title', 'title_eng', 'permalink', 'published_at', 'source', 'body_polarity', 'body_polarity_score']\n", "my_data_frame = []\n", "\n", "for story in stories:\n", " \n", " # make array of the fields we're interested in\n", " data = [\n", " story['id']\n", " , story['title']\n", " , story['translations']['en']['title']\n", " , story['links']['permalink']\n", " , story['published_at']\n", " , story['source']['domain']\n", " , story['sentiment']['body']['polarity']\n", " , story['sentiment']['body']['score']\n", " ]\n", " \n", " zipped = zip(my_columns, data)\n", " a_dictionary = dict(zipped)\n", " my_data_frame.append(a_dictionary)\n", " \n", "my_data_frame = pd.DataFrame(my_data_frame, columns = my_columns)\n", "\n", "my_data_frame.head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Timeseries Endpoint" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pull Timeseries\n", "We have seen how we can pull granular stories using the Stories endpoint. However, if we want to investigate volumes of stories over time, we can use the Timeseries endpoint. This endpoint retrieves the stories that meet our criteria and aggregates per minute, hour, day, month, or however we see fit. This can be very usfeul for identifying spikes or dips in news volume relating to a subject of interest. By default, our query below will aggregate the volume of stories per day.\n", "\n", "The timeseries endpoint ouputs data in a json format, but out function above will convert this to a pandas dataframe for legibility." ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-30DAYS',\n", " 'title': 'Citigroup'}\n", "Number of stories returned : 4,765\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countpublished_at
01142022-10-29
11272022-10-30
21792022-10-31
31852022-11-01
41822022-11-02
52102022-11-03
62162022-11-04
7972022-11-05
81692022-11-06
91522022-11-07
102142022-11-08
111602022-11-09
121452022-11-10
131682022-11-11
141192022-11-12
151382022-11-13
161092022-11-14
172042022-11-15
182132022-11-16
192432022-11-17
202852022-11-18
211212022-11-19
221382022-11-20
231822022-11-21
241662022-11-22
252102022-11-23
261212022-11-24
27862022-11-25
28702022-11-26
29422022-11-27
\n", "
" ], "text/plain": [ " count published_at\n", "0 114 2022-10-29\n", "1 127 2022-10-30\n", "2 179 2022-10-31\n", "3 185 2022-11-01\n", "4 182 2022-11-02\n", "5 210 2022-11-03\n", "6 216 2022-11-04\n", "7 97 2022-11-05\n", "8 169 2022-11-06\n", "9 152 2022-11-07\n", "10 214 2022-11-08\n", "11 160 2022-11-09\n", "12 145 2022-11-10\n", "13 168 2022-11-11\n", "14 119 2022-11-12\n", "15 138 2022-11-13\n", "16 109 2022-11-14\n", "17 204 2022-11-15\n", "18 213 2022-11-16\n", "19 243 2022-11-17\n", "20 285 2022-11-18\n", "21 121 2022-11-19\n", "22 138 2022-11-20\n", "23 182 2022-11-21\n", "24 166 2022-11-22\n", "25 210 2022-11-23\n", "26 121 2022-11-24\n", "27 86 2022-11-25\n", "28 70 2022-11-26\n", "29 42 2022-11-27" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# define the query parameters\n", "params = {\n", " 'title': 'Citigroup',\n", " 'published_at.start':'NOW-30DAYS',\n", " 'published_at.end':'NOW',\n", "}\n", "\n", "timeseries_data = get_timeseries(params)\n", "\n", "timeseries_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Visualizing Timeseries\n", "We can makes sense of timeseries data much quicker if we visualize it. Below, we make use out of Plotly library to visualize the data. " ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plotly.com" }, "data": [ { "line": { "color": "blue" }, "type": "scatter", "x": [ "2022-10-29", "2022-10-30", "2022-10-31", "2022-11-01", "2022-11-02", "2022-11-03", "2022-11-04", "2022-11-05", "2022-11-06", "2022-11-07", "2022-11-08", "2022-11-09", "2022-11-10", "2022-11-11", "2022-11-12", "2022-11-13", "2022-11-14", "2022-11-15", "2022-11-16", "2022-11-17", "2022-11-18", "2022-11-19", "2022-11-20", "2022-11-21", "2022-11-22", "2022-11-23", "2022-11-24", "2022-11-25", "2022-11-26", "2022-11-27" ], "y": [ 114, 127, 179, 185, 182, 210, 216, 97, 169, 152, 214, 160, 145, 168, 119, 138, 109, 204, 213, 243, 285, 121, 138, 182, 166, 210, 121, 86, 70, 42 ] } ], "layout": { "plot_bgcolor": "white", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Volume of Stories Over Time" }, "xaxis": { "gridcolor": "rgb(204, 204, 204)", "linecolor": "rgb(204, 204, 204)" }, "yaxis": { "gridcolor": "rgb(204, 204, 204)", "linecolor": "rgb(204, 204, 204)" } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig = go.Figure( data = go.Scatter( \n", " x = timeseries_data['published_at']\n", " , y=timeseries_data['count']\n", " , line=dict(color='blue')\n", " ))\n", "# forrmat the chart\n", "fig.update_layout(\n", " title='Volume of Stories Over Time',\n", " plot_bgcolor='white',\n", " xaxis=dict(\n", " gridcolor='rgb(204, 204, 204)',\n", " linecolor='rgb(204, 204, 204)'\n", " )\n", " , yaxis=dict(\n", " gridcolor='rgb(204, 204, 204)',\n", " linecolor='rgb(204, 204, 204)'\n", " )\n", ")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exploring Spikes in Timeseries Data\n", "We can see from the graph that there are various spikes in news volume. We can explore the cause of these spikes by pulling a story that will give us an indication of why Citigroup received so much attention using Alexa Ranking. Alexa Ranking is an estimate of a site's popularity on the internet. Learn more about working with Alexa Ranking [here](https://docs.aylien.com/newsapi/common-workflows/#alexa-rankings).\n", "\n", "Below, we'll identify the three dates with the most stories, then pull the highest ranked story for those dates using the same parameters we used to query the Timeseries endpoint." ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'published_at.end': '2022-11-19T00:00:00Z',\n", " 'published_at.start': '2022-11-18T00:00:00Z',\n", " 'sort_by': 'source.rankings.alexa.rank',\n", " 'title': 'Citigroup'}\n", "Bank of America (NYSE:BAC) Downgraded by Citigroup to “Neutral”\n", "https://www.com-unik.info/2022/11/18/bank-of-america-nysebac-downgraded-by-citigroup-to-neutral.html\n", "Fetched 10 stories. Total story count so far: 10\n", "2022-11-18T07:41:24Z\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ ":37: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "{'cursor': 'OTQzNjA5Miw1MDY2MTAzNTAx',\n", " 'published_at.end': '2022-11-18T00:00:00Z',\n", " 'published_at.start': '2022-11-17T00:00:00Z',\n", " 'sort_by': 'source.rankings.alexa.rank',\n", " 'title': 'Citigroup'}\n", "Sight Sciences (NASDAQ:SGHT) Price Target Increased to $10.00 by Analysts at Citigroup\n", "https://dakotafinancialnews.com/2022/11/17/sight-sciences-nasdaqsght-price-target-increased-to-10-00-by-analysts-at-citigroup.html\n", "Fetched 10 stories. Total story count so far: 10\n", "2022-11-17T16:26:46Z\n", "{'cursor': 'MjA3NjMyNiw1MDY0MDc3ODUz',\n", " 'published_at.end': '2022-11-05T00:00:00Z',\n", " 'published_at.start': '2022-11-04T00:00:00Z',\n", " 'sort_by': 'source.rankings.alexa.rank',\n", " 'title': 'Citigroup'}\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ ":37: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "The Berkeley Group (OTCMKTS:BKGFY) Rating Lowered to Neutral at Citigroup\n", "https://www.dispatchtribunal.com/2022/11/04/the-berkeley-group-otcmktsbkgfy-rating-lowered-to-neutral-at-citigroup.html\n", "Fetched 10 stories. Total story count so far: 10\n", "2022-11-04T05:28:57Z\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ ":37: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countpublished_attitle_1title_2title_3
02852022-11-18T00:00:00ZBank of America (NYSE:BAC) Downgraded by Citig...Becton, Dickinson and (NYSE:BDX) Price Target ...Citigroup Boosts monday.com (NASDAQ:MNDY) Pric...
02432022-11-17T00:00:00ZSight Sciences (NASDAQ:SGHT) Price Target Incr...Labor Dept.: Proposed Exemption for Prohibited...Atlas Copco (OTCMKTS:ATLKY) Raised to “Buy” at...
02162022-11-04T00:00:00ZThe Berkeley Group (OTCMKTS:BKGFY) Rating Lowe...Teck Resources (TSE:TECK.B) Cut to Neutral at ...Pinterest (NYSE:PINS) Price Target Increased t...
\n", "
" ], "text/plain": [ " count published_at \\\n", "0 285 2022-11-18T00:00:00Z \n", "0 243 2022-11-17T00:00:00Z \n", "0 216 2022-11-04T00:00:00Z \n", "\n", " title_1 \\\n", "0 Bank of America (NYSE:BAC) Downgraded by Citig... \n", "0 Sight Sciences (NASDAQ:SGHT) Price Target Incr... \n", "0 The Berkeley Group (OTCMKTS:BKGFY) Rating Lowe... \n", "\n", " title_2 \\\n", "0 Becton, Dickinson and (NYSE:BDX) Price Target ... \n", "0 Labor Dept.: Proposed Exemption for Prohibited... \n", "0 Teck Resources (TSE:TECK.B) Cut to Neutral at ... \n", "\n", " title_3 \n", "0 Citigroup Boosts monday.com (NASDAQ:MNDY) Pric... \n", "0 Atlas Copco (OTCMKTS:ATLKY) Raised to “Buy” at... \n", "0 Pinterest (NYSE:PINS) Price Target Increased t... " ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# create dataframe to store the label data - note, the publihset_at and count fields are needed for x and y coords.\n", "# the count field will be populated with the total count of stories for each respective day\n", "my_columns = my_columns = ['published_at', 'count', 'title_1', 'title_2', 'title_3']\n", "label_data = pd.DataFrame(columns = my_columns)\n", "\n", "# identify the dates with most stories\n", "top_3_dates = timeseries_data.sort_values(by=['count'], ascending = False)[0:3]\n", "\n", "# define the query parameters\n", "params = {\n", " 'title': 'Citigroup',\n", " 'published_at.start':'NOW-30DAYS',\n", " 'published_at.end':'NOW',\n", " 'sort_by' : \"source.rankings.alexa.rank\"\n", "}\n", "\n", "for index, row in top_3_dates.iterrows():\n", " \n", " params['published_at.start'] = str(row['published_at'] ) + 'T00:00:00Z'\n", " params['published_at.end'] = str(row['published_at'] + datetime.timedelta(days=1)) + 'T00:00:00Z'\n", " \n", " # retirve the top ranked story per date\n", " stories = get_top_ranked_stories(params, 3)\n", " \n", " print(stories[0]['published_at'])\n", " \n", " data = [[\n", " params['published_at.start'] # include the start date for visualization\n", " , row['count']\n", " # use function to return translated content if ncessary\n", " , stories[0]['title']\n", " , stories[1]['title']\n", " , stories[2]['title']\n", " ]]\n", " \n", " data = pd.DataFrame(data, columns = my_columns)\n", " label_data = label_data.append(data, sort=True)\n", " \n", "label_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Add Labels to Timeseries Spikes\n", "We will now append these titles to the spikes in the graph we previously created. If we hover over the markers, the tooltip will display the relvant story title." ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plotly.com" }, "data": [ { "line": { "color": "blue" }, "name": "Volume of Stories", "type": "scatter", "x": [ "2022-10-29", "2022-10-30", "2022-10-31", "2022-11-01", "2022-11-02", "2022-11-03", "2022-11-04", "2022-11-05", "2022-11-06", "2022-11-07", "2022-11-08", "2022-11-09", "2022-11-10", "2022-11-11", "2022-11-12", "2022-11-13", "2022-11-14", "2022-11-15", "2022-11-16", "2022-11-17", "2022-11-18", "2022-11-19", "2022-11-20", "2022-11-21", "2022-11-22", "2022-11-23", "2022-11-24", "2022-11-25", "2022-11-26", "2022-11-27" ], "y": [ 114, 127, 179, 185, 182, 210, 216, 97, 169, 152, 214, 160, 145, 168, 119, 138, 109, 204, 213, 243, 285, 121, 138, 182, 166, 210, 121, 86, 70, 42 ] }, { "marker": { "color": "white", "line": { "color": "blue", "width": 2 }, "size": 10 }, "mode": "markers", "name": "Spike lable", "text": [ "285

Bank of America (NYSE:BAC) Downgraded by Citigroup to
“Neutral”

Becton, Dickinson and (NYSE:BDX) Price Target Cut to
$220.00 by Analysts at Citigroup

Citigroup Boosts monday.com (NASDAQ:MNDY) Price Target to $138.00


", "243

Sight Sciences (NASDAQ:SGHT) Price Target Increased to $10.00
by Analysts at Citigroup

Labor Dept.: Proposed Exemption for Prohibited Transaction Restrictions
Involving Citigroup, New York

Atlas Copco (OTCMKTS:ATLKY) Raised to “Buy” at Citigroup


", "216

The Berkeley Group (OTCMKTS:BKGFY) Rating Lowered to Neutral
at Citigroup

Teck Resources (TSE:TECK.B) Cut to Neutral at Citigroup


Pinterest (NYSE:PINS) Price Target Increased to $26.00 by
Analysts at Citigroup

" ], "type": "scatter", "x": [ "2022-11-18T00:00:00Z", "2022-11-17T00:00:00Z", "2022-11-04T00:00:00Z" ], "y": [ 285, 243, 216 ] } ], "layout": { "legend": { "orientation": "h", "y": -0.1 }, "plot_bgcolor": "white", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Volume of Stories Over Time" }, "xaxis": { "gridcolor": "rgb(204, 204, 204)", "linecolor": "rgb(204, 204, 204)" }, "yaxis": { "gridcolor": "rgb(204, 204, 204)", "linecolor": "rgb(204, 204, 204)" } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# split title stings over multiple lines for legibility\n", "split_title_string(label_data, 'title_1')\n", "split_title_string(label_data, 'title_2')\n", "split_title_string(label_data, 'title_3')\n", "\n", "trace_1 = go.Scatter( \n", " x = timeseries_data['published_at']\n", " , y=timeseries_data['count']\n", " , name = 'Volume of Stories'\n", " , line=dict(color='blue')\n", " )\n", "\n", "trace_2 = go.Scatter(\n", " x = label_data['published_at']\n", " , y = label_data['count']\n", " , mode ='markers'\n", " , marker=dict(size=10,line=dict(width=2, color='blue'), color = 'white')\n", " , text = '' + label_data['count'].astype(str) + '

'\n", " + label_data['title_1_string'] + '

'\n", " + label_data['title_2_string'] + '

'\n", " + label_data['title_3_string'] + '

'\n", " , name = 'Spike lable'\n", " )\n", "\n", "data = [trace_1, trace_2]\n", "\n", "fig = go.Figure(data=data)\n", "\n", "# forrmat the chart\n", "fig.update_layout(\n", " title='Volume of Stories Over Time',\n", " legend = dict(orientation = 'h', y = -0.1),\n", " plot_bgcolor='white',\n", " xaxis=dict(\n", " gridcolor='rgb(204, 204, 204)',\n", " linecolor='rgb(204, 204, 204)'\n", " )\n", " , yaxis=dict(\n", " gridcolor='rgb(204, 204, 204)',\n", " linecolor='rgb(204, 204, 204)'\n", " )\n", ")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pull Document Timeseries by Sentiment\n", "We filter our timeseries queries in the same ways as stories, but one filter that is particularly interesting is filtering on sentiment. We have already discussed how stories are given a sentiment score at a granular level and we can use this score to pull volume of stories by title sentiment polarity over time.\n", "\n", "In the cell below, we run a function that pulls queries the Timeseries endpoint twice — once for positive sentiment stories and once for negative stories." ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "===========================================\n", " positive sentiment \n", "===========================================\n", "{'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-30DAYS',\n", " 'sentiment_title_polarity': 'positive',\n", " 'title': 'Citigroup'}\n", "Number of stories returned : 4,765\n", "Completed\n", "===========================================\n", " negative sentiment \n", "===========================================\n", "{'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-30DAYS',\n", " 'sentiment_title_polarity': 'negative',\n", " 'title': 'Citigroup'}\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ ":27: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Number of stories returned : 4,765\n", "Completed\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ ":27: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countpublished_atsentiment_title_polarity
01142022-10-29positive
11272022-10-30positive
21792022-10-31positive
31852022-11-01positive
41822022-11-02positive
\n", "
" ], "text/plain": [ " count published_at sentiment_title_polarity\n", "0 114 2022-10-29 positive\n", "1 127 2022-10-30 positive\n", "2 179 2022-10-31 positive\n", "3 185 2022-11-01 positive\n", "4 182 2022-11-02 positive" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# define the query parameters\n", "params = {\n", " 'title': 'Citigroup',\n", " 'published_at.start':'NOW-30DAYS',\n", " 'published_at.end':'NOW'\n", "}\n", "\n", "polarities = [ 'positive', 'negative']\n", "\n", "# Create dataframe to store the outputs\n", "column_names = [\"count\", \"published_at\", \"sentiment_title_polarity\"]\n", "timeseries_sentiment_data = pd.DataFrame(columns = column_names)\n", "\n", "for polarity in polarities:\n", "\n", " print('===========================================')\n", " print(' ' + polarity + ' sentiment ')\n", " print('===========================================')\n", "\n", " params['sentiment_title_polarity'] = polarity\n", "\n", " api_response = get_timeseries(params)\n", "\n", " #add polarity indicator\n", " api_response['sentiment_title_polarity'] = polarity\n", "\n", " timeseries_sentiment_data = timeseries_sentiment_data.append(api_response)\n", "\n", " print(\"Completed\")\n", "\n", "timeseries_sentiment_data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Visualizing Timeseries by Sentiment" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plotly.com" }, "data": [ { "fill": "tozeroy", "fillcolor": "rgba(138, 190, 6, 0.05)", "hovertemplate": "Date: %{x}
Stories: %{y}", "line": { "color": "green", "width": 1 }, "mode": "lines", "name": "Vol. stories positive", "type": "scatter", "x": [ "2022-10-29", "2022-10-30", "2022-10-31", "2022-11-01", "2022-11-02", "2022-11-03", "2022-11-04", "2022-11-05", "2022-11-06", "2022-11-07", "2022-11-08", "2022-11-09", "2022-11-10", "2022-11-11", "2022-11-12", "2022-11-13", "2022-11-14", "2022-11-15", "2022-11-16", "2022-11-17", "2022-11-18", "2022-11-19", "2022-11-20", "2022-11-21", "2022-11-22", "2022-11-23", "2022-11-24", "2022-11-25", "2022-11-26", "2022-11-27" ], "xaxis": "x", "y": [ 114, 127, 179, 185, 182, 210, 216, 97, 169, 152, 214, 160, 145, 168, 119, 138, 109, 204, 213, 247, 281, 121, 138, 182, 166, 210, 121, 86, 70, 42 ], "yaxis": "y" }, { "fill": "tozeroy", "fillcolor": "rgba(228, 42, 58, 0.05)", "hovertemplate": "Date: %{x}
Stories: %{y}", "line": { "color": "red", "width": 1 }, "mode": "lines", "name": "Vol. stories negative", "type": "scatter", "x": [ "2022-10-29", "2022-10-30", "2022-10-31", "2022-11-01", "2022-11-02", "2022-11-03", "2022-11-04", "2022-11-05", "2022-11-06", "2022-11-07", "2022-11-08", "2022-11-09", "2022-11-10", "2022-11-11", "2022-11-12", "2022-11-13", "2022-11-14", "2022-11-15", "2022-11-16", "2022-11-17", "2022-11-18", "2022-11-19", "2022-11-20", "2022-11-21", "2022-11-22", "2022-11-23", "2022-11-24", "2022-11-25", "2022-11-26", "2022-11-27" ], "xaxis": "x", "y": [ -114, -127, -179, -185, -182, -210, -216, -97, -169, -152, -214, -160, -145, -168, -119, -138, -109, -204, -213, -247, -281, -121, -138, -182, -166, -210, -121, -86, -70, -42 ], "yaxis": "y" } ], "layout": { "legend": { "orientation": "h", "y": -0.1 }, "plot_bgcolor": "white", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Volume of Positive & Negative Sentiment Stories Over Time" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ], "gridcolor": "rgb(204, 204, 204)", "linecolor": "rgb(204, 204, 204)" }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ], "gridcolor": "rgb(204, 204, 204)", "linecolor": "rgb(204, 204, 204)" } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "colours = {\n", " 'positive' : 'green'\n", " , 'positive_opaque' : 'rgba(138, 190, 6, 0.05)'\n", " \n", " , 'negative' : 'red'\n", " , 'negative_opaque' : 'rgba(228, 42, 58, 0.05)'\n", " \n", " , 'neutral' : 'rgb(40, 56, 78)'\n", " , 'neutral_opaque' : 'rgba(40, 56, 78, 0.05)'\n", " }\n", "\n", "\n", "# we will plot two subplots using the same axes\n", "fig = make_subplots(rows=1, cols=1)\n", "\n", "counter = 0\n", "\n", "# loop over postive and negative sentiment data to generate to line graphs\n", "# start of for loop ======================================================================================= \n", "for polarity in polarities: \n", " \n", " if polarity == 'negative':\n", " # multiply absolute number of stories by -1 to visualize negative sentiment stories\n", " factor = -1\n", " else:\n", " factor = 1\n", "\n", " # filter to the data we want to visualize based on sentiment \n", " data = timeseries_sentiment_data[timeseries_sentiment_data.sentiment_title_polarity == polarity]\n", "\n", " fig.append_trace(go.Scatter(\n", " x = data['published_at']\n", " , y = data['count']*factor\n", " , mode = 'lines'\n", " , name = 'Vol. stories '+polarity\n", " , line = dict(color = colours[polarity], width=1)\n", " , fill = 'tozeroy'\n", " , fillcolor = colours[polarity + \"_opaque\"]\n", " , hovertemplate = 'Date: %{x}
'\n", " +'Stories: %{y}'\n", " ) \n", " , col = 1\n", " , row = 1)\n", "\n", "# end of for loop =======================================================================================\n", "\n", "# forrmat the chart\n", "fig.update_layout(\n", " title='Volume of Positive & Negative Sentiment Stories Over Time',\n", " legend = dict(orientation = 'h', y = -0.1),\n", " plot_bgcolor='white',\n", " xaxis=dict(\n", " gridcolor='rgb(204, 204, 204)',\n", " linecolor='rgb(204, 204, 204)'\n", " )\n", " , yaxis=dict(\n", " gridcolor='rgb(204, 204, 204)',\n", " linecolor='rgb(204, 204, 204)'\n", " )\n", ")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Visualizing Entity Timeseries by Sentiment\n", "We can also track entity level sentiment over time." ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Running...\n", "{'aql': 'entities:{{element:title AND surface_forms:Citigroup AND '\n", " 'sentiment:positive}}',\n", " 'period': '+7DAYS',\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-30DAYS'}\n", "Number of stories returned : 245\n", "{'aql': 'entities:{{element:title AND surface_forms:Citigroup AND '\n", " 'sentiment:neutral}}',\n", " 'period': '+7DAYS',\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-30DAYS'}\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ ":21: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Number of stories returned : 5,390\n", "{'aql': 'entities:{{element:title AND surface_forms:Citigroup AND '\n", " 'sentiment:negative}}',\n", " 'period': '+7DAYS',\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-30DAYS'}\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ ":21: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Number of stories returned : 268\n", "Complete\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ ":21: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n" ] } ], "source": [ "sentiments = [1, 0, -1]\n", "sentiments = ['positive', 'neutral', 'negative']\n", "\n", "# create dataframe in the format we want\n", "my_columns = ['count', 'published_at', 'sentiment']\n", "my_data_frame = pd.DataFrame(columns = my_columns)\n", "\n", "print('Running...')\n", "\n", "for sentiment in sentiments:\n", "\n", " params = {\n", " 'published_at.start': 'NOW-30DAYS'\n", " , 'published_at.end': 'NOW'\n", " , 'period' : '+7DAYS'\n", " , \"aql\": \"entities:{{element:title AND surface_forms:Citigroup AND sentiment:\" + str(sentiment) + \"}}\"\n", " }\n", "\n", " timeseries = get_timeseries(params)\n", " timeseries['sentiment'] = sentiment\n", " my_data_frame = my_data_frame.append(timeseries)\n", "\n", "my_data_frame = my_data_frame.reset_index(drop = True)\n", "\n", "print('Complete')" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plotly.com" }, "data": [ { "line": { "color": "green", "width": 1 }, "mode": "lines", "name": "positive", "showlegend": true, "type": "scatter", "x": [ "2022-10-29", "2022-11-05", "2022-11-12", "2022-11-19", "2022-11-26" ], "xaxis": "x", "y": [ 37, 62, 74, 65, 7 ], "yaxis": "y" }, { "line": { "color": "#A2CAE5", "width": 1 }, "mode": "lines", "name": "neutral", "showlegend": true, "type": "scatter", "x": [ "2022-10-29", "2022-11-05", "2022-11-12", "2022-11-19", "2022-11-26" ], "xaxis": "x", "y": [ 1358, 1289, 1439, 1176, 128 ], "yaxis": "y" }, { "line": { "color": "red", "width": 1 }, "mode": "lines", "name": "negative", "showlegend": true, "type": "scatter", "x": [ "2022-10-29", "2022-11-05", "2022-11-12", "2022-11-19", "2022-11-26" ], "xaxis": "x", "y": [ 82, 70, 40, 72, 4 ], "yaxis": "y" } ], "layout": { "legend": { "orientation": "h", "y": -0.05 }, "plot_bgcolor": "white", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Sentiment Over Time" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ] }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ] } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# we will plot two subplots using the same axes\n", "fig = make_subplots(rows=1, cols=1)\n", "\n", "colours = {'positive': 'green', 'neutral':'#A2CAE5', 'negative':'red'}\n", "\n", "# loop over postive and negative sentiment data to generate to line graphs\n", "# start of for loop ======================================================================================= \n", "for sentiment in sentiments:\n", "\n", " mask = (my_data_frame['sentiment'] == sentiment)\n", "\n", " data = my_data_frame.loc[mask]\n", "\n", " fig.add_trace(\n", " go.Scatter(\n", " x = data['published_at']\n", " , y = data['count']\n", " , line = dict(color = colours[sentiment], width=1)\n", " , mode = 'lines'\n", " , name = sentiment\n", " , showlegend = True \n", " ) \n", " , row=1\n", " , col=1 \n", " )\n", "\n", " \n", "\n", "# end of for loop =======================================================================================\n", "\n", "# forrmat the chart\n", "fig.update_layout(\n", " title='Sentiment Over Time',\n", " legend = dict(orientation = 'h', y = -0.05),\n", " plot_bgcolor='white'\n", " \n", ")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Trends Endpoint\n", "Similar to the Timeseries endpoint, we may be interested in seeing themes and patterns over time that aren't immediately apparent when looking at individual stories. The Trends endpoint allows us to see the most frequently recurring entities, concepts or keywords that appear in articles that meet our search criteria.\n", "\n", "Below we will pull the most frequently occuring entities in the body of stories mentioning Citigroup over a month. \n", "\n", "Note- this query will take longer to run than previous endpoints as the News API is performing analysis on all entities included in all the stories that meet our search citeria. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pull Trends" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Running...\n", "Completed\n", "{'field': 'entities.body.surface_forms.text',\n", " 'published_at.end': '2022-11-28T17:03:04.054697Z',\n", " 'published_at.start': '2022-10-29T17:03:04.054631Z',\n", " 'trends': [{'count': 4529, 'value': 'Citigroup'},\n", " {'count': 3415, 'value': 'MarketBeat.com'},\n", " {'count': 2425, 'value': 'Fly'},\n", " {'count': 1900, 'value': 'NYSE'},\n", " {'count': 1651, 'value': 'Receive News & Ratings'},\n", " {'count': 1206, 'value': 'Morgan Stanley'},\n", " {'count': 1183, 'value': 'MarketBeat'},\n", " {'count': 1174, 'value': 'United States'},\n", " {'count': 1100, 'value': 'NASDAQ'},\n", " {'count': 1033, 'value': 'Hold'},\n", " {'count': 1024, 'value': 'Barclays'},\n", " {'count': 1011, 'value': 'Moderate Buy'},\n", " {'count': 1005, 'value': 'PE'},\n", " {'count': 999, 'value': 'LLC'},\n", " {'count': 996, 'value': 'Shares'},\n", " {'count': 955, 'value': 'SEC'},\n", " {'count': 862, 'value': 'EPS'},\n", " {'count': 833, 'value': 'Inc'},\n", " {'count': 829, 'value': 'Europe'},\n", " {'count': 814, 'value': 'JPMorgan Chase & Co.'},\n", " {'count': 774, 'value': 'Goldman Sachs Group'},\n", " {'count': 690, 'value': 'Credit Suisse Group'},\n", " {'count': 688, 'value': 'Deutsche Bank'},\n", " {'count': 670, 'value': 'Securities & Exchange Commission'},\n", " {'count': 535, 'value': 'Hedge'},\n", " {'count': 505, 'value': 'North America'},\n", " {'count': 496, 'value': 'Africa'},\n", " {'count': 496, 'value': 'PEG'},\n", " {'count': 490, 'value': 'Royal Bank of Canada'},\n", " {'count': 463, 'value': 'the Middle East'},\n", " {'count': 461, 'value': 'Bank of America'},\n", " {'count': 458, 'value': 'Asia'},\n", " {'count': 453, 'value': 'Cowen'},\n", " {'count': 447, 'value': '12-month'},\n", " {'count': 416, 'value': 'UBS'},\n", " {'count': 413, 'value': 'Jefferies Financial Group'},\n", " {'count': 404, 'value': 'Wells Fargo'},\n", " {'count': 388, 'value': '1-year'},\n", " {'count': 387, 'value': '“Moderate Buy'},\n", " {'count': 377, 'value': 'Latin America'},\n", " {'count': 377, 'value': 'TheStreet'},\n", " {'count': 371, 'value': '52-week'},\n", " {'count': 360, 'value': 'Piper Sandler'},\n", " {'count': 338, 'value': 'Truist'},\n", " {'count': 338, 'value': 'Vanguard Group Inc'},\n", " {'count': 306, 'value': 'JavaScript'},\n", " {'count': 306, 'value': 'Raymond James'},\n", " {'count': 288, 'value': 'State Street'},\n", " {'count': 281, 'value': 'Stock Target Advisor'},\n", " {'count': 264, 'value': 'Robert W. Baird'},\n", " {'count': 258, 'value': 'See'},\n", " {'count': 251, 'value': 'BlackRock Inc.'},\n", " {'count': 248, 'value': 'Stifel Nicolaus'},\n", " {'count': 241, 'value': 'GBX'},\n", " {'count': 235, 'value': 'Directors'},\n", " {'count': 229, 'value': 'GCB'},\n", " {'count': 228, 'value': 'ICG'},\n", " {'count': 228, 'value': 'Institutional Clients Group'},\n", " {'count': 228, 'value': 'Oppenheimer'},\n", " {'count': 227, 'value': 'Global Consumer Banking'},\n", " {'count': 226, 'value': 'DPR'},\n", " {'count': 221, 'value': 'Insider Buying'},\n", " {'count': 210, 'value': 'BMO'},\n", " {'count': 200, 'value': 'China'},\n", " {'count': 197, 'value': 'US'},\n", " {'count': 193, 'value': 'Featured Stories'},\n", " {'count': 189, 'value': 'Medium'},\n", " {'count': 187, 'value': 'Mizuho'},\n", " {'count': 183, 'value': '1.58'},\n", " {'count': 183, 'value': 'C – Get Rating'},\n", " {'count': 172, 'value': 'StockNews.com'},\n", " {'count': 171, 'value': 'Institutional'},\n", " {'count': 170, 'value': 'U.S.'},\n", " {'count': 168, 'value': 'Citi'},\n", " {'count': 165, 'value': 'Canada'},\n", " {'count': 164, 'value': 'KeyCorp'},\n", " {'count': 155, 'value': 'Insider Activity'},\n", " {'count': 155, 'value': 'MT Newswires'},\n", " {'count': 153, 'value': 'MarketScreener'},\n", " {'count': 152, 'value': 'United Kingdom'},\n", " {'count': 147, 'value': 'ZoneBourse'},\n", " {'count': 146, 'value': 'Berenberg Bank'},\n", " {'count': 141, 'value': 'Americas'},\n", " {'count': 141, 'value': 'Captrust Financial Advisors'},\n", " {'count': 141, 'value': 'Reuters'},\n", " {'count': 138, 'value': 'Mexico'},\n", " {'count': 136, 'value': 'Asia Pacific'},\n", " {'count': 136, 'value': 'Australia'},\n", " {'count': 135, 'value': 'Benchmark'},\n", " {'count': 133, 'value': 'New York'},\n", " {'count': 131, 'value': 'BNP Paribas'},\n", " {'count': 128, 'value': 'Course Objective Mean'},\n", " {'count': 128, 'value': 'Institutional Inflows'},\n", " {'count': 122, 'value': 'Wedbush'},\n", " {'count': 118, 'value': 'INC'},\n", " {'count': 117, 'value': 'Needham & Company LLC'},\n", " {'count': 116, 'value': 'CWM LLC'},\n", " {'count': 116, 'value': 'View The Five Stocks Here'},\n", " {'count': 110, 'value': 'DA Davidson'},\n", " {'count': 107, 'value': 'HoldingsChannel.com'}]}\n" ] } ], "source": [ "# define the query parameters\n", "params = {\n", " 'title': 'Citigroup',\n", " 'published_at.start':'NOW-30DAYS',\n", " 'published_at.end':'NOW',\n", " 'field' : 'entities.body.surface_forms.text'\n", "}\n", "\n", "print(\"Running...\")\n", "\n", "trends = get_trends(params)\n", "\n", "print(\"Completed\")\n", "pprint(trends)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create Wordcloud From Trends\n", "We can visualize the output of the Trends endpoint as a wordcloud to help us quickly interpret the most prevalent keywords. " ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "scrolled": false }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "from PIL import Image\n", "from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator\n", "import matplotlib.pyplot as plt\n", "\n", "#convert data to dataframe for visualization\n", "trends_data = pd.DataFrame(trends['trends'])\n", "\n", "subset = trends_data[['value', 'count']]\n", "tuples = [tuple(x) for x in subset.values]\n", "\n", "# Custom Colormap\n", "from matplotlib.colors import ListedColormap # use when indexing directly yo a colour map\n", "\n", "word_colours = [ \n", " \"#495B70\" # aylien navy\n", " , \"#8BBE07\" # aylien green\n", " , \"#7A98B7\" # grey\n", " , \"#E77C05\" # orange\n", " , \"#0796BE\" # blue\n", " , \"#162542\" # dark grey\n", " ]\n", "\n", "# listed colour map\n", "cmap = ListedColormap(word_colours)\n", "\n", "wordcloud = WordCloud(background_color=\"white\", width=800, height=400, colormap=cmap).generate_from_frequencies(dict(tuples))\n", "plt.figure( figsize=(20,10) )\n", "plt.imshow(wordcloud, interpolation=\"bilinear\")\n", "plt.axis(\"off\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Analysing Trends Over Time" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have used a wordcloud to invesitage the most prominent entities in a one month period, but what if we want to investigate the frequency of mentions over time?\n", "\n", "We can loop over the Trends endpoint and create a timeseries to investigate the distribution of entities over time.\n", "\n", "First we will create a function to create a list of tupples containing daily intervals to allow us to search for trends daily within a defined period." ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [], "source": [ "# the time format we need to submit for News API queries\n", "AYLIEN_TIME_FORMAT = '%Y-%m-%dT%H:%M:%SZ'\n", "\n", "def to_date(date):\n", " if not isinstance(date, datetime.datetime):\n", " date = str2date(date)\n", " return date.strftime(AYLIEN_TIME_FORMAT)\n", "\n", "def str2date(string):\n", " return datetime.datetime.strptime(string, '%Y-%m-%d')\n", " print('done')\n", "\n", "def get_intervals(start_date, end_date):\n", " start_date = str2date(start_date)\n", " end_date = str2date(end_date)\n", " return [(to_date(start_date + datetime.timedelta(days=d)),\n", " to_date(start_date + datetime.timedelta(days=d + 1)))\n", " for d in range((end_date - start_date).days)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we will define our date range, create a list of date tupples and iterate over those daily intervals to populate a dataframe that relates the entity, the number of times it was mentioned and the day the mentions occurred. " ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "scrolled": true }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ " 0%| | 0/30 [00:00:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 3%|▎ | 1/30 [00:03<01:53, 3.92s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 7%|▋ | 2/30 [00:06<01:38, 3.50s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 10%|█ | 3/30 [00:16<02:30, 5.58s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 13%|█▎ | 4/30 [00:27<03:03, 7.06s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 17%|█▋ | 5/30 [00:36<03:13, 7.75s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 20%|██ | 6/30 [00:47<03:24, 8.51s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 23%|██▎ | 7/30 [00:55<03:18, 8.62s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 27%|██▋ | 8/30 [01:05<03:16, 8.91s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 30%|███ | 9/30 [01:15<03:13, 9.22s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 33%|███▎ | 10/30 [01:24<03:06, 9.32s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 37%|███▋ | 11/30 [01:33<02:52, 9.07s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 40%|████ | 12/30 [01:44<02:53, 9.66s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 43%|████▎ | 13/30 [01:53<02:40, 9.46s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 47%|████▋ | 14/30 [02:02<02:30, 9.38s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 50%|█████ | 15/30 [02:11<02:18, 9.26s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 53%|█████▎ | 16/30 [02:21<02:12, 9.49s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 57%|█████▋ | 17/30 [02:30<01:59, 9.21s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 60%|██████ | 18/30 [02:38<01:46, 8.92s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 63%|██████▎ | 19/30 [02:48<01:40, 9.11s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 67%|██████▋ | 20/30 [02:56<01:30, 9.04s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 70%|███████ | 21/30 [02:57<00:58, 6.53s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 73%|███████▎ | 22/30 [03:06<00:58, 7.32s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 77%|███████▋ | 23/30 [03:16<00:55, 7.89s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 80%|████████ | 24/30 [03:25<00:49, 8.23s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 83%|████████▎ | 25/30 [03:34<00:42, 8.57s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 87%|████████▋ | 26/30 [03:42<00:34, 8.53s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 90%|█████████ | 27/30 [03:52<00:26, 8.73s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 93%|█████████▎| 28/30 [04:01<00:17, 8.91s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", " 97%|█████████▋| 29/30 [04:10<00:09, 9.03s/it]:36: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", "100%|██████████| 30/30 [04:11<00:00, 8.39s/it]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Completed\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "# define our daily intervals\n", "today = datetime.date.today().strftime('%Y-%m-%d')\n", "last_month = (datetime.date.today() - datetime.timedelta(days=30)).strftime('%Y-%m-%d')\n", "\n", "day_intervals = get_intervals(last_month, today)\n", "\n", "# create dataframe in the format we want\n", "my_columns = ['count', 'value', 'published_at']\n", "trends_data_frame = pd.DataFrame(columns = my_columns)\n", "\n", "# define the query parameters\n", "params = {\n", " 'title': 'Citigroup'\n", " , 'field' : 'entities.body.surface_forms.text'\n", "}\n", "\n", "# define what trends we want to return\n", "field = 'entities.body.surface_forms.text'\n", "\n", "for day in tqdm(day_intervals):\n", " \n", " # define time interval\n", " params['published_at.start'] = day[0]\n", " params['published_at.end'] = day[1]\n", "\n", " \n", " api_response = get_trends(params)\n", " \n", " #covert to dataframe\n", " api_response = pd.DataFrame(api_response['trends'])\n", "\n", " # add in a day label\n", " api_response['published_at'] = params['published_at.start']\n", " \n", " # add to global dataframe\n", " trends_data_frame = trends_data_frame.append(api_response)\n", "\n", "print(\"Completed\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can loop over this dataframe and visualize the distribution of the different entities. Note, the code below visualizes only the top 10 entities. " ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plotly.com" }, "data": [ { "mode": "lines", "name": "Citigroup", "type": "scatter", "x": [ "2022-10-29T00:00:00Z", "2022-10-30T00:00:00Z", "2022-10-31T00:00:00Z", "2022-11-01T00:00:00Z", "2022-11-02T00:00:00Z", "2022-11-03T00:00:00Z", "2022-11-04T00:00:00Z", "2022-11-05T00:00:00Z", "2022-11-06T00:00:00Z", "2022-11-07T00:00:00Z", "2022-11-08T00:00:00Z", "2022-11-09T00:00:00Z", "2022-11-10T00:00:00Z", "2022-11-11T00:00:00Z", "2022-11-12T00:00:00Z", "2022-11-13T00:00:00Z", "2022-11-14T00:00:00Z", "2022-11-15T00:00:00Z", "2022-11-16T00:00:00Z", "2022-11-17T00:00:00Z", "2022-11-18T00:00:00Z", "2022-11-19T00:00:00Z", "2022-11-20T00:00:00Z", "2022-11-21T00:00:00Z", "2022-11-22T00:00:00Z", "2022-11-23T00:00:00Z", "2022-11-24T00:00:00Z", "2022-11-25T00:00:00Z", "2022-11-26T00:00:00Z", "2022-11-27T00:00:00Z" ], "xaxis": "x", "y": [ 127, 112, 144, 160, 178, 173, 226, 163, 90, 185, 164, 174, 153, 161, 110, 108, 147, 136, 214, 213, 229, 197, 123, 160, 156, 214, 133, 120, 77, 64 ], "yaxis": "y" }, { "mode": "lines", "name": "MarketBeat.com", "type": "scatter", "x": [ "2022-10-29T00:00:00Z", "2022-10-30T00:00:00Z", "2022-10-31T00:00:00Z", "2022-11-01T00:00:00Z", "2022-11-02T00:00:00Z", "2022-11-03T00:00:00Z", "2022-11-04T00:00:00Z", "2022-11-05T00:00:00Z", "2022-11-06T00:00:00Z", "2022-11-07T00:00:00Z", "2022-11-08T00:00:00Z", "2022-11-09T00:00:00Z", "2022-11-10T00:00:00Z", "2022-11-11T00:00:00Z", "2022-11-12T00:00:00Z", "2022-11-13T00:00:00Z", "2022-11-14T00:00:00Z", "2022-11-15T00:00:00Z", "2022-11-16T00:00:00Z", "2022-11-17T00:00:00Z", "2022-11-18T00:00:00Z", "2022-11-19T00:00:00Z", "2022-11-20T00:00:00Z", "2022-11-21T00:00:00Z", "2022-11-22T00:00:00Z", "2022-11-23T00:00:00Z", "2022-11-24T00:00:00Z", "2022-11-25T00:00:00Z", "2022-11-26T00:00:00Z", "2022-11-27T00:00:00Z" ], "xaxis": "x", "y": [ 122, 99, 115, 110, 127, 133, 190, 156, 86, 137, 108, 67, 107, 111, 108, 100, 108, 98, 158, 176, 184, 181, 120, 102, 93, 96, 99, 100, 73, 50 ], "yaxis": "y" }, { "mode": "lines", "name": "Fly", "type": "scatter", "x": [ "2022-10-29T00:00:00Z", "2022-10-30T00:00:00Z", "2022-10-31T00:00:00Z", "2022-11-01T00:00:00Z", "2022-11-02T00:00:00Z", "2022-11-03T00:00:00Z", "2022-11-04T00:00:00Z", "2022-11-05T00:00:00Z", "2022-11-06T00:00:00Z", "2022-11-07T00:00:00Z", "2022-11-08T00:00:00Z", "2022-11-09T00:00:00Z", "2022-11-10T00:00:00Z", "2022-11-11T00:00:00Z", "2022-11-12T00:00:00Z", "2022-11-13T00:00:00Z", "2022-11-14T00:00:00Z", "2022-11-15T00:00:00Z", "2022-11-16T00:00:00Z", "2022-11-17T00:00:00Z", "2022-11-18T00:00:00Z", "2022-11-19T00:00:00Z", "2022-11-20T00:00:00Z", "2022-11-21T00:00:00Z", "2022-11-22T00:00:00Z", "2022-11-23T00:00:00Z", "2022-11-24T00:00:00Z", "2022-11-25T00:00:00Z", "2022-11-26T00:00:00Z", "2022-11-27T00:00:00Z" ], "xaxis": "x", "y": [ 76, 56, 73, 88, 70, 100, 150, 107, 66, 107, 79, 58, 98, 94, 82, 79, 91, 59, 115, 111, 111, 106, 66, 71, 77, 86, 67, 67, 48, 24 ], "yaxis": "y" }, { "mode": "lines", "name": "NYSE", "type": "scatter", "x": [ "2022-10-29T00:00:00Z", "2022-10-30T00:00:00Z", "2022-10-31T00:00:00Z", "2022-11-01T00:00:00Z", "2022-11-02T00:00:00Z", "2022-11-03T00:00:00Z", "2022-11-04T00:00:00Z", "2022-11-05T00:00:00Z", "2022-11-06T00:00:00Z", "2022-11-07T00:00:00Z", "2022-11-08T00:00:00Z", "2022-11-09T00:00:00Z", "2022-11-10T00:00:00Z", "2022-11-11T00:00:00Z", "2022-11-12T00:00:00Z", "2022-11-13T00:00:00Z", "2022-11-14T00:00:00Z", "2022-11-15T00:00:00Z", "2022-11-16T00:00:00Z", "2022-11-17T00:00:00Z", "2022-11-18T00:00:00Z", "2022-11-19T00:00:00Z", "2022-11-20T00:00:00Z", "2022-11-21T00:00:00Z", "2022-11-22T00:00:00Z", "2022-11-23T00:00:00Z", "2022-11-24T00:00:00Z", "2022-11-25T00:00:00Z", "2022-11-26T00:00:00Z", "2022-11-27T00:00:00Z" ], "xaxis": "x", "y": [ 73, 65, 74, 72, 83, 87, 99, 76, 47, 67, 54, 39, 38, 40, 43, 37, 41, 51, 81, 99, 96, 101, 63, 54, 61, 88, 65, 69, 53, 33 ], "yaxis": "y" }, { "mode": "lines", "name": "Receive News & Ratings", "type": "scatter", "x": [ "2022-10-29T00:00:00Z", "2022-10-30T00:00:00Z", "2022-10-31T00:00:00Z", "2022-11-01T00:00:00Z", "2022-11-02T00:00:00Z", "2022-11-03T00:00:00Z", "2022-11-04T00:00:00Z", "2022-11-05T00:00:00Z", "2022-11-06T00:00:00Z", "2022-11-07T00:00:00Z", "2022-11-08T00:00:00Z", "2022-11-09T00:00:00Z", "2022-11-10T00:00:00Z", "2022-11-11T00:00:00Z", "2022-11-12T00:00:00Z", "2022-11-13T00:00:00Z", "2022-11-14T00:00:00Z", "2022-11-15T00:00:00Z", "2022-11-16T00:00:00Z", "2022-11-17T00:00:00Z", "2022-11-18T00:00:00Z", "2022-11-19T00:00:00Z", "2022-11-20T00:00:00Z", "2022-11-21T00:00:00Z", "2022-11-22T00:00:00Z", "2022-11-23T00:00:00Z", "2022-11-24T00:00:00Z", "2022-11-25T00:00:00Z", "2022-11-26T00:00:00Z", "2022-11-27T00:00:00Z" ], "xaxis": "x", "y": [ 65, 63, 56, 55, 71, 66, 89, 71, 43, 80, 54, 36, 56, 62, 46, 52, 53, 44, 73, 74, 78, 70, 54, 40, 44, 47, 52, 46, 41, 22 ], "yaxis": "y" }, { "mode": "lines", "name": "Morgan Stanley", "type": "scatter", "x": [ "2022-10-29T00:00:00Z", "2022-10-30T00:00:00Z", "2022-10-31T00:00:00Z", "2022-11-01T00:00:00Z", "2022-11-02T00:00:00Z", "2022-11-03T00:00:00Z", "2022-11-04T00:00:00Z", "2022-11-05T00:00:00Z", "2022-11-06T00:00:00Z", "2022-11-07T00:00:00Z", "2022-11-08T00:00:00Z", "2022-11-09T00:00:00Z", "2022-11-10T00:00:00Z", "2022-11-11T00:00:00Z", "2022-11-12T00:00:00Z", "2022-11-13T00:00:00Z", "2022-11-14T00:00:00Z", "2022-11-15T00:00:00Z", "2022-11-16T00:00:00Z", "2022-11-17T00:00:00Z", "2022-11-18T00:00:00Z", "2022-11-19T00:00:00Z", "2022-11-20T00:00:00Z", "2022-11-21T00:00:00Z", "2022-11-22T00:00:00Z", "2022-11-23T00:00:00Z", "2022-11-24T00:00:00Z", "2022-11-25T00:00:00Z", "2022-11-26T00:00:00Z", "2022-11-27T00:00:00Z" ], "xaxis": "x", "y": [ 37, 27, 32, 42, 68, 57, 68, 50, 24, 39, 35, 24, 30, 40, 32, 25, 48, 45, 53, 64, 51, 64, 35, 27, 29, 72, 32, 29, 29, 30 ], "yaxis": "y" }, { "mode": "lines", "name": "MarketBeat", "type": "scatter", "x": [ "2022-10-29T00:00:00Z", "2022-10-30T00:00:00Z", "2022-10-31T00:00:00Z", "2022-11-01T00:00:00Z", "2022-11-02T00:00:00Z", "2022-11-03T00:00:00Z", "2022-11-04T00:00:00Z", "2022-11-05T00:00:00Z", "2022-11-06T00:00:00Z", "2022-11-07T00:00:00Z", "2022-11-08T00:00:00Z", "2022-11-09T00:00:00Z", "2022-11-10T00:00:00Z", "2022-11-11T00:00:00Z", "2022-11-12T00:00:00Z", "2022-11-13T00:00:00Z", "2022-11-14T00:00:00Z", "2022-11-15T00:00:00Z", "2022-11-16T00:00:00Z", "2022-11-17T00:00:00Z", "2022-11-18T00:00:00Z", "2022-11-19T00:00:00Z", "2022-11-20T00:00:00Z", "2022-11-21T00:00:00Z", "2022-11-22T00:00:00Z", "2022-11-23T00:00:00Z", "2022-11-24T00:00:00Z", "2022-11-25T00:00:00Z", "2022-11-26T00:00:00Z", "2022-11-27T00:00:00Z" ], "xaxis": "x", "y": [ 38, 36, 40, 55, 48, 45, 67, 39, 34, 45, 39, 27, 36, 46, 32, 34, 32, 36, 55, 65, 56, 45, 43, 38, 21, 40, 24, 36, 29, 22 ], "yaxis": "y" }, { "mode": "lines", "name": "United States", "type": "scatter", "x": [ "2022-10-29T00:00:00Z", "2022-10-30T00:00:00Z", "2022-10-31T00:00:00Z", "2022-11-01T00:00:00Z", "2022-11-02T00:00:00Z", "2022-11-03T00:00:00Z", "2022-11-04T00:00:00Z", "2022-11-05T00:00:00Z", "2022-11-06T00:00:00Z", "2022-11-07T00:00:00Z", "2022-11-08T00:00:00Z", "2022-11-09T00:00:00Z", "2022-11-10T00:00:00Z", "2022-11-11T00:00:00Z", "2022-11-12T00:00:00Z", "2022-11-13T00:00:00Z", "2022-11-14T00:00:00Z", "2022-11-15T00:00:00Z", "2022-11-16T00:00:00Z", "2022-11-17T00:00:00Z", "2022-11-18T00:00:00Z", "2022-11-19T00:00:00Z", "2022-11-20T00:00:00Z", "2022-11-21T00:00:00Z", "2022-11-22T00:00:00Z", "2022-11-23T00:00:00Z", "2022-11-24T00:00:00Z", "2022-11-25T00:00:00Z", "2022-11-26T00:00:00Z", "2022-11-27T00:00:00Z" ], "xaxis": "x", "y": [ 38, 26, 44, 44, 33, 39, 61, 58, 30, 49, 41, 31, 31, 26, 24, 25, 32, 40, 68, 72, 57, 61, 24, 24, 40, 40, 27, 43, 29, 25 ], "yaxis": "y" }, { "mode": "lines", "name": "NASDAQ", "type": "scatter", "x": [ "2022-10-29T00:00:00Z", "2022-10-30T00:00:00Z", "2022-10-31T00:00:00Z", "2022-11-01T00:00:00Z", "2022-11-02T00:00:00Z", "2022-11-03T00:00:00Z", "2022-11-04T00:00:00Z", "2022-11-05T00:00:00Z", "2022-11-06T00:00:00Z", "2022-11-07T00:00:00Z", "2022-11-08T00:00:00Z", "2022-11-09T00:00:00Z", "2022-11-10T00:00:00Z", "2022-11-11T00:00:00Z", "2022-11-12T00:00:00Z", "2022-11-13T00:00:00Z", "2022-11-14T00:00:00Z", "2022-11-15T00:00:00Z", "2022-11-16T00:00:00Z", "2022-11-17T00:00:00Z", "2022-11-18T00:00:00Z", "2022-11-19T00:00:00Z", "2022-11-20T00:00:00Z", "2022-11-21T00:00:00Z", "2022-11-22T00:00:00Z", "2022-11-23T00:00:00Z", "2022-11-24T00:00:00Z", "2022-11-25T00:00:00Z", "2022-11-26T00:00:00Z", "2022-11-27T00:00:00Z" ], "xaxis": "x", "y": [ 46, 32, 37, 28, 33, 31, 69, 56, 30, 61, 44, 19, 58, 52, 40, 49, 48, 31, 52, 58, 45, 34, 24, 26, 19, 29, 31, 22, 14, 14 ], "yaxis": "y" }, { "mode": "lines", "name": "Hold", "type": "scatter", "x": [ "2022-10-29T00:00:00Z", "2022-10-30T00:00:00Z", "2022-10-31T00:00:00Z", "2022-11-01T00:00:00Z", "2022-11-02T00:00:00Z", "2022-11-03T00:00:00Z", "2022-11-04T00:00:00Z", "2022-11-05T00:00:00Z", "2022-11-06T00:00:00Z", "2022-11-07T00:00:00Z", "2022-11-08T00:00:00Z", "2022-11-09T00:00:00Z", "2022-11-10T00:00:00Z", "2022-11-11T00:00:00Z", "2022-11-12T00:00:00Z", "2022-11-13T00:00:00Z", "2022-11-14T00:00:00Z", "2022-11-15T00:00:00Z", "2022-11-16T00:00:00Z", "2022-11-17T00:00:00Z", "2022-11-18T00:00:00Z", "2022-11-19T00:00:00Z", "2022-11-20T00:00:00Z", "2022-11-21T00:00:00Z", "2022-11-22T00:00:00Z", "2022-11-23T00:00:00Z", "2022-11-24T00:00:00Z", "2022-11-25T00:00:00Z", "2022-11-26T00:00:00Z", "2022-11-27T00:00:00Z" ], "xaxis": "x", "y": [ 26, 18, 39, 35, 50, 53, 53, 38, 23, 34, 35, 21, 39, 39, 39, 35, 39, 29, 56, 58, 41, 34, 33, 30, 11, 47, 26, 33, 20, 12 ], "yaxis": "y" } ], "layout": { "height": 700, "legend": { "orientation": "h", "y": -0.1 }, "plot_bgcolor": "white", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Trending Entities Over Time" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ], "gridcolor": "rgb(204, 204, 204)", "linecolor": "rgb(204, 204, 204)" }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ], "gridcolor": "rgb(204, 204, 204)", "linecolor": "rgb(204, 204, 204)" } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# we will plot two subplots using the same axes\n", "fig = make_subplots(rows=1, cols=1)\n", "\n", "# identify the top ten entities\n", "entities_total = trends_data_frame.groupby(['value'])['count'].agg('sum').reset_index().sort_values(by=['count'], ascending = False)\n", "\n", "top_ten_entities = entities_total[0:10]['value'].unique()\n", "\n", "# loop over postive and negative sentiment data to generate to line graphs\n", "# start of for loop ======================================================================================= \n", "for entity in top_ten_entities:\n", "\n", " # filter to the data we want to visualize based on sentiment \n", " data = trends_data_frame[trends_data_frame['value'] == entity]\n", "\n", " fig.append_trace(go.Scatter(\n", " x = data['published_at']\n", " , y = data['count']\n", " , mode = 'lines'\n", " , name = entity\n", " ) \n", " , col = 1\n", " , row = 1)\n", "\n", "# end of for loop =======================================================================================\n", "\n", "# forrmat the chart\n", "fig.update_layout(\n", " title='Trending Entities Over Time',\n", " legend = dict(orientation = 'h', y = -0.1),\n", " plot_bgcolor='white',\n", " xaxis=dict(\n", " gridcolor='rgb(204, 204, 204)',\n", " linecolor='rgb(204, 204, 204)'\n", " )\n", " , yaxis=dict(\n", " gridcolor='rgb(204, 204, 204)',\n", " linecolor='rgb(204, 204, 204)'\n", " )\n", " , height=700\n", ")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Clusters Endpoint\n", "Naturally, multiple news stories will exist that report on the same or similar topics. AYLIEN's clustering enrichment groups stories together that typically correspond to real-world events or topics. Clusters are made of stories that exist close to one another in vector space and the clustering enrichment links clusters to a \"representative story\" that exists in the centre of each cluster — reading this representative story provides an indication of the general nature of the entire cluster.\n", "\n", "Similar to the timeseries and Trends endpoints, clusters enable us to review stories over time and identify points of interest. We can search for individual clusters using a a cluster ID, but similar to stories, we will generally not know the IDs of interest before we find them. Consequently, we can search for clusters using the Trends endpoint. The Trends endpoint allows us to filter clusters based on the stories contained within the clusters.\n", "\n", "The Trends endpoint returns the id of clusters sorted by the count of stories associated with them. Once we have each cluster’s id, you can go on to get the stories for each of the clusters from the Stories endpoint. The Trends endpoint only returns the top 100 clusters for a given query.\n", "\n", "The following script identifies clusters of news that feature the Citigroup entitiy using the Trends endpoint and returns the top 3 stories in each cluster, ranked by Alexa ranking. " ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'aql': 'entities: {{surface_forms.text:\"Citigroup\" AND '\n", " 'overall_prominence:[0.7 TO *]}}',\n", " 'field': 'clusters',\n", " 'published_at.end': 'NOW',\n", " 'published_at.start': 'NOW-30DAYS'}\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countvalue
072403153946
146412444374
242404373226
338406971372
436402698173
.........
959401457503
969402491999
979403191325
989403481293
999403944021
\n", "

100 rows × 2 columns

\n", "
" ], "text/plain": [ " count value\n", "0 72 403153946\n", "1 46 412444374\n", "2 42 404373226\n", "3 38 406971372\n", "4 36 402698173\n", ".. ... ...\n", "95 9 401457503\n", "96 9 402491999\n", "97 9 403191325\n", "98 9 403481293\n", "99 9 403944021\n", "\n", "[100 rows x 2 columns]" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# define the query parameters\n", "params = {\n", " 'published_at.start':'NOW-30DAYS',\n", " 'published_at.end':'NOW',\n", " 'aql': 'entities: {{surface_forms.text:\"Citigroup\" AND overall_prominence:[0.7 TO *]}}',\n", " 'field' : 'clusters'\n", "}\n", "\n", "cluster_ids = get_cluster_from_trends(params)\n", "cluster_ids = pd.DataFrame(cluster_ids)\n", "cluster_ids\n" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "scrolled": true }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 100/100 [04:42<00:00, 2.82s/it]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Complete\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "clusters_output = []\n", "\n", "for cluster_id in tqdm(cluster_ids['value'].unique()):\n", " # get cluster\n", " params = {'id[]' : [cluster_id]}\n", " cluster = get_clusters(params)['clusters']\n", " \n", " # get top alexa ranked stories associated with the cluster \n", " stories = get_top_stories_in_cluster(cluster_id, 3)\n", " \n", " cluster[0]['stories'] = stories\n", " \n", " clusters_output.extend(cluster)\n", " \n", " time.sleep(1)\n", " \n", "print('Complete')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we look at the first 3 clusters returned, we can see the number of stories associated with each cluster, the representative story title and the top 3 ranked stories." ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cluster ID: 403153946\n", "Story Count: 80\n", "Representative Story Title: Citigroup to purchase Deutsche Bank's Mexico license\n", "Top ranked stories in cluster:\n", " > Citigroup to acquire Deutsche Bank's licence in Mexico\n", " > Citigroup to Buy Deutsche Bank’s License in Mexico - Bloomberg\n", " > Deutsche Bank expands support for Egypt’s sustainability ambitions\n", "\n", "Cluster ID: 412444374\n", "Story Count: 53\n", "Representative Story Title: Reguladores instan a Citigroup a corregir plan de quiebra\n", "Top ranked stories in cluster:\n", " > Agencies announce results of resolution plan review for largest and most complex domestic banks\n", " > Bank regulators identify shortcoming in Citigroup resolution plan\n", " > Bank regulators tell Citigroup to take urgent action to fix resolution plan\n", "\n", "Cluster ID: 404373226\n", "Story Count: 63\n", "Representative Story Title: Carrols Restaurant Group (NASDAQ:TAST) Price Target Cut to $4.00 by Analysts at Deutsche Bank Aktiengesellschaft\n", "Top ranked stories in cluster:\n", " > First Watch Restaurant Group (NASDAQ:FWRG) Price Target Cut to $22.00\n", " > Barclays Raises First Watch Restaurant Group (NASDAQ:FWRG) Price Target to $20.00\n", " > Payoneer Global (NASDAQ:PAYO) PT Raised to $10.00\n", "\n" ] } ], "source": [ "for cluster in clusters_output[0:3]:\n", " print('Cluster ID: ' + str(cluster['id']))\n", " print('Story Count: ' + str(cluster['story_count']))\n", " print('Representative Story Title: ' + str(cluster['representative_story']['title']))\n", " print('Top ranked stories in cluster:')\n", " for story in cluster['stories']:\n", " indent_string = ' > '\n", " print(indent_string + story['title'])\n", " \n", " print('')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Visualizing Cluster Data\n", "We can easily visualize the cluster data to make it more easily digestable and understandable. Below we'll convert it to a Pandas dataframe and then visualize with Plotly. " ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n", ":17: FutureWarning:\n", "\n", "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n", "\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
indexcluster_idpublished_atrepresentative_story_titlestory_counttop_story_title
003995962202022-11-01 03:52:48+00:00Goldman Sachs pense que la Fed va continuer à ...173The Fed will raise interest rates more aggress...
104061827312022-11-20 06:29:14+00:00StockNews.com Initiates Coverage on Airgain (N...157Inotiv (NASDAQ:NOTV) Stock Rating Lowered by T...
204095070572022-11-18 20:27:20+00:00Is Digital Dollar Coming Soon?133Cryptomonaries: A First Test of the Digital Do...
304107623102022-11-28 12:53:02+00:00JPMorgan Chase & Co. Boosts Cooper Companies (...126Citigroup Boosts Ross Stores (NASDAQ:ROST) Pri...
404093120772022-11-15 05:23:32+00:00Advance Auto Parts (NYSE:AAP) Upgraded to “Buy...116Advance Auto Parts (NYSE:AAP) PT Lowered to $1...
\n", "
" ], "text/plain": [ " index cluster_id published_at \\\n", "0 0 399596220 2022-11-01 03:52:48+00:00 \n", "1 0 406182731 2022-11-20 06:29:14+00:00 \n", "2 0 409507057 2022-11-18 20:27:20+00:00 \n", "3 0 410762310 2022-11-28 12:53:02+00:00 \n", "4 0 409312077 2022-11-15 05:23:32+00:00 \n", "\n", " representative_story_title story_count \\\n", "0 Goldman Sachs pense que la Fed va continuer à ... 173 \n", "1 StockNews.com Initiates Coverage on Airgain (N... 157 \n", "2 Is Digital Dollar Coming Soon? 133 \n", "3 JPMorgan Chase & Co. Boosts Cooper Companies (... 126 \n", "4 Advance Auto Parts (NYSE:AAP) Upgraded to “Buy... 116 \n", "\n", " top_story_title \n", "0 The Fed will raise interest rates more aggress... \n", "1 Inotiv (NASDAQ:NOTV) Stock Rating Lowered by T... \n", "2 Cryptomonaries: A First Test of the Digital Do... \n", "3 Citigroup Boosts Ross Stores (NASDAQ:ROST) Pri... \n", "4 Advance Auto Parts (NYSE:AAP) PT Lowered to $1... " ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# create dataframe in the format we want\n", "my_columns = ['cluster_id', 'representative_story_title', 'top_story_title', 'published_at', 'story_count']\n", "clusters_data_frame = pd.DataFrame(columns = my_columns)\n", "\n", "for cluster in clusters_output:\n", " \n", " data = [[\n", " cluster['id']\n", " # translate the stories to English where necessary\n", " , return_translated_content(cluster['representative_story'], 'title')\n", " , return_translated_content(cluster['stories'][0], 'title')\n", " , cluster['representative_story']['published_at']\n", " , cluster['story_count']\n", " ]]\n", " \n", " data = pd.DataFrame(data, columns = my_columns)\n", " clusters_data_frame = clusters_data_frame.append(data, sort=True)\n", " \n", "clusters_data_frame['published_at'] = pd.to_datetime(clusters_data_frame['published_at'], utc = True)\n", "\n", "pd.set_option('display.max_rows', 100)\n", "clusters_data_frame = clusters_data_frame.sort_values(by=['story_count'], ascending = False).reset_index(0)\n", "\n", "# convert story count to plotly friendly format\n", "clusters_data_frame['story_count'] = clusters_data_frame['story_count'].astype(np.int64)\n", "\n", "clusters_data_frame.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Visualize the Clusters in a scatterplot" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plotly.com" }, "data": [ { "marker": { "color": "rgba(40, 56, 78, 0.05)", "line": { "color": "rgb(40, 56, 78)", "width": 2 }, "size": [ 200, 181.5028901734104, 153.757225433526, 145.66473988439307, 134.10404624277456, 119.07514450867052, 117.91907514450867, 112.13872832369943, 92.48554913294798, 91.32947976878613, 90.17341040462428, 87.86127167630057, 80.92485549132948, 77.45664739884393, 76.30057803468208, 72.83236994219654, 71.67630057803468, 70.52023121387283, 61.27167630057804, 58.959537572254334, 58.959537572254334, 57.80346820809249, 57.80346820809249, 57.80346820809249, 56.64739884393064, 56.64739884393064, 55.49132947976879, 55.49132947976879, 54.335260115606935, 53.179190751445084, 52.02312138728324, 50.86705202312139, 48.554913294797686, 48.554913294797686, 48.554913294797686, 46.24277456647399, 46.24277456647399, 46.24277456647399, 45.08670520231214, 43.93063583815029, 42.77456647398844, 42.77456647398844, 41.61849710982659, 41.61849710982659, 40.46242774566474, 39.30635838150289, 39.30635838150289, 38.15028901734104, 38.15028901734104, 36.994219653179194, 35.83815028901734, 34.68208092485549, 33.52601156069364, 33.52601156069364, 33.52601156069364, 33.52601156069364, 32.369942196531795, 32.369942196531795, 32.369942196531795, 32.369942196531795, 32.369942196531795, 32.369942196531795, 32.369942196531795, 32.369942196531795, 31.213872832369944, 31.213872832369944, 31.213872832369944, 30.057803468208093, 30.057803468208093, 30.057803468208093, 28.901734104046245, 28.901734104046245, 28.901734104046245, 27.745664739884393, 27.745664739884393, 25.433526011560694, 25.433526011560694, 24.277456647398843, 23.121387283236995, 21.965317919075144, 20.809248554913296, 19.653179190751445, 19.653179190751445, 18.497109826589597, 18.497109826589597, 18.497109826589597, 18.497109826589597, 17.341040462427745, 17.341040462427745, 16.184971098265898, 16.184971098265898, 16.184971098265898, 16.184971098265898, 16.184971098265898, 15.028901734104046, 15.028901734104046, 15.028901734104046, 13.872832369942197, 13.872832369942197, 12.716763005780347 ] }, "mode": "markers", "text": [ "Index: 0
No. Stories:173

Rep Story:
Goldman Sachs pense que la Fed va continuer
à remonter ses taux jusqu'en mars 2023
Top Story:
The Fed will raise interest rates more aggressively,
and Goldman expects interest rates to reach 5%
next March. ", "Index: 0
No. Stories:157

Rep Story:
StockNews.com Initiates Coverage on Airgain (NASDAQ:AIRG)
Top Story:
Inotiv (NASDAQ:NOTV) Stock Rating Lowered by TheStreet ", "Index: 0
No. Stories:133

Rep Story:
Is Digital Dollar Coming Soon?
Top Story:
Cryptomonaries: A First Test of the Digital Dollar
by the New York Fed ", "Index: 0
No. Stories:126

Rep Story:
JPMorgan Chase & Co. Boosts Cooper Companies (NYSE:COO)
Price Target to $330.00
Top Story:
Citigroup Boosts Ross Stores (NASDAQ:ROST) Price Target to
$116.00 ", "Index: 0
No. Stories:116

Rep Story:
Advance Auto Parts (NYSE:AAP) Upgraded to “Buy” by
StockNews.com
Top Story:
Advance Auto Parts (NYSE:AAP) PT Lowered to $170.00
", "Index: 0
No. Stories:103

Rep Story:
TJX Companies (NYSE:TJX) Given New $90.00 Price Target
at Citigroup
Top Story:
TJX Companies (NYSE:TJX) PT Raised to $85.00 at
Credit Suisse Group ", "Index: 0
No. Stories:102

Rep Story:
Robert W. Baird Trims Genpact (NYSE:G) Target Price
to $54.00
Top Story:
Open Text (NASDAQ:OTEX) PT Lowered to $48.00 ", "Index: 0
No. Stories:97

Rep Story:
Hanesbrands (NYSE:HBI) Hits New 1-Year Low at $6.41

Top Story:
Paysafe (NYSE:PSFE) Hits New 12-Month Low at $1.27
", "Index: 0
No. Stories:80

Rep Story:
Citigroup to purchase Deutsche Bank's Mexico license
Top Story:
Citigroup to acquire Deutsche Bank's licence in Mexico
", "Index: 0
No. Stories:79

Rep Story:
對俄制裁色厲內荏? 美媒:政府悄悄要求大行與俄企保持業務往來
Top Story:
Baiden administration promotes business cooperation with Russian companies
", "Index: 0
No. Stories:78

Rep Story:
AZEK (NYSE:AZEK) Given New $26.00 Price Target at
UBS Group
Top Story:
Similarweb (NYSE:SMWB) Given New $10.00 Price Target at
Citigroup ", "Index: 0
No. Stories:76

Rep Story:
Syneos Health (NASDAQ:SYNH) PT Lowered to $30.00
Top Story:
Bilibili (NASDAQ:BILI) Lowered to Neutral at Citigroup ", "Index: 0
No. Stories:70

Rep Story:
Citigroup Cuts C.H. Robinson Worldwide (NASDAQ:CHRW) Price Target
to $94.00
Top Story:
Herbalife Nutrition (NYSE:HLF) Reaches New 52-Week Low After
Analyst Downgrade ", "Index: 0
No. Stories:67

Rep Story:
大摩:新兴市场货币至暗时刻已经过去
Top Story:
Worst Looks Over for Emerging Currencies, Morgan Stanley
Says - Bloomberg ", "Index: 0
No. Stories:66

Rep Story:
Top Picks für 2023 Bank of America: China-Aktien
hui, US-Big-Techs pfui!
Top Story:
Goldman Turns More Bullish on China Stocks, Upgrades
South Korea - Bloomberg ", "Index: 0
No. Stories:63

Rep Story:
Carrols Restaurant Group (NASDAQ:TAST) Price Target Cut to
$4.00 by Analysts at Deutsche Bank Aktiengesellschaft
Top Story:
First Watch Restaurant Group (NASDAQ:FWRG) Price Target Cut
to $22.00 ", "Index: 0
No. Stories:62

Rep Story:
Citigroup Trims Teradata (NYSE:TDC) Target Price to $40.00

Top Story:
Teradata (NYSE:TDC) Price Target Cut to $40.00 ", "Index: 0
No. Stories:61

Rep Story:
Citigroup Increases Chegg (NYSE:CHGG) Price Target to $29.00

Top Story:
Chegg (NYSE:CHGG) Price Target Increased to $24.00 by
Analysts at Morgan Stanley ", "Index: 0
No. Stories:53

Rep Story:
Reguladores instan a Citigroup a corregir plan de
quiebra
Top Story:
Agencies announce results of resolution plan review for
largest and most complex domestic banks ", "Index: 0
No. Stories:51

Rep Story:
PulteGroup (NYSE:PHM) Price Target Raised to $65.00
Top Story:
Genpact (NYSE:G) Price Target Cut to $52.00 by
Analysts at Citigroup ", "Index: 0
No. Stories:51

Rep Story:
SSE (LON:SSE) Given “Neutral” Rating at Citigroup
Top Story:
SSE (LON:SSE) Given “Neutral” Rating at Citigroup ", "Index: 0
No. Stories:50

Rep Story:
Aena S.M.E. (OTCMKTS:ANNSF) Upgraded at BNP Paribas
Top Story:
Visteon (NASDAQ:VC) Rating Increased to Neutral at JPMorgan
Chase & Co. ", "Index: 0
No. Stories:50

Rep Story:
经济担忧持续 国际投行掀起裁员潮
Top Story:
Citigroup Cuts Dozens of Jobs Across Investment-Banking Unit
- Bloomberg ", "Index: 0
No. Stories:50

Rep Story:
Highwoods Properties (NYSE:HIW) Price Target Lowered to $36.00
at Truist Financial
Top Story:
Markforged (NYSE:MKFG) Price Target Cut to $2.00 by
Analysts at Citigroup ", "Index: 0
No. Stories:49

Rep Story:
US asks banks to keep some ties with
Russia – media reports
Top Story:
Banks Asked to Keep Some Ties to Russia
", "Index: 0
No. Stories:49

Rep Story:
Infosys (NYSE:INFY) Rating Increased to Buy at StockNews.com

Top Story:
Freshpet (NASDAQ:FRPT) Raised to Neutral at Citigroup ", "Index: 0
No. Stories:48

Rep Story:
Citigroup Cuts Entegris (NASDAQ:ENTG) Price Target to $92.00

Top Story:
DexCom (NASDAQ:DXCM) Price Target Increased to $117.00 by
Analysts at Citigroup ", "Index: 0
No. Stories:48

Rep Story:
Arcturus Therapeutics (NASDAQ:ARCT) Shares Gap Up Following Analyst
Upgrade
Top Story:
Arcturus Therapeutics (NASDAQ:ARCT) Shares Gap Up Following Analyst
Upgrade ", "Index: 0
No. Stories:47

Rep Story:
Danaos (NYSE:DAC) PT Lowered to $65.00 at Citigroup

Top Story:
Morgan Stanley Increases Brinker International (NYSE:EAT) Price Target
to $30.00 ", "Index: 0
No. Stories:46

Rep Story:
Cheesecake Factory’s (CAKE) “Outperform” Rating Reaffirmed at Wedbush

Top Story:
Cheesecake Factory (NASDAQ:CAKE) Earns “Outperform” Rating from Wedbush
", "Index: 0
No. Stories:45

Rep Story:
Citi completes sale of Malaysia, Thailand consumer banking
to UOB
Top Story:
UOB completes acquisition of Citigroup’s consumer banking businesses
in Malaysia and Thailand ", "Index: 0
No. Stories:44

Rep Story:
Buy Chinese Stocks and Sell Big US Tech
Brands, BofA's Hartnett Says
Top Story:
Citi Strategists Say Exit Short US Equity Positions
in Shift - Bloomberg ", "Index: 0
No. Stories:42

Rep Story:
Susquehanna Boosts AES (NYSE:AES) Price Target to $33.00

Top Story:
AES (NYSE:AES) Price Target Raised to $33.00 at
Susquehanna Bancshares ", "Index: 0
No. Stories:42

Rep Story:
Carlos Slim se baja de compra de Banamex

Top Story:
Inbursa, by Carlos Slim, will not continue in
Banamex's purchase process ", "Index: 0
No. Stories:42

Rep Story:
Barclays Increases Fluor (NYSE:FLR) Price Target to $30.00

Top Story:
Citigroup Boosts Fluor (NYSE:FLR) Price Target to $29.00
", "Index: 0
No. Stories:40

Rep Story:
Citi comprará licencia de Deutsche Bank en México
para negocio de banca corporativa
Top Story:
Citigroup agreed to purchase Deutsche Bank license in
Mexico ", "Index: 0
No. Stories:40

Rep Story:
Jack in the Box (NASDAQ:JACK) Price Target Raised
to $88.00 at Citigroup
Top Story:
Jack in the Box (NASDAQ:JACK) Price Target Cut
to $73.00 by Analysts at Barclays ", "Index: 0
No. Stories:40

Rep Story:
Goodyear Tire & Rubber (NASDAQ:GT) Price Target Cut
to $13.00
Top Story:
Deutsche Bank Aktiengesellschaft Trims Goodyear Tire & Rubber
(NASDAQ:GT) Target Price to $10.00 ", "Index: 0
No. Stories:39

Rep Story:
Mercado pode ter se enganado em relação a
Lula, diz Citi
Top Story:
The economic indicators that Lula will inherit and
the new government's uncertainties ", "Index: 0
No. Stories:38

Rep Story:
暴跌逾50%后已然触底 特斯拉重获大摩、花旗“芳心
Top Story:
Tesla Is Value Opportunity as It Nears Morgan
Stanley Bear Case - Bloomberg ", "Index: 0
No. Stories:37

Rep Story:
Kodiak Sciences (NASDAQ:KOD) Price Target Lowered to $6.00
at Citigroup
Top Story:
Citigroup Lowers Kodiak Sciences (NASDAQ:KOD) Price Target to
$6.00 ", "Index: 0
No. Stories:37

Rep Story:
GoodRx (NASDAQ:GDRX) Given New $8.00 Price Target at
The Goldman Sachs Group
Top Story:
The Goldman Sachs Group Raises ChampionX (NASDAQ:CHX) Price
Target to $32.00 ", "Index: 0
No. Stories:36

Rep Story:
Royal Bank of Canada Cuts Drax Group (OTCMKTS:DRXGF)
Price Target to GBX 950
Top Story:
Royal Bank of Canada Cuts Drax Group (OTCMKTS:DRXGF)
Price Target to GBX 950 ", "Index: 0
No. Stories:36

Rep Story:
PACCAR (NASDAQ:PCAR) PT Raised to $97.00
Top Story:
Deutsche Bank Aktiengesellschaft Increases PACCAR (NASDAQ:PCAR) Price Target
to $97.00 ", "Index: 0
No. Stories:35

Rep Story:
NU (NYSE:NU) Earns Outperform Rating from Analysts at
Credit Suisse Group
Top Story:
Nu Skin Enterprises (NYSE:NUS) Upgraded to “Buy” at
StockNews.com ", "Index: 0
No. Stories:34

Rep Story:
Capri (NYSE:CPRI) Price Target Cut to $67.00
Top Story:
Capri (NYSE:CPRI) Price Target Cut to $58.00 by
Analysts at Morgan Stanley ", "Index: 0
No. Stories:34

Rep Story:
Wells Fargo & Company Boosts Agilent Technologies (NYSE:A)
Price Target to $150.00
Top Story:
Agilent Technologies (NYSE:A) Given New $150.00 Price Target
at Citigroup ", "Index: 0
No. Stories:33

Rep Story:
Starry Group (NYSE:STRY) Stock Rating Lowered by Citigroup

Top Story:
Starry Group (NYSE:STRY) Lowered to Sell at Citigroup
", "Index: 0
No. Stories:33

Rep Story:
花旗(C.US)巴克萊(BCS.US)已裁員數百人 華爾街裁員潮來襲!
Top Story:
Factbox-Corporate America leans on job cuts as recession
fears mount ", "Index: 0
No. Stories:32

Rep Story:
APA (NASDAQ:APA) Price Target Increased to $62.00 by
Analysts at Citigroup
Top Story:
APA (NASDAQ:APA) PT Raised to $62.00 ", "Index: 0
No. Stories:31

Rep Story:
曝美私下要求银行与俄企保持联系 知情人士透露称为了将对美国的不利影响降至最低_军事频道_中华网
Top Story:
Biden asks major US banks to maintain ties
with certain Russian companies ", "Index: 0
No. Stories:30

Rep Story:
Banamex: Carlos Slim no comprará banco mexicano
Top Story:
Inbursa, of Slim, abandons process to acquire Citibanamex
assets ", "Index: 0
No. Stories:29

Rep Story:
Citigroup Inc. (NYSE:C) Shares Sold by CapWealth Advisors
LLC
Top Story:
Citigroup Inc. (NYSE:C) Stock Position Lessened by Spears
Abacus Advisors LLC ", "Index: 0
No. Stories:29

Rep Story:
Frontier Group (NASDAQ:ULCC) Price Target Cut to $18.00

Top Story:
Frontier Group (NASDAQ:ULCC) Price Target Cut to $18.00
by Analysts at Cowen ", "Index: 0
No. Stories:29

Rep Story:
SVB Leerink Initiates Coverage on Alpine Immune Sciences
(NASDAQ:ALPN)
Top Story:
Sight Sciences (NASDAQ:SGHT) Price Target Raised to $11.00
at Morgan Stanley ", "Index: 0
No. Stories:29

Rep Story:
Qiagen (NYSE:QGEN) Shares Gap Up to $41.93
Top Story:
Qiagen (NYSE:QGEN) Price Target Cut to $55.00 ", "Index: 0
No. Stories:28

Rep Story:
Ook banken kondigen ontslagrondes aan
Top Story:
Citigroup, Barclays commence layoffs in investment banking business
", "Index: 0
No. Stories:28

Rep Story:
Mosaic (NYSE:MOS) Price Target Raised to $61.00
Top Story:
Mosaic (NYSE:MOS) Price Target Lowered to $60.00 at
Royal Bank of Canada ", "Index: 0
No. Stories:28

Rep Story:
Citigroup Trims Alector (NASDAQ:ALEC) Target Price to $17.00

Top Story:
Citigroup Trims Alector (NASDAQ:ALEC) Target Price to $17.00
", "Index: 0
No. Stories:28

Rep Story:
Ionis Pharmaceuticals (NASDAQ:IONS) Price Target Raised to $31.00
at Citigroup
Top Story:
Ionis Pharmaceuticals (NASDAQ:IONS) PT Raised to $31.00 at
Citigroup ", "Index: 0
No. Stories:28

Rep Story:
Barclays Boosts Trinseo (NYSE:TSE) Price Target to $25.00

Top Story:
Trinseo (NYSE:TSE) PT Raised to $24.00 at Citigroup
", "Index: 0
No. Stories:28

Rep Story:
Sociedad Química y Minera de Chile (NYSE:SQM) Price
Target Lowered to $125.00 at Deutsche Bank Aktiengesellschaft

Top Story:
Sociedad Química y Minera de Chile (NYSE:SQM) PT
Lowered to $112.00 ", "Index: 0
No. Stories:28

Rep Story:
MercadoLibre (NASDAQ:MELI) Price Target Cut to $1,400.00 by
Analysts at Credit Suisse Group
Top Story:
MercadoLibre (NASDAQ:MELI) Price Target Cut to $1,400.00 by
Analysts at Credit Suisse Group ", "Index: 0
No. Stories:28

Rep Story:
Embraer (NYSE:ERJ) Trading Down 5.1%
Top Story:
Embraer (NYSE:ERJ) Shares Gap Down on Analyst Downgrade
", "Index: 0
No. Stories:27

Rep Story:
Iberdrola (OTCMKTS:IBDRY) Rating Lowered to Equal Weight at
Morgan Stanley
Top Story:
Iberdrola (OTCMKTS:IBDRY) Cut to Neutral at Citigroup ", "Index: 0
No. Stories:27

Rep Story:
Symbotic (NASDAQ:SYM) PT Raised to $20.00
Top Story:
Gogoro (NASDAQ:GGR) Price Target Cut to $6.00 by
Analysts at Citigroup ", "Index: 0
No. Stories:27

Rep Story:
Phathom Pharmaceuticals (NASDAQ:PHAT) PT Lowered to $30.00 at
Guggenheim
Top Story:
Xenon Pharmaceuticals (NASDAQ:XENE) Price Target Lowered to $49.00
at Guggenheim ", "Index: 0
No. Stories:26

Rep Story:
Cricut (NASDAQ:CRCT) Price Target Increased to $9.00 by
Analysts at Citigroup
Top Story:
Cricut (NASDAQ:CRCT) Price Target Increased to $9.00 by
Analysts at Citigroup ", "Index: 0
No. Stories:26

Rep Story:
ZTO Express (Cayman) (NYSE:ZTO) PT Lowered to $33.00
at HSBC
Top Story:
ZTO Express (Cayman) (NYSE:ZTO) Price Target Cut to
$32.80 ", "Index: 0
No. Stories:26

Rep Story:
StockNews.com Upgrades Harley-Davidson (NYSE:HOG) to “Buy”
Top Story:
Harley-Davidson (NYSE:HOG) Now Covered by Citigroup ", "Index: 0
No. Stories:25

Rep Story:
MasTec (NYSE:MTZ) Price Target Raised to $112.00
Top Story:
Summit Materials (NYSE:SUM) PT Raised to $33.00 at
Citigroup ", "Index: 0
No. Stories:25

Rep Story:
Barclays Reaffirms “Overweight” Rating for Centrica (LON:CNA)
Top Story:
Centrica (LON:CNA) Receives Overweight Rating from Barclays ", "Index: 0
No. Stories:25

Rep Story:
Morgan Stanley Trims Assurant (NYSE:AIZ) Target Price to
$165.00
Top Story:
IQVIA (NYSE:IQV) Downgraded to Neutral at Citigroup ", "Index: 0
No. Stories:24

Rep Story:
Amplifon (OTCMKTS:AMFPF) Price Target Raised to €33.00
Top Story:
Fielmann Aktiengesellschaft (OTCMKTS:FLMNF) Price Target Lowered to €37.00
at HSBC ", "Index: 0
No. Stories:24

Rep Story:
AMC Entertainment (NYSE:AMC) Price Target Cut to $1.20
by Analysts at Citigroup
Top Story:
AMC Entertainment (NYSE:AMC) Hits New 12-Month Low at
$5.46 ", "Index: 0
No. Stories:22

Rep Story:
DoorDash (NYSE:DASH) Price Target Cut to $100.00 by
Analysts at Citigroup
Top Story:
DoorDash (NYSE:DASH) PT Lowered to $100.00 at Citigroup
", "Index: 0
No. Stories:22

Rep Story:
Citigroup Cuts Chindata Group (NASDAQ:CD) Price Target to
$8.90
Top Story:
SWK (NASDAQ:SWKH) Price Target Raised to $25.00 ", "Index: 0
No. Stories:21

Rep Story:
Mifel, pretendiente de Citi México, recluta a Apollo
y ADIA para financiar la oferta: fuentes
Top Story:
Citi Mexico suitor Mifel enlists Apollo, ADIA to
fund bid -sources By Reuters ", "Index: 0
No. Stories:20

Rep Story:
Arcus Biosciences (NYSE:RCUS) PT Lowered to $37.00
Top Story:
Arcus Biosciences (NYSE:RCUS) PT Lowered to $37.00 at
Citigroup ", "Index: 0
No. Stories:19

Rep Story:
SentinelOne (NYSE:S) Given New $25.00 Price Target at
BTIG Research
Top Story:
Citigroup Cuts Similarweb (NYSE:SMWB) Price Target to $10.00
", "Index: 0
No. Stories:18

Rep Story:
Citigroup Increases World Wrestling Entertainment (NYSE:WWE) Price Target
to $86.00
Top Story:
Cohu (NASDAQ:COHU) Rating Increased to Buy at Citigroup
", "Index: 0
No. Stories:17

Rep Story:
金融危機前兆?繼科技業巨頭後 華爾街投行也掀起裁員潮
Top Story:
Citigroup cuts dozens of jobs across investment-banking unit
", "Index: 0
No. Stories:17

Rep Story:
Citigroup Lowers Magna International (TSE:MG) Price Target to
C$83.00
Top Story:
Magna International (TSE:MG) Price Target Cut to C$83.00
", "Index: 0
No. Stories:16

Rep Story:
Citigroup, funds in talks to end lawsuit over
errant $500 mln...
Top Story:
Citigroup, funds in talks to end lawsuit over
errant $500 mln Revlon payment ", "Index: 0
No. Stories:16

Rep Story:
Citigroup Cuts loanDepot (NYSE:LDI) Price Target to $1.25

Top Story:
Citigroup Cuts loanDepot (NYSE:LDI) Price Target to $1.25
", "Index: 0
No. Stories:16

Rep Story:
Experian’s (EXPN) Neutral Rating Reiterated at Citigroup
Top Story:
Citigroup Reiterates “Neutral” Rating for Experian (LON:EXPN) ", "Index: 0
No. Stories:16

Rep Story:
Citigroup Raises Corebridge Financial (NYSE:CRBG) Price Target to
$25.00
Top Story:
Citigroup Increases Hims & Hers Health (NYSE:HIMS) Price
Target to $9.00 ", "Index: 0
No. Stories:15

Rep Story:
TechnipFMC (NYSE:FTI) PT Raised to $12.40
Top Story:
TechnipFMC (NYSE:FTI) PT Raised to $12.40 ", "Index: 0
No. Stories:15

Rep Story:
Research Analysts' Price Target Changes for November 17th
(AAOI, ACVA, BOC, CBRL, CIR, DNA, HLLY, IMPL,
JACK, KORE)
Top Story:
Open Text (NASDAQ:OTEX) Given New $30.00 Price Target
at Citigroup ", "Index: 0
No. Stories:14

Rep Story:
Bank of America (NYSE:BAC) Downgraded by Citigroup to
Neutral
Top Story:
Bank of America (NYSE:BAC) Downgraded by Citigroup to
“Neutral” ", "Index: 0
No. Stories:14

Rep Story:
Citigroup Trims CRISPR Therapeutics (NASDAQ:CRSP) Target Price to
$63.00
Top Story:
CRISPR Therapeutics (NASDAQ:CRSP) Price Target Lowered to $63.00
at Citigroup ", "Index: 0
No. Stories:14

Rep Story:
Accel Entertainment (NYSE:ACEL) Price Target Lowered to $13.00
at Macquarie
Top Story:
PropertyGuru (NYSE:PGRU) Now Covered by Citigroup ", "Index: 0
No. Stories:14

Rep Story:
Citigroup Boosts American Axle & Manufacturing (NYSE:AXL) Price
Target to $11.00
Top Story:
American Axle & Manufacturing (NYSE:AXL) PT Raised to
$11.00 at Citigroup ", "Index: 0
No. Stories:14

Rep Story:
AKTIE IM FOKUS: Uniper fällt weiter - Auch
Citi sagt: Rally nicht gerechtfertigt
Top Story:
AKTIE IM FOKUS: Uniper continues - also says
Citi: Rally not justified ", "Index: 0
No. Stories:13

Rep Story:
Youdao (NYSE:DAO) PT Lowered to $6.50
Top Story:
Youdao (NYSE:DAO) PT Lowered to $6.50 ", "Index: 0
No. Stories:13

Rep Story:
Grab (NASDAQ:GRAB) PT Lowered to $5.00
Top Story:
Grab (NASDAQ:GRAB) PT Lowered to $5.00 ", "Index: 0
No. Stories:13

Rep Story:
Stora Enso Oyj (OTCMKTS:SEOAY) Downgraded by Citigroup to
Neutral
Top Story:
Stora Enso Oyj (OTCMKTS:SEOAY) Stock Rating Lowered by
Citigroup ", "Index: 0
No. Stories:12

Rep Story:
AMC Entertainment (NYSE:AMC) Given New $1.10 Price Target
at Citigroup
Top Story:
AMC Entertainment (NYSE:AMC) PT Lowered to $1.10 at
Citigroup ", "Index: 0
No. Stories:12

Rep Story:
Citigroup Lowers Chindata Group (NASDAQ:CD) Price Target to
$8.90
Top Story:
Citigroup Lowers Chindata Group (NASDAQ:CD) Price Target to
$8.90 ", "Index: 0
No. Stories:11

Rep Story:
Talkspace (OTCMKTS:TALK) Given New $1.00 Price Target at
Citigroup
Top Story:
Talkspace (OTCMKTS:TALK) Price Target Cut to $1.00 " ], "type": "scatter", "x": [ "2022-11-01T03:52:48+00:00", "2022-11-20T06:29:14+00:00", "2022-11-18T20:27:20+00:00", "2022-11-28T12:53:02+00:00", "2022-11-15T05:23:32+00:00", "2022-11-17T15:19:44+00:00", "2022-11-10T15:18:37+00:00", "2022-11-09T19:49:24+00:00", "2022-11-11T03:05:05+00:00", "2022-11-08T09:49:59+00:00", "2022-11-25T13:39:58+00:00", "2022-11-08T07:03:26+00:00", "2022-11-07T09:58:14+00:00", "2022-11-07T14:59:29+00:00", "2022-11-24T10:52:29+00:00", "2022-11-15T11:41:42+00:00", "2022-11-04T11:56:42+00:00", "2022-11-04T06:25:11+00:00", "2022-11-23T22:45:54+00:00", "2022-11-22T15:29:16+00:00", "2022-11-18T12:23:28+00:00", "2022-11-05T07:25:23+00:00", "2022-11-10T20:58:48+00:00", "2022-11-20T11:53:05+00:00", "2022-11-08T19:37:34+00:00", "2022-11-12T05:23:30+00:00", "2022-11-07T12:28:56+00:00", "2022-11-02T15:21:59+00:00", "2022-11-10T11:59:44+00:00", "2022-10-29T13:09:50+00:00", "2022-11-03T07:55:19+00:00", "2022-11-23T16:33:18+00:00", "2022-11-07T16:26:21+00:00", "2022-11-24T13:46:48+00:00", "2022-11-07T18:33:55+00:00", "2022-11-09T16:39:05+00:00", "2022-11-17T21:37:30+00:00", "2022-11-02T14:00:28+00:00", "2022-11-12T04:47:01+00:00", "2022-11-24T01:17:12+00:00", "2022-11-11T18:33:29+00:00", "2022-11-10T14:43:00+00:00", "2022-11-18T16:09:23+00:00", "2022-10-30T10:17:01+00:00", "2022-11-09T14:55:38+00:00", "2022-11-04T14:08:16+00:00", "2022-11-22T15:53:26+00:00", "2022-11-02T05:56:32+00:00", "2022-11-10T08:38:01+00:00", "2022-11-14T13:43:55+00:00", "2022-11-08T08:10:18+00:00", "2022-11-24T00:02:23+00:00", "2022-11-03T18:59:48+00:00", "2022-10-29T11:05:27+00:00", "2022-11-23T07:00:48+00:00", "2022-11-10T10:31:24+00:00", "2022-11-10T19:13:02+00:00", "2022-11-09T14:53:04+00:00", "2022-11-14T08:32:52+00:00", "2022-11-10T15:19:39+00:00", "2022-11-14T14:00:08+00:00", "2022-11-18T15:05:48+00:00", "2022-10-31T13:35:06+00:00", "2022-11-07T19:06:51+00:00", "2022-11-04T06:35:33+00:00", "2022-11-25T10:02:31+00:00", "2022-11-11T15:33:19+00:00", "2022-11-15T22:26:52+00:00", "2022-11-23T14:43:07+00:00", "2022-10-29T03:09:18+00:00", "2022-11-10T09:36:02+00:00", "2022-11-14T07:03:18+00:00", "2022-11-24T08:15:01+00:00", "2022-11-22T20:46:31+00:00", "2022-11-07T16:52:33+00:00", "2022-11-05T16:15:02+00:00", "2022-11-26T08:41:56+00:00", "2022-11-27T20:35:15+00:00", "2022-11-25T10:15:37+00:00", "2022-11-23T13:07:03+00:00", "2022-11-17T08:47:02+00:00", "2022-11-12T01:34:17+00:00", "2022-11-25T06:33:28+00:00", "2022-11-10T22:55:19+00:00", "2022-11-09T14:54:31+00:00", "2022-11-01T10:18:19+00:00", "2022-11-16T18:22:44+00:00", "2022-11-15T22:44:10+00:00", "2022-11-17T22:27:23+00:00", "2022-11-14T13:36:21+00:00", "2022-11-26T08:37:34+00:00", "2022-11-11T07:48:28+00:00", "2022-11-26T08:52:41+00:00", "2022-11-25T08:37:37+00:00", "2022-11-18T15:46:01+00:00", "2022-11-08T14:04:20+00:00", "2022-11-03T12:06:31+00:00", "2022-11-26T10:05:22+00:00", "2022-11-23T14:25:16+00:00", "2022-11-16T10:00:07+00:00" ], "y": [ 173, 157, 133, 126, 116, 103, 102, 97, 80, 79, 78, 76, 70, 67, 66, 63, 62, 61, 53, 51, 51, 50, 50, 50, 49, 49, 48, 48, 47, 46, 45, 44, 42, 42, 42, 40, 40, 40, 39, 38, 37, 37, 36, 36, 35, 34, 34, 33, 33, 32, 31, 30, 29, 29, 29, 29, 28, 28, 28, 28, 28, 28, 28, 28, 27, 27, 27, 26, 26, 26, 25, 25, 25, 24, 24, 22, 22, 21, 20, 19, 18, 17, 17, 16, 16, 16, 16, 15, 15, 14, 14, 14, 14, 14, 13, 13, 13, 12, 12, 11 ] } ], "layout": { "height": 700, "legend": { "orientation": "h", "y": -0.1 }, "plot_bgcolor": "white", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Story Clusters Over Time" }, "xaxis": { "gridcolor": "rgb(204, 204, 204)", "linecolor": "rgb(204, 204, 204)" }, "yaxis": { "gridcolor": "rgb(204, 204, 204)", "linecolor": "rgb(204, 204, 204)" } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# split title stings over multiple lines for legibility\n", "split_title_string(clusters_data_frame, 'representative_story_title')\n", "split_title_string(clusters_data_frame, 'top_story_title')\n", "\n", "colours = {\n", " 'positive' : 'green'\n", " , 'positive_opaque' : 'rgba(138, 190, 6, 0.05)'\n", " \n", " , 'negative' : 'red'\n", " , 'negative_opaque' : 'rgba(228, 42, 58, 0.05)'\n", " \n", " , 'neutral' : 'rgb(40, 56, 78)'\n", " , 'neutral_opaque' : 'rgba(40, 56, 78, 0.05)'\n", " }\n", "\n", "#biggest cluster size\n", "big_cluster_size = 200\n", "\n", "# calculate the factor by which we will mutlipy all clusters to fit them on the graph\n", "factor = clusters_data_frame['story_count'].max()/big_cluster_size\n", "\n", "fig = go.Figure(data=go.Scatter(\n", " x=clusters_data_frame['published_at'],\n", " y=clusters_data_frame['story_count'],\n", " mode='markers',\n", " marker=dict(\n", " size=clusters_data_frame['story_count']/factor\n", " , line = dict(width=2, color = colours['neutral'])\n", " , color = colours['neutral' + '_opaque']\n", " ),\n", " text = 'Index: ' + clusters_data_frame['index'].astype(str)\n", " + '
No. Stories:' + clusters_data_frame['story_count'].astype(str)\n", " + '

Rep Story:
' \n", " + clusters_data_frame['representative_story_title_string'] \n", " + '
Top Story:
' \n", " + clusters_data_frame['top_story_title_string']\n", "))\n", "\n", "fig.update_layout(\n", " height=700 \n", " \n", ")\n", "\n", "# forrmat the chart\n", "fig.update_layout(\n", " title='Story Clusters Over Time',\n", " legend = dict(orientation = 'h', y = -0.1),\n", " plot_bgcolor='white',\n", " xaxis=dict(\n", " gridcolor='rgb(204, 204, 204)',\n", " linecolor='rgb(204, 204, 204)'\n", " )\n", " , yaxis=dict(\n", " gridcolor='rgb(204, 204, 204)',\n", " linecolor='rgb(204, 204, 204)'\n", " )\n", " , height=700\n", ")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusion\n", "Here we have given a quick introduction in how to get up and running with four of the AYLIEN News' API's most frequently used endpoints. With these code and visualization examples, you should be able to start exploring news data in no time!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "360px" }, "toc_section_display": true, "toc_window_display": true }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }